Data Worlds Collide – Seeking the 21st Century EDW?
When I’m talking with clients about how they are using their Big Data environments, the common response is “we’ve given our Data Science Team access”. These statistical boffins are well prepared for the challenges of analyzing unstructured, semi-structured and structured data at volume.
In contrast, I see a plethora of roles from finance, marketing, sales and…
When I’m talking with clients about how they are using their Big Data environments, the common response is “we’ve given our Data Science Team access”. These statistical boffins are well prepared for the challenges of analyzing unstructured, semi-structured and structured data at volume.
In contrast, I see a plethora of roles from finance, marketing, sales and other business departments using off-the-shelf BI tools to access data in existing data warehouses and datamarts – and making data driven decisions. No statistics or computer science degree required. Three decades of honing data integration techniques and the evolution of tools and platforms have made structured data available to business masses. A key part of this enablement was presenting one cohesive view of data across a department or enterprise – the data warehouse.
EDW 2.0?
The EDW concept is simple and common sense – merge disparate data sources together into one central repository – a single version of the truth for the enterprise. In practice, building an EDW was often a mammoth effort; viewed as a necessary step, but not undertaken lightly. The Data Warehousing Institute (TDWI) declared in it’s BI Maturity Model that an organization did not become a BI ‘Adult’ until the perilous ‘Chasm’ of challenges had been crossed and a functional EDW established. To those of you who successfully crossed the Chasm – congratulations, it’s a feat worth celebrating!
Today, organizations are often supplementing their EDW with a Hadoop environment and adding a few truck loads of unstructured data from varying sources into it. Many of you want access to this new, promising data ALONGSIDE your current information AND without having to be a data scientist to accomplish this.
How to Survive (and Thrive) in a New World of Data
This is the big question we’re focused on at Excella – how do you best access data from unstructured/semi-structured sources (aka Big Data) and structured sources (the data you have in your Warehouse or Marts) with the objective of providing a coherent view across ALL your data sources?
Here’s the options we’re exploring at Excella:
- Creating structured (summary) data from unstructured/semi-structured data and adding this into an existing Data Warehouse – when does this approach work and when does it not work?
- Moving everything into Hadoop – is it worth it?
- Using an open source NOSQL platform as the intermediary for structured and unstructured workloads (instead of Hadoop).
- Using data virtualization or data federation tools to bridge the gap, sourcing data on demand and leaving it stored disparately.
Stay tuned as we publish our observations in the coming weeks and become a subscriber to get access to all Excella blog posts. Have an alternate option you’d like us to prove out? Contact us!
You Might Also Like
How to Deliver Impactful Software with Doguhan Uluca and Keith Mealo
In case you missed it, Excella Principal Fellow, Doguhan Uluca, and Senior Engagement Manager, Keith...
Burton White Announced as a 2023 WashingtonExec’s Chief Officer Awards Finalist
“Excellians are Passionate About Making the Tech Community More Diverse” – Burton White. WashingtonExec announced...
The Shift Left and the Future of Tech with John Gilroy and Jeff Gallimore
In case you missed it, Jeff Gallimore, Excella’s Chief Technology and Innovation Officer, joined long-time...