I recently attended the 2017 Data Architecture Summit hosted by DATAVERSITY in Chicago. The audience, as you would imagine, was mostly data architects from a variety of industries coming together to review (and debate) architecture trends and tools. Architects are rarely afraid to speak their minds, especially if you ask for an opinion. After attending […]
I recently attended the 2017 Data Architecture Summit hosted by DATAVERSITY in Chicago. The audience, as you would imagine, was mostly data architects from a variety of industries coming together to review (and debate) architecture trends and tools. Architects are rarely afraid to speak their minds, especially if you ask for an opinion. After attending the sessions and speaking with many attendees and sponsors, here’s some key themes identified:
Long gone are the days of a ‘single source of truth’ for analytics via the enterprise data warehouse (EDW). The addition of a data lake is now regarded as a common strategy to meet more advanced analytics needs and can also serve as a repository to feed other downstream analytics solutions. For users, it can get confusing to know where to go to and what tools to use. Enter the idea of data lake zones.
The concept of presenting a single Lake with data zones targeted for different user audiences is compelling as users seek to understand the new enterprise data landscape. A presentation at the Summit by Robert Nocera from NEOS LLC illustrated this best in my opinion.
Robert proposed five zones in the Lake – Raw, Structured, Curated, Consumer, and Analytics. As data moves through the first four zones it becomes more structured and more transformations are applied.
While a single lake with zones simplifies user messaging, under the covers each zone can use the same or different data platforms and tools, giving development and maintenance flexibility. The zones idea and terminology (who liked “data lake island”???) is gaining traction and is something we are introducing at Excella also.
Something almost everyone that I spoke to agreed upon was the difficulty of navigating the vast market of tools and platforms available. There are simply too many choices in the current marketplace and little agreement on where to invest money, time, and training efforts.
Best advice? The analytics tech market is expected to consolidate over the next few years, so either hold tight before making a purchase decision or choose tools that you can deploy and switch more easily (in case you want/need to keep pace with change).
If you are not doing so already, plan for a Cloud-first strategy for analytics workloads to provide the elastic scalability and cost efficiency that modern processing and users demand. Over time migrate to tools that meet your functional needs and are easily deployed in popular Cloud platforms like AWS and Azure.
In a keynote led by Forrester analyst, Michele Goetz, the continuing challenge of hiring “data science unicorns” caught my attention. We all know that finding candidates with sophisticated statistics skills AND advanced computer science skills is difficult and expensive (some commanding salaries of up to $500K).
Forrester recommended creating a collaborative work environment of data engineers with computer science and automation skills paired with data scientists with strong math and statistics, to tackle 21st century data projects. This reinforces the (successful) approach we’ve taken at Excella – see our recent blog post “No More Unicorns” for more details on how we do this.
The demand for data engineers goes further. They also have a vital role in data governance efforts – building data interfaces that connect data consumers in an organization to data governance policies and master (reference) data. They are the glue behind enterprise data and analytics enablement. Forester quoted a stat from Indeed that showed 13% of all data-related job postings were for data engineers, versus 1% for data scientists; in demand skills are Python, Java, C++ and working with APIs to build data pipelines.
Looking for more insights? Contact us at Excella.com.
What is Data Analysis? Data analysis is the process of turning raw data into actionable...
October is here and with it comes jack-o-lanterns, skeletons, witches, and yes…bad data. There are many...