Along with a New Year come predictions for what’s hot and what’s not. Here’s five trends we’re watching in the data and analytics space in 2018. Off we go! 1. Relentless Adoption of Machine Learning Yes, it’s a buzzword. It’s also driving a technology revolution where machines are becoming more efficient and effective at certain […]
Along with a New Year come predictions for what’s hot and what’s not. Here’s five trends we’re watching in the data and analytics space in 2018. Off we go!
Yes, it’s a buzzword. It’s also driving a technology revolution where machines are becoming more efficient and effective at certain tasks than humans. The application of algorithms (aka machines) that can learn (and refine) based upon new data inputs is becoming a significant competitive advantage. Take time to understand the possible applications and consult with experts in the space to get started.
If you’re not aware of machine learning (ML), start here.
If you are starting to adopt ML and want help navigating neural networks, stay tuned for an upcoming post by Sean Cantrell titled ‘The Top 3 Neural Networks’.
Excel remains a gateway tool into analytics at many organizations and works well for low data volumes (less than 1 million rows) and simple visualizations. In today’s world, with more data generated than ever before, getting faster and easier data access is a common obstacle for many business users and analysts.
For those looking to become more independent or more efficient (or both), stepping up to a data preparation tool like Trifacta, Paxata or Alteryx may be a good next step. These tools are designed for analysts to gain “direct access to data and a significant power assist” per The Forrester Wave™: Data Preparation Tools, Q1 2017. Data prep tools offer intuitive interfaces that enable analysts to explore, blend, clean, enhance, and share data at a scale and speed that was previously reserved for developers (imagine going from a paddle boat to a speed boat!)
Adding these tools into your enterprise stack empowers analysts to work with less dependence on IT by promoting self-service access to data with tools that simplify the steps to collect, integrate and share data. This is especially helpful when fulfilling ad hoc reporting requests, trialing new dashboards or predictive models or exploring new data sources. In addition, reducing the ad hoc business demand on IT, allows developers to focus on building out the large-scale, frequently used data pipelines that need to be robust and highly reliable. When deployed thoughtfully alongside with data governance standards, the introduction of data prep tools can be a boon to analysts and developers alike.
It used to be simpler – the enterprise data warehouse was the ‘single source of truth’ for analytics. Typically, it was one central database with possibly some subject-specific data marts hanging off it. Then came the era of Big Data, introducing data lakes, data islands and the dreaded data swamp. Many analysts and business users were left asking, which one am I supposed to use?
As I’ve discussed before, the concept of data lake zones brings some language simplicity back to enterprise analytics. The idea of a single lake with multiple zones based upon intended usage and user type. The users know to use the lake for analysis and should be aware of the different lake zones available, along with the entry criteria (tech skills and data knowledge) required to access a zone. IT can determine the data platforms and tools that best fit each zone, mixing and matching the underlying technologies as needed. Sign me up!
We are generating more data than ever in this age of Internet-connected devices. Inevitably, companies capturing data are considering how they can best leverage it both internally AND externally. Usurping the traditional data brokers, the same technology used for Ad Exchanges (real-time marketplaces to buy and sell online ad space) is now being used to establish real-time data exchanges – matching buyers and sellers of data on demand.
Data exchanges provide an easy way for organizations to monetize the data they collect and turn this into an additional revenue stream. On the positive side, the more an organization knows about its customers and prospects, the better targeting of products and services to interested audiences. On the negative side is the question of privacy and who really owns data about you and your online behavior? Which leads us into the final trend for 2018…
Data privacy and data ownership become increasingly important in a world where data is likened to oil and the fuel for the economy. When does public policy intervene to protect the rights of citizens? What impact could new regulations have on innovation and the economy?
In the U.S. there is no single, comprehensive federal law protecting citizens’ data rights. However, there are multiple federal laws protecting certain categories of information (e.g. financial, healthcare) or regulating activities (e.g. telemarketing, email marketing).
The European Union took a big step towards harmonizing data privacy regulations with the passage of the GDPR, which comes into effect on May 25, 2018 and enforces significant penalties if data is collected without clear consent, sensitive data is stored without appropriate security infrastructure or personal data is transferred out of the EU without defined safeguards in place. The GDPR also mandates companies with more than 250 employees to add an independent Data Protection Officer (DPO) who is responsible for the protection of user information.
The world is watching to see the impact and efficacy of GDPR and whether this becomes the precedent for future privacy legislation.
What is data literacy? Data literacy isn’t all that different from literacy in any other...
What is Data Analysis? Data analysis is the process of turning raw data into actionable...