Near Real-Time Data Streaming with Azure Stream Analytics
Azure Stream Analytics is Microsoft’s latest addition to its suite of advanced, fully managed, server-less Platform-as-a-Service (PaaS) cloud components. As we move into the era of big data, more and more organizations find it imperative to be able to process a large amount of data in near real-time, and with the ability to act on it. The process of spinning…
Azure Stream Analytics is Microsoft’s latest addition to its suite of advanced, fully managed, server-less Platform-as-a-Service (PaaS) cloud components. As we move into the era of big data, more and more organizations find it imperative to be able to process a large amount of data in near real-time, and with the ability to act on it. The process of spinning up complex data pipelines and analytics has in the past been both time-consuming and expensive, but can now be done within minutes to hours for a very reasonable cost.
The Technology
At its core, Azure Stream Analytics is nothing more than a set of ingress and egress data streams and some plain old SQL with a temporal aspect that handles the analytical engine. It’s an extremely powerful technology despite its simplicity, which seamlessly integrates with custom, out-of-the-box machine learning algorithms.
Ingress (Inputs)
Azure Stream Analytics currently supports three types of inputs: blob storage, IoT and Event Hub. Each has their own specific use case. You may want to consider using an IoT Hub if you would like to stream temperature data from a Raspberry PI, since the IoT Hub would gracefully be able to decouple the event producers (Raspberry PI) and event consumers (Azure Stream Analytics). An Event Hub is very similar to an IoT Hub, with the difference that it only supports one-way communication and can be used to stream social media data or other kinds of big data streams. Blob storage, on the other hand, is a great source for reference data: slow moving data used to enrich a data stream (for example VIN numbers for specific license plates or customer demographics if streaming customer data). The beauty of Azure Stream Analytics is that it allows you to combine data from multiple streams into a single result set.
Egress (Outputs)
Triggering intelligent actions based on the processed data is one of Azure Stream Analytics strong suits. It is currently possible to output the aggregated result to any of the following: Azure Function, Event Hub, Service Bus, Cosmos DB, SQL Server, Blob or Table storage, Data Lake, or a streaming Power BI dashboard. Event Hubs are great to use if you would like the data to trigger another workflow, while Azure Functions can be used to send emails or trigger other functions downstream.
Azure Stream Analytics SQL
The data processing itself is performed using a variation of SQL, a familiar syntax to those who have worked with databases. Stream processing, as opposed to normal batch processing, does not have a static set of data. It instead constantly performs analysis on a moving data stream. To handle this, Microsoft has added tumbling, hopping and sliding windows to the analytics SQL syntax. These built-in temporal windows allow you to group your data as a set within a window of time. You may, for example, want to use a hopping window if you want to aggregate your data in one-minute windows while starting a new window every 10 seconds. In addition, Azure Stream Analytics SQL has tons of other built-in functionality to handle complex data structures and can utilize built-in and remote Azure Machine Learning algorithms – these assist with anomaly detection, semantic analysis, and more.
Why Is It Important?
Stream Processing is used all around us, and it will continue to grow as we continue to embark on the AI/ML journey.
Self-driving car manufacturers already rely heavily on this technology to get their cars to stop or turn based on live sensor data streams. Just knowing that you have a pedestrian in your path is not enough to know if you should hit the brakes or not. Only paired with information such as your current speed, the speed of which the pedestrian is moving, tire quality, and so forth may you be able to make an informed decision on when to stop. This is where streaming analytics comes in to play. We may also find streaming analytics in an abundance of other industries, such as banking, to counter bank-fraud, or in the industrial manufacturing industry to monitor industrial sensors, e.g. pressure valves.
Traditional industries such as large shopping malls may also benefit. Although most likely regulated in the US, it is not far-fetched to think that shopping malls could use an image processing software to capture customer’s license plates while in the parking lot. Based on the license plate information, they would be able to get additional information about their clientele such as demographic information, family size, income, latest international trips, etc. Streaming this data in real-time would allow the stores to have a better picture of what categories of products would sell better at any point in time during the day.
Conclusion
Azure Stream Analytics is here to stay. The ease with which it integrates with other cloud components makes the threshold for organizations to get started, very low. In my upcoming blog posts, I’ll continue to dig deeper into Azure Stream Analytics and demonstrate how we easily can set up a real-world example: streaming data from a Raspberry PI or Twitter API. I will also discuss the differences and similarities with the current AWS alternative for streaming analytics in my upcoming blog posts, where I will compare Azure Stream Analytics to its current AWS alternative. Stay tuned!
Want to learn more? Watch our webinar on real-time data streaming with Azure Stream analytics!
Image Sources:
[Digital image]. (n.d.). Retrieved from https://azurecomcdn.azureedge.net/mediahandler/acomblog/media/Default/Open-Live-Writer/EventsHub.png
[Digital image]. (n.d.). Retrieved from https://azure.microsoft.com/svghandler/cosmos-db/?width=600&height=315
[Digital image]. (n.d.). Retrieved from https://azure.microsoft.com/svghandler/data-lake-analytics/?width=600&height=315
[Digital image]. (n.d.). Retrieved from https://azure.microsoft.com/svghandler/functions/?width=600&height=315
[Digital image]. (n.d.). Retrieved from https://azure.microsoft.com/svghandler/iot-hub/?width=600&height=315
[Digital image]. (n.d.). Retrieved from https://azure.microsoft.com/svghandler/stream-analytics/?width=600&height=315
[Digital image]. (n.d.). Retrieved from https://banner2.kisspng.com/20180802/vty/kisspng-clip-art-microsoft-azure-sql-database-microsoft-sq-skills-5b63119fd6c064.4634905815332192318796.jpg
[Digital image]. (n.d.). Retrieved from https://bit.ly/2FsyUyR
[Digital image]. (n.d.). Retrieved from https://engbellmann.files.wordpress.com/2015/04/tablestorage.png?w=624
[Digital image]. (n.d.). Retrieved from https://i0.wp.com/liliankasem.com/wp-content/uploads/2016/07/azure-storage-blob-logo.png?fit=512,512
[Digital image]. (n.d.). Retrieved from https://unsplash.com/photos/6lcT2kRPvnI
[Digital image]. (n.d.). Retrieved from https://unsplash.com/photos/EYafxpmqAkg
[Digital image]. (n.d.). Retrieved from https://unsplash.com/photos/pvcfz7VCc3I
You Might Also Like
How to Deliver Impactful Software with Doguhan Uluca and Keith Mealo
In case you missed it, Excella Principal Fellow, Doguhan Uluca, and Senior Engagement Manager, Keith...
Burton White Announced as a 2023 WashingtonExec’s Chief Officer Awards Finalist
“Excellians are Passionate About Making the Tech Community More Diverse” – Burton White. WashingtonExec announced...
The Shift Left and the Future of Tech with John Gilroy and Jeff Gallimore
In case you missed it, Jeff Gallimore, Excella’s Chief Technology and Innovation Officer, joined long-time...