TLDR: An Azure Stream Analytics Job is a fantastic engine that can ingest data from multiple sources and output it to multiple destinations. And that’s exactly how I recommend using it. Using the Stream Analytics Query Language to analyze and manipulate data will sooner or later, cause you problems.
Just to remind the Job topology, this is what it looks like:
Apparently the Inputs and Outputs feel their purpose but why do Functions and Query exist? Well, the Function can be triggered by any incoming input and perform actions accordingly and the Query (which is written in the Stream Analytics Query Language) is the one that produces the output.
The Stream Analytics Query Language, will give you great capabilities of data analysis and data manipulation. You can filter out things, you can group things, you can change formats (text2date is a must) and so much more, but I would urge you to avoid it. Why? Because, according to my experience it is a double edged knife which can make your life easier, but will also cause you problems. Here are some of the problems I faced:
- It’s hard to debug! There is a way (and actually Microsoft recommends this) to edit the Stream Analytics Job in VS Code, there you have a bit more control about what’s happening between Input and Output but even there, if something fails, you get little to no information about why it failed.
- You don’t have the full spectrum of T-SQL. It is a query language and it is easy to get carried away and feel that you can do everything but many features are missing. Also, if you have the possibility, why no do all those manipulations in C# or any other programming language, before streaming it up?
- You will notice delays! All those manipulations take time and the more you do, the more time they will take.
I hope those reasons are enough but I would be happy to discuss more about that. In every case, my recommendation is that you use the Query-part only to filter and canalize the events to the correct output. Thus, you will be using the Stream Analytics Job for what it’s great, i.e. to ingest and output data!