Streaming data analysis with Free Azure Data Explorer

If you are new to Azure Data Explorer, a simple and cheap way to start is the Free Azure Data Explorer cluster.

There are a number of limitations compared to the paid version of Azure Data Explorer. But hey, you get a Time Series database cluster with data ingestion, KQL querying, and dashboarding for free!

Personally, I moved over to the paid dev/test version because I work a lot with streaming Azure IoT telemetry ingest and this kind of data ingestion was not offered in the free version.

Until last week!

Although a direct data connection to the Azure IoT Hub or ingestion via Event Grid is not offered, we can now ingest streaming data via Event Hub.

This brings us to the following architecture:

This post will explore how to ingest streaming data using the new Event Hub support.

Please create a free cluster first, if you have not done this yet.

Later on, the Event Hub, the IoT Hub, and a device simulation are attached.

Creating a free cluster

Go to My Cluster and select Create cluster:

You will be asked to fill in details like the database name and region:

Fill in the dialog fields until you get this view:

Note: the actual region your cluster is created in is shown in the cluster URI.

You are now ready to ingest data and query using KQL.

Check out my previous blog for more details about using Azure Data Explorer. It also shows you how to get training.

Create an Event Hub

To work with streaming IoT data, we need an IoT Hub and connect it to ADX using an Event Hub.

Notice that although the IoT Hub is part of the list of free services offered by Azure, an Event Hub is not.

To keep the cost impact as low as possible we use a Basic tier:

Please check the pricing calculator for the costs in your region.

Navigate to the Azure portal and log into your Azure subscription.

Note: If you do not have a subscription yet, create one for free.

First, create an Event Hub (namespace) in the same region your ADX cluster is living in:

Choose the basic tier if you want to keep the costs as low as possible.

Review and create the resource.

Once the Event hub namespace is created, navigate to it.

Add the actual Event Hub. I named it ‘TelemetryEH’:

The partition count is set to four, on par with the number of partitions of an average IoT Hub.

Review and create the Event Hub.

Before we set up the Event Hub data ingestion in Azure Data Explorer, first we set up the IoT Hub and then we started ingesting data.

This will help us when we need to create the table mapping later on.

Creating an IoT Hub

Within the same subscription and resource group, add a new (free tier) IoT Hub:

Notice that we use the same region again.

Review and create the IoT Hub.

Add an Event Hub route

Once created, navigate to the IoT Hub resource, select the Message routing pane, and add an Event Hub endpoint:

Give the route a name:

Make sure the endpoint relates to our own Event Hub.

Create the endpoint.

Once created, the endpoint is seen in the list:

Now, add a route:

Give the route a name and point it to the endpoint we just created.

Save the route:

See it is listed in the IoT Hub message routing pane.

Add a device registration

Before we can start ingesting device telemetry, we need to register a device name and get the unique security credentials for that device.

Navigate to the (still empty) Devices pane:

Add a device:

Just give it a name and keep the default values so a symmetric key is auto-generated.

Note: This is good enough for development and testing purposes.

Save the device settings.

Your device will be listed (perhaps you need to refresh the list):

Select the device to open the device details pane:

Copy the primary connection string, we need it when we connect our device to the cloud.

Create a Raspberry PI simulation

You could build your own device but today we use a simple, pre-defined device.

Navigate to this GitHub page where a real Raspberry Pi simulation is located:

Fill in the connection string you copied from the device registration details.

Hit run.

See that the device starts sending telemetry to the IoT Hub immediately:

The RPI sends a new message every two seconds.

If you wait for a short time, the IoT Hub tells you messages are arriving on the overview page:

By convention, the IoT Hub does not show message details on the Portal.

You need a tool like the Azure IoT Explorer to inspect message details.

I also checked the Event Hub, messages are arriving there too:

So, the routing is working as expected.

Let’s stream the telemetry messages to the free Azure Data Explorer.

Ingesting data into Azure Data Explorer from an Event Hub

Open the Free Azure Data Explorer landing page.

We continue with the Ingest dialog:

Click the button, and the first dialog regarding data ingestion is shown:

The cluster and database are selected.

Fill in the table name and go to the next dialog.

Amongst others, the source type Event Hubs is available:

Once selected, you can easily select all fields:

Because the basic tier Event Hub only supports one consumer group, select ‘$Default’.

Note: it is recommended to give every EventHub consumer their own consumer group. So, If you are using a standard tier Event Hub, add a new consumer group named eg. ‘adx’.

If you are interested in IoT Hub system properties like DeviceId or Enqueued time, check out More Parameters:

Go to the next step, Schema:

If everything works out well, you will see the arrival of actual RPI simulation telemetry!

This is why we started with connecting the device first. This automatically helps create the correct table column names…

There is only one small issue: we are ingesting JSON but the table will be filled with the actual JSON-encoded string.

We would rather go for the individual fields.

There is a simple solution to that:

Just change the data format from TXT to JSON.

See how the individual fields are now put in separate columns.

Note: the application property in the device message, the ‘temperature alert’ is not ingested. This is by design, the Azure Data Explorer does not support user/application properties…

You are now ready to start ingesting:

Hit the ‘number of rows’ button.

This takes us to the query pane:

I see it ingests telemetry.

Initially, when this feature arrived, the rate was quite slower than what the device produces because batching was used for ingestion.

The same real-time experience is when that checkbox is not set in the previous step:

Just to be sure, I force the streaming ingestion policy on the Telemetry table:

.alter table Telemetry policy streamingingestion enable

This gave and can give a much better response. The count is changed every two seconds, just like the interval seen on the RPI:

Upgrading your free cluster

Azure Data Explorer offers you the option to upgrade your free cluster to a full cluster.

Unfortunately, migrating the Event Hub ingestion is not supported:

How free is free?

The free ADX is free. You do not have to pay anything to get it started.

You do not even have to register for an Azure subscription.

So you can work out a test environment for your IoT test solution.

But this free version comes with limitations:

It can still store a fair amount of data but especially the number of columns can be a limitation for more serious tests.

And there are more limitations:

In this list, the direct IoT Hub ingest connector is not shown. But that one is only available in the full version.

Finally, Azure Data Explorer is not mentioned in the list of free Azure resources.

Therefore, it is not clear how long this free ADX will be provided to you. The Azure Data Explorer team uses this environment for the ADX training facilities eg. the Kusto detective agency so I do not expect it will be removed soon.

Still, please handle this environment just as a test and training environment.

Bonus: Microsoft Fabric KQL Queryset support

Microsoft Fabric is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. 

For IoT Developers this is a great addition to our Azure IoT resource toolkit.

For those familiar with Microsoft Fabric, your free ADX database can also be used in a Fabric KQL Queryset if the same credentials are used.

Read about it in this blog post.

Conclusion

So, the Free Azure Data Explorer supports streaming data after we give it a little encouragement.

The complete solution is not entirely free anymore due to the (very small) costs of the Event Hub.

Still, this is an excellent solution for testing and demonstrating the power of Azure Data Explorer in IoT solutions.

More background information is available here and here.

This post is the ninth part of this Azure Data Explorer blog series: