Azure Time Series Insights introduction

Just this week, I was part of the Microsoft Tech Days: Flight into IoT event.

With a whole team of MVPs, we all explained different parts of Azure IoT using a simulation of an airplane flight from London to Budapest.

I myself talked about the pros and cons of Azure Time Series Insights:

Because I only had twenty minutes for explaining what TSI is and for demonstrating how it works, I had to skip some topics.

In this blog, I give an overview of what I demonstrated plus I add some extra goodies and in-depth information because there luckily is no time limit to this blog 🙂

My TSI presentation was the third presentation in the row of topics.

During the first presentation, the airplane flight and the sensor device were introduced. The data from that sensor device was then sent to the Azure IoT Hub.

Some alerts were sent towards Logic Apps in the second presentation, using the basic routing feature in the IoT Hub.

Still, individual incoming IoT Hub messages were just ‘debugged’ using the Azure IoT Explorer.

To get more insights (pun intended), Azure Time Series Insights was introduced by me in the story because it fits well in almost any IoT project:

What is Time Series Insights?

TSI is the combination of a blazing fast storage solution, capable of storing decades of raw sensor data. It also comes with query capabilities and a programmable API:

So, incoming streams of data (either from an IoT Hub or an Event Hub) can be ingested. This means essentially that any streaming even in Azure can be ingested.

What about representing that data in a dashboard? There is already a TSI Explorer available to show both charts and raw data based on your queries.

The raw data is not kept hidden inside TSI. All incoming data is stored in an Azure Data Lake (Azure storage account, blob storage) so it’s accessible for eg. Data Scientists and dashboard developers.

TSI also support adding calculations on top of the raw data in so-called models.

These models, together with the raw data, can be accessed using Rest API calls. There is even a Power BI integration available.

So, TSI gives you the ability to store raw data and investigate that data using several options.

Why do I need TSI?

Traditionally, an IoT platform can be divided in several data paths: cold path, warm path and hot path.

The hot path is all about alerting, sending specific derived messages based on certain rules.

The cold path does the opposite, it stores all raw data which has potentially some future value for your solution. The storage solution is cheap but reliable. This raw data usually also needs some kind of transformation before it is queriable in an efficient way.

The warm path is just aggregating the data to provide actual value. This is custom built after specification and current needs of the end user.

Looking at the feature set of TSI, it is the perfect wrapper of the cold path. With TSI, you can store that raw telemetry data in a queriable way. The raw data is not altered, it’s just stored in Parquet format.

So, the raw data is directly available for people who have questions about the incoming data:

Especially administrators, operators, IoT Developers and Data Scientists will thank you for the opportunity to look into that lake of raw data and have their first questions been answered in seconds.

Note: If you are not interested in this query feature and you want to save on TSI, just route the raw data to a Azure Storage Gen2 data lake using the IoT Hub storage endpoint (in JSON or AVRO format). Or use the Event Hub capture option (AVRO format).

How does it work?

TSI is divided in several parts:

It can ingest raw messages both from Azure Event Hubs and Azure IoT Hubs. Data is not mixed or joined.

It’s ingested and persisted in cold storage. This is just an Azure Data Lake. I prefer the latest one, GEN2. The retention time is… forever. There are some costs involved when querying the cold data.

The same data can also be stored in a warm storage for up to thirty days. This warm storage comes with a (small) price but the good news is that querying the warm storage does not cost extra. The warm path is stored (in the background) in Azure Data Explorer. This is ideal when you base some automatic, recurring, reports on the TSI Rest API.

There is a separate model repository which makes it possible to added calculated fields to the incoming messages. the is somewhat auto-populated based on the incoming data but extending it by hand or via the import/export option is a powerful feature.

Based on the API, you can manipulate both the model and you can query the data.

There is a TSI explorer available for you so you can start querying within a minute after the first data arrives.

Please expect a little latency between the moment data arrives at the hubs and the data is stored in either the cold or warm store. Still, the experience is good, you look at data in near real-time.

How do I start?

Starting is actually very simple.

In the simplest scenario, you just attach a brand new TSI to your already existing IoT Hub:

Let’s connect an IoT Hub to a new Time Series Insights environment.

I expect you already have an IoT Hub running. Here, I use the IoT Hub and telemetry as seen in the Tech Days event.

First, we create a new TSI environment using the Azure Portal:

Attach it to the usual resource group and give it a unique name and region. Normally, the TSI environment is placed next to the IoT Hub, in the same region.

Then watch carefully!

TSI is all about storing data in a very efficient format. This format is called Time Series, hence the name 😉

To make storing information efficient we need to store it using a unique Time series ID.

The fun part is that once it’s chosen, it cannot be changed anymore!

It is a one-time insertable setting for a TSI environment.

But we are in luck!

Just follow this link to learn more about the ID to chose:

There, we learn that there is a default ID combination if you are using an IoT Hub (with additional Azure Plug&Play support).

So, we just take that combination as seen above.

Next, we need to create a new storage account, TSI has to leave the raw data somewhere.

I prefer the latest Gen2 version of the storage account data lake:

Strangely, this azure portal only offers you to create and select a new storage account. The ARM template of the TSI resource makes it possible to connect to an already existing storage account…

What about the hierarchial namespace? I’m not sure what the impact is for ‘hierarchial namespace’ is for TSI. It’s a setting for a Gen2 DLS (which I use) so I enabled it.

Last but not least, you can activate the warm store. this can be changed later. I leave it to the default settings. This gives us a seven day warm store. If you do not query regularly, you can diable it and save costs upfront.

Next, we can connect an event source:

Just simply select the already existing IoT Hub of your choice.

To connect to the IoT Hub, we need a security policy. The default available IoT Hub ‘service’ policy is good enough for us.

We also need to specify a UNIQUE IoT Hub consumer group. You can either use one you created already of add a new one to the IoT Hub here.

The last step is selecting the message property that represents the creation date and time of incoming messages (part of the message format).

I just leave it open and let TSI use the IoT Hub related timestamp.

Once this is set up, we have a running TSI environment:

We see the Time Series ID’s being shown on the main page.

Note: I imagine that if these are not valid anymore, you start a new TSI environment with new IDs and with a another data lake. Once all event sources are redirected to the new environment, you can optionally delete the old environment. The old data lake is not something you delete! If you want to keep the old TSI explorer for convenience, you cannot remove the old TSI environment yet.

You can see the relation with event sources and storage account using the related menu items: storage configuration and event sources:

Notice the Data access policies menu item.

Access to the TSI explorer is controlled using Azure Active Directory. So, you control who is accessing it and their role :

We are ready to start using TSI, you have been shown the most interesting features to know about Time Series Insights.

Probably, by now some sensor data is already ingested by the IoT Hub, and picked up by the TSI environment.

We can start accessing the TSI explorer:

Show me the data in the TSI Explorer

Click that ‘Go to TSI Explorer’ button and find yourself back in a new browser tab, in the main page of the explorer:

If you are not provided access yet, check if the right account is shown in the right upper corner of the dashboard. If not, log out and login again, now with the right account. If it is still having an issue, contact your administrator.

The explorer UI is divided into four parts:

  1. The timeline where you can select a certain portion of data in the timeline to show
  2. The (long) list of devices which contribute to the incoming data
  3. The area where the magic happens. Here, the selected raw telemetry is shown in some visualization
  4. Navigation menu towards the models (types and hierarchies)

These four parts work together to offer you the best experience.

Let’s check out each part.

The TSI Explorer time line

You see the number of messages coming in in this simple time line. The higher the line, the more messages are coming in:

The blue portion is the actual selection that is used in the chart.

You can play with the sliders to go back and forth in time and select a portion of the data.

Note: If you are just starting with an ‘almost empty’ TSI, there is not that much to select yet. Be patient.

You can also play with the calendar dialog if you want to be more precise regarding the time frame you are interested in:

Those options at the left side, last X minutes, hours or day, I use that a lot.

On the main dashboard, there is also an option to auto-refresh so you get a more ‘sliding’ experience:

Notice the time interval is configurable.

As we will see later on, in region 3, in the chart you can further zoom in into the data. The time line is automatically adjusted.

The UI remembers the previous selection so you can easily zoom in and out.

Once you have selected a period, let’s select telemetry from one or more devices to be represented.

The TSI Explorer device telemetry selection

In region 2, a long list of devices is shown. This can be a list representing thousands of devices…

There is a simple device name filter but later on, we will see how hierarchies could make your life easier.

In our example, we see two devices. From the second device, I select a few telemetry values and click that Add button:

The values will end up in the ‘swimming lanes’ inside the charting area, in region 3.

There is no direct relationship between the separate device properties, except for the timestamp. You have to find out if there is a relation yourself…

Note: if only one property named ‘EventCount’ is shown, wait for a moment until TSI has indexed the incoming messages. Closing and reopening the explorer will help also to access the properties…

Note: text columns are hard to show in graphical charts and therefor omitted. You can still see the text values in the raw data table representation. You can try to add calculated properties that turn text into different numbers or categories.

Click Add if not done yet.

The TSI charting section

I selected altitude, outside temperature, and speed.

Here, in section 3, we see the actual complete flight, including taxiing, lifting off, and landing (because this was the timspan I selected in the timeline at the top of the screen):

The chart has a maximum of four ‘swimming lanes’.

You can add much move values in the chart but you have to divide them over those four lanes.

You can combine them in the same swimming lane using the dots:

You can also select a section:

Then, you can do three things:

  • zoom in into that time section for more details (You can zoom out afterwards)
  • you can explore the raw telemetry in a table (and export this raw data even into CSV)
  • see some basic statistics like min/max/avg and standard deviation

Now, let’s copy the (orange) speed value by clicking the + icon on that selected property. This add a new, same property. I give it another color (red). I also put it in the same lane. Then I do some time shifting:

From there, I select the gear icon on that new lane:

I give that new Speed property an offset of twenty minutes so I can compare the red line with what happened twenty minutes earlier (the orange line).

I can even add (and remove) markers to make the comparison even simpler:

Markers are semi permanent and can be given a description.

Different kinds of charts

These are just the abilities of the line chart.

Two more chart types are available, heatmap and scatter plot:

This is an example of the heatmap, using the speed as an example:

Note: I disabled the other lines for now.

This heatmap can point me into the direction of anomalies when I see unexpected colors.

The other chart is the scatter plot, which plots circles on a grid.

That scatter plot takes three variables: X, Y, and size of the bullets (radius):

Notice, the time-shifted speed is also selectable 🙂

The result is very useful for finding anomalies:

I only have shown the lift off part of the journey so there is no overlap in bullets when the plane lands (because again lower altitude and lower speed).

We see:

  • the higher we fly, the colder it gets (size of the bullets)
  • the higher we fly, the faster the plane travels
  • no outliers, the sensors are confirming what we expect

So, this is great.

Still, we are only looking at the raw values. How can we add some calculations?

TSI Explorer models

In section 4, navigation to the models section, we can navigate to the other explorer page showing three tab items:

  1. Instances, showing all devices and the ability to manipulate them
  2. Hierarchies, relate a device to one or more logical subsets
  3. Types, extend the default capabilities of a device with calculations

First, let’s check out the types.

There is always that ‘DefaultType’ for each device so that one is there already. Leave it.

I already created this new type called ‘FlightData’:

I gave it already the usual properties (speed, altitude, etc.) but also a few new calculated properties:

Altitude is now also available in kilometers and the outside temperature can be shown in Fahrenheit. Those are simple numeric calculations:

The aggregation type of a calculated property is added in a similar way:

Notice I added that extra optional filter to only take actual values for the average aggregation.

The last type of a calculated property, the categorial property is set up a bit different.

As an example, I wanted to see if a plane is parked, on the taxi lane or flying. For that, I use the speed of the plane add created there categories.

First, I turn speed values into the ‘identifiers’: F, T, and P:

These identifiers are then match with a human-readable label and a separate color.

Now we have a more meaningful type. We can take our device and related it to this type.

Select the device in the Instances tab and edit it:

I selected the new type named FlightData and also gave the device a more meaningful name.

Save the changes (as in: publish it).

Go back to the chart, remove the current raw properties and add the new calculated properties in the line chart:

We see the new calculated values shown up. Great!

The stage looks like a heatmap now and I can see the plane was standing still a few times before it lifted off.

Last thing, that long list of devices is bothering me. I want to see those hierarchies in action.

I added a few of them already. You can see them in the Model Hierarchies tab:

Hierarchies are quite simple to add:

Just fill in a few (nested) levels.

Once this is published, I select our airplane device and make it part of all three hierarchies:

Select the three check boxes and fill in the instance fields with the related value:

Save the changes and check out the hierarchies in section one:

The same device can now be found in all three hierarchy subsets:

Notice the markers are still in place.

Save and load model information

Editing the model section can be cumbersome. Can you store it once it works out for you?

There are some options to save and load model information:

  • Bulk upload instances upfront
  • Download current list of instances
  • Upload and download hierarchies
  • Upload and download types

This makes it possible to automate and have more control over the model information:

The stored model information is just saved in JSON format.

PowerBI support

If you have created a beautiful visualization of a timeframe of the data using a certain chart and properties, there are a few actions that can be performed:

Next to looking at the raw data of that timeframe, there is also an option to connect that current visualization to Power BI (with some additional options):

You can then import that visualization in Power BI Studio and work further on that data.

I see some value in this work flow. Still, this is just raw data from a few devices. Handle with care, it’s not a replacement for an actual PowerBI dashboard based on the warm datapath.

Save and share views

What we can do with visualizations for Power BI, can also be done within TSI Explorer: sharing visualizations:

We can save the current chart by giving it a name and perhaps allow other users to look at it too.

This is then called a view:

You can open your own and share views:

You can also share the view just as a link to others, eg. in an email:

Notice that the receiver still needs access to the TSI explorer using its AAD credentials before its shown in the browser.

A more elaborate workflow

We have just seen what we can do with TSI Explorer, just connected to an IoT Hub.

Keep in mind, the same raw data is also available using the Azure Storage account, used by TSI to store the data in:

The data is available for you to do your Data Science magic:

Keep in mind you are not supposed to delete any data. This is not recommended by Microsoft.

Accessing this raw data is also part of a more mature workflow (bottom rectangle):

For example, Azure DataBricks could be a good option to start data mining the raw data using notebooks.

At the top of that picture, you can see the warm path (and even the hot path) supported by Azure Stream Analytics (as rule engine).

It could be interesting to ingest that warm/hot data also back into the TSI storage.

This way, you have a historical recording of what happened. You also have the full round trip between the cold path and decisions being made in the other paths.

Finally, using the APIs (rectange to the right), other dashboards can make use of this solution in a programmable way. Especially the TSI warm storage has added value by checking early for anomalies because of the query support for ‘free’.

Time Series insights also support metrics monitoring and alerting.


We have seen how TSI can contribute to almost all of your IoT projects.

This is your first tool to look for anomalies and trends.

The fact you get technical dashboards with near real-time charts on top of your queriable cold storage is a big advantage.

Please check it out and have fun!