Azure IoT Hub device query language

The Azure IoT Hub can register thousands of devices. To manage them at scale, several kinds of tooling is made available.

First, using the Device Twin of each device, devices can store extra context (type, brand, vendor, version, location, features, etc.) using the Tags section.

In the Tags section of the device twin, you are free to add a number of (sub) nodes to this JSON document section:

The tags will never be readable by the device itself, these tags are used to query all devices in your IoT Hub and make subsets.

You, both as Azure portal user or as a programmer, can query all devices for specific features.

In the Azure portal, this ability to query is most visible when you open the list of (edge) devices:

This shows a new section where we can enter a SQL-ish query starting with “SELECT FROM devices WHERE”:

Let’s dive further into this query language.

The query language is used for several Azure IoT Hub tasks:

  1. Querying devices
  2. Routing messages to other services
  3. Querying jobs

The same query language is also seen in the Azure IoT device configuration section.

Note: The query language also supports groups. For now, this is out of scope.

Querying devices

This query language is made available in the Azure Portal. It can also be executed using the CLI of the Azure IoT Hub SDK (using the RegistryManager class):

using Microsoft.Azure.Devices;
using Newtonsoft.Json;
using System;
internal class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("Hello IoT Hub query!");

        var registryManager = RegistryManager.CreateFromConnectionString("[IoT Hub connection string]");
            
        var query = registryManager.CreateQuery("SELECT * FROM devices", 100);
            
        while (query.HasMoreResults)
        {
            var page = query.GetNextAsJsonAsync().Result;
            foreach (string json in page)
            {
                dynamic deviceTwin = JsonConvert.DeserializeObject(json);

                Console.WriteLine(deviceTwin.deviceId);
            }
        }            
    }
}

Here, we get a list of device ID’s. Paging is supported, as shown here. We have to loop twice through the result. First, we get pages, each with a maximum of 100 devices (batch size as specified in this code). Then, we access each individual device.

Note: we need the shared access key of the registryRead access policy. We only want to read the registry, nothing more in this case.

From each device, we get access to the complete device twin:

{
  "deviceId": "device5",
  "etag": "AAAAAAAAAAI=",
  "deviceEtag": "MjczMDYzNzYy",
  "status": "enabled",
  "statusUpdateTime": "0001-01-01T00:00:00Z",
  "connectionState": "Disconnected",
  "lastActivityTime": "2022-02-11T10:21:32.901667Z",
  "cloudToDeviceMessageCount": 0,
  "authenticationType": "sas",
  "x509Thumbprint": {
    "primaryThumbprint": null,
    "secondaryThumbprint": null
  },
  "modelId": "dtmi:com:example:NonExistingController;1",
  "version": 3,
  "tags": {
    "feature2": 5
  },
  "properties": {
    "desired": {
      "$metadata": {
        "$lastUpdated": "2022-02-11T09:34:02.5332887Z"
      },
      "$version": 1
    },
    "reported": {
      "$metadata": {
        "$lastUpdated": "2022-02-11T09:34:02.5332887Z"
      },
      "$version": 1
    }
  },
  "capabilities": {
    "iotEdge": false
  }
}

As you can see, this ‘device5’ has one tag named ‘feature2’. This is an integer value. We also get access to the desired and reported properties.

Note: as noted in the official documentation, accessing device twins of individual devices this way is not optimal regarding refresh speed. You can experience refresh lag. When you want to read the latest values of a single device, in near real-time, use the Rest API or Azure IoT SDK to access that device registration directly.

Device test set

To be able to play around with the query language, I created a couple of devices with different tags. We should be able to both query individual devices and subsets of devices.

Note: We can query on a number of DeviceTwin settings. For simplicity, I only used tags. I did not add desired or reported properties. These DeviceTwin elements can be queried also.

Based on that set of devices, we will execute a number of queries. This will give a nice impression of what can be done using this query language.

Limited feature set

Here, we query devices. IoT Hub devices also support the concept of (logical) modules. The query language will also work for these device modules.

Still there are some limitations.

For querying devices, we can only apply:

  • Arithmetic operators: +, -, *, /, %
  • Logical operators: AND, OR, NOT
  • Comparison operators: =, !=, <, >, <=, >=, <>
  • One function: IS_DEFINED(property)

The query language is much richer: math functions, casting functions and string functions.

Unfortunately, these functions are only applicable to the routing functionality!

Today, we focus on the device querying functionality.

Function: is_defined

We can check if a device has certain tags, apart from the actual value.

Here I query for all devices without any (specified) features:

SELECT * FROM devices WHERE not is_defined(tags.feature1) and not is_defined(tags.feature2) and not is_defined(tags.feature3) and not is_defined(tags.feature4)

In my test, only two devices are selected:

Notice we can also make use of NOT and AND.

!= comparison

You can compare with =, !=, <, >, <=, >=, <>.

Notice what happens if I query for tags “not equal to” a certain value:

SELECT * FROM devices WHERE tags.feature1 != 'TWO'

This results only in devices having this tag in the DeviceTwin but with another value:

Devices not having the tag for comparison are ignored (I added them too). This means we do not need to care about ‘null’ values.

Other abilities

Let’s circle back to the DeviceTwin.

Perhaps you remembered the Azure Portal offers a simple query to search for a certain device using the id of devices.

Check the location of this element within the DeviceTwin:

{
  "deviceId": "device5",
  "etag": "AAAAAAAAAAI=",
  "deviceEtag": "MjczMDYzNzYy",
  "status": "enabled",
  "statusUpdateTime": "0001-01-01T00:00:00Z",
  "connectionState": "Disconnected",
  "lastActivityTime": "2022-02-11T10:21:32.901667Z",
  "cloudToDeviceMessageCount": 0,
  "authenticationType": "sas",
  "x509Thumbprint": {
    "primaryThumbprint": null,
    "secondaryThumbprint": null
  },
  "modelId": "dtmi:com:example:NonExistingController;1",
  "version": 3,
  ...
}

The same goes for the other fields.

Below, I show how to query some of them. You can try the others yourself.

Enabled/Disabled devices

The query language supports checking which devices are enabled or disabled:

SELECT * FROM devices WHERE status != 'enabled'

If needed, this can be combined with the other ‘where’ clauses:

Connection status

We can also check for the list of devices which are currently connected:

SELECT FROM devices WHERE connectionState = 'Connected'

This results in this list but keep in mind this is not a near real-time ‘snapshot’:

There can be a delay between Device Twin updates.

Still, over time, this should give you enough insights in querying device connection.

You could even combine this with the lastActivityTime element:

SELECT FROM devices WHERE lastActivityTime > '2022-02-11T10:21:32.901667Z'

This provides a subset of devices with a valid last active date time (again, taking the update lag into account):

Conclusion

The IoT Hub query language is the solution to be able to handle devices at scale once you have to search through thousands of devices.

The query language is quite powerful with all the functions available. Unfortunately, only a subset is available for querying devices.

A simple workaround is to query the devices outside the Azure portal using some programming language and use your programming skills to go a step further.

Advertentie