How to cope with IoT Hub enrichment restrictions

As seen in my previous post, The IoT Hub routing feature supports message enrichment, both for IoT devices and IoT edge modules.

Using the routing message enrichments, each incoming message gets extra user properties based on either static values, device twin tags, or device twin desired properties.

Unfortunately, only ten enrichments can be added:

If you want to pass more values, this will not work for you.

It would be great if nested JSON properties would count as one.

Again, unfortunately, only simple types (string, decimal, boolean, date/time, etc.) are supported so this excludes nested JSON (complex types).

Below, a viable solution to overcome both restrictions, using Azure Stream Analytics, is presented.

Let’s see how this works out.

Put Actual Json as pseudo JSON in text

Of course, it starts with adding an (extra) message enrichment to the IoT Hub message routing.

Here, I take the (device specific) Configuration property from the device twin Tags and try to put it into the user properties of each message of each device:

Because the number of enrichtments is limited (only a maximum of ten enrichments is supported), it’s impossible to enrich more than ten simple values.

By now, you could still be thinking: could Nested JSON be a possible solution?

So, imagine we want to read a complex configuration like below from a device twin tag:

// does not work
"tags": {
  "configuration": {
    "c1":"bla",
    "c2":14.4,
    "c3":true,
    "c4":"2022-05-17T13:01:03.7290717Z"
  }
},

Unfortunately, as stated already, nested (complex) JSON objects are not supported by the enrichments.

The trick we want to use is filling a string (which is a simple type but very flexible) with JSON without exposing it as actual JSON. In a later stadium, we want to transform it back into the original complex structure.

Clever 🙂

I tried several strategies to turn a complex structure into a string (like arrays, key-value pairs, etc.).

In the end, this ‘pseudo’ JSON (if you know a better name, put it in the comments) did the job:

"tags": {
  "configuration": "{\"c1\":\"bla\",\"c2\":14.4,\"c3\":true,\"c4\":\"2022-05-17T13:01:03.7290717Z\"}"
},

As a human, you still recognize the original JSON, don’t you?

As you can see, the same JSON structure is now escaped.

So, the string property now contains pseudo JSON. The enrichment doesn’t care so it’s supported.

Once the device starts sending telemetry, this string is used when enriching the ‘config’ user property:

Note: Screenshot is taken from an EventHub process data screen, added later in the process.

Handling Pseudo JSON in Stream Analytics

At this point, each message gets that configuration as a user property string.

Now, we want to use these settings inside the configuration as part of rules seen in Azure Stream Analytics.

How do we transform that pseudo JSON back into regular JSON? Or, how can I take parts of that string, the settings, to make decisions?

Using Azure Stream Analytics, I found out it is not that hard.

This is because we have JavaScript User Defined Functions available. So we could program whatever we need…

Note: From here, I reference this custom ‘extractJsonStructure’ function which will be explained later on.

See how some ASA job calls this ‘udf.extractJsonStructure’ function (we will look into this function later):

SELECT
  *
  , udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).c1 as C1
  , udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).c2 as C2
  , udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).c3 as C3
  , udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).c4 as C4
  , udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).missing as missing
FROM
  eventhubinput

First, the “GetMetadataPropertyValue(eventhubinput, ‘[User].[config]’)” is used to get acces to the user property string representing the pseudo JSON.

This string is then transformed into some ‘object’ where we can call the properties directly:

udf.extractJsonStructure([pseudo JSON string]).missing

Nice, this looks promising.

So, what magic is put into that function? How complex is this function?

As you will see, the function is simple!

We just need to parse the pseudo JSON and return actual JSON:

Yes, it’s just one, two lines of code:

function main(arrayString) {
  var arr = JSON.parse(arrayString);
  return arr;
}

Nothing more! You are done!

So, because that ‘object’ returned by the function is a real JSON structure, the Stream Analytics query language can consume existing properties directly:

udf.extractJsonStructure(GetMetadataPropertyValue(eventhubinput, '[User].[config]')).c1 as C1

Here, the property named ‘c1’ is extracted.

Using this logic, I can extract all other columns too. Look at the response:

Notice how each column is represented by the right format (string, float, datetime).

The boolean ‘true’ is transformed into a ‘1’ float value.

I also referenced a column name named ‘missing’, not available in the configuration structure. This column is transformed into a string value filled with ‘null’. This is perfect.

I have not tested this, but I think it should be possible to use nested values too…

Browser versus Visal Studio Code

Here, I used the Azure portal to program my ASA job.

It is strongly recommended to use Visual Studio Code with the Azure Stream Analytics extension.

This tool is far superior and has many more features (local testing, local testing with live data, visual debugging, version control, etc.) and adds the ability to program UDF functions in C#. Check out this post about how to start constructing a Stream Analytics job in Visual Studio Code.

Note: The same goes for Azure functions.

Conclusion

We have seen how to encode and decode a special structure, pseudo JSON, to enrich IoT messages with a complex user property.

Note: It is demonstrated with a Device Twin tag. It should work with Device twin (desired and reported) properties too.

This overcomes the limitations of the IoT Hub routing enrichment.

Using extra resources like an Azure Function or Stream Analytics, this is a viable solution.

I demonstrated it for ASA.

Note: Potentially, this also overcomes the need for joining reference data in an ASA job. The only limitation is that the Device Twin tags only hold a ‘current version’ and does not offer the historical view as seen in reference data. This is all right for data which never changes.

The same ‘workaround’ should be trivial when you code your own solution in an Azure Function.

Een gedachte over “How to cope with IoT Hub enrichment restrictions

Reacties zijn gesloten.