Plotting the Azure Digital Twins graph in Azure Data Explorer

There is a very strong relationship between these two Azure services, Azure Digital Twins and Azure Data Explorer.

Not only can historical Twin changes be stored in Azure Data Explorer and represented in Azure Digital Twins 3D Scenes histogram widgets, but any Digital Twin graph life cycle update can also be persisted.

How to store ADT historical twin changes and twin graph changes is described in this post.

But what if we actually show the twin graph in ADX? Can we see how it evolves over time?

Using the new Plotly graph and the Python plugin, we can.

That post is actually describing how to use the new KQL graph semantics feature (now in public preview).

With a little bit of tweaking, this is how a Digital Twin graph is represented at a certain point in time in Azure Data Explorer:

Let’s check out how we can use this to represent your Azure Digital Twins graph.

As already mentioned, most of the work is done already by Henning Rauch aka Professor Smoke, Principal Program Manager of the Azure Data Explorer (Kusto) product team.

Check out this blog post and accompanying GitHub gist containing the Python code needed to create a ‘plotly viz object’, it’s the starting point of our journey.

Azure Digital Twins, historical graph changes

A graph is a combination of ‘nodes’ and ‘edges’. In Azure Digital Twins, the twins are the nodes and the relations are the edges.

Azure Digital Twins makes it possible to export graph changes to Azure Data Explorer.

So, ADX learns about how nodes and edges appear and move away over time when the graph is changed.

I used to do this myself as seen in this blog post. Nowadays, a standard solution is made available.

If you implement that export, up to three tables are made available:

Once the historical connection is set up, we start with building a graph in the Digital Twin environment:

This is almost the same graph as seen in this ADT hands-on lab. The only thing I changed was removing one relation.

All changes are stored in ADX, the lifecycle of twins, relations, and properties.

Here, we are interested in the relationship events:

As you can see, this table describes for each relationship event, the name of the relation, the action (creation or deletion), the timestamp, the source, and the target.

As you can see, we see eight initial entries for relationships but one is removed at a later moment.

This means we only have seven relationships to be represented, between eight twins.

But first, let’s set everything in place for working with Python and Plotly in Azure Data Explorer.

Activating the Python language extension

Azure Data Explorer makes it possible to write custom queries to extend the already very powerful KQL query language.

These queries can even be written in custom Python code.

You first need to activate this Python language extension option in the ADX cluster configuration, before Python functions are accepted.

Go to the cluster configuration page. There, Python can be activated:

Notice that the layout we see here seems new, it differs quite from the documentation.

Check the Python language extension, this Python version 3.6.5 seems automatically selected and I was not able not select any other versions. This is OK for our test.

Note: It took me a couple of attempts to activate the option. Make sure you actually select that version line after switching the radio buttons and hitting the Save button (I got a ‘Missing required languageExtensionImageName property’ error).

All you have to do is to save this configuration change at a more quiet moment. You are not able to make other changes for up to fifteen minutes:

Once the update has succeeded, you can proceed with loading the already prepared Python function into your ADX database.

Loading the Python function

Once the Python language extension is activated, we need to import that Python function, made available so generously already.

Just go to the GitHub gist and select the raw version of it so you can copy it in one go:

Note: you can also download it as a zip file first.

This will create a function named ‘VisualizeGraphPlotly’ within the same database where the relationship table is found:

.create-or-alter function with (skipvalidation = "true") VisualizeGraphPlotly(
    E:(sourceId:long,targetId:long), N:(nodeId:long), 
    pLayout:string="spring_layout", pColorscale:string="Picnic", pTitle:string="Happy kraphing!") {
let pythonCodeBlueprint = ```
  ...

Note: I have no clue what ‘Happy kraphing!’ means either 🙂

Note: Regarding the ‘spring_layout‘ layout option, I was hoping for a more tree-like layout. I tried a couple of other NetworkX layouts (shell_layout, spectral_layout, circular_layout, kamada_kawai_layout, random_layout) but these did not differ that much.

Notice the empty lines in the Python script. Although this is normal for Python, the Azure portal query editor does not understand these lines so it only tries to execute only the first (thus broken) part of the command. The same goes for the Web UI version of that query editor.

It seems the Kusto editor understands the Python part better and executes it as one command.

Once the function is loaded, it’s part of your database:

Now, it’s time to use this function for the Azure Data Explorer relationships.

Building the graph query

Based on the example seen in the original post, we can now start collecting the data.

Here, we start with just using the relationship events, found in the ‘AdtLifecycleRelationshipEvents‘ table, as sources and targets are converted to nodes and edges:

AdtLifecycleRelationshipEvents
| make-graph Source --> Target
| graph-to-table 
    edges as E with_source_id=sourceId with_target_id=targetId, 
    nodes as N with_node_id=nodeId;
N;
E;

This results in two separate tables, starting with the nodes which are just unique identifiers for each individual node:

Notice that nine nodes or Twins are listed.

This is wrong, there are only eight twins expected!

That deleted relationship is still taken into account:

We are only interested in relationships that are not finally deleted!

With the power of KQL, we can fix the query:

let latestDeletedRelations =
 AdtLifecycleRelationshipEvents
 | summarize maxif(TimeStamp, Action=="Delete") by RelationshipId
 | where isnotnull(maxif_TimeStamp)
 | project-away maxif_TimeStamp;
AdtLifecycleRelationshipEvents 
| where RelationshipId !in (latestDeletedRelations)
| make-graph Source --> Target
| graph-to-table 
    edges as E with_source_id=sourceId with_target_id=targetId, 
    nodes as N with_node_id=nodeId;
N;
E;

Here, we skip each relationship by checking if the latest action on that same relationship is a Delete action.

The inline function ‘latestDeletedRelations‘ (using let) returns a list of every relation that is currently marked as removed (the latest row contains action ‘Delete’) using summarize maxif. This gives a lot of null values (of rows not removed) so we filter with isnotnull.

We then filter the table for all relations currently not deleted using the in operator.

This results in eight nodes and seven edges:

This query is the starting point of the visualization using the ‘VisualizeGraphPlotly‘ function:

let latestDeletedRelations =
 AdtLifecycleRelationshipEvents
 | summarize maxif(TimeStamp, Action=="Delete") by RelationshipId
 | where isnotnull(maxif_TimeStamp)
 | project-away maxif_TimeStamp;
AdtLifecycleRelationshipEvents 
| where RelationshipId !in (latestDeletedRelations)
| make-graph Source --> Target
| graph-to-table 
    edges as E with_source_id=sourceId with_target_id=targetId, 
    nodes as N with_node_id=nodeId;
VisualizeGraphPlotly(E, N, pTitle='Azure Digital Twins graph' )

Notice how the title name is overridden.

Execution of this function takes a lot of time! Be patient!

For just this handful of nodes and edges, it took more than 25 seconds (on the smallest VM, a single machine in my cluster):

It’s not clear to me why it takes that large amount of time…

But the function works, we get this single plotly JPath.

We can not add the visualization.

From query to Plotly visualization

If you follow the original post, you need to select ‘Add visual’ in the Web UI query editor:

In that dialog, you need to select Plotly as a visual type.

It seems the KQL query supports Plotly already using the render statement:

That was easy!

If you hover over a node, a popup dialog is shown, only containing the number of connections and that ‘random’ node identifier which is not the Azure Digital Twin name:

This seems to be not that helpful but you need to combine this with the dialog shown when hovering over the nearest relationship:

This is telling how the relationship, the edge, is set up. It feels a bit technical but it is the same as what was shown in the original table.

If extra relation-related information is added to the original relationship table (like ‘| extend extra=’extra”), it pops up here too:

So, we have a functional Azure Digital Twins twin graph in Azure Data Explorer.

Now, how about ADX Dashboard integration?

Just put the query without the render statement in the base query area and select Plotly as Visual type:

As a column, ‘Infer: plotly’ should be selectable.

This gives us the same graph as seen in the KQL editor.

But wait, we have another trick!

ADX Dashboards support the concept of parameters. These are extra filters made available as simple dropdown lists, including the standard available Time range filter.

I included support for ‘_endtime’ so we can go back in history and see how the graph looked back then:

let latestDeletedRelations =
 AdtLifecycleRelationshipEvents
 | where TimeStamp < _endTime
 | summarize maxif(TimeStamp, Action=="Delete") by RelationshipId
 | where isnotnull(maxif_TimeStamp)
 | project-away maxif_TimeStamp;
AdtLifecycleRelationshipEvents 
| where TimeStamp < _endTime
| where RelationshipId !in (latestDeletedRelations)
| make-graph Source --> Target
| graph-to-table 
    edges as E with_source_id=sourceId with_target_id=targetId, 
    nodes as N with_node_id=nodeId;
VisualizeGraphPlotly(E, N, pTitle='Azure Digital Twins graph' )

Note: I ignored the ‘_startTime’ parameter.

If I set the end time to a moment before that single relationship deletion, I get the original full graph:

Keep in mind that altering the filter has a considerable time impact, you need to wait until the function is executed (the circular arrow in the right upper corner will spin during recalculation).

Conclusion

We have seen how Azure Data Explorer can represent the Azure Digital Twins graph containing all twins and their relationships.

We can even go back in time to see how the graph evolved, and how it matured over time!

Adding extra data to a relationship, shown in the related popup dialog, is also a nice addition to the value of this visualization.

This is a nice usage of the already available Azure Digital Twins historical events export.

Unfortunately, building up the graph is very slow (it takes almost half a minute to refresh the graph in my situation).

It would be nice to see if the node identifiers could be turned into the original, unique, twin names, the ‘random’ node identifiers are not adding value.

As mentioned above, this KQL graph semantics feature is in public preview. You can give feedback to the product team here.

Update: A more elaborate example, produced by the ADX Product team, especially Henning Rauch, can be found here.

This post is the fourteenth part of this Azure Data Explorer blog series:

This post is part six of a series of posts about Azure Digital Twins:

17 gedachten over “Plotting the Azure Digital Twins graph in Azure Data Explorer

Reacties zijn gesloten.