Add local storage to Azure IoT Edge modules using Docker Bind

Azure IoT Edge makes use of the Moby container runtime so IoT Edge modules (being Docker containers) can work together and offer logic on the edge.

Docker containers are ‘sandboxed’. This means that the logic within the containers has limited access to the environment they ‘live’ in.

By default, containers have no SUDO rights, no access to the host filesystem, and just limited network capabilities.

Though, containers can be granted elevated rights. One of these is the right to access the filesystem.

In this blog, we will see how to configure a container with access to the filesystem. To demonstrate this, a custom IoT Edge module is introduced, an IoT Edge filewatcher for CSV files:

Before we proceed with the module, we first take a deeper look at the filesystem access for IoT Edge modules.

In the IoT Edge documentation, we can see the usage of the local file system in multiple ways. For example, it’s used for storing messages for offline support, for SQL Server to store the database files, and the Blob storage module needs filesystem access to store blobs locally.

Moby/Docker supports two ways to access the filesystem:

  • Binds
  • Volumes

This is what the Docker documentation writes about the Binds:

Bind mounts have been around since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container.

https://docs.docker.com/storage/bind-mounts/

Note: According to the Docker documentation, volumes are the preferred mechanism for persisting data generated by and used by Docker containers.

In the Azure IoT Edge documentation, Binds is used more commonly. In this blog, we focus on Binds.

Using Binds

Local storage access is controlled using Container Create Options like this:

{
  "HostConfig": {
    "Binds": [
      "[Host folder]:/app/exchange"
    ]
  }
}

The Bind has a left side and a right side (divided by the colon). The path on the left represents the path on the host, the right side represents the path within the container.

For now, I demonstrate file system access using a ‘generic’ module, the iot-edge-echo module.

Note: this Echo module is intended to be used to echo incoming route messages in the console log. There is no logic within the module to actually access the file system. For now, that is not needed. I could have used any other module.

We will make use of the capability to enter a running module (this echo module) using a bash shell (inside the module!) and look around and test what we can do on the file system. In this blog, it is demonstrated using a Linux host. This gives a good insight in the user and group rights.

We add an echo module to the IoT Edge runtime:

1 Container Create options

We add these container create options. A ‘/var/echo’ folder on the Host is mapped on ‘/app/echo’ within the module:

Warning: Only alter the host path. The right side is normally fixed so the internal logic knows how to handle its path.

2 Host system path

Once the Echo module is deployed successfully, We will see the ‘echo’ subfolder is automatically created on the host system (if it does not exist already):

Notice that the folder is ‘managed’ by the ROOT.

3 Path within the container

We enter the running module with this command:

sudo docker exec -i -t echo /bin/bash

This brings us inside the container! See how the prompt changes:

We see that the root folder is /app. This folder contains the actual (.Net Core) application. There is also an ‘echo’ folder inside the container. This is conform to the expected local path.

4 Trying to create a file

In the App folder, we are not allowed to create a file:

If we enter the Echo subfolder, we are also not allowed to create a file:

As you can see:

  1. The SUDO does not exist
  2. Permission denied creating a file

We are allowed to READ a file content. Here, a file was created on the host in the echo folder and read within the container:

5 Variations on binding path

Let’s check to make sure we need that subfolder on the right side. Here, I remove the subfolder within the container path:

If we try to deploy the echo container using these options, the module crashes:

So, we need to supply that subfolder!

What we also can do, we can choose a different subfolder name inside the container:

Once deployed, inside the container, we now have a ‘lima’ folder which redirects to the ‘echo’ folder on the host:

6A Elevated rights for write access

So, if you only want to read information, you do not need elevated file system rights. If you want to ‘persist’ information, you need to elevate rights.

The most simple way is using chmod with a high number:

sudo chmod 777 echo

Here we give both the user who is the owner of this folder (root), the group (root) which is the owner group and everybody else (our module) the rights to both read, write, and execute:

This is not the most elegant way. We open up the file for everyone on the system.

There is some Azure IoT Edge documentation about elevated rights related to blob storage.

I tried it myself. In the end, I went for a different approach.

6B Elevated rights on par with the module

If we look at the docker file of our Echo module, we see that we run the app as ‘moduleuser’.

This is on par with the rights within our container:

We have a group and user available within the module, both called moduleuser. The IDs of both user and group are equal to 1000.

If we check the ‘lima’ subfolder, the access is available for ‘root’:

If we check the host file system, we also see the same ‘root’ user for the ‘echo’ folder:

Still, access is denied.

Why?

Apperantly, the root within the module is not the same root on the host!

This is confirmed, the UID (user ID) and GID (group ID) do not match:

On the host, both group and user id are 0.

So, we ‘relax’ the rights of the ‘echo’ folder to the user ID and group ID used inside the container. This is executed on the Host:

See that the user ‘ark1123sv’ is the one with UID 1000 and GID 1000, not the Host ROOT!

As you can see, we only give read/write/exec rights to that user and group. Other users have fewer rights.

If we check the ‘lima’ subfolder in the container again, we are now allowed to write a file:

Example of usage, CSV File watcher

To demonstrate access to the host filesystem, I created this module which can import CSV files. Each row in the file is sent as a message to the cloud.

This container is available on Docker Hub:

svelde/iot-edge-filewatcher:1.0.0-amd64

In my example, I use the following Container Create Options:

{
  "HostConfig": {
    "Binds": [
      "/var/iot-edge-filewatcher/exchange:/app/exchange"
    ]
  }
}

The logic is simple:

  1. Every X seconds the module checks for new files with the right extension (default: txt)
  2. If found, that file is opened and each line is sent as a telemetry message. Meanwhile, no other files are processed
  3. If all goes well, the file is renamed to another extension so it’s not processed twice (default: old)

I use the default desired twin properties.

The module checks for a new file like with the extension ‘txt’:

The files are expected to be created in the exchange folder. So we need to give that folder elevated rights:

Note: I have elevated the rights of the file too because we have to rename that file.

If a .txt file is found, the module tries to read the column headers from the first row. For all other lines, messages are created. For each line message, for each column header, a property in the corresponding message is created.

We test with this file content:

Note: I added some empty values to it. The number of separators (the comma) has to correct always (I do not compensate for missing separators).

Keep in mind, the file has to have enough rights to be opened/renamed. As you an see here, the module was not able to open the file (there is a simple check built in):

Once the rights are elevated, the file is processed:

Thus, a file like this is processed. It should result in four messages:

column01,column02,column03
aaa,bbb,ccc
111,222,333
ddd,eee,
444,,666

This results in these four messages (the newest message at the top):

5:53:16 PM, 01/07/2021:
{
  "body": {
    "column01": "444",
    "column02": "",
    "column03": "666",
    "fileName": "a.txt",
    "timestamp": "2021-01-07T16:53:15.4468751Z",
    "moduleId": "file",
    "deviceId": "ark1123"
  },
  "enqueuedTime": "2021-01-07T16:53:16.397Z",
  "properties": {
    "content-type": "iot-edge-filewatcher",
    "deviceid": "$twin.tags.deviceid"
  }
}
5:53:16 PM, 01/07/2021:
{
  "body": {
    "column01": "ddd",
    "column02": "eee",
    "column03": "",
    "fileName": "a.txt",
    "timestamp": "2021-01-07T16:53:15.4282776Z",
    "moduleId": "file",
    "deviceId": "ark1123"
  },
  "enqueuedTime": "2021-01-07T16:53:16.397Z",
  "properties": {
    "content-type": "iot-edge-filewatcher",
    "deviceid": "$twin.tags.deviceid"
  }
}
5:53:16 PM, 01/07/2021:
{
  "body": {
    "column01": "111",
    "column02": "222",
    "column03": "333",
    "fileName": "a.txt",
    "timestamp": "2021-01-07T16:53:15.4107237Z",
    "moduleId": "file",
    "deviceId": "ark1123"
  },
  "enqueuedTime": "2021-01-07T16:53:16.397Z",
  "properties": {
    "content-type": "iot-edge-filewatcher",
    "deviceid": "$twin.tags.deviceid"
  }
}
5:53:16 PM, 01/07/2021:
{
  "body": {
    "column01": "aaa",
    "column02": "bbb",
    "column03": "ccc",
    "fileName": "a.txt",
    "timestamp": "2021-01-07T16:53:15.2547108Z",
    "moduleId": "file",
    "deviceId": "ark1123"
  },
  "enqueuedTime": "2021-01-07T16:53:16.146Z",
  "properties": {
    "content-type": "iot-edge-filewatcher",
    "deviceid": "$twin.tags.deviceid"
  }
}
Receiving events...

Note: I added a timestamp and the names of the file, the edge device, and module to the message.

The Echo module was restored to the original usage. It ingests the messages from the filewatcher. It now shows the routed messages on the edge:

The file watcher module log also shows four lines being processed. It even ignored an empty line at the end:

Contribution to open source

This module is available on GitHub as open-source under an MIT license.

The functionality is used to demonstrate container file system access.

We welcome your contributions in the form of issues accompanied by pull requests.

Conclusion

Once you understand how a container Bind is processed within a container, it’s easy to read or persist data on the local filesystem.

Keep the file and directory access rights in mind. Take some time to fiddle with it so it meets your (access) requirements.

With the demonstrated filewatcher container it is now easy to process CSV files. If you have similar use-cases, please share them.