Using CrateDB on Azure IoT Edge

During the SPS IPC Drives of 2018, I was introduced to the team of CrateDB.

They offer this blazing fast database:

CrateDB is a distributed SQL database built on a NoSQL foundation. It is familiar to use, simple to scale, and versatile for handling any type of structured or unstructured data with real-time query performance.

It’s always nice being able to choose from several services like databases. So I checked out how to develop a simple application and Azure IoT Edge module against Crate if running in a container.

In this blog, we see how we can use the CrateDB in Azure IoT Edge.

Crate is available in the Azure marketplace, to be run in the cloud:

Crate offers the database as a container also. So it should be possible to run the container on Azure IoT Edge.

For this, I created an Azure IoT Edge device registration and deployed it to an Advantech ARK 1233 running Ubuntu Linux. Create a new module in the ‘Set Module’ wizard. Just fill in the module name and the image URI (tag):

Note: Only Linux containers for Crate are supported in Docker Hub. See the documentation for other supported operating systems.

In the Container Create Options, expose the ports of the crate service and provide a volume bind so the crate database files can be placed on the host filesystem:

{
    "HostConfig": {
        "ExposedPorts": {
            "4200/tcp": {},
            "5432/tcp": {}
        },
        "PortBindings": {
            "4200/tcp": [
                {
                    "HostPort": "4200"
                }
            ],
            "5432/tcp": [
                {
                    "HostPort": "5432"
                }
            ]
        },
        "Binds": [
            "/var/crate/data:/data"
        ]
    }
}

This gives access to the volume ‘/var/crate/data’ and this also gives access to port 4200 (the admin portal) and port 5432 needed to interact with the database.

As always, we need to create and share the volume where the actual database is stored:

sudo mkdir /var/crate
sudo mkdir /var/crate/data
sudo chmod 775 /var/crate
sudo chmod 775 /var/crate/data

Once the module is deployed. It should start and run.

In my particular situation, the container failed to run!

Notice the exit code seen in the Azure portal:

I checked the logging of the ‘cratedb’ module, it was complaining about virtual memory:

Luckily, the logging gives two hints to solve this.

I prefer the change to the ‘/etc/sysctl.conf’ file because it survives a reboot. Just add this line at the bottom of that file:

vm.max_map_count = 262144

Once the fix was effectuated (restarting the edge device), access is given to the management portal of Crate:

To the left, several menu items guide you to forms for interacting with the database:

  • Console, to execute Crate SQL commands
  • Tables, to see the health and size of tables and other information
  • Privileges, to see created users
  • etc.

On the main screen, notice the Free tier information. By default, this Crate version is limited to three nodes and no expiration date. See it as a trial version for a minimal cluster. The featureset is the same as that of the Enterprise edition. There is no support available, you have to rely on the Crate community forum.

This is great if you want to learn about Crate and see what it can do for you.

Crate also provides a Community Edition. It does not include the Enterprise features but can be run on as many nodes as you wish.

Now, let’s create a table and store some data.

Creating tables

Crate comes with a tutorial where Twitter tweets are imported and queried. I encourage you to check this out first.

In this blog, we check out another tutorial and use it to our advantage.

We are going to import live ISS locations.

First, we create this table using the Console form in the portal:

CREATE TABLE iss (
  timestamp TIMESTAMP GENERATED ALWAYS AS CURRENT_TIMESTAMP,
  position GEO_POINT)

This looks like:

In the Console form, we can see the ISS table is empty at this point:

We also create this user in Crate (remember name and password):


CREATE USER edge WITH (password = 'iot')

GRANT ALL PRIVILEGES to edge

Let’s program against this database table.

Python app

This Python app is able to read ISS locations and write them to the database:

from crate import client
import requests
import time

def position():
    response = requests.get("http://api.open-notify.org/iss-now.json")
    position = response.json()["iss_position"]
    return f'POINT ({position["longitude"]} {position["latitude"]})'

print ("Press CTRL-C to exit...")

connection = client.connect("ark1123:4200",username="edge")

while True:
  cursor = connection.cursor()
  p = position()
  cursor.execute("INSERT INTO location (position) VALUES (?)", [p])
  print (p)
  time.sleep(1)

Note: This code is an extended version of this example

Once the application is running, we can see the incoming location; the number of rows increases every few seconds:

Great. We are able to insert rows into the table.

I noticed there are more clients supported. One of them is C#:

For programming with C#, a simple tutorial is available. Let’s check it out more closely.

C# app

Start an empty .Net Core C# app in Visual Studio.

To access the database, we need a NuGet package and program our code using it:

Here is a copy of the functionality seen in the Python app, now programmed in C#:

using Newtonsoft.Json;
using Npgsql;
using Npgsql.CrateDb;
using System;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;

namespace CrateDBClient
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            while (success)
            {
                var location = GetLocation().Result;

                var success = SaveLocation(location).Result;

                Console.WriteLine($"Saved={success}");

                Thread.Sleep(1000);
            }

            Console.ReadKey();
        }

        public static async Task<bool> SaveLocation(Location location)
        {
            NpgsqlDatabaseInfo.RegisterFactory(new CrateDbDatabaseInfoFactory());

            await using var conn = new NpgsqlConnection("Host=ark1123;Username=edge;SSL Mode=Prefer;");
            await conn.OpenAsync();

            // Insert some data
            using var cmd = new NpgsqlCommand();

            cmd.Connection = conn;
            cmd.CommandText = "INSERT INTO doc.location (position) VALUES ([@longitude , @latitude])";
            cmd.Parameters.Clear();
            cmd.Parameters.AddWithValue("longitude", NpgsqlTypes.NpgsqlDbType.Double, (Double)location.iss_position.longitude);
            cmd.Parameters.AddWithValue("latitude", NpgsqlTypes.NpgsqlDbType.Double, (Double)location.iss_position.latitude);

            try
            {
                return cmd.ExecuteNonQueryAsync().Result == 1;
            }
            catch (Exception dbEx)
            {
                Console.WriteLine("Exception writing message to Crate DB. " + dbEx.Message);

                return false;
            }
        }

        private static async Task<Location> GetLocation()
        {
            var url = $"http://api.open-notify.org/iss-now.json";

            using var client = new HttpClient();
            var req = new HttpRequestMessage(HttpMethod.Get, url);

            using var res = await client.SendAsync(req);
            var jsonString = await res.Content.ReadAsStringAsync();

            var location = JsonConvert.DeserializeObject<Location>(jsonString);

            return location;
        }
    }

    public class Location
    {
        public long timestamp { get; set; }

        public string message { get; set; }

        public Position iss_position { get; set; }

        public DateTime datetime
        {
            get
            {
                DateTime unixStart = new DateTime(1970, 1, 1, 0, 0, 0, 0, System.DateTimeKind.Utc);
                long unixTimeStampInTicks = (long)(timestamp * TimeSpan.TicksPerSecond);
                return new DateTime(unixStart.Ticks + unixTimeStampInTicks, System.DateTimeKind.Utc);
            }
        }
    }

    public class Position
    {
        public decimal longitude { get; set; }
        public decimal latitude { get; set; }
    }
}

Note: this code is written for example purposes. When used in production, validate if opening and closing the database connection multiple times is a smart thing to do.

The code is pretty straight-forward. It takes the ISS location, opens a database connection, and inserts the location into the table.

I only had two challenges writing this code:

  1. I had to convert the datetime I got from the location into something that the database would accept
  2. I forgot to register this line with “new CrateDbDatabaseInfoFactory()” which resulted in some ugly exceptions. Just add that single line once in your code and the connection opens without any problems

So I executed this console app and again, rows are written to Crate:

Again, we see the arrival of the locations in the table:

We can check the individual rows using a query:

The rows are shown at the bottom of the screen once the query is executed.

Each row comes with a link. If you open it, the ISS location is shown in OpenStreetMaps:

This is a nice feature of the Crate portal.

Credential check

Maybe you noticed already in both programming samples and the portal access: we are never asked to provide any credentials.

By default, Crate does not enforce the usage of credentials.

In an enterprise environment, using resources without credentials is normally not allowed. So let’s learn how to enforce the credential check.

For this, we need to add a file named ‘crate.yml’ to the shared folder ‘/var/crate’:

## Standard Docker crate.yml
# Crate Configuration
network.host: _local_,_site_
# Paths
path:
  logs: /data/log
  data: /data/data
blobs:
  path: /data/blobs

## --------------------
## Added Authentication
auth:
  host_based:
    enabled: true
    config:
      0:
        user: crate
        address: _local_
        method: trust
      99:
        method: password

These settings enforce the usage of a password when the client is connecting from another location than the machine where the Crate database is running on. In the case of a Docker container, other containers on the same machine need to provide a name/password too (due to the sandboxed nature).

The same security enforcement goes for the portal. We need to login to the portal too, once these settings are read.

Letting Crate read these sessings, change the Container Create Options by adding an extra line:

{
    "HostConfig": {
        "ExposedPorts": {
            "4200/tcp": {},
            "5432/tcp": {}
        },
        "PortBindings": {
            "4200/tcp": [
                {
                    "HostPort": "4200"
                }
            ],
            "5432/tcp": [
                {
                    "HostPort": "5432"
                }
            ]
        },
        "Binds": [
            "/var/crate/data:/data",
            "/var/crate/crate.yml:/crate/config/crate.yml"
        ]
    }
}

Note: the name of the file is predetermined, as seen in the binding in the container create options.

Restart the module so it picks up that extra binding.

Refresh the Crate portal and now you will be asked to provide credentials:

The two code examples above have to be updated too.

Alter the database call in the Python example:

...
connection = client.connect("ark1123:4200",username="edge",password="iot") # password added
...

Alter the C# example too:

...
await using var conn = new NpgsqlConnection("Host=ark1123;Username=edge;Password=iot;SSL Mode=Prefer;");
...

For example, executing the original C# program will fail with a database related exception:

I ported the C# console app to a custom IoT Edge module and it works like expected. And yes, I have to provide credentials in the desired properties:

Conclusion

You have seen how simple it is to start using Crate in Azure IoT Edge. And we have enforced the security using credentials.

The speed and footprint of the Crate container are ideal for usage in Azure IoT Edge. You are now able to program to another database for local storage of telemetry coming from sensors or local logic.

My friend Jurgen Mayrbaeurl has an OPC-UA demonstration posted on GitHub which makes use of CrateDB. Check out his project too.