top of page

Why is Machbase an ultra-fast time series DBMS optimized for IoT devices better than InfluxDB?

Inputting DHT11 temperature and humidity sensor and querying data in Machbase


Overview

Machbase is an ultra-high-speed time-series DBMS optimized for IoT. It provides data merging and monitoring functions for operational optimization and predictive analysis in industrial sites. It allows high-speed input and analysis even on small computers like Raspberry Pi. However, it is a concept that can be difficult for beginner developers who only have basic knowledge of using databases and are unfamiliar with the term time-series database itself. I, the writer of this article, also have almost no knowledge of databases, SQL, queries, etc. in the database field. To help these beginners easily understand and easily access and use Machbase products, I would like to write a simple “Machbase User Guide” using sensors that are easily accessible around us.


I have been using both Machbase and InfluxDB, a world-renowned time series DBMS, for some time. Based on my experience with both products, Machbase has no performance or usability shortcomings compared to InfluxDB, and in fact is far superior to InfluxDB in terms of speed. In particular, it showed a speed that outperformed InfluxDB in input and query on Raspberry Pi, which has limited hardware performance, unlike most computers. I can therefore guarantee that users reading this article will not regret choosing Machbase.


Before we get into the main topic, let’s briefly summarise what we’re going to do in the following guide. First, let’s install the base Machbase. Rather than using static, pre-created data, I think it’s better to use data that is generated in real time and closer to the actual experience. That’s why I’m going to use the IoT temperature and humidity sensor called DHT11. By connecting the DHT11 sensor to the Raspberry Pi, you can get temperature and humidity data on an hourly basis. We’ll feed this data into the Machbase DB and use an external tool to graph it so that we can understand a data processing flow at a high level. As it’s organized in detail from installation to use, it’s easy to proceed.


Machbase

Installing Machbase-Neo

To use the Machbase DB, you need to install Macbase-neo. Machbase-neo can be used on a wide range of servers, from small computers like Raspberry Pi to high-end servers, and can use MachSQL with a familiar syntax. It also allows you to write and query data in various ways via HTTP, MQTT, and gRPC, making it easy to use Macbase.

Please refer to https://neo.machbase.com/ for the installation method. Another advantage of Machbase is that it is quite easy to install. Just download the compressed file, unpack it, and run the server.




Unzip it to your desired directory.



After unzipping, there will be a machbase-neo executable in the subdirectory. Let’s start the server with serve.



When the server starts, a data directory named Machbase will be created in the same directory. This is the directory where data and settings entered through the server will be stored. If you reinstall for reasons such as updating to a higher version, you can preserve the data by moving the Machbase directory to the new Machbase directory. When you start the server, the server screen will be displayed, and you can check the port here, so use it appropriately.


You can enter SQL shell mode with the following command. Here, you can use various functions such as creating tables, deleting them, printing them, and querying them.



What is the DHT11 sensor?

I don’t think there are many people who don’t know about DHT11 sensors, but let’s talk about them.



The DHT11 sensor is an 8-bit microcontroller that can obtain temperature and humidity data in real time as serial data. It is widely used in small IoT devices such as Arduino and Raspberry Pi and is easy to obtain. It can measure temperatures ranging from 0 to 50 degrees Celsius and humidity ranging from 20 to 90%. It consists of three pins. Each is VCC, DAT, and GND. VCC receives power between 3V and 5V, DAT transmits data to the main body, and GND serves as a ground.


Connecting to Raspberry Pi 4



Let’s connect the DHT11 sensor to the Raspberry Pi board. Using the pin map above, you can connect each pin of the DHT11 according to its role. VCC is connected to 5V power (pin 2), DAT is connected to GPIO 2 (pin 4), and GND is connected to Ground (pin 6). GND is connected to Ground (pin 6). The connections are pretty straightforward. There’s nothing to worry about, but remember that it’s connected to GPIO 2. This will be used in the code to get the data. Once connected, you should see a red LED on the sensor as shown below.



Get DHT11 data and convert it to text

To understand how to use DHT11, let’s get some data and output it in CSV format. CSV refers to a data format in which data is divided by ‘,’. It is a format that is easy to change according to the DB used, so it is often used to record raw data. The code is written in Python. In fact, it is difficult to use Python as the main language on Raspberry Pi in a limited environment in terms of performance. However, I chose Python because it provides a variety of APIs, including an Adafruit module for handling dht11, and is well-documented.


Adafruit is not pre-installed on the Raspberry Pi. Let’s install it.




The basic usage is as follows.


Import the Adafruit_dht module.



Initialize the DHT11 device and get the temperature and humidity. Since we connected GPIO2, the pin number is 2.




Get temperature and humidity data every 3 seconds and output it to the console in CSV format along with the time of the measurement. The time is usually in the Unix time format. Unix time (= Epoach, Posix time) represents the number of seconds accumulated from 1 January 1970 00:00:00 UTC to the present. It can also be expressed in nanoseconds, and we usually enter nanoseconds in the DB. However, in this code, we only output seconds.


datetime.datetime.now(datetime.timezone.utc).timestamp() outputs the unix time based on UTC.


The Adafruit_DHT.read_retry(self. sensor, self. pin) method gets the humidity and temperature data at the time of the call.



I saved it as dht_csv_print.py and ran it. The console window output is as follows.



The format above is CSV format. Of course, it looks complicated and has poor visibility, but it can store a large amount of data in a small capacity and is easy to convert.


Enter real-time temperature and humidity data into Machbase.

Now that you know how to use DHT11, let’s upload data directly to the db. We will get data in the same 3-second cycle and write it in Python. Machbase provides various input methods such as mqtt and rest, but rest API is the simplest and easiest to use in Python, so we will use it. In Python, if you import the requests module, you can use the HTTP protocol with a simple method.


The code can be used as a module, so I created a separate Machbase class. The class consists of the to_jsonmessage() method, which changes the received data into an appropriate JSON format, and the write_data_machbase() method, which uploads the data to the DB. The content is easy to understand, so there will be no difficulty in understanding it. The thing to know is that in write_data_machbase(), the response value comes as a post-result of requests, and the input is successful only when the response value is 200.


The data will be received in list form. We could have received the data in a csv file like before, but since we will receive the data from the same code file and input it immediately, we received it in a list form for easy parsing. We looped every 3 seconds to get the data and input it into the Machbase DB.

There was some trial and error in this part. At first, I thought that there might be a bottleneck in the process of entering data into the DB, and I might not be able to get data accurately every 3 seconds, so I tried to apply “multi-threading” or “multi-processing” by specifying a separate process for getting data. However, when I used it myself, Python did not produce efficient multi-performance and instead created a bigger bottleneck than before, so I couldn’t get data for much longer than 3 seconds. Instead, after giving up on “multi,” I was able to get data accurately every 3 seconds.


I took the approach of getting data for each loop, entering it into the machbase, measuring the time during that time, and then sleeping the process by the amount of time subtracted from 3 seconds. This way, the recording was done exactly every 3 seconds. The configuration code is as follows.

==================================================================

import datetime, time

import Adafruit_DHT

import json

import requests

class machbase():

def init(self, url :str, table :str):

self._url = url

self._tablename = table

self._header = {'Content-Type': 'application/json; charset=utf-8'}

def to_jsonmessage(self, dataset):

table_dict = dict()

table_dict["table"] = self._tablename

data_dict = dict()

data_dict["columns"] = ["sensorname", "time", "value"]

row_list = list()

row_list.append(["temperature", dataset[2], dataset[0]])

row_list.append(["humidity", dataset[2], dataset[1]])

data_dict["rows"] = row_list

table_dict["data"] = data_dict

return table_dict

def write_data_machbase(self, record):

table_dict = self.to_jsonmessage(record)

table_json = json.dumps(table_dict)

response = requests.post(url = self._url, data = table_json, headers = self._header)

if response.status_code == 200:

print("Http post success!")

return 0

else:

print("Http post failed!")

return -1

if name == "__main__":

machbase_table = "TEMP_HUM_EXAMPLE"

pinnum = 2

machbase_client = machbase(url = machbase_url, table = machbase_table)

while True:

start = float(time.time())

utctime = int(datetime.datetime.now(datetime.timezone.utc).timestamp())*1000000000

hum, temp = Adafruit_DHT.read_retry(Adafruit_DHT.DHT11 ,pinnum)

if hum is not None and temp is not None:

datalist = list()

datalist.append(float(temp))

datalist.append(float(hum))

datalist.append(utctime)

machbase_client.write_data_machbase(datalist)

stop = float(time.time())

elapsed = stop - start

sleep_t = 3 - elapsed

if sleep_t >= 0:

time.sleep(sleep_t)

==================================================================


The JSON format used in the HTTP protocol in Machbase is as follows.

Name represents the name of the tag key, time represents a timestamp, and value represents the actual value of the tag.


I ran the Python program and it started taking input.

After about 30 minutes, I checked the data by selecting * from TEMP_HUM_EXAMPLE in the MachSQL shell screen.


==================================================================

gon@raspberrypi:~/Desktop/exer ./machbase shell sql "select * from TEMP_HUM_EXAMPLE"

║38 temperature 2023-01-26 04:35:09.000000 26.000000

║39 humidity 2023-01-26 04:35:06.000000 19.000000

║40 temperature 2023-01-26 04:35:06.000000 26.000000

║41 temperature 2023-01-26 04:35:03.000000 26.000000

║42 humidity 2023-01-26 04:35:03.000000 19.000000

║43 temperature 2023-01-26 04:35:00.000000 26.000000

║44 humidity 2023-01-26 04:35:00.000000 19.000000

║45 humidity 2023-01-26 04:34:57.000000 19.000000

║46 temperature 2023-01-26 04:34:57.000000 26.000000

==================================================================


The data was entered successfully. I left the program on for about a week and entered the data.


Queries that can be used easily

It is not the end after entering data. If you only enter data and output all data, there is no reason to use DB. You need to organize and filter the table according to the direction the user wants and query it so that only the desired values ​​are output. I will proceed with two simple queries to show you how to query in Machbase.


1. Find the maximum values ​​of temperature and humidityMachSQL provides a syntax that is almost identical to basic SQL.Let’s query using the max function to find the maximum value and group by.



2. Calculate the average temperature and humidity values ​​every hour.Machbase can truncate tables by desired time units using the date_trunc() function.


It takes three arguments. The first argument is the unit of time expression. Various units can be entered, such as ‘h’ (hour), ‘m’ (minute), etc. The second argument is TIME, and the third argument means the interval of the selected time unit. For example, if you want to truncate the table by one hour, enter date_trunc(‘h’, TIME, 1).


==================================================================

select name, avg(value), date_trunc('hour', TIME, 1) as _TIME from temp_hum_example

group by _TIME,

name order by _TIME;

==================================================================



Even if you have a lot of data, it’s very fast to output results. If you know SQL syntax, you can easily make queries and output tables in the way you want. Try out different queries. The syntax is easy, and it’s very similar to what you’re used to, so it should be easy to get into.


Conclusion

To run and use the Machbase server, you need to install Machbase-neo. Machbase-neo is convenient because you can use it right away by unzipping the compressed file. After running the server, I received temperature and humidity data from the DHT11 sensor and entered it into the DB. Although there is no separate Machbase API for Python, you can easily use the requests module to perform HTTP posts, so you can easily enter data if you create a JSON format that matches Machbase HTTP. However, it was disappointing that there was no client library. You can use MachSQL, which is similar to a general-purpose SQL language, for queries. The query process was also excellent because you could execute the desired query statement on the console by opening the SQL shell mode in the console window. It showed a fast speed of almost 1 second to output the query result even for large amounts of data.


It has been about a month since I started studying and using Machbase. I was almost a complete outsider to DBs, including Machbase. However, from a beginner’s perspective, Machbase was an easy-to-access DB. In addition, I conducted all the processes on an IoT device called Raspberry Pi. Machbase showed excellent input and query performance even with such limited hardware. Thanks to this, I was able to input and test data of various sizes without any burden. Most of these tasks could be performed on the console window. Most of the tasks were connected via ssh, and there was no need to connect a monitor when using Machbase. Users can freely access and use Machbase remotely with just one inexpensive Raspberry Pi. I consider this a big advantage. I hope that users reading this article will also experience Machbase, which is easy to access.



7 views
bottom of page