Introduction
In the last article, we compared InfluxDB 1.8.1 with Mchbase. While 1.8 is installed by default using the Linux package manager, we wanted to see how it compares to Machbase 6.1, the latest open-source version, 2.7.
Machbase
Machbase is a DBMS designed for fast input, search and statistics of time series sensor data.
It supports regular single servers, multi-server clusters, including edge devices such as Raspberry Pi. It has special features and architecture for processing time series sensor data and machine log data.
InfluxDB
It is an open-source time series DBMS developed by InfluxData. It is one of the most popular products for processing time series data. InfluxDB also supports clustering.
For more information about InfluxDB, see the sitebelow. https://docs.influxdata.com/influxdb
Test
Test environment
For this test, we used the following environment.
CPU : AMD EPYC 7742 64-Core Processor(128 thread)
Memory : 256GB
Disk : Samsung NVME 2TB(PCI Express 3.0 x 4)
OS : CentOS Linux release 7.8.2003
Database : Machbase 6.1.8 Fog / InfluxDB v2.7.0
Test constraints
In InfluxDB 2. x and above, the removal of the database concept from version 1.8, the addition of buckets and organizations, the reduction of SQL support, and the enhancement of security and authentication with tokens make existing client applications incompatible. As a result, existing tests written in SQL cannot be used and will be executed by rewriting queries in the Flux language.
Machbase used the existing test results.
Test input performance
With the update of InfluxDB from 1.8 to 2.7, resource usage has increased significantly.
As before, we ran tests in the following environments
Ingestion performance test results
Despite the increased resource headroom on our test machine, InfluxDB’s input performance is slightly worse than before (160,000 eps). You can see that InfluxDB input performance has not improved with the version.
Performance testing of queries
InfluxDB is reducing support for SQL queries and increasing support for flux, its own query language. As a result, the existing test query tool is no longer available, and the following SQL query has been transformed into a flux query for testing purposes. The query is shown in the figure below.
Machbase(SQL) / Q1
select count(*) from tag;
InfluxDB(Flux) / Q1
from(bucket:"sensor_data/autogen")
|> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)
|> filter(fn: (r) => r._measurement == "tag_data" )
|> group()
|> count() |> yield(name: "count")
Machbase(SQL) / Q2
select count(*) from (select * from tag where name = 'EQ0^TAG1' and
time between to_date( '2018-01-01 00:00:00') and to_date( '2018-01-02 00:00:00'));
InfluxDB(Flux) / Q2
from(bucket:"sensor_data/autogen")
|> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)
|> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" )
|> group()
|> count() |> yield(name: "count")
Machbase(SQL) / Q3
select count(*) from (select * from tag where name in ('EQ0^TAG1', 'EQ0^TAG2', 'EQ0^TAG3',
'EQ0^TAG4', 'EQ0^TAG5', 'EQ0^TAG6', 'EQ0^TAG7', 'EQ0^TAG8', 'EQ0^TAG9', 'EQ0^TAG10',and
time between to_date( '2018-01-01 00:00:00') and to_date( '2018-01-02 00:00:00'));
InfluxDB(Flux) / Q3
from(bucket:"sensor_data/autogen")
|> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)
|> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" or r.name == "EQ0^TAG2" or r.name == "EQ0^TAG3" or r.name == "EQ0^TAG4" or r.name == "EQ0^TAG5" or r.name == "EQ0^TAG6" or r.name == "EQ0^TAG7" or r.name == "EQ0^TAG8" or r.name == "EQ0^TAG9" or r.name == "EQ0^TAG10" )
|> group()
|> count() |> yield(name: "count")
Machbase(SQL) / Q4
select name, stddev(value) from tag where name = 'EQ0^TAG1'
and time >= to_date'2018-01-01 00:00:00') and time < to_date('2018-01-02 00:00:00')
group by name;
InfluxDB(Flux) / Q4
from(bucket:"sensor_data/autogen")
|> range(start:2018-01-01T00:00:00+09:00, stop:2018-01-02T0100:00+09:00)
|> filter(fn: (r) => r._measurement == "tag_data" ) and r.name == "EQ0^TAG1" and r._field == "value" )
|> stddev()
|> group(columns: ["_measurement"], mode: "by")
Test results for query performance
The query execution time was measured using Machsql and influx executable tools respectively. The results are shown below.
Comparison of test results in aggregate
The performance of the first query to get the number of records is so different that the performance of the other three queries is shown below.
Conclusion
Similar to the previous Machbase vs. InfluxDB performance comparison, we see that InfluxDB is very slow on input and Machbase is better on queries. We can also see that InfluxDB has reduced its support for SQL and increased its support for proprietary query languages, making it less compatible with existing applications.
Machbase promises to continue to maintain compatibility with existing programs by adhering to industry standards, including SQL, to improve support for various new APIs in Machbase neo and to continue to improve performance.
Thank you.
Machbase CRO, Grey Shim
Comments