Write once, Read many
Once the log data are entered into the database, they are seldom changed or deleted. In order to preserve the integrity, Machbase is designed so that no update can be made to the log data once they are entered. Hence,, users need not worry about the risk of change or deletion of the log data by malicious third parties.
The biggest issue related to performance of DBMS solutions in the field of processing log data has been whether it can do a change operation (including input and delete) independent of a read operation without conflict.
In order to sort out such technological issue, Machbase is so designed that no locks are assigned in connection with select operations. Further, change operations such as input and delete will never be in conflict. Through such a structure, Machbase can process statistics for millions of records in a select operation under ultra high speed, while hundreds of thousands of data are entered per second and some of them are deleted in real-time.
Ultra high-speed data storage
Machbase offers a storage capacity of dozens of times faster than other currently available regular database management systems. Despite the unfavorable environment where several indices exist in tables, Machbase can still process storage of data at an amazing speed from a minimum of 200,000 data per second to a maximum of 2,000,000 data per second.
This is possible because Machbase is designed to store data in "append only" mode.
Real-time index configuration
Under the conventional database structure, the higher the number of indices, the slower the data entry performance. Machbase has innovatively improved such conventional database structure so that Machbase can configure the index in real time even if hundreds of thousands of data are input per second. This is the very key feature that separates Machbase from other solutions. The powerful functional foundation makes it possible to do an immediate search the moment actual data is generated. This core technology is critical in machine data analysis.
High-performance data compression
One characteristic of the machine data is that they are being constantly generated. This inevitably means not only that the storage space of the database will become insufficient sooner or later but that one day, the database will no longer be able to retain sufficient data to be processed.
The conventional database structure is very much inappropriate in storage and analysis of the machine data because as data and indices increase, the faster the available data space is decreases. In order to duly cope with such a problem without sacrificing performance, Machbase stores data pouring into the system like a tsunami by having them compressed dozens to hundreds of times from the original by means of the two types (physical and logical) of innovative real-time compression technology.
One of the most important practical features for users who store and utilize the time serial log data is to determine whether a "particular event" has occurred at a “particular point in time.”
Users can determine a "particular point in time" by means of time-serial data treatment. However, in order to determine whether a "particular event" has occurred, the users need to find a specific "word" in the text field stored in a specific column in most cases. However, if the conventional database management system is used, users generally need to check the conditions of the first several characters of the words through the like clause or the exact match by a B+Tree in order to search for words in a particular field. In order to search certain words appearing in the middle of the text of the pertinent field, users may not use the index, but utilize either like '% word %' or IR (Information Retrieval) function if provided. Given that it is virtually impossible to configure the real-time index in order to use the pertinent IR, it has been thought that such search tool as used in the conventional DBMS solutions would not be available for the machine data. However, Machbase is different from conventional DBMS solutions in that it additionally provides “search” as the SQL keyword in addition to “like.” As such, it is possible to search words in real-time. Following is the practical examples of such function:
Support SQL syntax with time serial features
Machbase supports not only most SQL syntax that conventional DBMS solutions are providing but also such SQL syntax in which features of the time serial data are reflected. In the case of machine data or log data generated from the machines, the latest data is much more valuable than older data, and data access for recently generated data is several times more frequent than older data. With that said, Machbase offers the following additional benefits to its users:
• It stores the timestamp per nanosecond in the field of "_arrival_time" upon the very moment of storing records in its database, which means that all records that Machbase is storing can either be searched by time or be given the specific conditions.
• When searching data or otherwise doing a "select" operation, it outputs the most recent data first.
• It provides a DURATION keyword. When the machine data analysis is performed, it is normal to designate particular time span. In order to reflect such practice, Machbase provides such functions even in the SQL level. Through such features, users of Machbase may easily analyze the data although they do not assign complicated time operation to the where clause. Following are practical examples of such function:
Support selective deletion
It is safe to say that in case of machine data, deletion operation seldom occurs once they are input, and, however, that in case of embedded devices, obviously there exist limitations on the storage capability, and at the same time, such storage is not carefully managed by the users.
Reflecting such practice, Machbase provides optional deletion function so that its users can delete the records for a given specific conditions. Hence, embedded product developers can easily manage data storage having Machbase not to keep the program over a certain size, through "cron" or other periodic program. Following is the practical examples of such function: