top of page

Time-Based Data Valuation and Efficient Management Strategies (Feat. Mount)

Updated: Oct 18


Background and Value of Time Series Data

Time series data refers to data recorded over time. It is primarily generated in various fields such as sensors, log files, financial transactions, stock prices, and weather data. The main characteristic of time series data is that data points are continuously recorded along with time.

The value of time series data lies in its ability to analyze patterns and trends by including information about when data occurs and the intervals between data points. This becomes crucial material for supporting data-driven decision-making and predicting future events through predictive models. For example, in manufacturing, it can be used to monitor equipment status and predict maintenance needs, while in the finance industry, it can be used to analyze market trends and develop investment strategies.



Typical time series data server resource monitoring example


The Temporal Value of Data

The temporal value of data refers to the importance associated with the point in time when data was generated. New data generally has greater immediacy and is called 'hot data'. On the other hand, older data has less immediate necessity but holds value for long-term analysis and as reference material, and is called 'cold data'.


Hot Data

Cold Data

Definition

  • Frequently accessed and updated in real-time.

  • Mainly used for real-time sensor data, stock trading systems, and real-time analysis.

  • Low access frequency, primarily used for reference purposes or archiving.

  • Includes historical data, past logs, backup data, etc.

Characteristics

  • Fast read/write speeds

  • Low latency required

  • High I/O performance demands

  • Lower performance requirements due to infrequent access

  • High data compression ratio

  • Use of cost-effective storage


The temporal value of data is a key element in data management strategy. Understanding the temporal value of data allows organizations to maximize efficiency in their approach to data storage and access. This enables them to reduce storage costs and quickly access and use the data they need.



Management Strategy for Hot Data and Cold Data

To effectively manage time series data, a strategy that distinguishes between Hot Data and Cold Data is necessary. This is crucial for improving cost efficiency and optimizing data accessibility.




Operational Storage Management for Hot Data

Hot data requires real-time analysis and quick access. For this purpose, it is managed in an operational state by storing it in high-performance, fast storage solutions. This typically involves expensive storage like SSDs (high-speed disk drives). Additionally, database performance can be optimized and response times minimized by utilizing indexing and caching techniques.


Backup and Affordable Storage for Cold Data

Cold data is not frequently accessed but needs long-term preservation. Such data is backed up and stored in affordable storage. Examples include hard disk drives (HDDs), cloud storage, and tape drives. This is an efficient method that reduces costs while allowing data recovery when needed.


Recovery and Utilization of Cold Data When Necessary

Cold data should be recoverable and usable when needed. This refers to situations where backed-up data is reactivated for analysis and decision-making. The recovery process should be quick and efficient, requiring appropriate data management and backup systems. Additionally, the data recovery procedure must ensure data integrity and may include converting data to the latest format if necessary.


Machbase's Innovative Approach to Cold Data Utilization

Most DBMSs require a restore operation to the operational DB to query backup data again. If the backup data volume is large, this restore process can take hours to days.

In contrast, machbase provides functionality similar to the file system mount concept in Linux. Using the MOUNT feature allows immediate data querying by linking to the operational DB, regardless of data volume. This means data recovery time is reduced to within seconds, allowing immediate utilization when needed. Consequently, Hot Data can be minimized to only what's necessary for operations, while the rest can be backed up and stored as Cold Data without compromising data utilization.


Example of Backup/Mount Usage

Here's a brief explanation of how to use Machbase's Backup and Mount commands:


1) Database Backup

Create a backup DB named backup_20240528 using the following SQL command:

BACKUP DATABASE INTO DISK = 'backup_20240528';

2) BackupDB Mount

Connect the created backup DB to the operational DB with the name MountDB:

MOUNT DATABASE 'backup_20240528' TO MountDB;

3) Querying Mount Data

To query the mounted DB, you need to specify the table as "mountname.username.tablename" to distinguish it from the operational DB:

SELECT * FROM MountDB.SYS.TAG;

4) Unmount

After data utilization is complete, disconnect the mounted DB from the operational DB:

UNMOUNT DATABASE MountDB;

For more detailed information, please refer to the machbase manual (link: https://docs.machbase.com/dbms/feature-table/backup-mount/overview/).


Conclusion

Time series data has established itself as a crucial asset across various industries. By understanding the temporal value of data and implementing efficient management strategies for Hot Data and Cold Data, organizations can reduce data storage costs and quickly access and utilize necessary data. Effective time series data management plays a vital role in supporting data-driven decision-making and improving the accuracy of future prediction models.


The machbase time series DBMS provides a MOUNT feature that allows for immediate utilization of Cold Data, enabling more efficient data management strategies compared to other databases.


18 views
bottom of page