Time series and flat file storage solutions are both specialized databases. Like log stores, they were designed with a specific purpose and attempting to do much outside the realm of what they are good at will result in inefficiency and hang ups. In this installment of our series on datastores, let’s discuss what time series and flat file databases are and when to use them.
Time series datastores are usually intended for extremely fast-moving data. Millions, billions of data points per minute or even per second. But it’s also chronological and time stamped. It’s not meant to be schematic data. It’s not meant to do complex queries or relate data to other data. It is meant to take tons and tons of data in near real time, so you are getting data in and time stamped appropriately. You’re not waiting and queuing data while complex writes occur or a table is locked on a complicated join. The function of time series data stores is limited and when you try to do things beyond its limitations, it’s going to put up an index finger and tell you to hang on for a minute. Only that one minute is closer to 30 minutes, or sometimes longer
Time series stores are not efficient for anything other than high throughput, rapid firing data capture. And it’s usually meant for simplistic data, but because it handles a massive amount of transactions, it can still be expensive. The data it handles is typically something that happens in the memory of a large server. And while data is written down later to back itself up, there is still a rate of error that must be accepted. And maybe there are circumstances where you care less about the time series itself and just need to capture massive amounts of data in a short window, time series databases can help. Just remember that its function is limited and doesn’t allow for highly relatable data and cross-sectioning and complex queries.
Flat file storage can be anything from storing items in your hard drive and flat files in various formats to S3 buckets or something similar. It’s simply for storage. You’re writing it to a file for later use. You might put it in a JSON file or several other formats, but you have the data and it’s easily parsed and consumed. This is a form of data management that is often overlooked.
There is latency built into using flat file storage that’s native. If you are going to write to a general database, oftentimes what is happening is you’re working in the RAM on a computer. Later, that data will get put onto the actual hard drive, especially if you’re taking the database and optimizing it to make it work fast.
With flat file storage, what you’re doing is writing to files instead of writing to memory or writing to a datastore. This is how you have data that’s not ephemeral and won’t disappear in a traditional database. But sometimes it’s easier to do it directly, on your own, and sometimes it’s faster, depending on what you are doing and how you’re interacting with the data itself. Because of how it works, it’s also a method to back up data and it’s almost always cheaper than other methods of data storage.
Time series databases and flat file storage are like other datastores in that they are efficient at what they were designed for. They are both specialized for a specific purpose and are limited to that purpose, so trying to use them for something else makes them inefficient and will cause issues. It is usually a good idea to employ more than one form of data storage to enable various methods of working with your data. Understanding how each form of storage works is the first step to determining which storage method your business would benefit from the most.