Having a data storage solution is only the first step in the process. The second step is protecting that data from would-be thieves and nefarious evil-doers who will attempt to breach your walls.
To discuss data security you have to start with CIA – confidentiality, integrity and availability. One of the biggest considerations in data storage solutions is how to best protect storage resources against cyber-attacks. Cyber-attacks and data breaches are continually on the rise, so setting proper protections around your sensitive data is incredibly important.
The number one factor in securing any data storage solution is its configuration. Proper configuration is extremely important. Next is durability. You need to ensure that your company will have that data no matter what, so when you’re looking at durability, it’s about having a system where even if the underlying technology fails, you don’t lose the data. Whether it’s a five 9’s, a seven 9’s guarantee or hourly backups, you to have the data to secure it. Data should be replicated to multiple locations and multiple drives, or into multiple data stores underlying the service that you’re using and therefore there are multiple copies of it any given time. So if one copy is compromised or fails, you have still ensured that your business has access to its data.
Availability is durability. The availability of your data is directly related to the durability of your data storage solution.
Scalability also contributes to availability. You want your data to be able to scale with you, you want to be able to add more to your data stores as your business grows. You also want to make sure you can retrieve your data as needed without hitting a bottleneck.
Integrity is protected in data stores through versioning and logging. It’s really important, when you have a data storage solution, that you have saved versions of the data objects. This way, if data is tampered with or destroyed, you have backups and versions that are intact and can ensure integrity. With a version made each time data is changed, it essentially creates a new copy, including deleting data. But deleting it doesn’t delete the old versions, unless you intentionally delete all versions. Your data is therefore recoverable with versioning.
Other considerations with versioning are change management, logging data access and data manipulation. This tells you who is accessing what data and how it was altered, which also ensures integrity. These access controls also contribute to confidentiality.
When it comes to confidentiality, data should be isolated away from the public as much as possible. This includes everything from not having public facing S3 buckets to isolating data storage solutions, data drives and databases into non-public networks. Ensure that access is limited, especially in large scale production environments. You really only want machines to access production data systems. Prod data access should only be indirect for users via robust credentialing systems. However, even when eliminating direct human access in prod, machine access will still require keys and/or credentials. Those keys and access credentials should be kept in key stores or other secure stores, isolated away from everything else on your network.
It is incredibly important to have a key store or key vault as part of your data storage solution. There are a variety of different systems like key management systems and secret management systems that can be used. And If the information is incredibly secret or secure or other highly confidential information, you can always look into an HSM. They are much more expensive even as a fractional cloud service, but they are assembled by different parties with separations of concerns. For instance, setup/initiation and managing the hardware key to it, and maintenance are all assembled by different parties. Many are also built to essentially destroy the data they house if tampering is attempted.
As you’re setting up these protections, don’t fall prey to the two most common mistakes: failing to implement versioning and putting your data at risk accidentally. More often than not, you’re likely to see accidental data destruction or nefarious data destruction than you will theft of data. Truthfully, the “theft” that is often seen is regularly tracked back to either incidental exposure or negligent exposure, meaning that people have left access points or other information open to the public when they should have been made private. Or, worse, failing to patch a known vulnerability.
The best way to prevent the problems above is by selecting the right data solutions, configuring them correctly and isolating them as much as possible. When it comes to data, as with most of information security, use least privilege. No one should be directly accessing data storage solutions and data stores as a human being in a production environment. It frankly should not be necessary for a human to need to access it unless it’s an extreme emergency “break glass in case of fire” situation. Meaning something critical has happened and someone needs to go in and repair the problem with the least amount of access possible, and the process should be logged and recorded and only for a short period of time.
Having the proper data storage solution is only part of the puzzle, you must also ensure that your data is secure, that it’s durable and available and has integrity. Your business literally depends on this, because if something becomes exposed, you’re on the hook. And if that something was a known vulnerability, it can get very ugly in terms of fines and compliance crack downs. Don’t expose your data! Always remember, if you question anything, hire an expert to help you! Make sure it’s done right the first time.