Data security is due for a rethink.
We define data security as the process and technologies used to safeguard data. It’s a critical part of protecting a company’s reputation and fiscal well-being. The problem is that traditional data security models don’t do enough to account for the current realities of data management. Most companies equate data security with access control, i.e., determining who has permissions to view, modify, and retrieve data.
Access control is important, but, in today’s cloud-first environments, it’s not sufficient to protect data across living systems because data no longer lives in hermetic data banks behind closed doors. These days, data management means tracking rapidly evolving datasets, queries, configurations, and permissions across data lakes and throughout the cloud. To achieve true data security, we need models that actually account for the entire lifecycle and ecosystem of data.
Traditional Data Security Measures
Traditional data security measures are focused on protecting against outside threats. The aim is to put walls up around data, obscure it, and dispose of it in a way that can’t be accessed or used by hackers and other malicious outsiders.
Traditional data security controls include:
- Authentication: verifying a user’s credentials
- Access control: restricting who can see/use which data sets
- Backup and recovery: regularly backing up and restoring data in the event of catastrophic data loss
- Encryption: making data unreadable (without an authorized key) when the data is at rest, in transit, and/or in use
- Data masking & tokenization: obscuring data with proxy characters or random characters
- Deletion and erasure: removing data from the system
These traditional data security measures imply that as long as only “the right” people can access data, there’s no risk of a data breach. But to protect all your data and tools against data breaches, you’re going to need to go further.
Go Beyond Access Control To Tackle The Biggest Data Security Problem: Data in Use (and Insider Threat)
Most data security solutions focus on mitigating outsider threats. In fact, you’ll notice that 3 out of 6 of the traditional controls listed above primarily deal with keeping outsiders out: authentication, access control, and encryption.
But keeping hackers out isn’t the only security concern. Every time data is accessed and manipulated, even by internal users who are supposed to be there, there’s risk. As the data environment evolves, the data lifecycle lengthens and diversifies, and the problem takes on new dimensions. It’s not enough to focus all your energy on evil hackers anymore. To protect all your data and tools, you’re going to need to tackle the tangle of insider threat.
Insider threat is the risk of data loss from either employee malfeasance or employee carelessness, mistakes, or misuse.
Many security professionals dismiss the importance of insider threat because they equate insider threat to employee malfeasance. But malfeasance is the smallest part of insider threat—carelessness, mistakes, and misuse are the main drivers of insider threat.
“Insider threat via a company’s own employees (and contractors and vendors) is one of the largest unsolved issues in cybersecurity. It’s present in 50 percent of breaches reported in a recent study,” according to McKinsey. Many large data breaches in the headlines today are complex, multi-step hacks, and the first step is often a compromised or phished credential from a careless employee—i.e., they’re the direct result of insider threat.
Credentials will continue to be compromised. To truly protect sensitive data, data security teams need to take a more fine-grained look at the actual behavior of credentialed users, and that means monitoring data in use.
The only way to monitor data in use is to examine actions at the query level. A query is a request for data from a database. Every database supports a query language called SQL (Structured Query Language), and all data access is performed through SQL queries. The flexibility of SQL makes it incredibly powerful, but it also allows users who are writing queries to create security risks—intentionally or inadvertently. And that’s why we say queries are the ground truth for data security. A poorly constructed query can accidentally return way more user data than intended, leaving personal information vulnerable. And a user with a small amount of coding skill can write a query to access information that could be damaging in the wrong hands.
So consider automated query inspection as a crucial next step to go beyond access control. Don’t just focus on who has the keys, but watch how data is used (or misused) once the keys have been presented. Track what data users come and go with, and make sure you are able to recognize dangerous queries in time, and you’ll be a major step closer to data security.
Automatically Discover New Data Stores
More and more companies are collecting data and working to make intelligent use of their data, and that’s a good thing. But as data and data stores get replicated, sensitive data ends up spilling into stores where they aren’t supposed to be. When this happens, they can be accessed and modified by internal users who may not have any need or training to interact with such sensitive data.
To tackle this data sprawl problem, data security teams need to automate the process of tracking and mapping new data stores. Next-generation data security needs to be able to crawl a multi-cloud environment and automatically find new data stores. This is not a human scale problem. This is a machine-scaleproblem, and it needs a software solution.
Identify Data Store Misconfigurations
Cloud misconfiguration is one of the biggest security threats for organizations in the cloud. Organizations make a fatal (and frequent) mistake when they assume that the cloud service provider (CSP) is responsible for secure configurations. CSPs are only responsible for the security of the cloud; organizations are responsible for security in the cloud, which includes configurations. A 2018 IBM X-Force Report noted a 424% increase in data breaches resulting from cloud misconfiguration caused by human error.
Use tools like Dasera to automatically find and flag data store misconfigurations before your data infrastructure becomes just another leaky ship.
Identify and Classify Sensitive Data
The next step is data classification. As new data stores are identified, sensitive data needs to be correctly identified so only the right users can access it and so it can be deleted properly upon request. Companies can opt for a manual or automated solution for data classification, but the best option is a hybrid solution. Choose a data security system that automatically tracks new data stores, classifies sensitive data tags data identified as private, allows for data owners to seamlessly review and reclassify data, and flags when sensitive data is found in unexpected places also surfaces suggested classifications to help your teams learn about what kinds of data you have and what questions they might be able to ask and answer in the future.
Data security, as well as data intelligence, need to become interwoven in the operations of data-first companies in order to make good on the potential in data science and stem the flood of database breaches.
Find Over-Permissioning
In today’s cloud-first environments, sensitive data tends to sprawl across your environment, into less well protected data stores.
In order to have a proactive data security posture, you not only need to know where all sensitive data resides in your data stores (see Identify and Classify Sensitive Data, above) -- you also need to know who has access to those data stores, and who should/shouldn’t have access to sensitive data.
Again, regularly reconciling where sensitive data is stored against who does/doesn’t have access to sensitive data isn’t a human scale problem. You need a real-time surveillance system that identifies movement in data and changes in permissions, and notifies both the security and data teams when over-permission is detected.
Data Security Requires Both Collaboration And Automation
Data security is an evolving interaction between humans and technology; it’s a living system that is tasked with protecting a fundamental human right to privacy.
So the solution for data security is going to require both humans and technology: both collaboration and automation.
This realization is at the heart of the services Dasera provides, but its implications are platform agnostic:
- Provide relevant security training that is targeted to the specific access and needs of every member of your organization. Use data store management software that gives those teams access to the intelligence they need.
- Automate a system that sends alerts with context that makes sense to the right people, so those people can learn from what happened in real-time.
- Monitor data in use with query analysis to help teams get the insights they need to make use of the incredible potential in your data without accidentally (or intentionally) causing a breach.
- Automate the discovery of new data stores and sensitive data so they can be used in both a safe and a compliant manner..
To learn more about the lifecycle of data and how this framework can help protect all of your data and tools, download our whitepaper.