Cloud data protection isn't working for most companies. Here's how to fix it.

Cloud data protection needs to be reimagined. Traditional security solutions have focused on protecting the perimeter, but in cloud-first environments, there is no perimeter. In a cloud-first environment, companies need to focus on protecting the cloud data itself.

Cybersecurity has become a board-level issue, so why are we still seeing data breaches in the news every day? The industry is in need of a better definition and a more fine-grained approach to cloud data protection. It’s time to wrap our heads around the entire cloud data lifecycle and to protect the data itself at every stage.

What is cloud data protection?

Cloud data protection refers to the practices required to secure an organization’s data in the cloud, no matter where in the cloud that data is located: whether it's in motion or at rest, whether it's being held and managed internally, or whether it's moving through external systems managed by third parties.

Anything less is leaving you unprotected and vulnerable to data breaches. Data breaches now cost companies an average of $4.24m per incident, according to the latest  IBM Security report. This is the highest figure in the report’s 17-year history.

The Default Approach to Cloud Data Protection

The current default approach to cloud data protection that most organizations use looks something like this:

  • Encrypting data while it’s at rest within data stores
  • Encrypting data while it’s in transit with SSL connections
  • Keeping data stores within a virtual private cloud (VPC) 
  • Defining security groups within a VPC, creating virtual firewalls
  • Using SSO and/or the cloud provider's IAM functionality to authenticate user identity and potentially govern access to some data stores
  • Using access controls within the data store to determine if a given user's permission (to access, read only, write) to a specific database, schema, or table within a data store
  • Deploying Data Loss Prevention (DLP) solutions on end points to detect if sensitive data is leaving the perimeter

Why the Default Approach to Cloud Data Protection Doesn’t Work

The default approach to cloud data protection takes a traditional security for on-premise environments and tries to apply it to cloud-first environments.

In the on-premise world, all of an organization’s servers and data are in a server room or in server cages at a hosting facility. Security teams focus on hardening the perimeter -- both physical and network -- and making sure anyone who passes through the perimeter has proper credentials.

The default approach to cloud data protection tries to establish a perimeter around cloud data:  the VPC and security groups act as the network perimeter, and SSO and IAM enforce identity and user authentication. DLP solutions then try to detect when sensitive data passes through the perimeter.

But the key problem is, in a cloud-first environment, the perimeter is extremely porous.  For example, even with a VPC, security groups, SSO, IAM, and DLP all deployed, a data store could be configured to be open to the public internet, creating a massive hole in the perimeter. 

Organizations need to realize that, in a cloud-first environment, there is no perimeter. To truly have effective cloud data protection, enterprises need to protect the data itself -- they need to understand where data is stored, how those data stores are configured, how the data is used.  Enterprises need to evolve their cloud data protection by understanding and securing the full lifecycle of their data.

7 ways to fix cloud data protection

With businesses operating in the cloud, it's become easy to copy and transfer data to new places. So easy, in fact, that businesses lose track of all the places their data lives. The solution is to better monitor the data lifecycle and to ground your data protection work in the reality of what's going on beyond the perimeter.

1. Automatically discover new data stores

The cloud has increased the speed of data creations, making it much harder to keep track of all the places where your data lives. Data is queried and copy-pasted across teams collaborating in the cloud, while data stores are created and shared faster than security teams can review.

To protect cloud data at this stage, we need to use software and AI to crawl and find new data stores. The goal should be to track and map data as it evolves while it’s happening.

2. Automatically detect database misconfigurations

Organizations assume the cloud service provider (CSP) is responsible for secure configurations, but actually responsibility in the cloud is shared. CSPs are only responsible for the security of the cloud;  organizations are responsible for security  in the cloud, which includes configurations. A  2018 IBM X-Force Report noted a 424% increase in data breaches resulting from cloud misconfiguration caused by human error.

To protect cloud data at this stage: Determine the correct parameters for the types of data you steward. Once you know your answers, you need software that can crawl new configurations and automatically identify misconfigurations.

3. Automatically classify and tag sensitive fields

In the cloud data is shared quickly within departments, and sensitive data can be quickly loaded into different data stores or to different tables in the same data stores. Companies lose track of what kind of data is actually being loaded into databases and whether it should exist in the store at all.

One solution to better protect cloud data at this stage is crawling new data stores, and another lies in smarter data classification. Data security designed for the multiverse of cloud-based data storage needs to automatically find new fields and tag and classify sensitive fields.

4. Regularly sync all data store permissions and detect over-permission

In the cloud, new permissions are granted faster than security operations can review. Over-permission to sensitive data can occur as the result of sensitive data sprawl – permission to a dataset is granted without the company being aware of sensitive data in that store – or because of errors or oversights while granting permissions. For example, an executive gives a list of emails from a department head all access to a dataset when only one person on that team required access.

Companies need intelligent monitoring and reporting of access use. Permission analysis needs to be core to data security practices along with methods to regularly sync all data store permissions and detect permission errors.

5. Provide use specific training

Employee training about security needs to be relevant to how the employee uses data. A customer support employee fielding hundreds of inbound queries will need training on how to spot phishing and other inbound threats, while someone working on onboarding needs to know how to manage permissions and avoid database configuration errors.

According to  a 2021 report by Fujitsu, 60% of senior executives admitted that all employees were receiving the same data security training regardless of their role. Unsurprisingly, 61% of employees said that training was ineffective.

6. Analyze sensitive data use

Understand users’ underlying behavior and how they are using data to be able to detect when a credentialed user is a bad actor or an outsider masquerading as an insider. Watching usage patterns at the query level will help you spot when a credentialed user is accessing data inappropriately, either in one fell swoop or in a low and slow attack across months.

Security disciplines such as IAM, DLP, and encryption are useful, but they still use a perimeter-based security strategy. They concentrate on detecting dangers based on access behavior. On the other hand,  query analysis can determine the level of danger each time someone accesses consumer data.

7. Detect permission and lineage issues as they happen

As a company that stores data, you are responsible for following the rules in your sector for how long you must retain data and when that data must be deleted. Derivative data sets created throughout the lifecycle can make this complicated. Data needs to be properly archived so it can be retrieved as required for legal reasons, and when it’s time for it to be deleted, you need to be sure that data is really and truly deleted everywhere, in data compliance with data privacy laws. Security teams need AI to keep track of all past and future data events and detect permission and lineage issues as they happen.

It takes a digital village

IBM’s Cost of a Data Breach Report showed that 2021 had the highest average cost in 17 years, but it also revealed that Security AI has the biggest mitigating effect on the cost of data breaches. Automation and security artificial intelligence (AI), when fully deployed, provided the biggest cost mitigation, saving up to USD 3.81 million compared to organizations without it.

The threats to data in the cloud exceed the grasp of even the most skilled security professionals. This isn’t a job for the solo security hero anymore; it’s going to take a sophisticated digital village, including AI and automation to extend our reach into these complex new dimensions. Data’s lifecycle in the cloud makes for a quick-moving multiverse, and the tools required to protect it need to be on the same level.

When you're ready for help, contact us  here

Author

Thi Thumasathit