Five Cloud Data Threat Scenarios Only Query Analysis Can Protect You From

In today’s digital age, companies have more access than ever before to people’s PII, and general consumer data. As companies handle more and more data, they become more vulnerable to data breaches. As 21st-century technology evolves, so too does the threat landscape and there is nowhere CISO’s should apply the magnifying glass closer than on their own employees.

With so many businesses moving to a remote work environment, the organization perimeter has disappeared. The line between insiders and outsiders is blurring, making you more vulnerable e.g. to credential theft. Query inspection is critical for an effective data protection strategy as it captures the intent of the employees interacting with structured data.

Because we store most of our consumer data in cloud databases, insiders use this data with query interfaces, BI reports, data pipelines, APIs -- all of these, at the end of the day gather the data through a query. This makes queries the ground truth for how securely someone uses that information.

With more than 40% of breaches caused by employee negligence, it’s become more important than ever before to ensure that you have the right tools in place to protect your business.

Here are 5 situations where Query Analysis can protect your cloud assets:

1. An Analyst Accidentally Exfiltrates PII Fields

Let’s say that you are the CISO of a fairly large consumer app company. With everything shifting to digital, your VP of Growth asks your analyst to review new accounts data in order to compare it with previous months or quarters. The analyst accidentally pulls up more than just statistical results. The query generated by the BI tool also requested the raw data like PII. This data is now on the analyst’s laptop, waiting to be accidentally exfiltrated for thousands of consumers.

SELECT

count(*),

acc.account_name,

acc.created_date,

acc.referral_source

FROM

accounts acc

JOIN orders ord ON ord.account_id = acc.id

WHERE

acc.created_date > (CURRENT_DATE - INTERVAL '3 months')

GROUP BY acc.account_name, acc.created_date, acc.referral_source;

From the point of view of your current stack, the analyst did nothing wrong. They used Tableau to run a report on a database they had legitimate access to. Without analyzing the query, in this situation the company could have lost PII for tens of thousands of consumers. Makes you wonder how many such queries were run across your organization today, doesn’t it?

But wait, there’s more.

2. The MBA Who Doesn’t Know SQL

It’s been a busy summer for your eCommerce business, and your support team just hired an MBA graduate to join the strategy team. He just got out of training a couple of weeks ago where he learned how to use SQL for data analysis. Today is his first day on the floor and his manager has asked him to work on understanding trends in recent support tickets. While doing this research, our friend whose understanding of SQL is basic, runs a select * query on the entire customer database that has millions of records.

SELECT * FROM customers;

The MBA then downloads the data for millions of users and loads it into an Excel file, and puts the file on their laptop or an S3 bucket. Now this data exists outside the secured data warehouse infrastructure.

From the point of view of access control, the intern had the right credentials. They also had the right intent, but accidentally used data in a way that was incorrect. And the only way to understand it is by analyzing every query individually.

3. A Nosey Employee Looks Up Their Ex-Boyfriend

Every business has them, not every business knows what they’re up to. With most of your teams working from home, it’s become more difficult than ever to monitor if they are doing something malicious or unethical. You have a data scientist who works in your AI department at your dating app company. She’s been with the company for a couple of years so you trust her. Little do you know that she just broke up with her longtime boyfriend and heard from his friend that he has been seeing someone new.

She runs a search in the database for her ex-boyfriend’s profile and history to gain access to who he has been chatting with as well as whose pictures he’s been liking.

SELECT

ch.active_connections,

FROM

accounts acc

JOIN chat_metadata ch ON acc.id = ch.account_id

WHERE

acc.name = ‘johnSmith1989’;

This behavior violates the privacy of her ex, a user of your app. Even though the engineer needs access to the database as part of her job, spying on the details of an individual is a gross violation and probably merits strong punishment. However, if you did not analyze her individual query, she’s done nothing that violates security protocols (she wasn’t even looking at PII which might still be encrypted).

Now imagine a system which could understand that her query violated the privacy of an individual and automatically changed the query to offer statistical or masked output. Would be great, wouldn’t it?

4. The Gig Engineer Creates a Temp Table

You are the CIO of a decently-sized healthcare company. Because of the recent layoffs of a lot of the engineering department, you need to hire a freelance engineer for a three-month gig while the business gets back on its feet. You ask the engineer to run some statistical analysis but instead of running it directly on the DW, they create a materialized vide / derived table in the Cloud environment so that they can run their queries on it then delete it.

CREATE MATERIALIZED VIEW joined_tables AS

SELECT *

FROM customers c

JOIN payments p ON c.id = p.customer_id

JOIN reviews r ON c.id = r.customer_id

WITH DATA;

Except, they forget to delete the temp table. The engineer was so busy trying to get their bearings that they forgot to delete the table they had created as a stopgap solution. Now, the data (PHI, PII, etc.) is lying there completely unprotected in the cloud environment.

5. A Disgruntled Employee Sells Data to Competitor

When COVID-19 took the world by storm and upended normal business processes, your retail shop had to make cuts to certain departments that weren’t essential to the business. A group of employees in the marketing department caught wind that they were getting laid off and got really upset.

In order to make a quick buck and also jab the company that plans on firing them, these employees started stealing customer data in small batches (to avoid UEBA detection) to sell to your top competitor.

SELECT * FROM leads

WHERE qualified = true

AND signed = false

ORDER BY created_on ASC

LIMIT 500;

And then, an hour later:

SELECT * FROM leads

WHERE qualified = true

AND signed = false

ORDER BY created_on ASC

LIMIT 500 OFFSET 500;

(This repeats until the rest of your high-value leads are exfiltrated)

Because of the multitude of queries that are run on a daily basis, flagging and blocking these small but mighty queries is near impossible. As a result, you can hardly tell an exfiltration attempt if it’s hidden in the form of multiple, co-ordinated, and sufficiently randomized data pulls.

Protect The Structured Data in Your Cloud

Existing security disciplines like IAM, DLP, and Encryption are great, but they apply a vault approach to security. They focus on identifying threats through access behavior. Query analysis, on the other hand, can assess the amount of risk every time someone uses consumer data.

In this day and age, when companies are collecting valuable consumer data, it’s imperative that CISO’s have the right tools in place to protect their businesses. Because customer data could so easily be abused by accidental or malicious internal queries, having a solution in place that tracks and rewrites unsafe queries can help you stay ahead of potential breaches and keep your company out of the headlines.

Looking for a solution that offers real-time protection for the structured data in your cloud data warehouses and data lakes?

Dasera is a cybersecurity platform that automates the security of your cloud data through visibility, governance, and remediation for your structured databases. By using an innovative AI and machine learning algorithm, Dasera’s query analysis can help you go beyond access control and get direct insights into how your data is being used internally. For more information, contact us for a consultation.