Glossary

Data Security Defined: Your Guide to Industry Jargon

Anomaly

An anomaly refers to any deviation, irregularity, or unexpected variation from the normal or expected patterns in data. For example, Dasera can detect anomalies in user access patterns with its built-in Query Log Analysis Engine.

Audit

An audit thoroughly examines an organization's data-related processes, activities, controls, and policies. Its primary goal is to assess and ensure alignment with established standards, best practices, and policies. Dasera’s policy engine ensures that your organization’s security measures comply with regulations like GDPR, HIPAA, and PCI; helping to avoid possible legal and regulatory consequences during audits.

Authentication

Authentication refers to the process of verifying the identity of a user, system, application, or device attempting to access cloud resources or services. This security mechanism ensures that only authorized entities can access and interact with cloud-based resources and data. Dasera can auto-discover data stores in your cloud and self-managed environments via secure authentication.

Authorization

Authorization refers to granting or restricting access to data, systems, or resources based on established rules, policies, and permissions. This ensures that only authorized individuals or entities can perform specific actions or operations on data within an organization. Dasera employs Role-Based Access Control (RBAC) for platform user authorization.

Cloud Access Security Broker (CASB)

Cloud access security brokers are on-premises or cloud-based security policy enforcement points between cloud service consumers and providers to combine and interject enterprise security policies as cloud-based resources are accessed. CASBs consolidate multiple types of security policy enforcement. Example policies include authentication, single sign-on, authorization, credential mapping, device profiling, encryption, tokenization, logging, alerting, and malware detection.

Cloud Security Posture Management (CSPM)

Cloud security posture management (CSPM) consists of offerings that continuously manage IaaS and PaaS security posture through prevention, detection, and response to cloud infrastructure risks. The core of CSPM applies common frameworks, regulatory requirements, and enterprise policies to proactively and reactively discover and assess risk/trust of cloud services configuration and security settings. If an issue is identified, remediation options (automated or human-driven) are provided.

Cloud Service Network Security (CSNS)

Cloud Service Network Security (CSNS) provides security functions for dynamic network perimeters common with cloud-native workloads. CSNS offers granular segmentation and protects both North-South and East-West traffic. Examples of CSNS functions include next-generation firewalls, load balancers, Denial of Service (DoS) protection, web application and API protection (WAAP), and SSL/TLS inspection.

Cloud Service Provider (CSP)

A Cloud Service Provider (CSP) provides cloud computing services via the Internet. These services, delivered through servers and data centers, enable customers to access computing resources and applications without needing physical hardware and infrastructure ownership. Dasera supports the scanning of many CSPs, including but not limited to Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure, Snowflake, and Databricks.

Cloud Workload Protection Platform (CWPP)

Cloud Workload Protection Platform (CWPP) safeguards public, private, and hybrid cloud workloads. This enables enterprises to integrate security seamlessly throughout the application development lifecycle. CWPP solutions identify and scan workloads across cloud and on-premises infrastructure, detecting security issues and offering options for vulnerability resolution. Additionally, CWPPs deliver essential security functions, including runtime protection, network segmentation, and malware detection for workloads.

Cloud-Native Application Protection Platform (CNAPP)

Cloud-native application protection platforms (CNAPPs) are a unified and tightly integrated set of security and compliance capabilities designed to secure and protect cloud-native applications across development and production. CNAPPs consolidate many previously siloed capabilities, including container scanning, cloud security posture management, infrastructure as code scanning, cloud infrastructure entitlement management, runtime cloud workload protection, and runtime vulnerability/configuration scanning.

Data Access Governance (DAG)

Data Access Governance (DAG) is a data security technology segment focusing on managing and protecting access to data. It involves creating visibility and control over an organization's 'who has access to what' files and information. This includes defining and enforcing policies for accessing and using data, such as defining roles and responsibilities for those with access to data, setting up processes for granting and revoking access, and auditing data access. DAG ensures that the right people have access to the right data at the right time, thereby reducing the risk of data breaches and ensuring compliance with data privacy.

Data Catalog

A data catalog is a centralized inventory of an organization's data assets, using metadata to help data professionals find the most suitable data. It collects and enriches metadata to support data discovery and governance, providing a clear view of data flows and data lineage details. Data catalogs are essential for data management and governance, incorporating governance policies, controls, data quality rules, and a business glossary to ensure appropriate data use.

Data Classification

Data classification is the process of categorizing data based on predefined criteria such as sensitivity level, type, and regulatory requirements. It enhances data retrieval, risk management, security, and regulatory compliance. As one of Dasera’s primary functionalities, classification aids organizations in regulatory compliance, guiding the data lifecycle, and protecting critical information.

Data Deployment

Data deployment is making a data model or software application available for use. In data science, this involves applying a model for prediction using new data, making the model's predictions accessible to users, developers, or systems for data-driven decision-making. In software development, it's the delivery mechanism for applications, modules, updates, and patches from developers to users. The deployment methods can impact a product's responsiveness to customer demand changes. Data deployment encompasses stages of data management like data integration, migration, and synchronization, ensuring accurate and up-to-date data availability in appropriate systems and environments.

Data Discovery

Data discovery is identifying and exploring data within an organization to understand its characteristics, relationships, and patterns. It involves profiling data for quality and structure, mapping relationships, visualizing insights, and categorizing information. This step is essential for informed decision-making and is supported by Dasera, which enables efficient data searching, querying, and visualization.

Data Duplication

Data duplication refers to redundant copies of data within a system. It can lead to increased storage costs and management complexity. Data deduplication identifies and eliminates these duplicate copies, storing only one data instance. This process conserves storage space and improves efficiency during data transfer and backup operations by reducing the volume of data that needs to be handled.

Data Exfiltration

Data exfiltration, or data theft, is the unauthorized, intentional data transfer from a computer or other device. It can be conducted manually or automated using malware. The stolen data can be utilized to damage a company’s reputation, for financial gain, or sabotage. Data exfiltration poses a significant security concern with potentially catastrophic consequences, making its prevention a critical aspect of data security.

Data Flows

Data flows describe how data moves and is processed within a system, from its origin to its destination, and any transformations it undergoes along the way. A data flow can be visualized using a data flow diagram (DFD), which maps out the flow of information for any process or system. It uses defined symbols to show data inputs, outputs, storage points, and the routes between each destination. Data flows define the flow between source and target data assets and any operations on that data in data integration. Data engineers can then analyze or gather insights and use the data to make decisions.

Data Governance

Data governance is the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption, and control of data and analytics.

Data In Motion

Data in motion, also known as data in transit or flight, refers to digital information being transferred from one location to another. This can include data moving within a single system, from a local application to a web browser, or between different systems, such as through cloud services or over a network. The concept is crucial for real-time data analysis, security, and business continuity, as it involves data that is often recently created and requires immediate action. Therefore, data in motion is typically secured through encryption to prevent unauthorized access during transmission.

Data In Use

Data in use is active data currently accessed, processed, updated, erased, or read by a system. Unlike passive data storage, data in use is actively engaged in operations within an organization’s infrastructure. It is the most vulnerable state of data as it is directly accessible by users or systems, making it a target for attacks and exploits. Protecting data involves user authentication, identity management, permissions control, and monitoring for suspicious activity.

Data Lake

A data lake is a centralized storage repository that holds large amounts of raw data in its native format until needed. It differs from a data warehouse in that it stores unstructured and structured data without a predefined schema, allowing for data usage and analysis flexibility. Data lakes support a variety of analytics, from SQL queries to machine learning, and can process large volumes of data from diverse sources. Effective data lake management requires robust cataloging, search, and security measures to prevent it from becoming unmanageable or a "data swamp."

Data Lifecycle

The data lifecycle refers to the stages a data unit undergoes from creation to deletion, including storage, usage, and archival. Data lifecycle management (DLM) governs these stages, maximizing data value and ensuring data security and availability. Understanding the data lifecycle is crucial for effective data management, governance, and regulatory compliance.

Data Lineage

Data lineage refers to understanding, recording, and visualizing data as it flows from data sources to consumption, including all transformations and modifications made throughout its lifecycle. It provides visibility into the data's origin, its journey through various systems, and any changes made along the way. Data lineage helps organizations understand data movement, transformation processes, and relationships, which is crucial for data governance, compliance, troubleshooting, and decision-making.

Data Localization

Data localization is storing and processing data within the region from which it originated. It identifies the exact geographic locations where data can and cannot be stored and processed. Specific regulations require data localization, including the GDPR, LGPD, and others (e.g., China, Russia, and India). It's a crucial consideration for organizations, especially those working with cloud computing service providers or those relying on cross-border data flow, as it can be challenging to gain transparency into the locations where data processing occurs.

Data Locations

Data locations are the geographical areas where data is stored or processed, including on-premises servers, cloud storage, edge devices, endpoints, and hybrid environments. They can be identified through GPS signals, IP addresses, or other geolocation technologies. Understanding data locations is crucial for compliance with data sovereignty and localization laws and managing data security and privacy. Dasera helps understand data store locations, essential for effective data management, security, and compliance.

Data Loss Protection (DLP)

According to Gartner - “Data loss protection (DLP) describes a set of technologies and inspection techniques used to classify information content contained within an object — such as a file, email, packet, application or data store — while at rest (in storage), in use (during an operation) or in transit (across a network). DLP tools also can dynamically apply a policy — such as log, report, classify, relocate, tag and encrypt — and apply enterprise data rights management protections.”

Data Maps

Data mapping is a process that catalogs an organization's data, tracking its collection, usage, storage, processing, and sharing. It's a key component of data management, helping to standardize data, reduce errors, and provide visibility into data lineage. It involves documenting various data elements and systems that hold the data. Effective data mapping answers key questions about the data, such as its classification, format, and processing basis. It supports data governance and is crucial for compliance with data privacy regulations like GDPR and CCPA.

Data Observability

Data observability is the capacity to fully comprehend the health and performance of data within an organization's systems. It employs automated monitoring, root cause analysis, data lineage, and data health insights to proactively detect, resolve, and prevent data anomalies. It provides a transparent view of data flows and the ability to inspect, diagnose, and rectify data inconsistencies, which is vital for data-driven decision-making and maintaining a healthy data environment.

Data Pipelines

A data pipeline is an automated process that moves and transforms data from one system to another, such as from an application to a data warehouse. It consists of a source, processing steps, and a destination, enabling data flow and transformation for various uses like analytics and reporting.

Data Portfolio

A data portfolio, or asset portfolio, collects data assets that an organization owns, manages, and stores. Data assets can be a set of prepared data or information that is easily consumed for a specific purpose. The portfolio approach to data assets involves identifying, building, and governing individual data products at an enterprise level. This approach helps organizations understand their data capabilities and make strategic decisions. The data asset portfolio can include various systems, files, documents, databases, or websites that companies use to generate revenue.

Data Quality

Data quality measures how well a dataset serves its specific purpose based on accuracy, completeness, consistency, reliability, and timeliness. High-quality data is deemed fit for its intended use in operations, decision-making, and planning, accurately representing the real-world construct it refers to.

Data Remediation

Data remediation is correcting errors, organizing, and migrating data to ensure it is accurate, well-managed, and serves its intended purpose. It involves cleansing "dirty" data and may include deleting redundant or unnecessary data, which is crucial for maintaining data quality and compliance.

Data Repositories

A data repository is a centralized place to store, manage, and maintain data. It can consist of one or more databases or files and may be distributed over a network. Data repositories are designed to collect, manage, and store datasets for various purposes, including data analysis, sharing, and reporting. They are essential for logically organizing data, ensuring long-term storage and access, and facilitating data discovery and citation. Data repositories are commonly used in scientific research or for managing large data sets in business and other domains.

Data Residency

Data residency refers to the physical or geographic location where an organization stores or processes its data. This could be a physical or virtual location. It is a core element of data privacy laws and establishes how organizations control and secure personal data stored across multiple regions.

Data Retention

Data retention refers to storing data for a specific period for legal compliance, business continuity, and data analytics. It involves managing data availability, usability, integrity, and security in enterprise systems based on internal data standards and policies. Defining how long an organization must hold on to specific data is crucial, as retaining data longer than necessary can lead to unnecessary storage costs and potential legal and security risks.

Data Risk Assessment (DRA)

A Data Risk Assessment (DRA) evaluates potential risks associated with an organization's data. It involves identifying and analyzing threats to data security, assessing existing controls, ensuring regulatory compliance, and analyzing the impact of potential breaches. DRA results inform decisions on implementing or enhancing security measures. Organizations can leverage Dasera to conduct regular DRAs, which is essential for effective risk management.

Data Security

Data security is the practice of protecting digital information from unauthorized access, corruption, or theft. It involves securing databases and includes physical hardware security, software security, and adherence to organizational policies. Encryption, management, data redaction, masking, access controls, and auditing are key elements. It's vital for data integrity, availability, and regulatory compliance. Organizations often choose Dasera to enhance their data security strategies.

Data Security Governance (DSG)

Data security governance (DSG) is a subset of information governance that protects corporate data through defined policies and processes. The goal of data security governance is to ensure that data is protected, maintained, and used in a secure and compliant manner. This involves defining roles, responsibilities, and procedures to manage risks related to data breaches, unauthorized access, data loss, and other security threats. Many organizations leverage Dasera as part of their broader DSG strategy.

Data security Posture

A data security posture refers to an organization's overall approach, readiness, and effectiveness in safeguarding its data assets against potential security threats and risks.
It encompasses the combination of security measures, policies, practices, and technologies that an organization has in place to protect its data from unauthorized access, breaches, data loss, and other security incidents.
A strong data security posture reflects the organization's commitment to data protection and risk mitigation.

Data Security Posture Management (DSPM)

Data security posture management (DSPM) provides visibility into where sensitive data exists, who can access it, how it is used, and the security posture of the applicable data store or application. DSPMs assess the current state of data security, identify and classify potential risks and vulnerabilities, implement security controls to mitigate these risks, and regularly monitor and update the security posture to ensure it remains effective. DSPM enables businesses to maintain the confidentiality, integrity, and availability of sensitive data. Typical users of DSPM include Information Technology (IT) departments, security teams, compliance teams, and executive leadership.

Data Sensitivity

Data sensitivity refers to the protection required for data based on its importance, confidentiality, and potential impact if unauthorized access, disclosure, modification, or deletion occurs. Sensitive data, including financial information, personal identification information, health records, and proprietary business information, must be protected due to potential harm if compromised. Data sensitivity varies and is often classified based on potential impact. Understanding data sensitivity is key for implementing suitable data security measures and complying with data protection regulations.

Data Sovereignty

Data sovereignty is the concept that digital data is subject to the laws of the country in which it is located. It involves the laws and regulations that dictate how data is generated, processed, and stored. Data sovereignty is key to ensuring data remains secure and helps organizations comply with all applicable laws and regulations.

Data Sprawl

Data sprawl is the proliferation of data across various organizational locations and systems. It occurs when vast amounts of data are collected, processed, and stored, making it difficult for organizations to track their data, where it's located, and who has access to it. Data sprawl can pose significant challenges in data management, security, and compliance, as it can be difficult to inventory, secure, and analyze data dispersed across multiple locations.

Data Vulnerabilities

Data vulnerabilities are weaknesses in an organization's data security that could be exploited to gain unauthorized access to sensitive data. They pose data confidentiality, integrity, and availability risks and can stem from software bugs, system misconfigurations, or inadequate policies. They can lead to data breaches, leaks, or corruption if not addressed. Organizations should continuously identify, prioritize, and mitigate these vulnerabilities to strengthen their data security.

Data Warehouse

A data warehouse is a system that consolidates data from various sources into a single, central location to support data analysis and business intelligence activities. It's designed to handle large-scale analytics and complex queries quickly. Data warehouses store current and historical data, enabling analysis of different periods and trends for future predictions. They are optimized for read access, making them ideal for large analytical queries or processing large volumes of data.

Dataset

A dataset is a collection of related data. In the case of structured data, a dataset corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given dataset record. The dataset lists values for each variable, such as the height and weight of an object, for each dataset member. Datasets can also consist of unstructured data, such as a collection of documents or files. They are often used in scientific research, business, and other domains for data analysis, sharing, and reporting.

Direct Identifiers

Direct identifiers are data elements that can independently identify an individual without additional information. Examples include social security numbers, passport numbers, and driver's license numbers. These identifiers are unique to a person and can directly reveal their identity.

FERPA

The Family Educational Rights and Privacy Act (FERPA) is a U.S. federal law protecting the privacy of student education records. It applies to all schools receiving funds from the U.S. Department of Education. FERPA grants parents specific rights regarding their children's education records, which transfer to the student when they turn 18 or attend a post-high school institution.

The General Data Protection Regulation (GDPR) is a regulation in EU law on data protection and privacy in the European Union and the European Economic Area. It also addresses the transfer of personal data outside these areas. The GDPR aims to give individuals control over their personal data and simplify the international business regulatory environment by unifying the regulation within the EU. It covers all companies that deal with EU citizens' data, so it is a critical regulation for corporate compliance officers at banks, insurers, and other financial companies.

HIPAA

The Health Insurance Portability and Accountability Act (HIPAA) is a U.S. law that safeguards patient health information from unauthorized disclosure. It helps maintain patient privacy and regulates how personal health information is handled in healthcare and related industries.

Identity and Access Management (IAM)

According to Gartner - “Identity and Access Management (IAM) is a security and business discipline that includes multiple technologies and business processes to help the right people or machines to access the right assets at the right time for the right reasons while keeping unauthorized access and fraud at bay.”

PCI DSS

PCI DSS, or Payment Card Industry Data Security Standard, is a set of security measures for protecting cardholder data during credit and debit card transactions. Established by major credit card companies, it's not legally mandated but is often required contractually for businesses handling card transactions. The Payment Card, Industry Security Standards Council, administered the standard to reduce credit card fraud and ensure secure transactions. Compliance with PCI DSS is crucial for minimizing data breaches and maintaining customer trust.

Personally Identifiable Information (PII)

Personally Identifiable Information (PII) is any data potentially identifying a specific individual. This can include direct identifiers, such as name, social security number, or passport information, that uniquely identify a person or quasi-identifiers, such as race or date of birth, combined with other data to identify an individual. PII is used in various contexts and is subject to privacy laws and regulations. Organizations must protect PII to ensure privacy and prevent identity theft.

Quasi-Identifiers

Quasi-identifiers are pieces of information that, while not unique, can be combined with other quasi-identifiers to identify an individual. Examples include date of birth, gender, zip code, and occupation. Unlike direct identifiers, quasi-identifiers require additional information to confirm an individual's identity.

RAG Data

Retrieval-augmented generation (RAG) data refers to integrating external, authoritative knowledge bases into large language models to enhance the accuracy and relevance of generated responses without the need to retrain the model.

Regulatory Compliance

Regulatory compliance is an organization's adherence to laws, regulations, and standards governing data protection, handling, and use. It involves implementing measures to secure data from unauthorized access, misuse, and loss and to ensure data privacy and accuracy. Compliance requirements can vary depending on the industry and type of data handled and may include standards like GDPR, HIPAA, and PCI DSS. Non-compliance can result in legal penalties, including fines and lawsuits. Organizations often establish internal processes and choose Dasera to maintain compliance and demonstrate compliance to regulatory bodies.

Secure Access Service Edge (SASE)

Secure access service edge (SASE) delivers converged network and security as a service capability, including SD-WAN, SWG, CASB, NGFW, and zero trust network access (ZTNA). SASE supports branch offices, remote workers, and on-premises secure access use cases. SASE is primarily delivered as a service and enables zero trust access based on the identity of the device or entity, combined with real-time context and security and compliance policies.

Semi-Structured Data

Semi-structured data does not conform to a rigid data model but has some structural elements. It does not follow the tabular structure of relational databases but contains tags, markers, or other types of organizational metadata that provide some level of organization and hierarchy. Examples of semi-structured data include XML, JSON, and HTML. This data type is more flexible and adaptable than structured data and can evolve as new attributes are added.

Sensitive Data

Sensitive data is confidential information that an individual or organization wants to keep private due to the potential harm if exposed. This includes personal information such as financial details, health records, and social security numbers, as well as corporate information like trade secrets and proprietary data. Unauthorized exposure of sensitive data could lead to financial loss, privacy violations, or security breaches. Organizations should limit access to sensitive data through robust data security and information security practices.

Shadow Data

Shadow data refers to all the information assets an organization creates, processes, stores, and shares outside its formal systems and control. This can include data shared through cloud services, instant messaging, social media, and personal email accounts. Shadow data poses significant risks as it can be easily overlooked, leading to potential data breaches, non-compliance with regulations, and loss of intellectual property. Organizations must have visibility into and control shadow data to ensure data security and compliance.

SOX

SOX compliance refers to adherence to the Sarbanes-Oxley Act (SOX), a U.S. law aimed at enhancing corporate disclosures and protecting investors from fraudulent financial reporting. It's an annual obligation for publicly traded companies in the U.S., requiring them to establish financial reporting standards, safeguard data, track breaches, log electronic records for auditing, and prove compliance.

Structured Data

Structured data conforms to a specific model or structure, making it easily searchable, queryable, and analyzable. It's typically organized in a tabular format with defined rows and columns and is found in relational databases, spreadsheets, and other structured repositories. Examples include SQL databases, Excel files, and so on. Structured data is highly valuable for data-driven applications, business intelligence, machine learning, and artificial intelligence due to its organized nature and ease of processing.

Synthetic Data

Synthetic data is generated by applying a sampling technique to real-world data or by creating simulation scenarios where models and processes interact to create completely new data not directly taken from the real world.

The California Consumer Privacy Act (CCPA)

The California Consumer Privacy Act (CCPA) is a privacy law in California granting residents certain rights regarding their personal information. Consumers can know what data is collected, opt out of its sale, and request deletion. The CCPA applies to businesses meeting specific criteria, emphasizing transparency and security in data practices. Dasera offers out-of-the-box support for CCPA compliance, detecting and auto-tagging sensitive data.

Unstructured Data

Unstructured data is information that doesn't adhere to a predefined data model, making it challenging to store and manage in traditional databases. It's often text-heavy but can include other data types like dates, numbers, and facts. This data can be human or machine-generated and comes in various forms, such as text documents, emails, videos, photos, and audio files. Despite its lack of structure, unstructured data can be analyzed with specialized tools and techniques to extract valuable insights.

Zero Trust Network Access (ZTNA)

Zero trust network access (ZTNA) is a product or service that creates an identity- and context-based, logical access boundary around an application or set of applications. The applications are hidden from discovery, and access is restricted via a trusted broker to a set of named entities. The broker verifies the specified participants' identity, context, and policy adherence before allowing access, prohibiting lateral movement elsewhere in the network. This removes application assets from public visibility and significantly reduces the surface area for attack.