Announcing Agent Commander - The First Integrated solution from Veeam + Securiti.ai enabling the scaling of safe AI agents

View
Veeam

The Funniest Evening at RSA with Hasan Minhaj

Hasan Minhaj Request ticket
View

Unstructured Data, GenAI, and Regulatory Compliance

Author

Jack Berkowitz

Chief Data Officer at Securiti

Listen to the content

For most enterprises, unstructured data is difficult to manage, govern, and secure. Issues around the sheer volume and variety of unstructured data sources—from text-heavy emails and documents to photo, video, and social media files—complicate governance and security teams’ efforts to enforce consistent policies around unstructured data. Uncontrolled access and sharing obfuscate efforts to track data provenance and inconsistent formats make it difficult to manage unstructured data uniformly.

Partly because the tools and technologies built to manage unstructured data have proven less effective than those for structured data, many enterprises have deprioritized the proper management of unstructured data, and some struggle to even identify where it lives throughout their organization.

The risks and opportunities behind unstructured data

Unstructured data poses increased cybersecurity risks and challenges compared to structured data, primarily due to its lack of organization and the potential presence of sensitive information hidden within its content. A range of threats exist, including data breach and exposure, insider threats, shadow data, unclear or inconsistent data classification, data sprawl, and delivery of ransomware or malware.

At the same time, unstructured data is the primary input that fuels most GenAI systems, whose models are trained on massive amounts of unstructured text data. It is this data that enables them to develop (in effect “learn”) the natural language capabilities and contextual understanding they need to generate human-like output. And with the rise of GenAI and the untold opportunity it represents for enterprise, unstructured data — along with the challenges that beset it — is very much in the spotlight.

Play Video

To mitigate these risks, organizations must implement robust data governance strategies, access controls, encryption, data loss prevention (DLP) tools, and employee training programs specifically tailored to the unique challenges of unstructured data. Continuous monitoring, auditing, and the adoption of advanced technologies like machine learning and data classification can also help organizations better identify, protect, and manage their unstructured data assets. Utilizing effective risk identification, threat detection, and identity and entitlement management together alongside data classification can provide a holistic solution to classify, manage, and protect your data — and help prevent data loss or theft.

The regulatory landscape around unstructured data and GenAI

Regulatory requirements around GenAI systems are already here — and they’re rapidly expanding in number and scope. In March, the European Union adopted the most comprehensive set of regulations around the use of AI by businesses, the EU AI Act, adding to the list of pre-established regulations that already cover unstructured data.

Existing regulations around unstructured data

Even before the AI Act that covers generative artificial intelligence specifically, privacy and security regulations have increasingly recognized the importance of protecting unstructured data, as it often contains sensitive and personal information. Key regulations that increase the scrutiny of unstructured data include:

  1. General Data Protection Regulation (GDPR): The GDPR, which applies to the European Union, defines personal data broadly, including unstructured data such as emails, documents, and multimedia files that can directly or indirectly identify an individual.
  2. California Consumer Privacy Act (CCPA): The CCPA defines personal information to include unstructured data like emails, text messages, and photos. Businesses must disclose the categories of personal information they collect, including unstructured data, and provide consumers with the right to access, delete, and opt-out of the sale of their personal information.
  3. Health Insurance Portability and Accountability Act (HIPAA): HIPAA in the United States requires covered entities to implement safeguards to protect the confidentiality, integrity, and availability of electronic protected health information (ePHI), which can include unstructured data such as medical records, physician notes, and diagnostic images.
  4. Payment Card Industry Data Security Standard (PCI DSS): PCI DSS requires merchants and service providers to protect cardholder data, which can include unstructured data like customer emails, call recordings, and scanned documents containing payment card information.
  5. Sarbanes-Oxley Act (SOX): SOX in the United States requires public companies to maintain and protect corporate records, including unstructured data such as emails, memos, and financial reports, for auditing purposes.
  6. In the EU, NIS2 (coming into effect in October 2024) and the Digital Operational Resilience Act “DORA” (coming into effect in January 2024) cover a wide range of business and financial information, including unstructured data and raise the bar in terms of security techniques required (e.g., anomaly detection, identity management, and vulnerability and threat reporting, among others).
  7. In the US, SEC Cybersecurity materiality reporting (which came into effect on December 18, 2023) applies to most US public companies now and smaller reporting entities starting June 2024. This requires a determination of materiality to be made “without undue delay” following a cyber incident, and if the incident is determined to be material, then a disclosure must be made to the SEC within four days. This applies to both unstructured and structured data — and to effectively comply with the tight reporting timelines requires active threat detection and reporting, as this information is key to making a determination of materiality (or not) and justifying such determination with evidence.

Many regulations emphasize the need for data discovery, classification, and protection measures to identify and secure sensitive unstructured data. Organizations must implement appropriate access controls, encryption, data loss prevention (DLP) tools, and other security measures to protect unstructured data containing personal, financial, or confidential information.

Where the new EU AI Act comes into play

Here are some key points from the EU AI Act that are relevant to unstructured data:

  1. Data governance: The AI Act emphasizes the importance of data governance and data management practices for AI systems.
  2. Data quality: The regulation requires that the training, validation, and testing data used for AI systems be relevant, representative, free of errors, and complete.
  3. Documentation: AI system providers are required to document their systems' characteristics, capabilities, and limitations, including information about the training data used.
  4. Data protection: The AI Act reinforces the need to comply with existing data protection regulations, such as the GDPR, when processing personal data for AI systems. This is particularly relevant for unstructured data that may contain personal information.
  5. Bias and discrimination: The act aims to mitigate the risks of bias and discrimination in AI systems, which could arise from biases present in the training data, including unstructured data sources.

What companies should do to ensure compliance?

To effectively manage their unstructured data, companies should implement the following strategies:

  1. Discover and classify unstructured data: Identify and classify unstructured data assets — including documents, emails, multimedia files, and so on — across the org using machine-learning data discovery tools and automatically categorize data based on sensitivity, content, purpose, and more. to automate the process and categorize data based on sensitivity.
  2. Establish a data governance framework: Manage unstructured data throughout its lifecycle with a comprehensive framework that defines policies, roles, and responsibilities — and includes rules for data creation, storage, access, retention, and disposal.
  3. Implement metadata management practices: Enrich unstructured data with contextual information, such as data owners, access permissions, retention periods, and so on.
  4. Apply access controls and data security: Protect sensitive unstructured data from unauthorized access, data breaches, or accidental exposure by establishing and implementing appropriate measures for access controls, encryption, and data loss prevention (DLP).
  5. Manage the entire data lifecycle: Define and enforce policies for data retention, archiving, and disposal. Ensure regulatory compliance and minimize data storage costs by automating processes for managing data lifecycle stages.
  6. Integrate cloud and on-prem: Manage unstructured data across cloud and on-premises environments, ensuring consistent governance, security, and compliance across hybrid infrastructure.
  7. Enable continuous monitoring and auditing: Implement processes to track data access, usage, and potential data leakage or misuse.

Today, Securiti and Lacework offer a combined solution to give companies the end-to-end visibility into their multicloud and on-prem environments that they need to govern and protect unstructured data at scale — and achieve regulatory compliance now and in the future. With the ability to prioritize risk based on Lacework security findings and determine the sensitivity of data with the Securiti Data Command Center, companies can identify their highest priority risks, focus on high-impact threats, intelligently prioritize and remediate data, protect sensitive information at scale, and properly manage their unstructured data environment. Learn more about the partnership to see what our combined solution can do for your org.

Analyze this article with AI

Prompts open in third-party AI tools.
Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox


Share

More Stories that May Interest You
Videos
View More
Rehan Jalil, Veeam on Agent Commander : theCUBE + NYSE Wired: Cyber Security Leaders
Following Veeam’s acquisition of Securiti, the launch of Agent Commander marks an important step toward helping enterprises adopt AI agents with greater confidence. In...
View More
Mitigating OWASP Top 10 for LLM Applications 2025
Generative AI (GenAI) has transformed how enterprises operate, scale, and grow. There’s an AI application for every purpose, from increasing employee productivity to streamlining...
View More
Top 6 DSPM Use Cases
With the advent of Generative AI (GenAI), data has become more dynamic. New data is generated faster than ever, transmitted to various systems, applications,...
View More
Colorado Privacy Act (CPA)
What is the Colorado Privacy Act? The CPA is a comprehensive privacy law signed on July 7, 2021. It established new standards for personal...
View More
Securiti for Copilot in SaaS
Accelerate Copilot Adoption Securely & Confidently Organizations are eager to adopt Microsoft 365 Copilot for increased productivity and efficiency. However, security concerns like data...
View More
Top 10 Considerations for Safely Using Unstructured Data with GenAI
A staggering 90% of an organization's data is unstructured. This data is rapidly being used to fuel GenAI applications like chatbots and AI search....
View More
Gencore AI: Building Safe, Enterprise-grade AI Systems in Minutes
As enterprises adopt generative AI, data and AI teams face numerous hurdles: securely connecting unstructured and structured data sources, maintaining proper controls and governance,...
View More
Navigating CPRA: Key Insights for Businesses
What is CPRA? The California Privacy Rights Act (CPRA) is California's state legislation aimed at protecting residents' digital privacy. It became effective on January...
View More
Navigating the Shift: Transitioning to PCI DSS v4.0
What is PCI DSS? PCI DSS (Payment Card Industry Data Security Standard) is a set of security standards to ensure safe processing, storage, and...
View More
Securing Data+AI : Playbook for Trust, Risk, and Security Management (TRiSM)
AI's growing security risks have 48% of global CISOs alarmed. Join this keynote to learn about a practical playbook for enabling AI Trust, Risk,...

Spotlight Talks

Spotlight 50:52
From Data to Deployment: Safeguarding Enterprise AI with Security and Governance
Watch Now View
Spotlight 11:29
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Watch Now View
Spotlight 11:18
Rewiring Real Estate Finance — How Walker & Dunlop Is Giving Its $135B Portfolio a Data-First Refresh
Watch Now View
Spotlight 13:38
Accelerating Miracles — How Sanofi is Embedding AI to Significantly Reduce Drug Development Timelines
Sanofi Thumbnail
Watch Now View
Spotlight 10:35
There’s Been a Material Shift in the Data Center of Gravity
Watch Now View
Spotlight 14:21
AI Governance Is Much More than Technology Risk Mitigation
AI Governance Is Much More than Technology Risk Mitigation
Watch Now View
Spotlight 12:!3
You Can’t Build Pipelines, Warehouses, or AI Platforms Without Business Knowledge
Watch Now View
Spotlight 47:42
Cybersecurity – Where Leaders are Buying, Building, and Partnering
Rehan Jalil
Watch Now View
Spotlight 27:29
Building Safe AI with Databricks and Gencore
Rehan Jalil
Watch Now View
Spotlight 46:02
Building Safe Enterprise AI: A Practical Roadmap
Watch Now View
Latest
View More
Introducing Agent Commander
The promise of AI Agents is staggering— intelligent systems that make decisions, use tools, automate complex workflows act as force multipliers for every knowledge...
Risk Silos: The Biggest AI Problem Boards Aren’t Talking About View More
Risk Silos: The Biggest AI Problem Boards Aren’t Talking About
Boards are tuned in to the AI conversation, but there’s a blind spot many organizations still haven’t named: risk silos. Everyone agrees AI governance...
Largest Fine In CCPA History_ What The Latest CCPA Enforcement Action Teaches Businesses View More
Largest Fine In CCPA History: What The Latest CCPA Enforcement Action Teaches Businesses
Businesses can take some vital lessons from the recent biggest enforcement action in CCPA history. Securiti’s blog covers all the important details to know.
View More
AI & HIPAA: What It Means and How to Automate Compliance
Explore how the Health Insurance Portability and Accountability Act (HIPAA) applies to Artificial Intelligence (AI) in securing Protected Health Information (PHI). Learn how to...
California’s Delete Request and Opt-out Platform (DROP) and the Delete Act View More
California’s Delete Request and Opt-out Platform (DROP) and the Delete Act
Understand California’s DROP platform and the Delete Act, including compliance timelines, the 45-day cycle, broker obligations, and how to operationalize compliance.
Building A Secure AI Foundation For Financial Services View More
Building A Secure AI Foundation For Financial Services
Access the whitepaper and discover how financial institutions eliminate Shadow AI, enforce real-time AI policies, and secure sensitive data with a unified DataAI control...
Emerging AI Security Trends For 2026 View More
Emerging AI Security Trends For 2026
Securiti’s latest infographic provides security leaders with a walkthrough of all the emerging AI security trends for 2026 to help them assess and plan...
Safe AI, Accelerated: View More
Safe AI, Accelerated: Securing Data & AI Across the Lifecycle
Securiti’s latest infographic dives into the issue organizations face when scaling their AI projects safely, and how best they can address those challenges.
View More
Take the Data Risk Out of AI
Learn how to prepare enterprise data for safe Gemini Enterprise adoption with upstream governance, sensitive data discovery, and pre-index policy controls.
View More
Navigating HITRUST: A Guide to Certification
Securiti's eBook is a practical guide to HITRUST certification, covering everything from choosing i1 vs r2 and scope systems to managing CAPs & planning...
What's
New