Securiti leads GigaOm's DSPM Vendor Evaluation with top ratings across technical capabilities & business value.

View

AB 1008: California’s Move to Regulate AI and Personal Data

Contributors

Anas Baig

Product Marketing Manager at Securiti

Sadaf Ayub Choudary

Data Privacy Analyst at Securiti

CIPP/US

Omer Imran Malik

Data Privacy Legal Manager, Securiti

FIP, CIPT, CIPM, CIPP/US

Listen to the content

This post is also available in: Brazilian Portuguese

As artificial intelligence (AI) continues revolutionizing industries, concerns about data privacy are becoming increasingly critical, especially as AI systems increasingly rely on vast datasets often containing personal information. In a major step toward regulating AI, the California Senate passed Assembly Bill 1008 (AB 1008) on August 30, 2024, which was subsequently signed into law by the Governor on September 28, 2024.

This law expands the definition of personal information under the California Privacy Rights Act (CPRA) to include a wide array of formats, including AI models. Thus, it broadens the scope of privacy protections to data utilized by automated systems and machine learning models or large language models (LLMs).

It will impose new requirements on businesses using AI, significantly altering the governance and management of AI models, particularly those trained on personal information.

In this blog, we will explore the key provisions of AB 1008, what they mean for AI developers and users, their broader implications for data privacy and compliance, and how Securiti Genstack AI automation enables enterprises to ensure swift compliance with AI regulations.

Understanding AB 1008

California's AB 1008 introduces additional privacy law obligations for AI systems trained on personal information. This bill ensures that AI models, particularly LLMs, comply with the CPRA by expanding the scope of privacy law to include personal information processed within these systems. Below are the key changes introduced by AB 1008:

Expanded Definition of “Personal Information”

AB 1008 revises the definition of "personal information" under the CPRA to include "abstract digital formats," such as data processed in AI systems capable of outputting personal information. This includes model weights, tokens, or other outputs, derived from a consumer’s personal information that could produce an output linked to that individual.

This change significantly impacts AI systems, particularly LLMs, trained on personal information, by subjecting them to CPRA obligations as those dealing with conventional forms of personal information.

Biometric Data Protection

AB 1008 clarifies that biometric data- including fingerprints, facial recognition data, and iris scans- collected without a consumer’s knowledge is not considered publicly available information (which is exempted under CPRA) and must be treated as personal information under CPRA.

This is especially important for businesses using AI systems for facial recognition, voice analysis, or other biometric data processing. Even if collected in public, such data remains protected under the CPRA, requiring businesses to comply with privacy regulations, including obtaining consent and respecting consumers' data rights.

Consumer Rights Over AI Models

Following AB 1008, the business obligations regarding CPRA will continue beyond a model’s training phase. Even after their personal information has been used to train a machine-learning model, consumers have the right to access, delete, correct, and restrict the sale or sharing of personal data contained within the trained AI system as tokens or model weights.

Neural Data as Sensitive Personal Information

SB 1223, passed alongside AB 1008, introduces neural data as a category of sensitive personal information. Neural data refers to information generated from measuring a consumer’s central or peripheral nervous system activity. This means that AI models utilizing neural data will be subject to even stricter data protection obligations under the CPRA.

Implications for AI Developers and Companies

AB 1008 poses several challenges for AI developers and businesses who rely on AI models trained on personal information:

Cost of Compliance

It may be costly and time-consuming to retrain AI models after each consumer data request, particularly for enterprises that handle big volumes of data, such as LLMS created by Google, OpenAI, and other digital behemoths, which could need expensive and time-consuming retraining cycles.

Technical Feasibility

Organizations are required to respond to data subject requests within 90 days, which can pose significant operational challenges. While retraining smaller models within this timeframe may be feasible, meeting these requirements for large language models is much more difficult. This presents serious technological hurdles as the retraining process for LLMs requires extensive computing resources, specialized hardware, and time.

Operational Challenges in Data Management

Managing DSR requests to access, delete, or correct personal data in AI systems introduces significant operational complexity. Businesses will need to track the flow of personal information from collection to AI model outputs. Keeping track of which personal information is used in training datasets and AI outputs can be difficult, especially in complex systems involving third-party data providers, vendors, or multiple data sources.

Data Integrity

Another technological challenge is ensuring an AI model retains its performance and integrity while honoring DSR requests. Deleting or correcting individual data points may affect an AI system's general accuracy and behavior, which might reduce the system's overall effectiveness.

Data Privacy as a Design Consideration

AB 1008 will probably require AI developers and companies to include data privacy protections from the very beginning of AI development. The importance of privacy-by-design strategies—which remove or anonymize personal information from training datasets—will only grow, and they might result in non-personalized AI outputs.

Do AI Models Contain Personal Information?

The question of whether AI models contain personal information has already sparked debate among regulators. Currently European authorities are divided about this issue.

For instance, the Hamburg Data Protection Authority (DPA) has adopted a different stance, maintaining that LLMs do not contain personal data and are, therefore, not subject to data subject rights such as deletion or correction.

This position contrasts with California’s stance under AB 1008 which treats AI models as potential repositories of personal information, thereby subjecting them to consumer privacy rights and regulatory obligations. This stance was solidified when the California Privacy Protection Agency (CPPA) voted to support the bill, following a staff position paper that emphasized the need to regulate AI models under existing privacy laws.

This discrepancy between California and European perspectives may make compliance more challenging for international companies. Organizations must implement adaptable and dynamic data management practices that comply with local regulations to successfully navigate these diverse regulatory landscapes.

Possible Solution: Sanitizing the AI Data Pipeline

Following the enactment of AB 1008, businesses must take proactive measures to navigate the compliance complexities it introduces. One effective strategy is sanitizing the AI data pipeline during the training phase, ensuring that personal information is not used to train AI models in the first place. This approach could avoid the need for costly retraining in response to consumer requests.

For example, businesses can adopt data anonymization, de-identification, or synthetic data generation techniques that allow them to train AI models without personal information. This would not only ensure compliance with AB 1008 but also reduce the costs and operational challenges associated with retraining models.

Accelerate AI Compliance with Securiti Genstack AI

Despite the challenges presented by AB 1008, the main challenge in scaling enterprise generative AI systems is securely connecting to diverse data systems while maintaining controls and governance throughout the AI pipeline.

Large enterprises orchestrating GenAI systems face several challenges: securely processing extensive structured and unstructured datasets, safeguarding data privacy, managing sensitive information, protecting GenAI models from threats like data poisoning and prompt injection, and performing these operations at scale.

Securiti’s Genstack AI Suite removes the complexities and risks inherent in the GenAI lifecycle, empowering organizations to swiftly and safely utilize their structured and unstructured data anywhere with any AI and LLMs. It provides features such as secure data ingestion and extraction, data masking, anonymization, redaction, and indexing and retrieval capabilities.

Additionally, it facilitates the configuration of LLMs for Q&A, inline data controls for governance, privacy, and security, and LLM firewalls to enable the safe adoption of GenAI.

Securiti’s Genstack AI enables organizations to:

  • Streamlines data connectivity: Genstack AI simplifies the connection to hundreds of data systems (unstructured and structured data), ensuring seamless integration across diverse data environments, including public, private, SaaS, and data clouds.
  • Accelerates AI pipeline development: Enables faster construction of generative AI pipelines by supporting popular Vector databases (DBs), large language models (LLMs), and AI prompt interfaces.
  • Secure deployment: Facilitates the secure deployment of enterprise-grade generative AI systems by maintaining data governance, security, and compliance controls throughout the AI pipeline.
  • Comprehensive and flexible solution: Genstack AI offers multiple components that can be used collectively for end-to-end enterprise retrieval-augmented generation (RAG) systems or individually for various AI use cases.
  • Enterprise-grade AI: Designed specifically to meet the needs of enterprises, ensuring that generative AI systems are safe, scalable, and compliant with industry regulations.
  • Data Sanitization: Classification and redaction of sensitive data on the fly, ensuring data privacy and compliance policies are properly enforced before data is fed to the AI models.
  • Data Vectorization and Integration: Turn data into custom embeddings with associated metadata and load them to your chosen vector database, making your enterprise data ready for LLM to use.
  • LLM Model Selection: Select from a wide range of vector databases and LLM models to build an AI system that aligns with your business goals and operational requirements for a specific use case.
  • LLM Firewalls: Protect AI interactions, including prompts, responses, and data retrievals with context-aware LLM firewalls. Custom and pre-configured policies block malicious attacks, prevent sensitive data leaks, ensure your AI systems align with corporate policies, and preserve access entitlements to documents and files.

Securiti Genstack AI enables organizations to accelerate their transition from generative AI POCs to production by ensuring the safe use of enterprise data, alignment with corporate policies, compliance with evolving AI laws, and continuous monitoring and enforcement of guardrails.

What’s Next for AB 1008?

With AB 1008 now signed into law, the potential implications on businesses using AI are obvious: compliance will become more complex, and the cost of managing AI systems trained on personal data could rise significantly. Companies operating in California will need to rethink their data strategies, prioritizing privacy-first approaches and adopting technologies that allow for easy removal, correction, and management of personal information in AI models. This law could set a precedent for other states or countries, pushing the global conversation on how AI systems handle personal data.

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox


Share

More Stories that May Interest You
Videos
View More
Mitigating OWASP Top 10 for LLM Applications 2025
Generative AI (GenAI) has transformed how enterprises operate, scale, and grow. There’s an AI application for every purpose, from increasing employee productivity to streamlining...
View More
Top 6 DSPM Use Cases
With the advent of Generative AI (GenAI), data has become more dynamic. New data is generated faster than ever, transmitted to various systems, applications,...
View More
Colorado Privacy Act (CPA)
What is the Colorado Privacy Act? The CPA is a comprehensive privacy law signed on July 7, 2021. It established new standards for personal...
View More
Securiti for Copilot in SaaS
Accelerate Copilot Adoption Securely & Confidently Organizations are eager to adopt Microsoft 365 Copilot for increased productivity and efficiency. However, security concerns like data...
View More
Top 10 Considerations for Safely Using Unstructured Data with GenAI
A staggering 90% of an organization's data is unstructured. This data is rapidly being used to fuel GenAI applications like chatbots and AI search....
View More
Gencore AI: Building Safe, Enterprise-grade AI Systems in Minutes
As enterprises adopt generative AI, data and AI teams face numerous hurdles: securely connecting unstructured and structured data sources, maintaining proper controls and governance,...
View More
Navigating CPRA: Key Insights for Businesses
What is CPRA? The California Privacy Rights Act (CPRA) is California's state legislation aimed at protecting residents' digital privacy. It became effective on January...
View More
Navigating the Shift: Transitioning to PCI DSS v4.0
What is PCI DSS? PCI DSS (Payment Card Industry Data Security Standard) is a set of security standards to ensure safe processing, storage, and...
View More
Securing Data+AI : Playbook for Trust, Risk, and Security Management (TRiSM)
AI's growing security risks have 48% of global CISOs alarmed. Join this keynote to learn about a practical playbook for enabling AI Trust, Risk,...
AWS Startup Showcase Cybersecurity Governance With Generative AI View More
AWS Startup Showcase Cybersecurity Governance With Generative AI
Balancing Innovation and Governance with Generative AI Generative AI has the potential to disrupt all aspects of business, with powerful new capabilities. However, with...

Spotlight Talks

Spotlight 11:29
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Watch Now View
Spotlight 11:18
Rewiring Real Estate Finance — How Walker & Dunlop Is Giving Its $135B Portfolio a Data-First Refresh
Watch Now View
Spotlight 13:38
Accelerating Miracles — How Sanofi is Embedding AI to Significantly Reduce Drug Development Timelines
Sanofi Thumbnail
Watch Now View
Spotlight 10:35
There’s Been a Material Shift in the Data Center of Gravity
Watch Now View
Spotlight 14:21
AI Governance Is Much More than Technology Risk Mitigation
AI Governance Is Much More than Technology Risk Mitigation
Watch Now View
Spotlight 12:!3
You Can’t Build Pipelines, Warehouses, or AI Platforms Without Business Knowledge
Watch Now View
Spotlight 47:42
Cybersecurity – Where Leaders are Buying, Building, and Partnering
Rehan Jalil
Watch Now View
Spotlight 27:29
Building Safe AI with Databricks and Gencore
Rehan Jalil
Watch Now View
Spotlight 46:02
Building Safe Enterprise AI: A Practical Roadmap
Watch Now View
Spotlight 13:32
Ensuring Solid Governance Is Like Squeezing Jello
Watch Now View
Latest
Shrink The Blast Radius: Automate Data Minimization with DSPM View More
Shrink The Blast Radius
Recently, DaVita disclosed a ransomware incident that ultimately impacted about 2.7 million people, and it’s already booked $13.5M in related costs this quarter. Healthcare...
Why I Joined Securiti View More
Why I Joined Securiti
I’m beyond excited to join Securiti.ai as a sales leader at this pivotal moment in their journey. The decision was clear, driven by three...
View More
EU Publishes Template for Public Summaries of AI Training Content
The EU released the Explanatory Notice and Template for the Public Summary of Training Content for General-Purpose AI (GPAI) Models. Learn more.
Decoding Saudi Arabia’s Cybersecurity Risk Management Framework View More
Decoding Saudi Arabia’s Cybersecurity Risk Management Framework
Discover the Kingdom of Saudi Arabia’s National Framework for Cybersecurity Risk Management by the NCA. Learn how TLP, risk assessment and proactive strategies protect...
View More
The Rise of AI in Financial Institutions: Realignment of Risk & Reward
Learn how AI is transforming financial institutions by reshaping risk management, regulatory compliance, and growth opportunities. Learn how organizations can realign risk and reward...
Redefining Data Privacy Careers in the Age of AI View More
Redefining Data Privacy Careers in the Age of AI
Securiti's whitepaper provides a detailed overview of the impact AI is poised to have on data privacy jobs and what it means for professionals...
7 Data Minimization Best Practices View More
7 Data Minimization Best Practices: A DSPM Powered Guide
Discover 7 core data minimization best practices in this DSPM-powered infographic checklist. Learn how to cut storage waste, automate discovery, detection and remediation.
Navigating the Minnesota Consumer Data Privacy Act (MCDPA) View More
Navigating the Minnesota Consumer Data Privacy Act (MCDPA): Key Details
Download the infographic to learn about the Minnesota Consumer Data Privacy Act (MCDPA) applicability, obligations, key features, definitions, exemptions, and penalties.
The DSPM Architect’s Handbook View More
The DSPM Architect’s Handbook: Building an Enterprise-Ready Data+AI Security Program
Get certified in DSPM. Learn to architect a DSPM solution, operationalize data and AI security, apply enterprise best practices, and enable secure AI adoption...
Gencore AI and Amazon Bedrock View More
Building Enterprise-Grade AI with Gencore AI and Amazon Bedrock
Learn how to build secure enterprise AI copilots with Amazon Bedrock models, protect AI interactions with LLM Firewalls, and apply OWASP Top 10 LLM...
What's
New