Securiti leads GigaOm's DSPM Vendor Evaluation with top ratings across technical capabilities & business value.

View

Assembly Bill 2013: Generative Artificial Intelligence: Training Data Transparency

Author

Sadaf Ayub Choudary

Data Privacy Analyst at Securiti

CIPP/US

Listen to the content

This post is also available in: Brazilian Portuguese

California Assembly Bill 2013 (AB 2013) on Generative Artificial Intelligence: Training Data Transparency was signed into law on September 28, 2024, after the State Assembly and the State Senate approved it.

The law introduces transparency requirements for generative AI (GenAI) system developers. It mandates that developers publicly disclose information about the data used to train and test their GenAI models. GenAI systems and services used for purposes related to national security, military, or defense are exempt from such requirements.

The law addresses growing regulatory and public concerns around model bias, privacy, and other ethical accountability factors. To that end, it serves as a vital first step in a direction that would require developers to be more transparent about their backend development processes. This law helps Californians better understand how AI systems work while promoting responsible innovation.

Read on to learn more about the law in greater detail.

Who Does the Law Apply To?

The law applies to developers of generative artificial intelligence (AI) systems or services or entities that substantially modify such systems. The term "developer" includes any person, partnership, state or local government agency, or corporation that designs, codes, produces, or substantially modifies an AI system or service for use by members of the public. Members of the public exclude:

  • Affiliate- entities that, directly or indirectly, through one or more intermediaries, controls, is controlled by, or is under common control with, another entity. This means the requirement to post public documentation under AB 2013 only applies when AI systems are made available outside an organization's internal or affiliated network.
  • Members of a hospital's medical staff.

The phrase “substantially modifies it”  means creating a new version, new release, or other update to a generative artificial intelligence system or service that materially changes its functionality or performance, including the results of retraining or fine-tuning.

What Does It Regulate?

The law regulates “generative artificial intelligence,” defined as AI that can generate derived synthetic content, such as text, images, video, and audio, that emulates the structure and characteristics of the artificial intelligence’s training data.”  The regulation applies to systems or services released on or after January 1, 2022.

Obligations on Developers

Developers are required to post specific documentation about the training data on their public websites by January 1, 2026 (or prior to substantial modifications). The documentation must include:

  • Sources or owners of the datasets.
  • A description of how the datasets align with the intended purpose of the AI system.
  • Number and types of data points in the datasets.
  • Whether the datasets contain copyrighted, trademarked, patented, or public domain information.
  • Whether the developer purchased or licensed the datasets.
  • Whether the datasets include ‘personal information’ or ‘aggregate consumer information’.
  • Whether the developer cleaned, processed, or modified the datasets and the intended purpose of those efforts in relation to the AI system or service;
  • The time period of data collection and whether data collection is ongoing.
  • The time period during which the data in the datasets was collected, including a notice if the data collection is ongoing.
  • Information about synthetic data generation, if used.

Exemptions

Certain AI systems or services are exempt from the training data transparency requirements:

  • AI systems or services solely used for security and integrity purposes.
  • AI systems or services used for the operation of aircraft in the national airspace.
  • AI systems or services developed for national security, military, or defense purposes, only available to federal entities.

Key Takeaway

Maintaining a data provenance record is crucial for compliance with Assembly Bill 2013, which mandates transparency regarding the datasets used to train generative AI systems. By accurately tracking datasets' origin, ownership, modifications, and usage, businesses can meet the law’s requirements to disclose how data supports AI functionality, whether it contains personal or sensitive information, and if any synthetic data is used.

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox


Share

More Stories that May Interest You
Videos
View More
Mitigating OWASP Top 10 for LLM Applications 2025
Generative AI (GenAI) has transformed how enterprises operate, scale, and grow. There’s an AI application for every purpose, from increasing employee productivity to streamlining...
View More
Top 6 DSPM Use Cases
With the advent of Generative AI (GenAI), data has become more dynamic. New data is generated faster than ever, transmitted to various systems, applications,...
View More
Colorado Privacy Act (CPA)
What is the Colorado Privacy Act? The CPA is a comprehensive privacy law signed on July 7, 2021. It established new standards for personal...
View More
Securiti for Copilot in SaaS
Accelerate Copilot Adoption Securely & Confidently Organizations are eager to adopt Microsoft 365 Copilot for increased productivity and efficiency. However, security concerns like data...
View More
Top 10 Considerations for Safely Using Unstructured Data with GenAI
A staggering 90% of an organization's data is unstructured. This data is rapidly being used to fuel GenAI applications like chatbots and AI search....
View More
Gencore AI: Building Safe, Enterprise-grade AI Systems in Minutes
As enterprises adopt generative AI, data and AI teams face numerous hurdles: securely connecting unstructured and structured data sources, maintaining proper controls and governance,...
View More
Navigating CPRA: Key Insights for Businesses
What is CPRA? The California Privacy Rights Act (CPRA) is California's state legislation aimed at protecting residents' digital privacy. It became effective on January...
View More
Navigating the Shift: Transitioning to PCI DSS v4.0
What is PCI DSS? PCI DSS (Payment Card Industry Data Security Standard) is a set of security standards to ensure safe processing, storage, and...
View More
Securing Data+AI : Playbook for Trust, Risk, and Security Management (TRiSM)
AI's growing security risks have 48% of global CISOs alarmed. Join this keynote to learn about a practical playbook for enabling AI Trust, Risk,...
AWS Startup Showcase Cybersecurity Governance With Generative AI View More
AWS Startup Showcase Cybersecurity Governance With Generative AI
Balancing Innovation and Governance with Generative AI Generative AI has the potential to disrupt all aspects of business, with powerful new capabilities. However, with...

Spotlight Talks

Spotlight 11:29
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Watch Now View
Spotlight 11:18
Rewiring Real Estate Finance — How Walker & Dunlop Is Giving Its $135B Portfolio a Data-First Refresh
Watch Now View
Spotlight 13:38
Accelerating Miracles — How Sanofi is Embedding AI to Significantly Reduce Drug Development Timelines
Sanofi Thumbnail
Watch Now View
Spotlight 10:35
There’s Been a Material Shift in the Data Center of Gravity
Watch Now View
Spotlight 14:21
AI Governance Is Much More than Technology Risk Mitigation
AI Governance Is Much More than Technology Risk Mitigation
Watch Now View
Spotlight 12:!3
You Can’t Build Pipelines, Warehouses, or AI Platforms Without Business Knowledge
Watch Now View
Spotlight 47:42
Cybersecurity – Where Leaders are Buying, Building, and Partnering
Rehan Jalil
Watch Now View
Spotlight 27:29
Building Safe AI with Databricks and Gencore
Rehan Jalil
Watch Now View
Spotlight 46:02
Building Safe Enterprise AI: A Practical Roadmap
Watch Now View
Spotlight 13:32
Ensuring Solid Governance Is Like Squeezing Jello
Watch Now View
Latest
View More
Securiti and Databricks: Putting Sensitive Data Intelligence at the Heart of Modern Cybersecurity
Securiti is thrilled to partner with Databricks to extend Databricks Data Intelligence for Cybersecurity. This collaboration marks a pivotal moment for enterprise security, bringing...
Shrink The Blast Radius: Automate Data Minimization with DSPM View More
Shrink The Blast Radius
Recently, DaVita disclosed a ransomware incident that ultimately impacted about 2.7 million people, and it’s already booked $13.5M in related costs this quarter. Healthcare...
View More
What is Trustworthy AI? Your Comprehensive Guide
Learn what Trustworthy AI means, the principles behind building reliable AI systems, its importance, and how organizations can implement it effectively.
View More
What is Security Posture?
Learn what security posture is, its strategic importance, types, how to conduct a security posture assessment, and how Securiti DSPM helps.
The Healthcare Data & AI Security Playbook View More
The Healthcare Data & AI Security Playbook
Practical blueprint to secure PHI and AI workloads—discover and classify data across EHRs and clouds, enforce least privilege, de-identify/tokenize, monitor risk, and meet HIPAA/FHIR...
Energy Data & AI: A DSPM Playbook for Secure Innovation View More
Energy Data & AI: A DSPM Playbook for Secure Innovation
The whitepaper highlights the critical data security challenges and risks associated with the Energy sector, the real-world risk scenarios, and how DSPM can help.
Operationalizing DSPM: 12 Must-Dos for Data & AI Security View More
Operationalizing DSPM: 12 Must-Dos for Data & AI Security
A practical checklist to operationalize DSPM—12 must-dos covering discovery, classification, lineage, least-privilege, DLP, encryption/keys, policy-as-code, monitoring, and automated remediation.
7 Data Minimization Best Practices View More
7 Data Minimization Best Practices: A DSPM Powered Guide
Discover 7 core data minimization best practices in this DSPM-powered infographic checklist. Learn how to cut storage waste, automate discovery, detection and remediation.
The DSPM Architect’s Handbook View More
The DSPM Architect’s Handbook: Building an Enterprise-Ready Data+AI Security Program
Get certified in DSPM. Learn to architect a DSPM solution, operationalize data and AI security, apply enterprise best practices, and enable secure AI adoption...
Gencore AI and Amazon Bedrock View More
Building Enterprise-Grade AI with Gencore AI and Amazon Bedrock
Learn how to build secure enterprise AI copilots with Amazon Bedrock models, protect AI interactions with LLM Firewalls, and apply OWASP Top 10 LLM...
What's
New