Securiti leads GigaOm's DSPM Vendor Evaluation with top ratings across technical capabilities & business value.

View

EU Publishes Template for Public Summaries of AI Training Content

Contributors

Anas Baig

Product Marketing Manager at Securiti

Rohma Fatima Qayyum

Associate Data Privacy Analyst at Securiti

Published September 2, 2025

Listen to the content

Introduction

The EU AI Act is the first of its kind, a comprehensive AI regulation that lays down rules on artificial intelligence. It entered into force on August 1, 2024. Chapter V of the EU AI Act specifies the obligations of the providers of general-purpose artificial intelligence (GPAI) models that went into effect on August 2, 2025.

Article 53(1)(d) of the EU AI Act obligates the providers of GPAI models to create and publish a detailed summary of the GPAI model’s training data. This summary must follow a template provided by the AI Office. All providers of GPAI models, including the providers of free and open-source licenses, are required to fulfill this obligation.

To support this obligation, the European Commission has most recently, on 24 July 2025, released the Explanatory Notice and Template for the Public Summary of Training Content for General-Purpose AI (GPAI) Models.

What is the Objective of the Summary

The summary aims to ensure transparency across the board concerning the data used for training GPAI models. With the template in place, GPAI model providers will now be required to release a consistent summary of the data that was used to train their models.

The summary will help various parties, especially copyright holders and data subjects, exercise and enforce their rights. Moreover, it will assist downstream providers in assessing data diversity to prevent bias, allow researchers to evaluate risks, and promote a more competitive market.

The summary must be comprehensive, covering data from all training stages, but it is not required to be overly technical. Providers are encouraged to voluntarily disclose more details to help copyright holders verify if their content was used for training.

What Must Be Disclosed

The European Commission’s template outlines three main sections:

1. General Information

This contains basic information such as the provider and their authorised representative’s name and contact details, versioned model name(s), model dependencies, date of placement on the EU market, etc.

In addition, it should contain information on the model and the provider, as well as specifics about the modalities present in the training data (text, image, video, and audio), the sizes of each modality within wide ranges, and a description of the type of content (e.g., fiction, press publications, photography, audiobooks, music videos) included in the training data.

2. List of Data Sources

To ensure the completeness of the summary regarding the content used for the model training, the list of data sources requires disclosure of the primary datasets that were used to train the model, such as:

  • Large private or public databases,
  • A detailed narrative description of the data scraped online by the provider or on their behalf (including a summary of the most pertinent domain names scraped), and
  • A narrative description of all other data sources used (such as user data or synthetic data).

3. Relevant Data Processing Aspects

This mandates the disclosure of the methods and steps undertaken for processing the data before model training. This is particularly crucial for complying with EU legislation for copyright and associated rights (including respect for opt-outs under text and data mining rules), as well as for removing illegal content to minimize the possibility that the  GPAI model may replicate and distribute it widely.

Who is Affected and When

All GPAI model providers, including those with systemic risks, must disclose their specific summaries before putting their models on the EU market. Providers of models made available under open-source and free licenses are also subject to this obligation.

The requirement to publish the training data summary takes effect starting August 2, 2025. Providers of GPAI models released before that date must ensure the summary is published no later than August 2, 2027. The providers must also explicitly identify and explain any missing information in the public summary. This applies if they've made their best effort, but the information is either unavailable or would be an unreasonable burden to retrieve.

The summary must be posted on the provider's official website in an easily readable format, explicitly identifying the model or models (and maybe the model version or versions) that the summary covers. It is recommended that the summary be made publicly accessible with the model through all of its public distribution methods, including internet platforms.

Enforcement

Beginning August 2, 2026, the AI Office will supervise and enforce guidelines for GPAI models. GPAI model providers are required to update summaries at least every six months or whenever there are material changes, such as model fine-tuning or additional training, to the training data. Non-compliance can result in substantial penalties, with fines reaching up to 3% of a provider's total global annual turnover or €15 million, whichever is higher.

The Broader Outlook

Transparency is at the core of the EU AI Act, requiring AI model providers, developers, deployers, distributors, and other applicable stakeholders to comply with the Act’s requirements.

The Explanatory Notice is a step in the right direction, providing deeper insights into the AI training data for the general public and copyright holders, without requiring providers to fully provide intellectual raw data, other business-critical trade secrets, or sensitive data.

With the template in place, copyright holders can have oversight of whether copyrighted material has been utilized to train the AI model. It also establishes a universal standard for internal documentation, providing greater transparency across teams and relevant authorities.

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox



More Stories that May Interest You
Videos
View More
Mitigating OWASP Top 10 for LLM Applications 2025
Generative AI (GenAI) has transformed how enterprises operate, scale, and grow. There’s an AI application for every purpose, from increasing employee productivity to streamlining...
View More
Top 6 DSPM Use Cases
With the advent of Generative AI (GenAI), data has become more dynamic. New data is generated faster than ever, transmitted to various systems, applications,...
View More
Colorado Privacy Act (CPA)
What is the Colorado Privacy Act? The CPA is a comprehensive privacy law signed on July 7, 2021. It established new standards for personal...
View More
Securiti for Copilot in SaaS
Accelerate Copilot Adoption Securely & Confidently Organizations are eager to adopt Microsoft 365 Copilot for increased productivity and efficiency. However, security concerns like data...
View More
Top 10 Considerations for Safely Using Unstructured Data with GenAI
A staggering 90% of an organization's data is unstructured. This data is rapidly being used to fuel GenAI applications like chatbots and AI search....
View More
Gencore AI: Building Safe, Enterprise-grade AI Systems in Minutes
As enterprises adopt generative AI, data and AI teams face numerous hurdles: securely connecting unstructured and structured data sources, maintaining proper controls and governance,...
View More
Navigating CPRA: Key Insights for Businesses
What is CPRA? The California Privacy Rights Act (CPRA) is California's state legislation aimed at protecting residents' digital privacy. It became effective on January...
View More
Navigating the Shift: Transitioning to PCI DSS v4.0
What is PCI DSS? PCI DSS (Payment Card Industry Data Security Standard) is a set of security standards to ensure safe processing, storage, and...
View More
Securing Data+AI : Playbook for Trust, Risk, and Security Management (TRiSM)
AI's growing security risks have 48% of global CISOs alarmed. Join this keynote to learn about a practical playbook for enabling AI Trust, Risk,...
AWS Startup Showcase Cybersecurity Governance With Generative AI View More
AWS Startup Showcase Cybersecurity Governance With Generative AI
Balancing Innovation and Governance with Generative AI Generative AI has the potential to disrupt all aspects of business, with powerful new capabilities. However, with...

Spotlight Talks

Spotlight 11:29
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Watch Now View
Spotlight 11:18
Rewiring Real Estate Finance — How Walker & Dunlop Is Giving Its $135B Portfolio a Data-First Refresh
Watch Now View
Spotlight 13:38
Accelerating Miracles — How Sanofi is Embedding AI to Significantly Reduce Drug Development Timelines
Sanofi Thumbnail
Watch Now View
Spotlight 10:35
There’s Been a Material Shift in the Data Center of Gravity
Watch Now View
Spotlight 14:21
AI Governance Is Much More than Technology Risk Mitigation
AI Governance Is Much More than Technology Risk Mitigation
Watch Now View
Spotlight 12:!3
You Can’t Build Pipelines, Warehouses, or AI Platforms Without Business Knowledge
Watch Now View
Spotlight 47:42
Cybersecurity – Where Leaders are Buying, Building, and Partnering
Rehan Jalil
Watch Now View
Spotlight 27:29
Building Safe AI with Databricks and Gencore
Rehan Jalil
Watch Now View
Spotlight 46:02
Building Safe Enterprise AI: A Practical Roadmap
Watch Now View
Spotlight 13:32
Ensuring Solid Governance Is Like Squeezing Jello
Watch Now View
Latest
Shrink The Blast Radius: Automate Data Minimization with DSPM View More
Shrink The Blast Radius
Recently, DaVita disclosed a ransomware incident that ultimately impacted about 2.7 million people, and it’s already booked $13.5M in related costs this quarter. Healthcare...
Why I Joined Securiti View More
Why I Joined Securiti
I’m beyond excited to join Securiti.ai as a sales leader at this pivotal moment in their journey. The decision was clear, driven by three...
View More
EU Publishes Template for Public Summaries of AI Training Content
The EU released the Explanatory Notice and Template for the Public Summary of Training Content for General-Purpose AI (GPAI) Models. Learn more.
Decoding Saudi Arabia’s Cybersecurity Risk Management Framework View More
Decoding Saudi Arabia’s Cybersecurity Risk Management Framework
Discover the Kingdom of Saudi Arabia’s National Framework for Cybersecurity Risk Management by the NCA. Learn how TLP, risk assessment and proactive strategies protect...
Redefining Data Privacy Careers in the Age of AI View More
Redefining Data Privacy Careers in the Age of AI
Securiti's whitepaper provides a detailed overview of the impact AI is poised to have on data privacy jobs and what it means for professionals...
View More
Financial Data & AI: A DSPM Playbook for Secure Innovation
Learn how financial institutions can secure sensitive data and AI with DSPM. Explore real-world risks, DORA compliance, responsible AI, and strategies to strengthen cyber...
Navigating the Minnesota Consumer Data Privacy Act (MCDPA) View More
Navigating the Minnesota Consumer Data Privacy Act (MCDPA): Key Details
Download the infographic to learn about the Minnesota Consumer Data Privacy Act (MCDPA) applicability, obligations, key features, definitions, exemptions, and penalties.
EU AI Act Mapping: A Step-by-Step Compliance Roadmap View More
EU AI Act Mapping: A Step-by-Step Compliance Roadmap
Explore the EU AI Act Mapping infographic—a step-by-step compliance roadmap to help organizations understand key requirements, assess risk, and align AI systems with EU...
The DSPM Architect’s Handbook View More
The DSPM Architect’s Handbook: Building an Enterprise-Ready Data+AI Security Program
Get certified in DSPM. Learn to architect a DSPM solution, operationalize data and AI security, apply enterprise best practices, and enable secure AI adoption...
Gencore AI and Amazon Bedrock View More
Building Enterprise-Grade AI with Gencore AI and Amazon Bedrock
Learn how to build secure enterprise AI copilots with Amazon Bedrock models, protect AI interactions with LLM Firewalls, and apply OWASP Top 10 LLM...
What's
New