Announcing Agent Commander - The First Integrated solution from Veeam + Securiti.ai enabling the scaling of safe AI agents

View
Veeam

The Funniest Evening at RSA with Hasan Minhaj

Hasan Minhaj Request ticket
View

EU Publishes Template for Public Summaries of AI Training Content

Contributors

Anas Baig

Product Marketing Manager at Securiti

Rohma Fatima Qayyum

Associate Data Privacy Analyst at Securiti

Published September 2, 2025

Listen to the content

Introduction

The EU AI Act is the first of its kind, a comprehensive AI regulation that lays down rules on artificial intelligence. It entered into force on August 1, 2024. Chapter V of the EU AI Act specifies the obligations of the providers of general-purpose artificial intelligence (GPAI) models that went into effect on August 2, 2025.

Article 53(1)(d) of the EU AI Act obligates the providers of GPAI models to create and publish a detailed summary of the GPAI model’s training data. This summary must follow a template provided by the AI Office. All providers of GPAI models, including the providers of free and open-source licenses, are required to fulfill this obligation.

To support this obligation, the European Commission has most recently, on 24 July 2025, released the Explanatory Notice and Template for the Public Summary of Training Content for General-Purpose AI (GPAI) Models.

What is the Objective of the Summary

The summary aims to ensure transparency across the board concerning the data used for training GPAI models. With the template in place, GPAI model providers will now be required to release a consistent summary of the data that was used to train their models.

The summary will help various parties, especially copyright holders and data subjects, exercise and enforce their rights. Moreover, it will assist downstream providers in assessing data diversity to prevent bias, allow researchers to evaluate risks, and promote a more competitive market.

The summary must be comprehensive, covering data from all training stages, but it is not required to be overly technical. Providers are encouraged to voluntarily disclose more details to help copyright holders verify if their content was used for training.

What Must Be Disclosed

The European Commission’s template outlines three main sections:

1. General Information

This contains basic information such as the provider and their authorised representative’s name and contact details, versioned model name(s), model dependencies, date of placement on the EU market, etc.

In addition, it should contain information on the model and the provider, as well as specifics about the modalities present in the training data (text, image, video, and audio), the sizes of each modality within wide ranges, and a description of the type of content (e.g., fiction, press publications, photography, audiobooks, music videos) included in the training data.

2. List of Data Sources

To ensure the completeness of the summary regarding the content used for the model training, the list of data sources requires disclosure of the primary datasets that were used to train the model, such as:

  • Large private or public databases,
  • A detailed narrative description of the data scraped online by the provider or on their behalf (including a summary of the most pertinent domain names scraped), and
  • A narrative description of all other data sources used (such as user data or synthetic data).

3. Relevant Data Processing Aspects

This mandates the disclosure of the methods and steps undertaken for processing the data before model training. This is particularly crucial for complying with EU legislation for copyright and associated rights (including respect for opt-outs under text and data mining rules), as well as for removing illegal content to minimize the possibility that the  GPAI model may replicate and distribute it widely.

Who is Affected and When

All GPAI model providers, including those with systemic risks, must disclose their specific summaries before putting their models on the EU market. Providers of models made available under open-source and free licenses are also subject to this obligation.

The requirement to publish the training data summary takes effect starting August 2, 2025. Providers of GPAI models released before that date must ensure the summary is published no later than August 2, 2027. The providers must also explicitly identify and explain any missing information in the public summary. This applies if they've made their best effort, but the information is either unavailable or would be an unreasonable burden to retrieve.

The summary must be posted on the provider's official website in an easily readable format, explicitly identifying the model or models (and maybe the model version or versions) that the summary covers. It is recommended that the summary be made publicly accessible with the model through all of its public distribution methods, including internet platforms.

Enforcement

Beginning August 2, 2026, the AI Office will supervise and enforce guidelines for GPAI models. GPAI model providers are required to update summaries at least every six months or whenever there are material changes, such as model fine-tuning or additional training, to the training data. Non-compliance can result in substantial penalties, with fines reaching up to 3% of a provider's total global annual turnover or €15 million, whichever is higher.

The Broader Outlook

Transparency is at the core of the EU AI Act, requiring AI model providers, developers, deployers, distributors, and other applicable stakeholders to comply with the Act’s requirements.

The Explanatory Notice is a step in the right direction, providing deeper insights into the AI training data for the general public and copyright holders, without requiring providers to fully provide intellectual raw data, other business-critical trade secrets, or sensitive data.

With the template in place, copyright holders can have oversight of whether copyrighted material has been utilized to train the AI model. It also establishes a universal standard for internal documentation, providing greater transparency across teams and relevant authorities.

Analyze this article with AI

Prompts open in third-party AI tools.
Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox



More Stories that May Interest You
Videos
View More
Rehan Jalil, Veeam on Agent Commander : theCUBE + NYSE Wired: Cyber Security Leaders
Following Veeam’s acquisition of Securiti, the launch of Agent Commander marks an important step toward helping enterprises adopt AI agents with greater confidence. In...
View More
Mitigating OWASP Top 10 for LLM Applications 2025
Generative AI (GenAI) has transformed how enterprises operate, scale, and grow. There’s an AI application for every purpose, from increasing employee productivity to streamlining...
View More
Top 6 DSPM Use Cases
With the advent of Generative AI (GenAI), data has become more dynamic. New data is generated faster than ever, transmitted to various systems, applications,...
View More
Colorado Privacy Act (CPA)
What is the Colorado Privacy Act? The CPA is a comprehensive privacy law signed on July 7, 2021. It established new standards for personal...
View More
Securiti for Copilot in SaaS
Accelerate Copilot Adoption Securely & Confidently Organizations are eager to adopt Microsoft 365 Copilot for increased productivity and efficiency. However, security concerns like data...
View More
Top 10 Considerations for Safely Using Unstructured Data with GenAI
A staggering 90% of an organization's data is unstructured. This data is rapidly being used to fuel GenAI applications like chatbots and AI search....
View More
Gencore AI: Building Safe, Enterprise-grade AI Systems in Minutes
As enterprises adopt generative AI, data and AI teams face numerous hurdles: securely connecting unstructured and structured data sources, maintaining proper controls and governance,...
View More
Navigating CPRA: Key Insights for Businesses
What is CPRA? The California Privacy Rights Act (CPRA) is California's state legislation aimed at protecting residents' digital privacy. It became effective on January...
View More
Navigating the Shift: Transitioning to PCI DSS v4.0
What is PCI DSS? PCI DSS (Payment Card Industry Data Security Standard) is a set of security standards to ensure safe processing, storage, and...
View More
Securing Data+AI : Playbook for Trust, Risk, and Security Management (TRiSM)
AI's growing security risks have 48% of global CISOs alarmed. Join this keynote to learn about a practical playbook for enabling AI Trust, Risk,...

Spotlight Talks

Spotlight 50:52
From Data to Deployment: Safeguarding Enterprise AI with Security and Governance
Watch Now View
Spotlight 11:29
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Not Hype — Dye & Durham’s Analytics Head Shows What AI at Work Really Looks Like
Watch Now View
Spotlight 11:18
Rewiring Real Estate Finance — How Walker & Dunlop Is Giving Its $135B Portfolio a Data-First Refresh
Watch Now View
Spotlight 13:38
Accelerating Miracles — How Sanofi is Embedding AI to Significantly Reduce Drug Development Timelines
Sanofi Thumbnail
Watch Now View
Spotlight 10:35
There’s Been a Material Shift in the Data Center of Gravity
Watch Now View
Spotlight 14:21
AI Governance Is Much More than Technology Risk Mitigation
AI Governance Is Much More than Technology Risk Mitigation
Watch Now View
Spotlight 12:!3
You Can’t Build Pipelines, Warehouses, or AI Platforms Without Business Knowledge
Watch Now View
Spotlight 47:42
Cybersecurity – Where Leaders are Buying, Building, and Partnering
Rehan Jalil
Watch Now View
Spotlight 27:29
Building Safe AI with Databricks and Gencore
Rehan Jalil
Watch Now View
Spotlight 46:02
Building Safe Enterprise AI: A Practical Roadmap
Watch Now View
Latest
View More
Introducing Agent Commander
The promise of AI Agents is staggering— intelligent systems that make decisions, use tools, automate complex workflows act as force multipliers for every knowledge...
Risk Silos: The Biggest AI Problem Boards Aren’t Talking About View More
Risk Silos: The Biggest AI Problem Boards Aren’t Talking About
Boards are tuned in to the AI conversation, but there’s a blind spot many organizations still haven’t named: risk silos. Everyone agrees AI governance...
Largest Fine In CCPA History_ What The Latest CCPA Enforcement Action Teaches Businesses View More
Largest Fine In CCPA History: What The Latest CCPA Enforcement Action Teaches Businesses
Businesses can take some vital lessons from the recent biggest enforcement action in CCPA history. Securiti’s blog covers all the important details to know.
View More
AI & HIPAA: What It Means and How to Automate Compliance
Explore how the Health Insurance Portability and Accountability Act (HIPAA) applies to Artificial Intelligence (AI) in securing Protected Health Information (PHI). Learn how to...
California’s Delete Request and Opt-out Platform (DROP) and the Delete Act View More
California’s Delete Request and Opt-out Platform (DROP) and the Delete Act
Understand California’s DROP platform and the Delete Act, including compliance timelines, the 45-day cycle, broker obligations, and how to operationalize compliance.
Building A Secure AI Foundation For Financial Services View More
Building A Secure AI Foundation For Financial Services
Access the whitepaper and discover how financial institutions eliminate Shadow AI, enforce real-time AI policies, and secure sensitive data with a unified DataAI control...
Emerging AI Security Trends For 2026 View More
Emerging AI Security Trends For 2026
Securiti’s latest infographic provides security leaders with a walkthrough of all the emerging AI security trends for 2026 to help them assess and plan...
Safe AI, Accelerated: View More
Safe AI, Accelerated: Securing Data & AI Across the Lifecycle
Securiti’s latest infographic dives into the issue organizations face when scaling their AI projects safely, and how best they can address those challenges.
View More
Take the Data Risk Out of AI
Learn how to prepare enterprise data for safe Gemini Enterprise adoption with upstream governance, sensitive data discovery, and pre-index policy controls.
View More
Navigating HITRUST: A Guide to Certification
Securiti's eBook is a practical guide to HITRUST certification, covering everything from choosing i1 vs r2 and scope systems to managing CAPs & planning...
What's
New