Securiti launches Gencore AI, a holistic solution to build Safe Enterprise AI with proprietary data - easily

View

Summary of EDPB’s Opinion 28/2024 Concerning AI Models & Processing of Personal Data

Contributors

Rohma Fatima Qayyum

Assoc. Data Privacy Analyst

Muhammad Faisal Sattar

Data Privacy Legal Manager at Securiti

FIP, CIPT, CIPM, CIPP/Asia

Listen to the content

Introduction

The primary duty of the European Data Protection Board (EDPB) is to ensure the consistent application of the General Data Protection Regulation (GDPR) throughout the European Economic Area (EEA). On 17th December 2024, EDPB issued Opinion 28/2024 in response to the Irish supervisory authority’s (IE SA) request to the EDPB to issue an opinion concerning the processing of personal data in the context of the development and deployment phases of artificial intelligence (AI) models.

The opinion responds to the following four questions:

  1. When and how can an AI model be considered ‘anonymous?’
  2. How can controllers demonstrate the appropriateness of legitimate interest as a legal basis in the development phase of an AI model?
  3. How can controllers demonstrate the appropriateness of legitimate interest as a legal basis in the deployment phase of an AI model?
  4. What are the consequences of the unlawful processing of personal data in the development phase of an AI model on the subsequent processing or operation of the AI model?

Scope of the Opinion

The EDPB has issued this opinion without prejudice to the obligations of controllers and processors under the GDPR. However, this opinion is unclear about how it fits within the wider scheme of obligations under the EU AI Act. As per the accountability principle enshrined in Article 5(2) GDPR, controllers are obligated to demonstrate compliance with all the principles relating to the processing of personal data. Most importantly, this opinion should be interpreted considering that technologies associated with AI models are subject to rapid evolution and all possible scenarios concerning the AI models and processing of personal data may not be addressed in this opinion. It is also important to note that AI systems are broader in scope than AI models, as AI models are typically integrated into and form part of AI systems.

I. Provisions Not Addressed in the Opinion

This opinion does not examine the following provisions, which could still be significant in evaluating the data protection requirements applicable to AI models:

  1. Processing of special categories of data: Article 9(1) of the GDPR prohibits processing special categories of data unless exceptions under Article 9(2) apply. The Court of Justice of the European Union (CJEU) has clarified that even a single sensitive data item triggers this prohibition unless a valid derogation is met. For the Article 9(2)(e) exception that relates to the processing of special categories of personal data manifestly made public by the data subject, it must be verified that the data subject explicitly intended to make their data public. These principles are crucial when processing special category data in AI models.
  2. Automated-decision making, including profiling: The data processing operations carried out by AI models may fall under the scope of Article 22 of GDPR, which imposes additional obligations on controllers and provides additional safeguards to data subjects, as detailed in Working Party Guidelines on Automated individual decision-making and Profiling for the purposes of Regulation 2016/679.
  3. Compatibility of purposes: Controllers shall assess whether processing for a secondary purpose aligns with the purpose for which personal data was originally collected, as per Article 6(4) GDPR criteria. This provision may be relevant for AI model development and deployment and Supervisory Authorities (SAs) should assess its applicability accordingly.
  4. Data protection impact assessments (DPIAs): Under Article 35 GDPR, DPIAs are required when processing in the context of AI models is likely to pose high risks to the rights and freedoms of natural persons, reinforcing the element of accountability.
  5. Principle of data protection by design: SAs should evaluate compliance with Article 25(1) GDPR, ensuring AI models incorporate data protection by design principles.

II. AI Models in the Context of the Opinion

EDPB understands that an AI system relies on an AI model to perform its intended objectives by incorporating the AI model into a larger framework. One example of this is an AI system for customer service might use an AI model trained on historical conversation data to respond to user queries. Moreover, the AI models relevant to this opinion are the models developed through the training process. Such AI models learn from data to perform their intended tasks. This opinion only covers the subset of AI models that are a result of training such models using personal data.

Question 1: When and how can an AI model be considered ‘anonymous?’

AI models, whether trained with personal data or not, are designed to infer predictions or conclusions. Even if an AI model is not explicitly designed to produce personal data from the training dataset, it is possible, in some cases, to use means reasonably likely to extract personal data from some AI models, or simply to accidentally obtain personal data through interactions with an AI model. Such an AI model cannot be considered ‘anonymous.’ EDPB considers that an AI model trained on personal data cannot, in all cases, be considered anonymous. Instead, the anonymity of an AI model should be assessed on a case-by-case basis.

For an AI model to be considered anonymous, it should be very unlikely:

  1. to directly or indirectly extract the personal data of individuals whose data was used to create the model;
  2. to extract such personal data from the model through queries.

So essentially, to be truly “anonymous,” the AI model must make re-identification of personal data nearly impossible.

The SAs must carry out a thorough evaluation of the likelihood of identification to conclude on the anonymous nature of an AI model. While conducting this assessment, SAs should take into account ‘all means reasonably likely to be used’ by the controller or another person and the unintended re(use) or disclosure of the AI model. Consequently, the EDPB establishes a high threshold for proving the anonymity of AI models. Looking ahead, demonstrating anonymity may become increasingly challenging for AI models, as they will have to prove that the likelihood of identification through "all means" is negligible.

When answering both questions, EDPB builds on Guidelines 1/2024 on the processing of personal data based on Article 6(1)(f) GDPR and mentions the three-step legitimate interest assessment (LIA) that needs to be carried out when evaluating the appropriateness of legitimate interest as a legal basis in the context of the development and deployment of AI models. EDPB also highlights the significance of compliance with Article 5 of GDPR, which sets the principles relating to the processing of personal data, to be assessed by SAs when assessing specific AI models.

To assess whether certain processing of personal data is based on legitimate interest, SAs should verify that the controllers meet the following three conditions:

  1. the pursuit of legitimate interest by the controller or by a third party;
  2. the processing is necessary to pursue the legitimate interest (‘necessity test’); and
  3. the legitimate interest is not overridden by the interests or fundamental rights and freedoms of the data subjects (‘balancing test’).

A. Pursuit of Legitimate Interest

EDPB has further elaborated on the criteria that should be met for the pursuit of an interest to be legitimate, it is important that:

  1. the interest is lawful;
  2. the interest is clearly and precisely articulated; and
  3. the interest is real and present, not speculative.

B. Necessity Test

Moreover, the assessment of necessity entails the following two elements:

  1. whether the processing activity will allow the pursuit of the purpose; and
  2. whether there is no less intrusive way of pursuing this purpose.

One way to test the applicability of the necessity test in the context of AI models is to analyse whether a processing purpose can be adequately pursued without processing personal data in the development phase of the AI model. The second step of this process is to also assess whether that processing purpose can be achieved with less amount of personal data and using means that are less intrusive to the rights of data subjects, such as the implementation of technical safeguards to protect personal data belonging to data subjects.

C. Balancing Test

The third step of LIA is the balancing test, where the rights and freedoms of data subjects have to be balanced against the interests of the controller or a third party. AI models can pose risks to the rights and freedoms of data subjects, such as scraping their personal data without their knowledge during the development phase and inferring personal data contained in the AI model’s training database during the AI model’s deployment phase.

Within the balancing test, EDPB assigns significant weightage to the reasonable expectations of data subjects. EDPB states that if the processing of personal data of data subjects is carried out for a purpose other than those reasonably expected by data subjects at the time of data collection, fundamental rights of the data subject could, in particular, override the interest of the controller. Additionally, reasonable expectations of data subjects can also vary depending upon various factors, such as whether the data subjects had made their data publicly available and whether the data was obtained directly from the data subject or another source. Moreover, if the balancing test shows that the processing should not take place because of the negative impact on individuals, mitigating measures may limit this negative impact. The opinion includes a non-exhaustive list of such mitigating measures, including facilitating individuals to exercise their rights, such as the right to erasure and the right to opt-out, and also adopting technical measures to prevent the regurgitation of personal data by an AI model.

Question 4: What are the consequences of the unlawful processing of personal data in the development phase of an AI model on the subsequent processing or operation of the AI model?

Unlawful processing means processing personal data while not complying with Article 5(1)(a) and Article 6 of GDPR. Possible measures for SAs to remediate the unlawfulness of initial processing include issuing a fine, imposing temporary limitations on processing, erasing part of the dataset unlawfully processed, or ordering the erasure of the whole dataset, having regard to the proportionality of the measure.

Scenario 1: A controller unlawfully processes personal data to develop the model, the personal data is retained in the model and is subsequently processed by the same controller

SA has the power to impose corrective measures against unlawful processing, where it may order the controller to delete unlawfully processed personal data. Such corrective measures would not allow the controller to carry out subsequent processing of personal data. For example, if the subsequent processing relies on Article 6(1)(f) GDPR (legitimate interests), the unlawfulness of the initial processing should influence the assessment, particularly regarding risks to data subjects and their reasonable expectations. Hence, unlawful development-phase processing can affect the legitimacy of subsequent processing activities.

Scenario 2: A controller unlawfully processes personal data to develop the model, the personal data is retained in the model and is processed by another controller in the context of the deployment of the model

In such a scenario, SA should assess the lawfulness of the processing carried out by:

  1. the controller that originally developed the AI model; and
  2. the controller that acquired the AI model and processes the personal data by itself.

While making such an assessment, SAs should consider whether the controller has assessed some non-exhaustive criteria, such as the source of the data and whether the AI model is the result of an infringement of the GDPR.  The degree of the assessment of the controller and the level of detail expected by SAs may vary depending on diverse factors, including the type and degree of risks raised by the processing in the AI model during its deployment in relation to the data subjects whose data was used to develop the model.

Scenario 3. A controller unlawfully processes personal data to develop the model and ensures that the model is anonymised before the same or another controller initiates another processing of personal data in the context of the deployment

If the AI model’s subsequent operation does not involve processing personal data, the GDPR would not apply, and the initial unlawful processing would not affect the model's operation. However, claims of anonymity must be substantiated, and SAs must assess this on a case-by-case basis as guided by the EDPB's considerations.

If personal data is processed during the deployment phase after anonymization, the GDPR applies to these activities. In such cases, the lawfulness of the deployment-phase processing is not affected by the initial unlawful processing.


1. Personal data refers to any information relating to an identified or identifiable natural person.

2. AI models are essential components of AI systems, they do not constitute AI systems on their own. AI models require the addition of further components, such as a user interface, to become AI systems.

3. AI system refers to a machine-based system that is designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment, and that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.

4. Training refers to the part of the development phase where AI models learn from data to perform their intended tasks.

5. Web scraping refers to a technique used for collecting information from publicly available online sources. 

Join Our Newsletter

Get all the latest information, law updates and more delivered to your inbox


Share


More Stories that May Interest You

What's
New