All high-risk AI systems must be designed and developed to ensure that they achieve an appropriate level of accuracy, robustness, and cybersecurity while also consistently delivering these aspects throughout their operations and lifecycle.
The European Commission must take a proactive role in developing appropriate benchmarks and measurement methodologies to assess the level of accuracy, robustness, and cybersecurity of high-risk AI systems by working closely with all relevant stakeholders and organizations, such as the metrology and benchmarking authorities.
The levels of accuracy and the accuracy metrics for high-risk AI systems should be declared in the accompanying instructions of use.
High-risk AI systems shall be designed to be as resilient as possible against possible errors, faults, and inconsistencies that may occur within their systems or environments in which they operate, particularly resulting from interactions with natural persons or other systems.
The robustness of such systems may be achieved via technical redundancy solutions such as backups and fail-safe plans.
Furthermore, high-risk AI systems that continue to learn after being placed on the market or put into service should be developed to minimize the likelihood of biased outputs influencing the input datasets for future operations (‘feedback loops’). Any such feedback loops should be appropriately addressed with effective mitigation measures.
High-risk AI systems shall be resilient enough to withstand any attempts by unauthorized third parties to alter their use, outputs, or performance by exploiting system vulnerabilities.
Any technical solutions developed to address AI-specific vulnerabilities shall include, where appropriate, measures to prevent, detect, respond to, resolve, and control for attacks trying to manipulate the training data set (‘data poisoning’) or pre-trained components used in training (‘model poisoning’), inputs designed to cause the AI model to make a mistake (‘adversarial examples’ or ‘model evasion’), and confidentiality attacks or model flaws.