EU AI ACT — GUIDE

Data governance & bias testing — Article 10 of the EU AI Act

Article 10 of Regulation (EU) 2024/1689 is the data backbone of the high-risk regime. It sits between Article 9 (risk management) and Article 11 (technical documentation) and decides what data may be used, under what governance, and how bias must be addressed. This page walks through the four operative paragraphs and the narrow Article 10(5) basis for processing special-category data to detect and correct bias.

Who Article 10 applies to

Article 10 applies to high-risk AI systems that make use of techniques involving the training of AI models with data. For high-risk AI systems that do not train on data, Article 10 only applies to the testing data sets. The primary obligated party is the provider, but Article 25 reclassification can shift these duties onto a deployer that materially modifies a system, and Article 26 deployers that supply input data must ensure it is relevant and sufficiently representative in view of the intended purpose.

Article 10(2) — data-governance practices

Training, validation and testing data sets must be subject to data-governance and management practices appropriate for the intended purpose of the high-risk AI system. Those practices must concern in particular:

the relevant design choices;
data collection processes and the origin of data and, in the case of personal data, the original purpose of the data collection;
relevant data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation;
the formulation of assumptions, in particular with respect to the information that the data is supposed to measure and represent;
an assessment of the availability, quantity and suitability of the data sets that are needed;
examination in view of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations;
appropriate measures to detect, prevent and mitigate possible biases identified;
the identification of relevant data gaps or shortcomings that prevent compliance with the regulation, and how those gaps and shortcomings can be addressed.

Article 10(3) — quality criteria

Training, validation and testing data sets must be relevant, sufficiently representative, and to the best extent possible, free of errors and complete in view of the intended purpose. They must have the appropriate statistical properties, including, where applicable, as regards the persons or groups of persons in relation to whom the high-risk AI system is intended to be used. Those characteristics of the data sets may be met at the level of individual data sets or at the level of a combination thereof.

Article 10(4) — context-relevant properties

Training, validation and testing data sets must take into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which the high-risk AI system is intended to be used.

Practically, a recruitment system trained predominantly on data from one labour market and deployed in another is a presumptive Article 10(4) compliance gap that the deployer should flag to the provider before purchase.

Article 10(5) — special-category data for bias detection

To the extent that it is strictly necessary for the purposes of ensuring bias detection and correction in relation to high-risk AI systems, providers of such systems may exceptionally process special categories of personal data referred to in Article 9(1) of Regulation (EU) 2016/679 (GDPR), subject to appropriate safeguards for the fundamental rights and freedoms of natural persons. Those safeguards must include at least the following:

bias detection and correction cannot be effectively fulfilled by processing other data, including synthetic or anonymised data;
the special categories of personal data are subject to technical limitations on the re-use and the use of state-of-the-art security and privacy-preserving measures, including pseudonymisation;
the special categories of personal data are subject to measures to ensure that the personal data processed are secured, protected, subject to a suitable safeguard, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access;
the special categories of personal data are not to be transmitted, transferred or otherwise accessed by other parties;
the special categories of personal data are deleted once the bias has been corrected or the personal data has reached the end of its retention period, whichever comes first;
the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data.

Article 10(6) and the testing-only carve-out

For the development of high-risk AI systems not using techniques involving the training of AI models, paragraphs 2 to 5 only apply to the testing data sets.

How Article 10 connects to deployer duties

Pre-purchase due diligence. Deployers should ask the provider for a statement on Article 10(2)(g) bias-mitigation measures and Article 10(4) context-fit before signing — missing answers are red flags.
Article 26(4) input-data quality. Where deployers exercise control over the input data, they must ensure that input data is relevant and sufficiently representative in view of the intended purpose of the high-risk AI system.
Article 27 FRIA. Article 10 outcomes feed the fundamental-rights impact assessment required of certain Article 27 deployers (public-law bodies and private actors providing public services, plus the Annex III points 5(b) and 5(c) deployers).
Article 11 / Annex IV. Article 10 findings are part of the technical documentation a deployer needs to consult under Article 13 instructions for use.

Common misconceptions

“Article 10 makes bias illegal.” It does not. It requires examination of biases that could affect health, safety or fundamental rights and appropriate mitigation. Residual bias is acceptable if documented and mitigated.
“Article 10(5) lets us collect race or sexual-orientation data.” Only the provider, only strictly necessary, only when other data is not enough, and with all six safeguards. It is a narrow exception, not a general licence.
“Synthetic data avoids Article 10.” Article 10(5)(a) explicitly considers synthetic and anonymised data as the preferred route. Quality criteria under Article 10(3) still apply to synthetic data sets.
“Article 10 is the provider’s problem.” Article 26(4) puts an input-data quality duty on deployers, and substantial modification (Art 25) makes the deployer the provider for all Section 2 obligations including Article 10.

Related EU guides

Sources

Regulation (EU) 2024/1689, Articles 9, 10, 11, 13, 25, 26, 27 — EUR-Lex: eur-lex.europa.eu/eli/reg/2024/1689/oj
Regulation (EU) 2016/679 (GDPR), Article 9 — EUR-Lex: eur-lex.europa.eu/eli/reg/2016/679/oj
European Commission — AI Act Service Desk, Article 10: ai-act-service-desk.ec.europa.eu/en/ai-act/article-10

Note: Article 10(5) requires a tight necessity-and-proportionality assessment under GDPR Article 9. Always consult your DPO or external counsel before invoking it. PowerQuant supplies documentation templates — not legal advice.

PowerQuant Module 1

AI inventory plus a per-system Article 10 vendor-attestation pack — data provenance, bias examination summary and Article 26(4) input-data statement — delivered in 5 working days. Fixed fee, no subscription.

Price in EUR: FOUNDER_DECISION (placeholder pending Alex confirmation).

Start Module 1