🟡 🛡️ Security Saturday, April 25, 2026 · 4 min read

arXiv:2604.21854 'Bounding the Black Box': A Statistical Framework for Certifying High-Risk AI Systems Under the EU AI Act

arXiv:2604.21854 ↗

Editorial illustration: Bounding the Black Box — statistical framework for EU AI Act certification

Why it matters

Natan Levy and Gadi Perl published a paper on April 23, 2026 on ArXiv that fills a regulatory gap in the EU AI Act, NIST framework, and Council of Europe Convention. They propose a two-step statistical framework using the RoMA and gRoMA tools, which calculate an auditable upper bound on failure rates without access to the internal structure of the model.

Researchers Natan Levy and Gadi Perl published a paper on April 23, 2026 on ArXiv titled “Bounding the Black Box” (arXiv:2604.21854), directly tackling a problem that has troubled both regulators and industry for two years — how to prove that a high-risk AI system is sufficiently safe when no law specifies what “sufficiently safe” means in numbers.

The paper is 11 pages long and arrives at a moment when the EU AI Act is entering operational application, and organizations across the continent must begin conducting conformity assessments for their AI systems without a clear methodological foundation.

What Exactly Is the Regulatory Gap?

The authors frame the problem sharply and precisely. Three key regulatory instruments — the EU AI Act, the NIST Risk Management Framework (RMF), and the Council of Europe Convention on AI, Human Rights, and the Rule of Law — all require that operators of high-risk systems demonstrate safety before deployment. However, as the authors literally state: “none specifies what ‘acceptable risk’ means in quantitative terms, and none provides a technical method for verifying that a deployed system actually meets such a threshold.”

In other words, the regulator demands proof, but specifies neither what needs to be proved nor how to prove it. This creates legal uncertainty for regulated entities and opens the door to “compliance theater” — paper-based risk assessments without any real measure of quality.

What Does the Proposed Two-Step Framework Look Like?

Levy and Perl propose a framework inspired by aviation safety protocols, where safety is not proven through hope but through measuring failure rates below a pre-defined threshold.

Phase one — political. The competent authority (in the EU context, this would be a national regulatory body or the European AI Office) formally establishes two values: an acceptable failure probability denoted δ (delta) and an operational input domain denoted ε (epsilon). This step is a political and legal decision, not a technical one — whoever has the authority to define “acceptable” sets the threshold.

Phase two — technical. The statistical tools RoMA and gRoMA calculate an auditable upper bound on the actual failure rate of the system over the given domain ε. If the upper bound falls below δ, the system passes certification. If it does not, it fails.

Why Is the RoMA Approach Particularly Important for Closed Models?

The key technical characteristic of the RoMA and gRoMA tools, according to the abstract, is that they work without access to the internal model structure. The auditor does not need weights, gradients, or architectural details — they work with input and output data and compute the statistical failure bound.

This is critical for the European market because the majority of high-risk systems that will fall under the EU AI Act will be closed commercial models (OpenAI, Anthropic, Google, Mistral). Any certification method that requires access to model weights is practically inapplicable. RoMA enables a third party to conduct meaningful verification even on a black-box system.

What Does This Mean for Regulated Entities and Regulators?

For organizations developing or integrating high-risk AI systems (healthcare, finance, HR processes, critical infrastructure), the paper offers a concrete technical template for their own compliance assessments while regulators have not yet published their own guidelines. The approach is also useful as a negotiating position with vendors — it becomes possible to request from model providers that they supply statistical evidence computed in the RoMA style, rather than generic “model card” statements.

For regulatory bodies, the paper provides a methodological starting point that is academically published, peer-reviewed, and technically specific enough to be incorporated into secondary legislation. The abstract does not cite concrete p-value thresholds or case studies, meaning the full paper text must be read before implementation, but the direction is clear: quantitative certification of AI safety is no longer a theoretical but an operational challenge.

🤖

This article was generated using artificial intelligence from primary sources.