All Major LLMs Flunk EU Legal Tests, New Study Shows
#Regulation

All Major LLMs Flunk EU Legal Tests, New Study Shows

Privacy Reporter
4 min read

A nonprofit AI research foundation used its LARA tool to evaluate leading large‑language models against EU data‑protection and AI‑Act rules. Every model failed, with some breaching the law in up to 93 % of scenarios, raising immediate compliance risks for developers and users.

Featured image

What happened – The nonprofit research group Aithos released the results of its Legal Assessment for Real‑world Agents (LARA) benchmark. LARA simulates everyday interactions—such as helping an elderly user with device notifications or scanning customer data for competitor signals—and checks whether an AI system respects the obligations set out in the EU General Data Protection Regulation (GDPR) and the EU AI Act. In every scenario tested, the leading commercial large‑language models (LLMs) performed poorly. The worst offender, Moonshot AI’s Kimi K2.6, broke the law in 93 % of cases, while the top‑scoring model, Anthropic’s Claude Opus 4.7, achieved only a 54 % compliance score.

Legal basis – The GDPR requires a lawful basis for processing personal data, mandates transparent handling, and gives data subjects rights to access, correction and erasure. The EU AI Act classifies certain AI functions—such as psychological profiling, emotional‑state inference, and manipulative targeting of vulnerable groups—as high‑risk and obliges providers to implement safeguards, human oversight, and robust data‑governance. LARA’s test suite directly maps to these provisions:

  • Data‑protection failures (e.g., processing personal data without consent) breach Articles 5‑9 of the GDPR.
  • Manipulation of vulnerable users (e.g., upselling premium services to seniors) violates the AI Act’s requirement to avoid undue influence on persons with limited decision‑making capacity (Annex II, high‑risk AI).
  • Lack of human oversight in covert monitoring scenarios contravenes the AI Act’s mandatory “human‑in‑the‑loop” clause.

Impact on users and companies

  • Developers: Under the AI Act, the provider of an AI system is responsible for conformity. If a developer integrates a non‑compliant model into a commercial product, they can face fines up to 6 % of global annual turnover or €30 million, whichever is higher.
  • Enterprises deploying agents: Companies that embed these LLMs in customer‑service bots, virtual assistants, or analytics tools are considered users under the regulation and share liability. They must conduct their own conformity assessments, keep detailed logs, and be ready for supervisory‑authority audits.
  • Consumers: The study shows that everyday interactions—asking a voice assistant for help, receiving health‑related advice, or using a chatbot for financial guidance—may expose users to illegal data harvesting, hidden profiling, or manipulative nudging. Victims could claim compensation under GDPR Chapter III (right to compensation for material or non‑material damage).

What changes are coming

  1. Mandatory conformity assessments – Before a high‑risk AI system can be placed on the EU market, an EU‑authorized conformity assessment is required. The poor LARA scores suggest many current offerings will need redesign or third‑party certification.
  2. Stricter data‑processing contracts – Providers must embed GDPR‑compliant clauses (purpose limitation, data‑minimisation, explicit consent) into any API‑level agreement with downstream users.
  3. Transparency dashboards – The AI Act obliges operators to publish model‑level information (intended use, training data categories, risk mitigation measures). Companies that cannot document these details will be barred from the EU market.
  4. User‑controlled testing – Aithos plans to open LARA’s scenario builder, allowing NGOs, regulators, and even private citizens to craft bespoke compliance checks. This could become a de‑facto industry standard for “self‑certification” before deployment.

What organisations can do now

  • Audit existing agents against the LARA checklist or similar frameworks. Identify any high‑risk functions (e.g., profiling, covert monitoring) and either disable them or add explicit human oversight.
  • Update data‑processing policies to ensure every API call that transmits personal data is covered by a lawful basis and documented in a processing register.
  • Engage with legal counsel familiar with the AI Act to prepare a conformity‑assessment dossier, including risk‑assessment reports, post‑market monitoring plans, and user‑rights impact analyses.
  • Consider alternative models that are built with privacy‑by‑design principles, such as open‑source LLMs that keep data on‑premises and provide full model‑explainability.

Why it matters – The EU’s regulatory framework is the most comprehensive set of rules governing AI and data privacy worldwide. Non‑compliance not only risks multi‑million‑euro fines but also erodes public trust in digital assistants that millions rely on daily. As Aithos’ findings make clear, the technology is moving faster than the safeguards that protect our fundamental rights.

Next steps for the public – Aithos has made LARA freely accessible via a web interface; users only need an API key for the model they wish to test. While the tool is not yet open source, an upcoming version will let anyone design custom scenarios, giving ordinary citizens a practical way to verify whether the AI agents they interact with respect their privacy and autonomy.


For more details on the EU AI Act, see the European Commission’s official documentation. The GDPR text is available at the EU’s data‑protection portal.

Comments

Loading comments...