
Multilingual Legal AI Benchmarks & Datasets
Multilingual legal datasets and human-audited benchmark infrastructures designed for enterprise AI evaluation, multilingual legal retrieval, regulatory grounding, and trustworthy legal AI systems.
Available now
EU AI & Data Governance — Gold Dataset & Benchmark Suite
17 EU Digital Regulatory Frameworks
12 Aligned EU Languages
~5,000+ Structurally Aligned Legal Segments Across 12 EU Languages
Human-Audited Benchmark Infrastructure
Why it matters
Most legal AI systems are still evaluated using general-purpose NLP benchmarks not designed for multilingual legal reasoning, cross-language regulatory consistency, or enterprise legal retrieval.
QA & Validation
Human-audited benchmark validation
Dual-stage annotation workflows
Multilingual alignment review
92.54% audited accuracy
Structured QA governance
Upcoming Releases
CJEU Case Law
EU Competition Law
International Arbitration
ESG & Sustainability
Financial Regulation
About
THT Legal Data was created by François-Olivier Manson, PhD in Law.
The project was developed to support multilingual legal AI evaluation through structured legal datasets, human-audited benchmark infrastructures, and cross-language regulatory alignment workflows.
The objective is to contribute to more reliable, auditable, and trustworthy legal AI systems operating across multilingual regulatory environments.
© François-Olivier MANSON
14, rue des Malapets
65400 Beaucens
France
Hosting
Framer B.V.
Rozengracht 207B
1016 LZ Amsterdam