Legal AI’s Multilingual Promise Hinges on Data, Not Just Smarter Models

2 min readSources: Artificial Lawyer

Michael Krallmann of TransLegal highlights that effective multilingual legal AI requires access to vast, high-quality data.

Why it matters: Legal professionals adopting AI for cross-border work need reliable, accurate translations of nuanced legal concepts. Without robust, multilingual legal data, AI's utility and trustworthiness in global practice remain limited—posing real risks for law firms and their clients.

  • TransLegal developed a multilingual legal database covering 75+ countries and targeting 10,000 legal terms per jurisdiction.
  • A review of 50+ AI models found that 70–80% of training data is in English, underrepresenting 100+ languages.
  • 85% of legal professionals reported translation issues negatively impacted their outcomes and client relationships.
  • 96% of legal professionals use AI tools, with nearly half calling them essential to daily workflows.

The rise of legal AI has made cross-border and comparative work more accessible, but most tools are built on English-heavy data. Michael Krallmann, CEO of TransLegal, warns that improving AI models alone won’t resolve the nuances of multilingual legal practice. The linchpin, he contends, is access to high-quality, structured multilingual data.

While AI excels at basic translation, legal meaning depends on context and jurisdiction-specific interpretation. "The value of AI in legal contexts lies not in identifying lexical counterparts, but in reflecting functional equivalence in context," Krallmann says in Artificial Lawyer.

  • TransLegal addresses this by building a multilingual legal terminology database, currently spanning over 75 countries and aiming for 10,000 terms per jurisdiction.
  • A survey of 50+ multilingual AI models revealed stark language disparities: 70–80% of training data is English, with 100+ languages still underserved.
  • This gap isn’t theoretical—an industry survey showed 85% of legal professionals have faced translation- or language-related setbacks affecting cases or client trust.
  • "Providing models with TransLegal’s data makes multilingual legal content accessible with an unprecedented level of accuracy and trust," said founder Michael G. Lindner.

As legal AI becomes central to operations—96% of lawyers now use AI tools, and nearly half deem them essential—solving the data gap is vital for firms with international footprints. "Language and context are the foundation of all legal work," adds DeepL’s André Barrow, noting translation is already among the top AI use cases in law.

By the numbers:

  • 10,000 — Target number of legal terms per jurisdiction in TransLegal’s database
  • 70–80% — Share of multilingual AI model data that is English
  • 85% — Legal professionals reporting translation issues hurt outcomes