Computer Science Researchers Secure US Patent for Multisource Translation

August 28, 2024

Ondřej Bojar and Dominik Macháček from the Institute of Formal and Applied Linguistics (ÚFAL) at Charles University have obtained a U.S. patent for simultaneous machine translation of speech from multiple language sources. This patent gives the Faculty of Mathematics and Physics (Matfyz) exclusive rights to commercialise its research outcomes in the United States. The authors are currently seeking a suitable application partner.

Dominik Macháček (left) and Ondřej Bojar

The patent, registered under the number US 12,056,457 B2, covers the simultaneous machine translation of speech from multiple language sources. This is the second successful U.S. patent application by Ondřej Bojar's team, following the 2021 Bojar-Sudarikov patent for training translation systems for small languages and specific domains, such as SMS text messages.

Machine translation is a process where a computer automatically translates text or speech from one language to another, such as from English to Czech. This technology, with significant research and practical potential, falls within the field of computational linguistics and has been studied since the 1950s.

A key challenge in speech translation is simultaneity. Many lectures, conferences, or meetings need to be translated as quickly as possible, ideally simultaneously, so that listeners who require assistance in understanding can interact with the speaker in real time. Simultaneous machine translation addresses this challenge by not only processing the translation as quickly as possible from a technological standpoint but also by linguistically determining when to translate and when to wait for additional words that clarify the meaning, thereby improving translation quality.

The processes of speech processing by a computer and subsequent simultaneous translation are complex and prone to errors. Ondřej Bojar and Dominik Macháček have developed an innovative method to eliminate such errors. Many international meetings, such as those in the European Parliament, are interpreted simultaneously into multiple languages. This allows the interpretation outputs to be combined into a translation in a selected additional language. For example, the English word funded sounds similar to founded. A computer can easily confuse these words and make a translation error, say, into German. However, if a simultaneous Czech translation by an interpreter is available – where funded and founded are clearly distinguished as financováno and založeno – the ambiguity can be resolved correctly.

The patented system thus does not “listen” to just one speaker, as is typically the case, but rather to multiple speakers simultaneously. By monitoring multiple channels, the system can compensate for the shortcomings of individual inputs, such as poor audio quality, unclear pronunciation, or other factors. The patented system allows for a smooth transition to an alternative source, such as another available interpreter.

Obtaining a technological patent is not easy due to scientific, administrative, and financial challenges. The patent application for multisource speech translation came about thanks to an extraordinary alignment of circumstances. Between 2019 and 2022, the research team led by Ondřej Bojar worked on the European research and innovation project ELITR – European Live Translator – where they implemented the technological content of the solution. The relatively costly patent protection was effectively made possible by the COVID-19 pandemic, which altered the project’s plans.

Thanks to the granting of the U.S. patent, the Faculty of Mathematics and Physics at Charles University holds exclusive rights to commercialise this innovation in a globally significant area – the United States. The inventors’ next goal is to apply the patent practically. They are therefore seeking clients who would value a more reliable speech translation.

ÚFAL, OPMK