One month after making its ground-breaking AI speech-to-speech translator generally available, KUDO, the world leader in real-time multilingual solutions, prepares for the rollout of its AI Engine v.2.0. This update is set to increase the accuracy of KUDO AI’s speech translation capabilities by up to 30%.
KUDO AI (patent pending) represents a speech-to-speech translation solution that offers multilingual audio and captioning. This sophisticated technology empowers users to listen to speakers in their preferred language, eliminating the need to rely solely on subtitles. A standout feature of the KUDO AI Speech Translator is its capacity to operate in real-time, supporting near-simultaneous and uninterrupted translation, thus promising a frictionless user experience. This feature has been specifically designed and optimized for live translation of speeches, lectures, and presentations among others.
The quality is measured by letting linguists manually evaluate a corpus of texts processed by the simultaneity module over a balanced corpus of speeches in each language. The corpus represents prototypical inputs for which the application has been designed and comprises several categories of texts, such as technical presentations, political speeches, lectures, but also more challenging ones, such as casual talks, speeches rich at disfluencies, non native speakers, etc. This evaluation is performed at each iteration and is a measure used internally to assess improvements of the engine.
Isabel Canovas, Linguistic Analyst, adds “A roughly 30% reduction of errors in the machine learning-based simultaneity module corresponds to a similar uplift in the overall translation experience”. In this scenario, it’s not just the accuracy but the precision with which the original message is conveyed in translation that sees a positive impact, but also and foremost the grammatical and syntactical naturalness of the translated output.
Underpinning this capability is our simultaneity module, based on advanced machine learning and NLP techniques, which processes and analyzes the structure of the speech as it unfolds, making informed decisions with each new word spoken. “This task is intricate due to the necessity of maintaining a delicate balance: creating short and compact segments of text to facilitate real-time translation, while also considering that machine translation often delivers better results with longer texts and more context”, says Claudio Fantinuoli, CTO and designer of KUDO AI. Achieving the right balance is crucial to enhancing translation accuracy and elevating the overall user experience.