Translation Quality Report. June 2024

The goal of this report is to compare translation quality between old and new language models. New models have not only improved quality but performance and memory usage. We used the BLEU metric and primarily Flores 101 test set in the report.

BLEU is the most popular metrics in the world for machine translation evaluation. Flores 101 test set was released by Facebook Research and has the biggest language pair coverage.

QUALITY METRICS DESCRIPTION

BLEU

BLEU is an automatic metric based on n-grams. It measures the precision of n-grams of the machine translation output compared to the reference, weighted by a brevity penalty to punish overly short translations. We use a particular implementation of BLEU, called sacreBLEU. It outputs corpus scores, not segment scores.

References

  • Papineni, Kishore, S. Roukos, T. Ward and Wei-Jing Zhu. “Bleu: a Method for Automatic Evaluation of Machine Translation.” ACL (2002).
  • Post, Matt. “A Call for Clarity in Reporting BLEU Scores.” WMT (2018).

COMET

COMET (Crosslingual Optimized Metric for Evaluation of Translation) is a metric for automatic evaluation of machine translation that calculates the similarity between a machine translation output and a reference translation using token or sentence embeddings. Unlike other metrics, COMET is trained on predicting different types of human judgments in the form of post-editing effort, direct assessment, or translation error analysis.

References

On-premise Private Software Updates

New version - 1.29.0.

Changes in functionality:

  • Added support of additional models for Speech Recognizer.
  • Improved document translation quality.

New version - 1.28.0.

Changes in functionality:

  • Improved translation quality.
  • Improved the Slack bot service.
  • Updated dependencies.

New version - 1.27.0.

Changes in functionality:

  • Improved translation quality.
  • Added support for glossary.
  • Improved the Slack bot service.
  • Improved DOCX, DOC translation quality.
  • Improved work with alternative translation variants.

New version - 1.26.0.

Changes in functionality:

  • Improved alternative translation variants feature.
  • Improved translation quality.
  • Added normalizer and denoiser for Speech Recognizer.

LANGUAGE PAIRS

Note: The lower size of models on the hard drive means the lower consumption of GPU memory which leads to decreased deployment costs. Lower model size has better performance in translation time. The approximate usage of GPU memory is calculated as hard drive model size x 1.2

Language PairCurrent
Model's
Size, mb
Test
Data
Previous
Model's
BLEU
Current
Model's
BLEU
DifferencePrevious
Model's
COMET
Current
Model's
COMET
Difference
English - Arabic190,63Flores 10133,2133,40+0,1987,8188,27+0,46
Greek - English184,00Lingvanex62,9364,45+1,5291,8692,39+0,53
Lithuanian - English113,91Flores 10134,0634,96+0,9085,7486,24+0,50
English - Croatian184,00Flores 10131,1834,95+3,7789,0391,09+2,06
Russian - Kazakh190,63Lingvanex38,1038,39+0,2992,0692,13+0,07
Kazakh - Russian190,63Flores 20022,7522,79+0,0487,5988,12+0,53
Catalan - English113,91Flores 10146,4447,54+1,1088,0588,55+0,50
Hmong - English113,91NTREX20,3621,30+0.9460,8961,95+1,06
German - English190,65NTREX38,6940,96+2,2787,5287,98+0,46
English - Spanish184,02Lingvanex62,8263,04+0,2293,4493,50+0,06
Nepali - English113,91Flores 10133,6441,67+8,0388,4889,94+1,46
Tajik - English113,91Flores 10132,1933,74+1,5576,1077,46+1,36
English - Lithuanian113,91Flores 10130,8431,28+0,4489,6190,11+0,50
English - Estonian113,91Flores 10130,9331,48+0,5591,0991,64+0,55
Ukrainian - English184,00Flores 10141,1541,54+0,3986,9286,98+0,06
English - Hebrew184,11Flores 10135,4236,00+0,9187,8788,53+0,66
English - Malay184,11Flores 10144,1244,63+0,5189,4189,77+0,36
Estonian - English113,91Flores 10139,1941,07+1,8888,8188,33+0,52
Japanese - English190,63Flores 10129,5931,05+1,4687,2888,08+0,80
English - Ukrainian184,00Flores 10129,5934,30+4,7287,0689,88+2,82
French - English190,65Flores 10148,3548,82+0,4789,3189,46+0,15
Herbew - English184,11Flores 10145,0146,31+1,3087,8288,82+0,50
Albanian - English113,91Lingvanex55,4356,43+1,0086,6387,83+1,20
English - Hmong113,91Lingvanex42,2660,99+18,7375,4877,35+1,87

Frequently Asked Questions (FAQ)

How to evaluate the quality of translation?

The quality of translation can be assessed through manual and automatic approaches. Manual evaluation involves human translators checking the texts for accuracy and looking for errors. Automatic approach to the evaluation of machine translation presupposes the use of specific metrics such as BLEU, COMET, METEOR and others.

Why do we need translation quality assessment?

Translation quality assessment ensures that the translated texts meet the required standards. It allows linguists to evaluate the accuracy, fluency and the correspondence of the translated text to its intended purpose. For machine translation systems quality assessment is important to improve their engines, compare different MT providers, and identify strengths and weaknesses for future development.

How can you improve translation quality?

There are many ways to improve the quality of your translations:
1. Set clear standards or guidelines
2. Hold quality checks at multiple stages of a translation process
3. Ensure human reviews of translated texts
4. Hire professional translators with appropriate skills
5. Constantly train MT models and improve them
6. Use advanced NLP techniques to ensure accuracy
7. Combine MT with human post-editing to get the best results
8. Collect and analyze the feedback from your clients

More fascinating reads await

How Lingvanex Helps Expats Feel at Home

How Lingvanex Helps Expats Feel at Home

December 02, 2024

Advances in SOTA and Lingvanex translation models

Advances in SOTA and Lingvanex translation models

November 26, 2024

How is Artificial Intelligence Evaluated?

How is Artificial Intelligence Evaluated?

November 21, 2024

Contact us

0/250
* Indicates required field

Your privacy is of utmost importance to us; your data will be used solely for contact purposes.

Email

Completed

Your request has been sent successfully

× 
Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site.

We also use third-party cookies that help us analyze how you use this website, store your preferences, and provide the content and advertisements that are relevant to you. These cookies will only be stored in your browser with your prior consent.

You can choose to enable or disable some or all of these cookies but disabling some of them may affect your browsing experience.

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Always Active

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Always Active

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Always Active

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Always Active

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.