Machine Translation Post-Editing

Undoubtedly, machine translation allows both small private companies and large international corporations to save considerable time, but how reliable are the results? Complex phrases often turn into gibberish, stylistic nuances are lost, and cultural subtleties are left out. This becomes even more challenging in literary translation, where the need for creative expression and nuance is paramount. What can be done when almost every fifth sentence requires correction? This is where machine translation post-editing (MTPE) comes to the rescue.

Today, post-editing is not just manual error correction by professional translators and editors. It’s an automated process where large language models (LLMs) play a key role. Rather than relying on labour-intensive manual corrections, LLMs take on the task of refining translations. With post-editing powered by LLMs, not only are obvious errors corrected, but the style is also enhanced, resulting in a more natural and readable text. Trained on extensive datasets, these models are capable of "understanding" the language instead of just matching words.

What's the result? Time and cost savings, quality improvement, and, of course, a translation that sounds as if it was made by a human. In a world where time is money, post-editing with LLMs becomes a necessary step in any machine translation system.

Steps in Machine Translation

The machine translation process consists of several stages:

  • Text Analysis. The system breaks down the text into components — words, phrases, sentences. It attempts to "understand" the grammatical structure and context to choose the correct equivalents in the target language. But here potential errors arise: the machine may misinterpret complex structures or ambiguous words, affecting the translation later.
  • Choosing Translation Options. The system automatically selects the most suitable equivalents for each word and phrase. The problem is that this process is often mechanical: the machine does not always account for context, or may simply lack it, leading to incorrect choices, especially with homonyms or idiomatic expressions.
  • Generating the Final Text. The machine assembles the translated parts into a coherent text. However, even if each individual element was translated correctly, problems may arise at the level of syntax (incorrect word order), style (robotic and lifeless tone), and logic (loss of connection between sentences).

Consider the example of translating the English sentence "I saw her duck when the ball was coming towards her. She reacted quickly to avoid getting hit" into French.

The English word "duck" can refer both to an animal and to the action. The automatic system will most likely output the result: "J'ai vu son canard quand le ballon arrivait vers elle. Elle a réagi rapidement pour éviter d'être frappée." This results in an utterly nonsensical sentence.

The correct version would be "Je l'ai vue se baisser quand la balle arrivait vers elle. Elle a réagi rapidement pour éviter d'être frappée."

Without post-editing, translations may remain a jumble of words, losing the original meaning and cultural context. Even the most powerful machine translation algorithm requires additional refinement.

Common Machine Translation Errors

  • Grammar Errors: Automatic translations often contain tense mismatches, incorrect use of prepositions, articles, or cases, especially in complex sentences with many nuances.

English: The teacher was proud of her student.

Machine translation: Le professeur était fier de son étudiante. (The masculine adjective "fier" is used instead of the feminine "fière".)

Correction: Le professeur était fière de son étudiante.

  • Syntax Errors: The machine may misplace words in a sentence, especially when the source and target languages have different structures, making the text unnatural and hard to understand.

English: He found a sacred book in the old library.

Machine translation: Il a trouvé un sacré livre dans la vieille bibliothèque. (Here "sacré" means "sacred" and should follow the noun.
Placing it before the noun changes the meaning to the colloquial "damned".)

Correction: Il a trouvé un livre sacré dans la vieille bibliothèque.

  • Semantic Errors: Machine translators can misinterpret words or phrases, especially polysemous ones, leading to incorrect translations and distorted meaning.

English: The bank is next to the river.

Machine translation: La banque est à côté de la rivière. (The word "bank" is translated as a financial institution, whereas in this context it means "shore”.)

Correction: La berge est à côté de la rivière.

  • Stylistic Errors: Machine translation often fails to retain the original style, making formal texts too simple and informal ones overly official, disrupting the text's tone and flow.

English: Hey, how’s it going?

Machine translation error: Bonjour, comment ça va ? The translation is too formal.)

Correction: Salut, ça va ?

  • Cultural Errors: Idioms, sayings, and cultural references are often translated literally, making them unclear or ridiculous for the target audience.

English sentence: It’s raining cats and dogs.

Machine translation error: Il pleut des chats et des chiens. (The idiom is translated literally, and doesn’t make sense to a French audience.)

Correction: Il pleut des cordes.

These errors clearly demonstrate that even the most advanced machine translation systems cannot fully replace the human approach, and it is NMT post-editing that ensures a high level of quality and accuracy.

LLM Algorithms in Post-Editing

LLMs use advanced algorithms to deeply analyse and enhance machine translations, reaching a level of accuracy once exclusive to humans. The process starts with the model assessing the context, structure, and meaning of the source text, identifying semantic links between words and sentences to catch errors from the automatic translator.

Source Text Analysis. LLMs go beyond word-by-word translation—they "understand" context. These models break down complex phrases, detect polysemous words, and interpret them correctly, reducing misinterpretation and avoiding literal translations.

Translation Correction. After receiving the text, the LLM adjusts syntax, improves grammar, and refines the natural flow of phrases. Trained on vast data, it identifies not only obvious mistakes but also subtle inconsistencies in style or logic.

LLM's Role in Translation Refinement

Large language models (LLMs) have completely transformed the approach to post-editing machine translation. But in what areas do MTPE services outperform traditional manual post-editing? LLMs possess unique ways of working with text that go beyond mere error correction. They can enhance the overall meaning and coherence of a text, making it smoother and more logical.

  • Style Adaptation. IImagine an editor reviewing hundreds of pages, meticulously searching for errors and refining the style. LLMs not only correct errors but also adapt the text to the desired style — ranging from formal to conversational. This is especially valuable for texts where tone and emotional nuance matter. Manual editing is time and money-consuming, LLMs achieve this instantly, ensuring consistent style throughout the text.
  • Grammar Correction. While humans must carefully reread and check every grammatical detail, LLMs correct errors almost instantly. Complex tense agreements, cases, and prepositions are handled automatically, dramatically speeding up the process.
  • Readability Improvement. We've all encountered cumbersome and convoluted sentences after an automatic translation. Manually refining such constructions takes time and skill. LLMs, however, can effortlessly transform complex phrases into natural and understandable sentences, making the text easier to read.
  • Specialised Vocabulary. In traditional post-editing, translators often spend time searching for the correct equivalents for complex terms—something not every editor is familiar with. LLMs, trained on vast amounts of text, recognize context and suggest accurate terms automatically, eliminating manual search. One of their key strengths is maintaining lexical consistency. If different terms are used for the same concept in various parts of the translation, the model selects the right match, ensuring uniformity and logical flow throughout the text.
  • Context. Polysemous words and expressions are a nightmare for any editor, requiring careful reading to avoid mistakes. LLMs handle this swiftly and accurately, selecting the correct meaning based on context and preventing misunderstandings. They go beyond fixing individual words, analysing sentence meaning within the broader text. This allows them to propose corrections that enhance the translation's precision and clarity for the target audience, avoiding overly literal translations or loss of original meaning.

Manual post-editing has always been considered the gold standard, but it requires a lot of time, patience, and expertise. LLMs not only complete the task faster but do so with a high level of accuracy. Where an editor might need to reread the text multiple times to catch subtle nuances, LLMs see the entire picture instantly. LLMs take on the bulk of the work, saving time and resources, especially when translating large volumes of text.

Metrics for Translation Quality

Among the many metrics used to assess translation quality, two stand out: BLEU and COMET. Each has its characteristics and is suitable for different tasks.

BLEU (Bilingual Evaluation Understudy Score) is a classic and one of the most well-known metrics. It compares the translated text to a reference by counting matching words and phrases in the target language. BLEU’s advantage lies in its simplicity and speed. However, it doesn’t account for meaning depth, context, or stylistic features. This can lead to high BLEU scores for texts that do not accurately convey the original meaning or sound unnatural.

COMET (Cross-lingual Optimised Metric for Evaluation of Translation) is a more modern and advanced metric that not only considers word matches but also the semantic and contextual coherence between the source and translated texts. COMET excels at capturing semantic connections and can evaluate more complex aspects of translation, but it still has limitations.

Limitations of Translation Metrics

While BLEU and COMET are useful tools, they don’t always show the whole picture, especially when dealing with post-edited texts — whether manually or with the help of LLMs. These metrics assess word matching and accuracy in word transmission, but they cannot capture everything — style, readability, and cultural nuances remain out of reach. For instance, a text may score high on COMET but still appear unnatural and poorly received by the target audience.

Post-editing often requires more than just error correction — it’s about adapting the text for a specific context, enhancing its flow, and adjusting its tone. In this regard, no metric can replace a careful editor or deeper analysis. For a truly qualitative assessment, subjective factors, which automatic systems are not yet capable of considering, are essential.

Manual Evaluation of LLM Post-Edited Texts

Despite the advancements in automatic metrics, nothing replaces the keen eye of a professional editor. While algorithms correct errors, experienced editors add a human touch by identifying subtle stylistic nuances that machines often miss. They carefully choose words and phrases, ensuring the text feels natural and is contextually accurate in the target language. This attention to detail is especially crucial when tone and style are paramount.

Direct feedback from the target audience is a valuable way to assess the quality of MT post-editing. Surveys and questionnaires help determine how well the text resonates with readers, whether it sounds natural, and if it meets their expectations. Users provide insights into readability, highlight awkward sections, and evaluate the text’s overall style and meaning. This feedback complements technical assessments, offering a more emotional and human perspective.

Together, manual evaluation methods — through the work of professional editors and user feedback — offer a deeper understanding of post-editing success. While automatic metrics provide quantitative data, human evaluation captures the full richness and nuance of how a text is truly perceived.

Lingvanex Post-Editing Solutions

Lingvanex offers powerful tools for machine translation post-editing services, combining cutting-edge technologies and the capabilities of large language models (LLMs). These solutions help companies improve translation quality by adapting it to specific tasks and needs.

Let’s explore in more detail how LLMs handle post-editing texts translated into French.

Original Text:

I recently started a new job at a large multinational company. The office is located in the heart of the city, and the commute is quite convenient for me. My team consists of people from different countries, which makes every day interesting as we exchange ideas and perspectives. However, the workload has been heavier than I anticipated. I sometimes find it difficult to keep up with all the deadlines, especially since my previous job was much more relaxed. Despite the challenges, I’m learning a lot, and I appreciate the opportunity to develop new skills. I’m also getting used to the company’s culture, which emphasises teamwork and collaboration. Hopefully, I will find a better work-life balance soon.

Translation into French using Lingvanex Translator:

J'ai récemment commencé un nouvel emploi dans une grande entreprise multinationale. Le bureau est situé au cœur de la ville, et le trajet est assez pratique pour moi. Mon équipe est composée de personnes de différents pays, ce qui rend chaque jour intéressant lorsque nous échangeons des idées et des perspectives. Cependant, la charge de travail a été plus lourde que je ne l'avais prévu. J'ai parfois du mal à respecter tous les délais, d'autant plus que mon emploi précédent était beaucoup plus détendu. Malgré les défis, j'apprends beaucoup, et j'apprécie l'opportunité de développer de nouvelles compétences. Je m'habitue aussi à la culture de l'entreprise, qui met l'accent sur le travail d'équipe et la collaboration. Espérons que je trouverai bientôt un meilleur équilibre entre vie professionnelle et vie privée.

Text After Post-Editing with LLM by Lingvanex:

Je viens de commencer un nouvel emploi dans une grande entreprise multinationale. Le bureau est situé en plein centre-ville, ce qui facilite mon trajet. Mon équipe est composée de personnes issues de divers pays, ce qui rend chaque jour intéressant grâce aux échanges d'idées et de perspectives. Cependant, la charge de travail s'avère plus importante que je ne l'avais anticipé. Je peine parfois à respecter tous les délais, car mon précédent emploi était bien moins exigeant. Malgré ces défis, j'apprends beaucoup et je suis reconnaissant de pouvoir développer de nouvelles compétences. Je m'adapte également à la culture d'entreprise, qui encourage le travail d'équipe et la collaboration. J'espère trouver prochainement un meilleur équilibre entre vie professionnelle et vie privée.

Result: Overall, the text is translated correctly with Lingvanex, but some unnatural phrases remain, which post-editing successfully smooths out. Replacing the grammatical construction "J'ai récemment commencé" with "Je viens de commencer" makes the text more conversational and natural for French speakers. The new construction indicates a recent job start and fits better in this context. The phrase "au cœur de la ville" is technically correct, but "en plein centre-ville" sounds more natural for describing an office’s location in the city. The word "lourde" was replaced with "importante" , which sounds more professional and stylistically better reflects the level of workload. In the phrase "J'ai parfois du mal", replacing the verb with "peine" makes the text smoother and more fluent. Simplifying the phrase "Espérons que" to "J'espère" makes it more personal and direct, improving the style of the text.

Original text:

Coral reefs are some of the most diverse and valuable ecosystems on the planet. They provide habitat for thousands of species and protect coastlines from erosion. However, coral reefs are highly sensitive to changes in water temperature, acidity, and pollution. Climate change has led to widespread coral bleaching, where the coral expels the algae living in its tissues, causing it to turn white. This process weakens the coral, making it more susceptible to disease and death. Protecting coral reefs is vital not only for biodiversity but also for the millions of people who rely on these ecosystems for food, income, and coastal protection.

Translation into French using Lingvanex translator:

Les récifs coralliens comptent parmi les écosystèmes les plus diversifiés et les plus précieux de la planète. Ils fournissent un habitat à des milliers d’espèces et protègent les côtes de l’érosion. Cependant, les récifs coralliens sont très sensibles aux changements de température de l’eau, d’acidité et de pollution. Le changement climatique a conduit à un blanchissement généralisé des coraux, où le corail expulse les algues vivant dans ses tissus, les faisant devenir blanches. Ce processus affaiblit le corail, le rendant plus sensible aux maladies et à la mort. La protection des récifs coralliens est vitale non seulement pour la biodiversité mais aussi pour les millions de personnes qui dépendent de ces écosystèmes pour leur alimentation, leurs revenus et leur protection côtière.

Text after post-editing using LLM from Lingvanex:

Les récifs coralliens comptent parmi les écosystèmes les plus diversifiés et les plus précieux de la planète. Ils offrent un habitat à des milliers d'espèces et protègent les côtes de l'érosion. Cependant, les récifs coralliens sont très sensibles aux changements de température, d'acidité et de pollution de l'eau. Le changement climatique a entraîné un blanchissement généralisé des coraux, où le corail expulse les algues vivant dans ses tissus, les faisant devenir blancs. Ce processus affaiblit le corail, le rendant plus sensible aux maladies et à la mort. La protection des récifs coralliens est vitale non seulement pour la biodiversité, mais aussi pour les millions de personnes qui dépendent de ces écosystèmes pour leur alimentation, leurs revenus et leur protection côtière.

Result: The translated text is generally good, but it requires a few corrections. Replacing "fournissent" with the more context-appropriate "offrent" makes the text smoother and more natural. In the phrase "les récifs coralliens sont très sensibles aux changements de température, d'acidité et de pollution de l'eau", post-editing added the clarification "de l'eau" after mentioning pollution and acidity, which improves clarity. A small grammatical adjustment was made: "blancs" was changed to "blanches" to agree with the gender of the word "algues", as this noun is plural and feminine in French. Additionally, the LLM added a comma before "mais aussi", which aligns with French punctuation rules and makes the text more grammatically sound.

Thus, as we can see, post-editing with the help of LLMs significantly improves the style and naturalness of texts. The corrections not only address grammatical and punctuation errors but also enhance the flow, structure of sentences, and selection of more natural French phrases.

Advantages of Lingvanex in Post-Editing

  • LLM Integration for Quality Enhancement: Lingvanex integrates with LLMs for automated post-editing of translations. This means that complex phrases, specific terminology, and grammatical errors are corrected not just on a basic level, but with context and style in mind, making the text as natural and accurate as possible.
  • Tailored to Client Needs: One of the key features of Lingvanex’s solutions is the ability to fully adapt to the specific needs of clients. Every industry and business has unique translation requirements, and Lingvanex takes this into account, offering customizable solutions for every task.
  • Support for Multiple Formats and Easy Integration: Lingvanex supports numerous file formats, making it a flexible tool for various workflows. Integration into existing systems and processes is seamless, saving time and resources for companies.
  • High-Level Data Security: In an age when data protection is a priority for most companies, Lingvanex ensures a high level of security. All translations are processed according to the strictest data protection standards.

The Future of LLM Post-Editing

The future of post-editing using LLMs looks truly exciting. With each passing year, the technology becomes smarter and more accurate, making post-editing faster and more efficient. We are on the verge of a time when human intervention in the translation process will be minimal, especially when working with large volumes of text.

LLM technologies continue to evolve rapidly, and one of their key features is the ability to be fine-tuned. These models can be trained and adapted to the data of a particular company or project, making post-editing even more precise and personalised. Every new iteration improves their ability to understand context, adapt style, and handle industry-specific terminology. This paves the way for a significant reduction in reliance on manual work, especially in translating large volumes of texts.

However, despite all these advancements, the role of humans in post-editing will not disappear — it will simply evolve. Humans will become more of supervisors, overseeing the final output rather than performing the actual editing.


Frequently Asked Questions (FAQ)

What are the 3 main techniques used for machine translation?

3. Neural Machine Translation (NMT) is based on deep learning techniques, particularly artificial neural networks, which process entire sentences as a single unit rather than word-by-word. This allows the model to capture contextual meanings more effectively, improving fluency and accuracy.

Why is neural machine translation better?

Neural machine translation better captures context and meaning across longer text segments, providing smoother and more accurate translations compared to older methods.

What is the difference between LLM and machine translation?

LLMs are general-purpose models for text generation, while machine translation is specialised for translating text between languages.

Are LLMs good at language translation?

Yes, LLMs can handle translation but may lack consistency compared to specialised systems. LLMs are highly effective at improving machine translations through post-editing by correcting grammatical errors, enhancing fluency, and adapting the text to sound more natural.

What are the disadvantages of LLM?

LLMs can be computationally expensive and require significant resources for fine-tuning. Additionally, they may still require human oversight for complex linguistic or cultural nuances.

More fascinating reads await

How is Artificial Intelligence Evaluated?

How is Artificial Intelligence Evaluated?

November 21, 2024

Enhancing Translation with Data Studio

Enhancing Translation with Data Studio

November 21, 2024

Lingvanex's Mobile Neural Machine Translation Engine

Lingvanex's Mobile Neural Machine Translation Engine

November 21, 2024

Contact us

0/250
* Indicates required field

Your privacy is of utmost importance to us; your data will be used solely for contact purposes.

Email

Completed

Your request has been sent successfully

× 
Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site.

We also use third-party cookies that help us analyze how you use this website, store your preferences, and provide the content and advertisements that are relevant to you. These cookies will only be stored in your browser with your prior consent.

You can choose to enable or disable some or all of these cookies but disabling some of them may affect your browsing experience.

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Always Active

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Always Active

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Always Active

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Always Active

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.