ChatGPT Translator VS Lingvanex which one is better

We’ve seen many articles and rave reviews claiming that ChatGPT’s translation capabilities are on par with DeepL and Google, and sometimes even surpass them. As a company who have been developing our own translation solutions for the past six years, we became curious about how true all of this is and how our solution compares to ChatGPT. Should we be worried about such a strong competitor?

chatgpt-lingvanex

To compare the quality of translation, we prepared test datasets for seven language pairs:

  • English-Spanish
  • English-German
  • English-Russian
  • English-French
  • English-Italian
  • English-Portuguese
  • English-Finnish


Each test dataset contains around 2500 lines and includes sentences of various themes, lengths, styles, and formatting, to eliminate text selection bias for a specific translator.

ChatGPT offers the API version 4 for limited access. At the moment, access is only available to previously created accounts that have already paid for version 3.5. According to reviews, version 4 has made significant advancements in terms of quality compared to version 3.5. We will check that too!

For testing, we will use two metrics: BLEU and COMET.

BLEU — a widely recognized standard for testing the quality of translation. By default, we will use the Sacre Bleu version. This version is used in the MT machine translation conference and various international competitions. In this metric, the comparison of translations is based on the number of n-grams (combinations of words) that follow one another. The goal of the metric is to find the maximum matching combinations between the translation made by a human and the one made by a machine. The comparison begins with clusters of four words. If there are none, it searches for three-word n-grams. If further matches are not found, it can go down to one n-gram. Points are awarded for each sequence of words (tokens) that the program finds. The drawback of the metric is that it does not account for synonyms, and if the translation accurately conveys the thought but with different words, it will show 0.

COMET — A metric designed to solve the problem of comparing synonyms, which metrics based on the symbolic comparison of two strings cannot handle. If the result of the translation is a semantically similar phrase, but described with different words, the metric will indicate comparable results. It should be noted that its result will also depend on the diversity of the language corpus used to construct the comparison classifier. This metric is widely used as an alternative to the BLEU metric.

Prompts we used for ChatGPT translation:

You are TranslateGPT. You translate user messages from English to Italian (Finnish / French / German / Portuguese / Russian / Spanish). You are the most accurate English to X translator in the world.

Below are graphs with test results:

English-Finnish translation

This language pair shows noticeable improvement in the translation quality of ChatGPT 4 compared to version 3.5. According to the COMET metric, ChatGPT4 slightly outperforms Lingvanex.

English-French translation

When it comes to translating into German, the situation is the same as with French. However, Lingvanex’s lag in the COMET metric is minimal.

English-German translation

When it comes to translating into German, the situation is the same as with French. However, Lingvanex’s lag in the COMET metric is minimal.

Let’s compile all the differences in a table. In red font, we’ll indicate where ChatGPT falls short of Lingvanex. In green font, we’ll mark where it surpasses. The data is relevant as of July 31, 2023.

chatgpt-lingvanex

The Lingvanex translation price was calculated based on the cost of a month’s rent for a basic GPU server (150 dollars) + the monthly price for a Lingvanex language model translation (from 100 dollars), and the number of characters that can be translated in a month with this configuration.

Conclusion

The test results show that while ChatGPT 3.5 is mostly inferior to Lingvanex in translation quality, according to the COMET metric, ChatGPT4 often matches Lingvanex. It’s worth noting that currently, translating large volumes of text with ChatGPT4 is very expensive. In order to perform the tests for this article and translate roughly 20,000 lines through ChatGPT4, a total of $45 was expended. The calculation of the translation cost can be confusing, as it’s difficult to estimate in advance, in tokens, how much you’ll end up paying for the translation.

At the moment, the translation speed of ChatGPT 4 is unstable, most likely depending on the current load on their servers. We had to take 3–4 second breaks between requests. Overall, the translation speed with the test dataset was about 8 words per second. Our solution enables the translation of several thousand words per second, even on weaker servers. In addition, we observed censorship in the translations: if a line contains obscene language, ChatGPT will not proceed to translate the entire sentence.

Therefore, ChatGPT is better used for stylistic translation of small volumes of text without special security requirements. Furthermore, styles and themes can be altered on the go. By carefully selecting prompts, you can achieve improved quality tailored to a specific task, but this requires going through a significant number of prompts.

Lingvanex translation solutions are better used where large volumes of translation are required at a low cost, with safety, speed, and stability.

We’ll admit that for some language pairs, the difference in translation quality may be different, but testing all possible pairs is lengthy and costly.

In general, solutions from ChatGPT and Lingvanex are designed for different purposes and should be chosen depending on the task.


Frequently Asked Questions (FAQ)

Does ChatGPT translate better?

ChatGPT can do a better job than some translation services in capturing stylistic nuances of the texts. For example, in comparison with Google Translate, ChatGPT understands idioms, slang and dialects better. Traditional translation services on the other hand, excel at providing more precise translations. Overall, the quality of the translation output depends on the chosen language pairs and the complexity of the texts.

Which is the best AI translator?

Currently, there are a number of translation programs and services based on AI. Many of them, for example, Google, DeepL, Lingvanex and Yandex, train their own neural models on unique data sets. The choice depends on specific tasks and requests.

How accurate is ChatGPT translation?

The quality of the ChatGPT translation performance depends on various factors: language pairs, training techniques, the genre, the length and the subject of the texts. Translations between closely related languages are more likely to be accurate. Shorter and simpler texts are easier to translate than long complex ones. Technical texts can be a challenge here. AI translation models are constantly developing and improving. The more training data they are put through - the better translation output will be.

More fascinating reads await

Best Free Apps for Slack

Best Free Apps for Slack

May 19, 2025

Speech Recognition Quality Comparison

Speech Recognition Quality Comparison

April 30, 2025

Machine Translation in the Military Sphere

Machine Translation in the Military Sphere

April 16, 2025

Contact Us

* Required fields

By submitting this form, I agree that the Terms of Service and Privacy Policy will govern the use of services I receive and personal data I provide respectively.

Email

Completed

Your request has been sent successfully

×