What is BLEU score?

As an automated process, machine translations must be evaluated regularly in order to ensure their quality.

And how can you do that? Companies have two options. One of them is to use professionals to review the translations.

Of course, this would be almost contradictory to the whole MT process - as it is a translation process that does not involve humans. To solve this problem, the BLEU score was created. The BLEU score is a well-known concept for those who are familiar with MT.

BLEU is the short term for Bilingual Evaluation Understudy - or an algorithm that evaluates machine-translated texts from one language to another.

How does the BLEU score evaluate quality?

To evaluate a text’s quality, the BLEU scores compare it with referenced translations. In other words, the BLEU score measures the similarity between an automated translation and a professional one. Ideally, the closer the automated version is to the human-translated content, the better it is. The BLEU score understands how “close” a text is to the referenced content thanks to its algorithm.

This algorithm compares consecutive sentences from the MT text with the consecutive sentences found in the reference translation, proceeding to evaluate how similar they are from one to another, not taking into account grammar or intelligibility. Then, BLEU provides a score that goes from 0 to 1 based on this information. 1 is, ideally, the best scenario here.

However, it is important to point out that few texts will ever achieve this score, as it would imply that the MT text is identical to at least one text from the referenced corpus.

As you may have noticed, the BLEU score system needs to have referenced content to evaluate the machine-translated texts. In fact, it is generally recommended to have at least 1,000 sentences gathered in the referenced translations to ensure its quality.

There are a few downsides to this technology. For example, it’s needed for a significant amount of referenced sentences can be a problem depending on the nature of the translated content. However, this method is still highly popular among MT users and is one of the most cost-efficient ways to measure automated translated texts.

What is BLEU score?

How does the BLEU score evaluate quality?

Unlock the power of glocalization with our Translation Management System.

Unlock the power of

with our Translation Management System.

Try Bureau Works Free for 14 days

Company

Use Cases

Content-Type

Industries

Resources

Guides

Developers

What is BLEU score?

How does the BLEU score evaluate quality?

Unlock the power of glocalization with our Translation Management System.

Unlock the power of

with our Translation Management System.

Related Posts

AI and the Myth of Effortless Translation

Creativity and Resilience Will Save Localization

Why Process-First Thinking Is No Longer Enough in the AI Era

Why Gen Z Won’t Work Like Us with Alex Katsambas

Why AI Translation Is Not That Simple

Project Managers In The Age Of AI

Can AI Translate Poetry?

Proofreading Metrics: Measuring Success in Translation Accuracy

Welcome to the Age of Agentic Translation

Try Bureau Works Free for 14 days

Live Interactive Workshop December 5, 2024

Company

Use Cases

Content-Type

Industries

Resources

Guides

Developers