Professional Documents
Culture Documents
Post-Editing Certification
May 2017
1. Introduction ·························································································································· 5
1.1. MT Development in the Last Century ··················································································· 6
1.2. A Brief History of MT at SDL ·································································································· 8
2. PEMT and Translation ·········································································································11
2.1. Global Developments and the Localization Industry ··························································11
2.2. The Right Solution for the Right Content ············································································11
2.3. Content Evaluation and the Post-Editing Process·······························································13
3. MT Technologies ·················································································································15
3.1. Rules-Based Machine Translation ·······················································································15
3.2. Statistical Machine Translation
(also known as data-driven machine translation) ······························································17
3.3. Hybrid MT Systems ·············································································································19
4. How the MT Output Is Created ···························································································21
4.1. MT Engine Creation·············································································································21
4.1.1. Baselines ·····························································································································21
4.1.2. Verticals·······························································································································22
4.1.3. Customizations ····················································································································23
4.2. MT Engine Training ·············································································································24
4.3. MT Output Evaluation — Testing Methodologies ······························································25
5. Using the MT Output: The Basics of Post-Editing ·······························································31
5.1. Introduction to Post-Editing································································································31
5.2. Degrees of Post-Editing ·······································································································32
5.3. The Quality Check Process ··································································································36
6. How to Get the Most out of MT ·························································································38
6.1. What Makes an Effective Post-Editor? ···············································································38
6.2. Post-Editing Quality Expectations ·······················································································39
6.3. Under-Editing ······················································································································41
6.4. Over-Editing ························································································································42
7. Expected Statistical MT Behavior························································································45
7.1. Common Patterns to Watch for When Post-Editing ···························································45
7.2. How to Provide Feedback to Improve the MT Output ·······················································46
8. Using SDL Language Cloud in SDL Trados Studio ································································49
8.1. How to Add SDL Language Cloud
as a Machine Translation Provider in SDL Trados Studio····················································49
8.1.1. Applying MT Segment-by-Segment ····················································································51
8.1.2. Applying MT to Whole Files/Projects
Using the “Pre-translate Files” Batch Task··········································································51
8.1.3. Using the Studio AutoSuggest Feature to Retrieve MT in Studio ·······································52
8.2. How to Use Dictionaries in Language Cloud to Improve MT Output··································54
8.2.1. Best Practices for LC Dictionary Creation ···········································································55
8.2.2. How to Upload and Maintain a Dictionary in LC ·································································56
8.2.3. Testing That the Dictionary Has Been Applied ···································································58
9. AdaptiveMT Engines Powered by Language Cloud·····························································60
10. The Future of Post-Editing ··································································································63
11. Summary ·····························································································································64
12. More Resources on MT and Post-Editing ···········································································66
In response to the growing demand for post-editing, SDL introduced its Post-Editing Machine
Translation (PEMT) Certification in 2014 to enable translators to gain a foothold in the emerging
post-editing market. The ready availability of MT means that more material is earmarked for
translation, which in turn fuels the need for skilled and knowledgeable post-editors. PEMT is now
very much a mainstream skill for translators, and one of the main aims of the PEMT Certification
program is to give them the right skills to deal with the ever-increasing demand for post-editing.
Our certification program is geared toward anyone who is impacted by post-editing and wants to
gain a better understanding of the history and theory behind MT as well as the practical applications
of MT. The interest in and uptake of the PEMT Certification program is testament to the continuous
need for MT training, and this updated training guide aims to capture the latest MT developments
and best practice guidelines.
Machine Translation (MT) is automated translation that uses software to translate text from one
natural language to another. It is one of the oldest applications of artificial intelligence and both
facilitates and accelerates the creation of translated materials.
Why is PEMT so important in today’s market? A structured and controlled PEMT workflow allows
companies to join the fast-track MT highway and successfully deliver their brand message to a global
audience.
With this in mind, it is important to remember that MT does not replace the need for human
translation and human translators. PEMT is the process of allowing machines to do the heavy lifting
of translation, with editing and quality assurance being performed by a trained translator. MT is an
effective tool to assist translators in their everyday work.
Early attempts at MT typically failed due to a lack of coverage. The models functioned by encoding a
limited selection of transformational rules, which simply did not provide for the diversity of natural
language translation. Consequently, the first attempts to commercialize MT in the 1970s and 1980s
operated by drastically increasing the number of encoded transformational rules. This produced
Rules-Based Machine Translation (RBMT), which functioned relatively successfully with targeted
human feedback over a particular domain. However, this led to the further problem of how to make
the huge number of transformational rules needed to encode language pairs cooperate with each
other. The answer was a statistical approach to MT.
In the late 1980s, computational power increased and became less expensive. As a result, interest
picked up in Statistical Machine Translation (SMT). From the 1990s, statistical learning approaches
came to the fore, led by cutting-edge work from the research team at IBM. SMT systems no longer
required the same human effort to encode transformational rules and update lexicons and
terminology lists, but instead exploited the wealth of existing translations, covering numerous
language pairs, to extract rules based on statistical probability.
Since the 1990s, SMT has been pushed forward through intensive research and training as well as
support from industry, the US Defense Advanced Research Projects Agency (DARPA) and the
European Commission’s FP7 program. Statistical MT has been deployed in real-world, commercial
contexts by SDL, Google, Microsoft and IBM, alongside ongoing research and new developments in
the field of statistical and hybrid MT. In 2011, SMT was boosted with Google’s announcement that it
would charge for access to the Google Translator API. Shortly afterwards, Microsoft also announced
that it would start charging for use of the Microsoft Translator API. These two events can be viewed
as a key milestone for the machine translation industry and the localization industry as a whole. The
progression to a paid API model for machine translation was a clear sign that the use, spread and
quality of MT have matured to a level where enterprises and developers see sufficient value in MT to
invest in it.
After many decades, it appears that the models used in MT are more in line with our understanding
of how human language cognition and processing operates.
This does not mean that the MT output is of an equal standard to anything the human brain can
produce, but it reinforces that MT is an essential tool in the translation lifecycle.
MT accuracy is improving constantly, and new research and developments put MT at the forefront
of linguistic improvements. MT is now a truly interdisciplinary field, drawing from computer science,
linguistics, probability theory, algorithm design, automata theory and engineering.
Our MT journey started back in 2000 when we acquired a Rules-Based Machine Translation (RBMT)
engine from Transparent Language, which became SDL Enterprise Translation Server (ETS). In 2004,
the Knowledge-based Translation System (KbTS) group was set up to use the MT system as part of a
high-quality translation process and offer post-editing as a service to our customers.
In 2009, Statistical Machine Translation (SMT) was starting to establish itself firmly as a contender in
the localization industry, following rapid development. SDL forged a strategic partnership with a
leading SMT developer, Language Weaver, allowing MT usage to grow exponentially.
In 2010, SDL acquired Language Weaver and has continued to invest heavily in the development and
deployment of SMT technology. KbTS was rebranded as iMT (intelligent Machine Translation) in
2011, underlining the importance of machine translation within the translation productivity
workflow. The same year also saw the execution of a radical MT growth strategy by scaling MT
projects through our in-country language offices.
Today, the MT team is a truly interdisciplinary team encompassing computational linguists, project
managers, data specialists and post-editors who educate, promote and support MT usage
throughout SDL. Best results can only be achieved through an integrated effort, and we are
supported by two research labs that focus on MT technology and SDL’s language offices, where MT
is fully integrated in the production lifecycle.
Recent improvements to machine translation technology at SDL include the implementation of XMT
technology.
Developed by the SDL Language Research Group based in Los Angeles, California, XMT is a complete
rewrite of MT technology by a team who understands the limitations of previous generations of
machine translation.
While previous technologies apply a single translation algorithm universally to all language pairs,
XMT allows different translation algorithms to be used for different language pairs, providing much
higher-quality output. SDL XMT incorporates all previous innovations and algorithms, but its modular
and robust design also enables the rapid transition of new technologies and innovations.
Another recent exciting release saw the implementation of real-time MT learning mechanisms under
the name AdaptiveMT. This technology will allow XMT MT engines to learn and remember the
translation preferences of individual users.
This development will form the basis of a recalibration of the relationship between MT and
translation experts, and will put the post-editor firmly at the heart of the MT improvement cycle.
Section 9 of this manual focuses on the latest developments in AdaptiveMT at SDL.
With regard to the application of neural network technology to machine translation, SDL has been
experimenting with this approach for some time. Information about developments on this front will
follow in the near future.
Why is post-editing so important? Now more than ever, language is a business requirement. An
estimated three-quarters of web users take advantage of free translation tools thanks to the greater
accessibility and integration of MT solutions. Over 90% of non-native English speakers use MT
software to translate English websites they visit. Nowadays, businesses need their websites to cover
14 languages to reach 90% of the most economically active people. Overall, there is an enormous
growth in digital content which is fueling localization volumes.
Furthermore, the importance of English as a global lingua franca is slowly decreasing. In recent
years, the two languages with the greatest growth on the Internet were Arabic and Mandarin
Chinese — both of which grew twentyfold. In contrast, content in English only trebled.
Proportionally, English is declining in importance. It is estimated that by 2020, English will have lost
its status as lingua franca altogether. However, rather than being replaced by another natural
language, linguistic diversity will be the new status quo and translation will be key to
communication.
How can MT and post-editing help respond to these trends? The only answer to the digital content
explosion is an automated solution with the ability to accelerate content availability and scale to
enable a faster time to market. Structured PEMT solutions create the framework to accommodate
the ever-increasing volume of data, which is set to outstrip the current capacity of both conventional
translators and post-editors.
2.2.
The last few years have seen significant investment and progress in MT technology. SDL remains at
the forefront of MT development and uses a structured PEMT process to increase efficiency while
delivering a high-quality end product. Our MT technology is integrated with SDL’s translation
environments across the translation workflow: SDL Trados Studio, TMS and WorldServer.
It is worth noting that language service providers such as SDL are not the only ones implementing an
MT roadmap — many of our clients have mature MT strategies and rely on MT as an integral part of
their localization process.
With so many of SDL’s clients and partners looking to MT as a standard, there has been a shift in
terms of which domains and content types can be considered for machine translation. Content types
that were considered unsuitable a couple of years ago can now be handled productively using
machine translation.
Human translation
Post-editing to publishable level: MT output from MT engines is post-edited by professional
linguists to a quality level equivalent to conventional translation. Post-editing MT content is
the preferred solution for publishable documents. It is used as part of a high-quality
translation process
Light post-editing or post-editing to understandable quality: MT output is post-edited to a
level suitable for an acceptable and actionable translation, not with perfect grammar and
style. This is a viable option for perishable and low-visibility content
Raw MT or FAUT (Fully Automated Useful Translation): MT is generated by baseline engines
or customized engines and the output is used directly, with no human intervention. This
solution is used mostly for content such as emails, support content or instant messages,
where the user wants to have an idea of the content, without the need for high quality
Deciding which content can successfully be translated using MT is a commercial and financial
decision, and it is important to follow best practices when taking it.
Many companies now consider MT the only viable option to process the volume of content they
need to localize. MT also allows the translation of content that would previously have remained
untranslated. An example of this is USG (User-Generated Content). Most clients have some USG in
their content portfolio and are facing increasing demands to make it available to a global audience.
This cannot usually be achieved with human translation; however, MT is ideally suited to translating
this high-volume, perishable content.
Post-editing machine translation is no longer a marginal translation activity but has become a
mainstream way to satisfy the translation productivity needs of our data-driven world. Post-editing
is now an established practice, widely used by translators and routinely taught at universities.
Translators and linguists are an integral part of any post-editing process, and their expertise is
needed throughout. As a language services and machine translation provider, SDL offers its
customers over 600 MT engines for a variety of language pairs. This is only possible with the right
linguistic support — SDL’s linguists and in-house and freelance translators are involved in every stage
of the MT lifecycle, from rigorously testing MT solutions before deployment to providing the
feedback needed for engine improvements.
A careful content evaluation should be at the start of every successful post-editing implementation
process. Some types of content are simply not suitable for an MT process as they require a more
nuanced approach harnessing skills that can only be provided by human translators.
A content audit will help determine whether PEMT is the right solution for the content. The current
and future needs of a project should be evaluated on the basis of:
Quality of the evaluated source data (issues deriving from language errors, formatting and
authoring)
Volume requirements that are too capital and time intensive for human translators to
manage
The technical integration required to produce a production workflow (CAT tools/CMS and
workflow systems)
Chances of MT translation success (based on previous experiences)
Companies can achieve the most significant gains through PEMT by targeting content that is both
strategic and high volume with tight turnaround times.
As part of the wider process, it is critical for the evaluator to identify and record not just the content
types where PEMT has been successful but also where PEMT has not met project goals. Collecting
feedback from editors and users is critical for this to happen. A continual feedback loop will
eventually increase the speed at which evaluations take place and help predict translation outcomes
more accurately.
3.1.
Chronologically speaking, Rules-Based Machine Translation (RBMT) was the first approach to
automated translation. RBMT uses a linguistics-based set of rules in combination with a dictionary.
This is also known as knowledge-driven MT. A language pair is built by looking at the construction of
both source and target, taking into account source and target grammar and vocabulary. This is a
transfer-based approach with three translation phases: analysis, transfer and generation. It involves
parsing a source sentence, analyzing the structure, converting this to a machine-readable code and
then transforming it into the target, as shown below:
The core system is based on a set of grammatical rules for each language pair, combined with a
dictionary. The dictionary contains source words and phrases, their translation and detailed
grammatical information, such as the part of speech and inflection. It provides the modules with the
linguistic knowledge they need.
The rules are the “linguistic processor” of the system, responsible for analysis and generation. They
use linguistic information stored in the dictionary. These rules are intended to represent the
grammatical knowledge of speakers and specify inherent agreement and relational information.
Determiner and
Subject and finite
noun need to
Example agree in number
verb need to agree
in number
and gender
At the translation stage, the MT engine analyzes each source sentence and tags the words and
phrases with their part of speech to identify grammatical components (for example, the subject,
object and verb). The MT system then looks up the translations of these grammatically tagged words
and phrases in the machine dictionary and combines them using the coded language rules for the
target language. This builds the translated sentence.
Disadvantages:
It is very time consuming to develop a set of rules and a dictionary for any given language
pair
This approach is limited in terms of applications, as source content needs to be well written
to generate good output
Translations are often literal
Rules-based systems do not allow for context sensitivity in terms of style or terminology
Challenges of RBMT
RBMT allows for excellent terminology control. There is no need for pre-existing TMs, as project
dictionaries can be created from scratch and the output is systematic (rightly or wrongly), allowing
experienced post-editors to work quickly and reliably. However, it can take a number of years to
develop a new language pair and the source must be well written to generate good output.
As explained previously, RBMT uses a linguistics-based set of rules together with a dictionary.
Translation rules are encoded manually. Issues arise from the number of rules that need to be
created and the fact that rules conflict with each other. In addition, both words and grammar can be
ambiguous. One word can have several meanings (for example, “bank”) and different grammatical
rules apply only in certain contexts.
RBMT output is often not very fluent or sensitive to context, providing a single translation per term.
The creation and maintenance of project dictionaries can also be time consuming.
A Statistical Machine Translation (SMT) system learns to translate by analyzing large volumes of
previously translated content. How does this work? A large database of aligned source and target
texts is entered into a statistical learning system. On the basis of these examples, the system creates
an engine for automated translation. In essence, the system learns how to translate by analyzing the
statistical relationships between source and target data.
The starting point for training an engine is an aligned corpus of source and translated sentences
containing hundreds of millions of words. The training process subdivides each of the source
sentences into words and series of words (n-grams) and analyzes the associated translated
sentences. In this way, the training process determines the most likely set of translations for each n-
gram in the source. By analyzing just the translated content, the training process learns the order in
which the translated words are most likely to occur. The more training data and the more
consistency there is in the data, the more accurate the process becomes.
In the next stage of the process, the system compiles all of the learned data into the runtime MT
engine. The runtime MT engine subdivides each sentence into smaller chunks and looks up the
possible translations in the compiled database. For a given source sentence, this process results in
many possible translated sentences. The MT engine uses the statistical data on the probability of a
translation and the word order to determine the best candidate for the MT output.
The following graphic illustrates how SMT learns from translations rather than using language rules.
It shows very simply how the system uses statistical analysis to decide how to translate the Spanish
word “banco”.
The quality of the MT output depends on both the linguistic and the technical quality of the material
included.
Disadvantages:
Large databases of sufficient quality are required
Engine cannot be influenced directly
Little control over terminology
Compared with RBMT, statistical machine translation can offer a larger number of languages for
post-editing, as engines cost less and are faster to train, as well as being easier to maintain. Because
SMT is trained using “real” sentences and phrases, the direct output tends to be more fluent than
RBMT output.
The following table summarizes the key differences between SMT and RBMT.
Raw fluency
Raw accuracy
Hybrid MT can be any combination of SMT and RBMT technology. There are several forms of hybrid
MT, which are based on different approaches:
Coupling two or more existing systems in series or in parallel without modifying either
Using either RBMT or SMT as the basic architecture and extending it with either knowledge-
driven or data-driven components
Using a new hybrid architecture that combines knowledge-driven or data-driven
components
A common form of hybrid MT is RBMT post-processed with SMT (also known as statistical
smoothing). RBMT is used as a starting point for translation and SMT is added as a post-
process to correct the output and improve the fluency of the RBMT system
Disadvantages:
Risk of introducing errors with SMT
SDL takes a three-pronged approach to SMT and uses the following engine types, matching the
solution to the particular use case:
Baselines
Verticals
Customized engines
We will now explore the characteristics of each solution in more detail. It is helpful for post-editors
to know which type of engine they are working with in order to determine the correct approach for
post-editing.
4.1.1.
The baselines are the core generic engines developed by SDL for any given language pair and contain
hundreds of millions of words of bilingual data. The baseline systems are used as the starting point
for new language directions. They use existing translation databases to build up language pairs. The
data used is mined from reliable sources available in the public domain, such as news, IT
documentation, technical manuals and publically available government material, and cover a variety
of subjects, for example IT, automotive, news, sports, electronics etc.
Baselines can be used as powerful backup engines for customizations and verticals. This means that
if a word, phrase or grammatical structure is not found in the training data for a vertical or
This solution produces good results for clients who require immediate access to MT or do not have
sufficient data volumes for a custom solution. Baselines work particularly well if the content for
translation is general and varied.
4.1.2. Verticals
Verticals are trained statistical engines that are exclusive to a specific domain or subject area, for
example automotive, IT, electronics or travel. The data used for creating a vertical is selected from
sources within a domain or an industry. The MT output is more likely to follow domain-specific
technical terminology.
This solution can be used if there is not enough project-specific data to create a customized engine.
These domain-specific engines therefore provide a point of entry for projects with small project TMs.
Verticals are off-the-shelf solutions that also prove useful in cases where there is not enough time to
create a project-specific solution before the project starts.
Based on the higher volume of data used in a vertical in comparison with a customization, the engine
is less likely to take translations from the baseline and is therefore more likely to produce a specific
technical translation instead of a general one. However, as the data used to create a vertical will
come from different sources within a domain, the post-editor will need to look out for
inconsistencies in terms of style and terminology.
SDL verticals are available for the following domains in a wide range of languages:
These engines are reviewed on a regular basis and are retrained when new data or new technical
features become available.
One of the biggest advantages of a customized engine is the adherence to client-specific terminology
and style. As the machine translation output is fully based on the bilingual corpus, with no
syntactical or lexical data added, the quality of the output can only be as good as the quality of the
corpus. If the corpus data contains inconsistent terminology or style, the resulting MT output may
also show inconsistencies.
Baselines work well for general and varied content. Verticals are an off-the-shelf solution when no or
not enough client data is available for a customization. Customized engines are trained with
customer-specific data and are the recommended solution for specific client projects.
The type of MT solution chosen for a particular client or project not only depends on the type of
content to be translated, but also on the MT use case (for example raw MT versus post-editing).
Choosing the right solution within this context serves to improve the usefulness and efficiency of MT
and post-editing.
Building an engine to handle large data sets across several file formats requires a structured
process. Specialist tools at the data intake stage are required to optimize the data and
ensure that it is understood, cleaned and prioritized. Engine design must also include a view
of future translation requirements to ensure the ability to process data quickly and provide
insightful and relevant analytics further down the road.
Successful engine training is based on fully understanding the quality expectations for a
particular project. This includes having access to important project assets such as
terminology and style guides. These must be incorporated into the MT engine training and
used to measure the success of the output.
Once the training data corpus (a large collection of words, typically a translation
memory) has been selected, it is the role of the computational linguists to decide
how to optimize the data, with the key objective of ensuring high-quality MT
performance. Computational linguists, working with data specialists and engineers,
Data cleaning can also take place at this stage. This is a process applied to the
training corpus in order to make it compatible with the platform where the SMT
engines are created. This process improves the quality of the data by removing
content that could adversely affect the MT output, such as tags, entities, misaligned
segments and corruptions. These elements could appear in the output and provoke a
drop in productivity. Some parts are also harmonized to achieve MT output that will
be faster to post-edit, as fewer changes will be required.
There is a close relationship between engine data and future client content. A
combination of automated processes and human validation ensures that tests
remain relevant for use with future data. Being able to spot obvious issues such as
corruption in a million-word data set (impossible for the human eye) is critical when
assessing content. Recording the results in a structured way will ensure that lessons
are learned for future efforts.
It is likely that several attempts are needed (possibly simultaneously) in order to find
the optimal engine. The choice of engine design must be agile, based on experience,
incorporate knowledge of previously successful configurations and draw on an
inherent understanding of statistical machine translation behavior. Trainers must be
able to incorporate industry-standard automated evaluations to narrow down the
best engine candidates for human evaluation.
4.3.
Once an MT solution has been proposed for a specific use case, it is important to measure how well
it performs. To achieve this, a number of tests can be performed. The key considerations to be taken
into account are:
Human evaluation of MT quality assessment normally uses Likert-based scales. With this method,
resources are asked to score aspects of the MT output by following a list of parameters associated
with a numerical scale. For example, “score 5 if the output is entirely correct, score 4 if the output is
understandable but has grammatical errors,” and so on. This kind of assessment mainly focuses on
the understandability, utility and actionability of the MT output, although some vendors have
started looking into Likert-based scales that could help assess the post-editing effort.
Human evaluation can also be used to compare two or more MT engines or systems in order to
select the best one, and is based on the evaluator stating their preference between two or more MT
outputs generated for the same source sentences.
Human evaluation testing is also used to assess the level of productivity enhancement when working
with MT compared to human translation. In this case, the tests should try to match the real-life
production environment of the user while still focusing on the quality of the MT output. Some tests
mimic the features of a standard production environment while recoding detailed analytics. Most
productivity tests in the industry are based on a combination of measuring post-editing speed, and
post-editing effort and distance, or comparing post-editing speed with conventional translation
speed. This makes it possible to correctly predict the increased productivity that post-editing can
offer.
The main disadvantages inherent in human evaluation are higher costs and the fact that humans are
prone to subjectivity. However, as the use case becomes more complex (such as in the case of
PEMT), so does the requirement to test in a way that is relevant to the particular project. Quality and
productivity evaluations that involve human testers are naturally more costly, but provide conclusive
results overall.
In the last few decades, many methods for automated evaluation have been proposed. For some use
cases, these offer a quick, recognized and cost-effective way to analyze the potential quality of a MT
engine.
Most automated measures assess the quality of the machine translation compared to a reference
translation that is deemed to be high quality. Some of the most widely used approaches are detailed
below.
BLEU (BiLingual Evaluation Understudy) score: This algorithm aims to evaluate the quality of text
that has been machine translated. The central idea behind BLEU is “the closer a machine translation
is to a professional human translation, the better it is.” To assess this, scores are calculated for
individual translated segments—generally sentences—by comparing them with a set of good-quality
reference translations. Those scores are then averaged over the whole corpus to reach an estimate
of the translation’s overall quality. Even though BLEU has become a standard in the industry, it has
its limitations. Intelligibility or grammatical correctness are not taken into account explicitly, for
instance, as they are supposed to be included in the correct reference translations.
NIST: The name of this metric comes from the US National Institute of Standards and Technology.
This measure is based on the BLEU score, but it differs from the BLEU algorithm in several ways.
NIST also differs from BLEU in terms of how some penalties are calculated. For example, small
variations in translation length do not impact the overall NIST score as much as in BLEU.
METEOR (Metric for Evaluation of Translation with Explicit ORdering): This metric was designed to
address some of the problems found in the more popular BLEU metric, and also produces a good
correlation with human judgment at the sentence or segment level (this differs from the BLEU
metric in that BLEU seeks correlation at the corpus level). With this system, several features that had
not been part of any other metrics at the time were introduced. Matches in METEOR are made by
following the parameters below, among others:
Exact words: As with other metrics, a match is made if two words are identical in the
machine translation output and the reference translation
Stem: Words are reduced to their stem form. If two words have the same stem, a match is
also made
Synonymy: Words are matched if they are synonyms of one another. Words are considered
synonymous if they share any synonym sets according to an external database
Levenshtein distance: This metric measures the similarity or dissimilarity (“distance”) between two
text strings by calculating the minimum number of single-character edits (insertions, deletions and
substitutions) required to change one word into another. In the field of machine translation, this can
be done by comparing the raw MT output to the human translation.
The Levenshtein distance between “dog” and “frog” is 2, as it is not possible to convert the first
word into the second with fewer edits (replace “d” with “f” and add “r”).
This algorithm always has a maximum value that corresponds to the maximum length of both input
strings. In the case that two words do not have anything in common, the minimum number of edits
will not exceed the maximum number of characters in the longer string.
Example: If we have “computer” and “alibi”, the Levenshtein distance will be 8 and no higher than 8:
Replace “c” with “a”
Replace “o” with “l”
Replace “m” with “I”
Replace “p” with “b”
Replace “u” with “I”
Delete “t”
Delete “e”
Delete “r”
Example:
MT: “If I go home after 10pm, I will let you know”
Reference human translation: “I will let you know if I go home after 10 pm”
In this case, the MT output is correct and no changes would be necessary during a post-editing stage.
However, the Levenshtein distance will be quite high, as many changes would be required to turn
the first sentence into the second one.
This demonstrates once again the importance of selecting large test beds to run any of these
automated evaluations on, as that will allow us to obtain more reliable results.
TERp: This is a word-based metric that calculates the minimum number of edits required to match
an MT output to a correct reference translation, normalized by the length of the reference.
# of edits
TER =
average # of reference words
TERp (TER-Plus) is an extension of Translation Edit Rate (TER) and builds on the success of TER as an
evaluation metric and alignment tool. At the same time, it addresses several of TER’s weaknesses
through the use of paraphrases, morphological stemming and synonyms, as well as edit costs that
are optimized to correlate more closely with various types of human judgments.
Put simply, TERp measures the number of edits that are necessary to go from the raw MT output to
a final edited version. As such, it is a helpful metric to measure typing and editing effort.
The TERp score is a number from 1 to 100; the higher the number, the more editing was required.
Another alternative present in the industry is TAUS’s Dynamic Quality Evaluation Framework (DQF),
which was developed to tackle the general problem of evaluating translation quality.
The framework allows users to profile their content and receive guidance on best-fit evaluation
methods. DQF provides a vendor-independent environment for evaluating translated content.
A knowledge base documenting best practices provides detailed practical information on how to
carry out different types of quality evaluation. By establishing best practices, metrics and
benchmarks within a dynamic framework, best-fit evaluation approaches are applied depending on
content type and usage.
In conclusion, automatic measures do present some limitations. Especially in the case of assessing
productivity, the cognitive effort made by the post-editor to deliver a high-quality end product is not
really accounted for. For example:
The task of post-editing really amounts to much more than purely typing and editing. For that
reason, automated metrics are more suited to engine training development and comparison, but are
not necessarily practical in a production scenario. Human evaluations are difficult to replace in this
sense.
However, we also need to see different testing methodologies that respond to the demands of the
MT industry. The potential to use Translation Edit Rate as a measure of MT productivity in particular
has generated a lot of interest for a while now, and some MT clients are certainly very aware of it
and are very interested.
For us, it is important that our engines are fit for purpose, and our current testing methodology
ensures that the engine quality is good enough for post-editing. However, we know that our testing
methods also need to evolve in line with MT developments.
Translators new to post-editing will develop this skill with time. Post-editors will not be fully
productive from day one as they need to learn their trade. Industry research has shown that
experience is the single most important factor in translation productivity, and this becomes even
more influential in post-editing. Over time, translators learn strategies that help them adapt their
working practices to use the MT output to their advantage.
When preparing a file for post-editing, the translation memory is applied as usual to obtain the 100%
and fuzzy matches. Machine translation is applied to any untranslated text left after the TM is
applied.
The post-editing phase itself involves a number of key stages. Since the post-editor is attempting to
be as efficient and productive as possible, preparation is key. Do not rush ahead without taking time
to consider the source and MT output. Determine the useable parts and then build around these.
Focus on accuracy, without under-editing or over-editing, and finally check over the grammar and
terminology.
• Determine the usable elements (single words and phrases) and make
them the basis for your translation
2
• Build from the MT output and use every part of the MT output that can
speed up your work
3
• Correct any grammatical errors and make sure that the terminology of the
MT output is compliant with glossaries and termbases. These aspects will
always need to be checked, as any inconsistencies in the training material
5 will be reproduced in the output
• Finally, after post-editing each segment, reread your translation and make
sure that no details are missing and you have not left any words that are
7 not needed
5.2.
The market makes a distinction between two degrees of post-editing: post-editing to publishable
quality and post-editing to an understandable level or light PE. Quality expectations need to be
determined in conjunction with the customer — end quality is dependent on client requirements.
Post-editing to publishable level is the highest quality standard. This is in line with the expectations
of the majority of SDL’s clients. After post-editing, files undergo a quality check to ensure that the
translation is correct and fluent. The final quality should be comparable to conventional translation.
When post-editing a text to understandable quality, the user should not expect a grammatically or
stylistically perfect translation. Grammar and spelling mistakes will only be corrected if the meaning
is affected.
Instructions in the text must be actionable, not perfectly worded
The translation is not expected to be perfect but should be actionable. As an example, if
a knowledge base article or FAQ is post-edited to understandable quality, the user
should be able to understand how to fix a particular problem.
Terminology is appropriate in context, not client-specific
For light post-editing, the post-editor will be asked to fix critical terminology errors that
prevent the end-user from understanding the concept (NB: Automatic checks for client-
preferred terms can only be corrected if a glossary in MultiTerm format is provided).
Text is understandable, not consistent
Light post-editing does not allow the post-editor to check consistency across the text.
Information is accurate, not localized to in-country standards
Wrong punctuation is not corrected
Punctuation No
Consistency No
Country standards No
Country
Register and tone No
The following table lays out the differences in quality requirements for different use cases.
When quality checking, always bear the MT in mind and understand the initial MT output. Identify
known problems in advance (see Section 7) and make sure to include them in your checks (e.g.
wrong prepositions, terminology, known issues with MT). It is important to learn to distinguish
between what needs to be changed and what can remain untouched. Note that there are some
items that always need to be amended by the post-editor. Examples include date formats, spacing,
wrong prepositions or terminology issues caused by several possible translations of the same word.
When quality checking post-edited material, focus on over-editing and under-editing (depending on
style and client requirements) (see Sections 6.3 and 6.4). Over-editing will lead to lower productivity
and needs to be avoided during both the PE and the QA check phase. Under-editing may result in
quality issues and will impact negatively on the time needed for quality checking.
Before starting a quality check, make sure that all the content has been translated. Then check that
the post-edited text reads well from a user’s point of view. The post-edited text must match the
source. Be careful to look for mistranslations, words left out of the translation or additional words
that are not in the source text. Check that there are no typos. Scrolling down the file will enable you
to spot spelling mistakes and inconsistencies. Terminology, especially product names, should be
consistent with the master glossary. Sometimes, terminology is not consistent in the TMs and there
are additional lists and guidelines for terminology. Finally, check that the overall style is consistent
with the rest of the files and complies with the style guide from the client.
The following guidelines will help you identify usable parts and achieve the maximum post-editing
productivity. The translator needs to achieve publishable quality at the post-editing stage without
sacrificing translation speed. Once you have learnt to identify usable parts and to use them, you will
find post-editing easier and faster than translating from scratch. Like any new skill, however, there is
a learning curve with MT post-editing — but the more you practice, the faster and easier it gets.
Post-Editing Tips
Apart from this, it is important to bear in mind that account knowledge is important for post-editing
as well. While this is important for all translation projects—conventional as well as MT—a solid
knowledge of the project requirements with regard to style guidelines, terminology, the TM and
client expectations will help you achieve good post-editing productivity.
Excellent
linguistic skills
Domain and
Practice! subject
knowledge
Proficiency
Positive with CAT tools
attitude and
towards MT automated
text-checking
6.2.
The quality expectations will vary according to the degree of post-editing and the client
requirements. However, certain general principles apply. The aim is to deliver a high-quality
translation more quickly than a conventional translation. Translation speed is a key factor when
post-editing. Therefore, the machine translation needs to be corrected with a view to maintaining
efficiency.
When post-editing to publishable quality, there should be no difference in quality between a human
translation and a post-edited translation. However, there may be a slight shift in style. Style should
be correct and appropriate to the project, but may need to be less refined to allow for a more
efficient use of the MT output. Where a client specifically asks for MT to be used on their project,
the client needs to be made aware of this and expectations need to be managed accordingly.
There will, of course, be a certain amount of variation — but this is a feature of conventional
translation as well. So long as the quality criteria are adhered to, a post-edited text will be
considered to have met the quality expectations.
• The style and register of the target must be appropriate for the document
type.
• The translation must read well and be suitable for the end user.
There are two main issues that post-editors often face when attempting to fulfill the highest-
possible quality criteria in the shortest amount of time. These are under-editing and over-editing.
There is always room to allow for stylistic changes and creativity with post-editing, and stylistic
features that do not meet the client style guidelines should certainly be amended. The important
thing to remember is not to let preferential changes distract from necessary amendments and not to
let these changes have a negative impact on the overall productivity.
Comment on
Language PE with Over- PE without Over-Edited
Pair Source MT Editing Over-Editing Version
Fotos mit 1,3 Photos with 1.3 1.3 megapixel photos Photos with 1.3 Unnecessary
Megapixeln megapixels megapixels reordering of
DE-EN elements.
Zudem In addition there are There are various In addition, there Unnecessary
stehen different SATA-types types of SATA are different SATA rephrasing and
verschieden are available, such as available for this, types available, change of syntax.
SATA-Typen micro SATA or Slimline such as micro SATA such as micro
zur Auswahl, SATA. or slimline SATA. SATA or slimline
wie z.B. Micro SATA.
SATA oder
Slimline-
DE-EN SATA.
Mit der 1 With the 1 meter long You can optimise With the 1-m Unnecessary
Meter langen Tischantenne you can your WLAN reception table-top antenna reordering of
Tischantenne significantly optimize significantly using the you can elements; more
können Sie your WLAN-reception. 1-m table-top significantly of the MT can be
Ihren WLAN- antenna. optimise your left unchanged if
Empfang WLAN reception. syntax is kept as
deutlich is.
DE-EN optimieren.
Make sure Sicherstellen, dass das Währenddessen Das Bremspedal Unnecessary
that the brake Bremspedal muss das Bremspedal muss rewrite; usable
pedal is niedergedrückt wird weiterhin gedrückt niedergedrückt parts of the MT
depressed während Sie dieses werden! sein, während Sie were ignored in
while you Verfahren dieses Verfahren over-edited
perform this durchführen. durchführen. version.
EN-DE procedure.
Post-editing is not a light review of machine-translated content. MT output is rarely flawless and
post-editors need to notice the issues and evaluate which parts of the MT output can be used for the
final translation. Post-editing is most efficient when the translator knows what kind of issues to
expect. MT output issues depend to some extent on the project and language combinations as well
as the type of MT technology used.
Incorrect or missing
Mistranslations/ Vocabulary left in
articles and
antonyms source language
prepositions
Since statistical processes do not actively involve linguistic information in the translation process
(excluding any rules that are extracted from statistical regularities), the opposite meaning may occur
in the MT output (i.e. positive sentences instead of negative sentences and vice versa). This is
relatively easy to post-edit but should always be checked during the post-editing stage as it is a
major quality concern.
Some general guidelines for post-editors who would like to provide actionable feedback are
presented below:
Feedback needs to be precise and specific in order to be useful and actionable
It is important to report structural issues, not only individual mistranslations. Structural
issues are recurring errors with an immediate impact on productivity, e.g.:
○ Same particle wrongly appears at the start of sentences
○ Incorrect punctuation and additional spaces
Indicating the frequency of errors helps determine possible error patterns
Source and translation units are matched up based on statistical probability. As opposed to rules-
based engines, no grammatical rules are applied and no grammatical information is stored for
individual words or terms.
When evaluating style and terminology for an SMT engine, it is also important to take into account
whether the engine in question is a customized engine (i.e. specifically created for one project) or a
vertical engine (i.e. created for a domain). Vertical engines combine data from many different
projects within a domain and can therefore be expected to produce less consistent output than
customized engines, which use project-specific data for a single project/client only.
Unless you are working with adaptive MT technology of some kind (see Section 9), SMT engines are
static once they have been created and can only be changed by retraining the engine, which involves
reassembling and re-uploading the data corpus. Post-editor feedback becomes very relevant and can
be implemented at that stage.
Retraining of Engines
Any unusual issues in the MT output that do not relate back directly to the data corpus should
always be reported immediately. Examples of such issues include corrupt characters, the random
insertion of words or numbers, or a high number of untranslated words in the target. These
occurrences may indicate a general technical problem that needs to be fixed urgently.
At the same time, many of the issues reported for SMT engines are in fact expected SMT behaviors.
Post-editors and project leads should therefore familiarize themselves with the issues they are likely
to encounter when post-editing statistical machine translation output. These were detailed in the
previous section, “Common Patterns to Watch for When Post-Editing.”
In all cases, post-editors can still influence the quality of MT by making sure that project dictionaries
are updated when working with rules-based MT, and by carefully maintaining TMs for statistical MT
in order to build up a clean data corpus for future use.
SDL Language Cloud machine translation can be applied to the files you receive from your clients and
the output can be post-edited to full publishable quality. Generally, output from the baseline
engines might require additional work for projects with very specific and technical terminology.
Industry-specific or domain MT engines can generate translation results that require less post-
editing.
Through a series of available subscription packages, the Language Cloud online platform also allows
users to create their own custom MT engines from Translation Memory eXchange (TMX) files. By
using custom MT engines, future machine translation outputs will be closer to customers’ previous
content.
During the project creation process, or via Project Settings for an existing project, go to “Language
Pairs” -> “Translation Memory and Automated Translation”, click “Use…” and then select “SDL
Language Cloud Machine Translation” from the dropdown menu:
After the 30-day trial is over, you will be able to choose from a selection of packages available on the
SDL Language Cloud website, including a free package for free machine translation usage.
Segments where MT has been applied appear as “AT” (Automated Translation) in the Editor.
8.1.2.
To apply MT to whole files or projects, you can use the “Pre-translate Files” batch task.
In the settings for this batch task (accessible via Project Settings or when you run the batch task
itself), choose “Apply automated translation” for the “When no match found” option. Also note that
this batch task will reapply your TMs, so ensure that the minimum match value is set appropriately
(usually 75):
Please note that machine-translated words are listed as “New” in Analysis reports.
8.1.3.
Retrieving MT output using AutoSuggest in Studio is another excellent way to increase productivity
through clever leverage of machine translation.
AutoSuggest editing is an optional feature of SDL Trados Studio that can be used to speed up
conventional translation. AutoSuggest monitors what you type and, after you have typed the first
few characters of a word, presents you with a dropdown list of suggested words and phrases in the
target language that start with the same characters. If one of the words or phrases matches what
you were about to type, you can automatically complete the word or phrase by selecting it from the
list. As you continue to type, the list of suggested words is continuously updated.
Example of longer chunks that are not perfect, but using a shorter chunk can still provide
productivity increases through assisted typing.
Another example of longer chunks that are not perfect, but using a shorter chunk can still provide
productivity increases through assisted typing.
When the MT output is good enough, you can use longer chunks, or even fully translate the segment
with the suggested MT output.
Example of good MT where a large chunk can be inserted, improving productivity effectively.
Example of good MT where the whole sentence can be translated instantly, with no post-editing
required.
As the suggested MT output is entirely optional when using this feature, the user can simply ignore
the suggestion and continue as normal if a suggestion is not good enough for a specific section.
You can set up AutoSuggest to provide suggestions from other sources other than MT. All
suggestions will be displayed in the same dropdown list.
As you can see in the following screenshots, an icon will help you identify the source of the
suggestions. These can include termbase (green can icon), MT suggestions (blue AT icon), TM
Retrieving MT output using AutoSuggest is another excellent way to increase productivity through
clever leverage of machine translation.
8.2.
A Language Cloud subscription allows customers to enforce specific terminology by using Language
Cloud (LC) dictionaries. Dictionaries use the same underlying technology, and are relatively easy to
set up, select and maintain. Dictionaries can be used in combination with any type of engine
available in LC (baseline, domain or customized engines). Other advantages include the fact that
more than one dictionary can be applied per project and that exact term matches will be translated
correctly.
The main disadvantage is that the fluency of translations may suffer: If a term is found in a source
segment, the engines will translate the part of the sentence before the term, add the translation
from the dictionary, and then translate the part of the sentence after the term. If a sentence has one
or more terms, this may have a negative effect on the quality of the MT output. It is recommended
to carry out some tests using the actual data to check whether the quality loss is acceptable as a
trade-off for correct terminology.
LC requires dictionaries in TBX format. Lists of terms in Excel or MultiTerm termbase (SDLTB) format
can easily be converted to TBX using tools such as the Glossary Converter tool, which is available for
free in the SDL AppStore. The tool includes a help file with easy-to-follow instructions.
8.2.1.
Due to the high impact that enforcing terminology has on the MT output in LC, it is important to
follow the criteria below when creating a dictionary for this purpose:
Dictionaries are managed in a browser through your online LC account, which allows you to create
and maintain dictionaries. Let us look at how to get there. Once on the account page, use the menu
on the left to go to Machine Translation, and click on the Dictionaries tab.
Click on New Dictionary (the blue button at the top right of the tab), and enter a name and
description. It is strongly recommended to include the language(s) in the name, for example
Automotive Project 1_Dictionary_EN-ES.
Click on Add languages to open a language list. Select all the languages contained in your termbase,
and click Save.
Check the status on the details page, and wait for it to switch to Done.
If you upload the same file twice or have overlapping content, no duplicates are generated, so you
can upload a later version with additional terms and it will overwrite an existing one.
The details page of a dictionary is accessible by clicking the Edit button next to the name.
In your Language Cloud account, go to the Translator tab. Choose the source and target language,
and the dictionary you uploaded. If your dictionary does not appear in the list, you may have to
refresh the page or log out and then in again.
Translate a segment that contains a word from your term list. The translation should contain the
translation from the dictionary. Go to the Add a dictionary dropdown again, select None, and
translate the same sentence. You are likely to get a different translation. Of course, the translation
might have used the correct term anyway, so it is worth testing with several different terms and
segments.
AdaptiveMT addresses the balance between MT technology and user experience by giving the post-
editor a much greater degree of control over machine-translated output and engine quality. The
machine learns from the corrections the translator makes, significantly reducing errors over time
and thus alleviating the frustration of having to correct the same mistakes again and again.
AdaptiveMT is self-learning machine translation where the MT engine adapts in real time to the
terminology and style used by the translator, based on each individual post-edited segment.
AdaptiveMT is available in Trados Studio 2017 for the following language pairs on the SDL baseline:
English > French, German, Spanish, Italian and Dutch. Reversed language directions and other
language pairs (such as Japanese <> English) are expected to follow in 2017.
Adaptive engines are personal at present; however, plans to allow multiple translators to share
adaptive engines are in the pipeline.
With AdaptiveMT, by contrast, the machine learns seamlessly and continuously from user feedback,
in real time, during the post-editing process. Any post-edits are fed back to the machine translation
engine, hosted by SDL, and this engine is completely personal to the translator. The post-edits are
exclusive to the translator and are not used to train any of SDL’s own engines. Translators will be
able to benefit from improved MT output from job to job which is adapted to their specific tone,
terminology and content. AdaptiveMT has evolved from a traditional static MT system to a highly
scalable and responsive dynamic system capable of self-learning and self-improvement.
MT technology and MT innovation add the most value when embedded in a well-designed and
structured process.
When put into the context of post-editing, AdaptiveMT is a new and powerful tool for translators
and post-editors which needs to be integrated into the existing MT lifecycle. This innovation is
currently targeted at an individual level, giving control to post-editors, but also opens up new MT
solutions and scenarios that will need to be delivered and integrated for translation teams, LSPs and
organizations. This will create the need to set up projects efficiently and share adaptive engines in
the same way as other translation resources such as termbases or TMs.
The impact of machine translation on the translation industry is only set to grow, and technological
improvements pave the way for language pairs or content types previously considered unsuitable for
machine translation.
Interactivity will be key, not only during the post-editing process but also when training and
maintaining engines and measuring their quality. The role of the post-editor will become increasingly
strategic, and linguists and translators will be able to influence and drive MT output on several levels
— both through adaptive technology and through highly evolved tools to measure and predict
engine quality and human effort. The human element is one of the key influencing factors for
machine translation development and SDL remains committed to working with the translation
community to enhance technology progress and user experience.
Our objective is to harness the power of these technologies to connect people and help
organizations go global faster.
Readers are subsequently introduced to the technology behind RBMT and SMT systems in Chapter 3
and are then guided through the engine creation options currently available in Chapter 4. This
includes information on preparation, testing and engine training, which is central to the MT process
at SDL and ensures the highest-quality output.
Readers are given a series of examples and tips on how to post-edit the raw MT output to different
degrees of understandability and to a high-quality, publishable standard in Chapters 5 and 6. The
latter chapter explains how post-editing as a skill can be integrated into the translator’s workflow
and how best to improve productivity without compromising on quality. Common pitfalls of MT
technology and how to provide feedback on the issues observed are discussed in Chapter 7.
The training workbook explains how to access MT through SDL Language Cloud in SDL Trados Studio
and why this can be advantageous to translators, both as a means of increasing efficiency and as an
introduction to the practice of post-editing.
Finally, there is an introduction to the recent development of AdaptiveMT, which allows for an
unprecedented level of interactivity between the post-editor and the MT output, bringing the
technology closer to the user.
The aim of the training workbook has been to cover the material above, while reassuring aspiring
post-editors that MT does not remove the need for human translation or the creative input of the
translator, but simply facilitates and accelerates their task. MT provides a means to an end rather
than an end solution in itself.
Over the last few years, MT has been moving out of the research lab into a bigger, more critical
playing field, powered by multidisciplinary teams, end-user satisfaction and a more rewarding post-
editing experience. The combination of structured user feedback loops and the latest, best-of-breed
MT technology has put the post-editor firmly at the heart of the MT lifecycle and improvement
process.
Webinars •TAUS:
http://www.translationautomation.com/multimedia/mult
imedia#webinars
•GALA: https://www.gala-global.org/resources/videos-
downloads
•Program for the TranslatingEurope Forum 2016, featuring
video recordings:
https://ec.europa.eu/info/sites/info/files/tef2016_progra
mme_with_links_en.pdf
• SDL: http://www.sdl.com/community/blog/
• SDL AppStore: http://appstore.sdl.com/
Blogs & Other • Paul Filkin’s Blog (CAT tools, Studio):
https://multifarious.filkin.com/
Copyright © 2017 SDL plc. All Rights Reserved. All company product or service
names referenced herein are properties of their respective owners.