Professional Documents
Culture Documents
译后编辑与人工翻译过程中认知努力的对比实证研究
译后编辑与人工翻译过程中认知努力的对比实证研究
UDC __ 编号 20141210057
广东外语外贸大学硕士学位论文
申请人姓名 周 博
导师姓名及职称 卢植 教授
申请学位类别 文 学
学科专业名称 翻译学
论文提交日期 2017 年 3 月 17 日
论文答辩日期 2017 年 5 月 25 日
刘梦莲 副教授 邹兵 讲师
学位授予单位 广东外语外贸大学
独创性声明
本人郑重声明:所呈交的学位论文是本人在导师指导下进行的研究工作及取得的
研究成果。据我所知,除了文中特别加以标注和致谢的地方外,论文中不包含其他人
构的学位或证书而使用过的材料。与我一同工作的人对本研究所做的任何贡献均已在
论文中作了明确的说明并表示谢意。
作者签名: 签字日期: 年 月 日
学位论文版权使用授权书
有权保留并向国家有关部门或机构送交论文的复印件和磁盘,允许论文被查阅和借阅。
行检索,可以采用影印、缩印或扫描等复制手段保存、汇编学位论文。
作者签名: 导师签名:
签字日期: 年 月 日 签字日期: 年 月 日
An Empirical Investigation of Cognitive Effort
Required to Machine Translation Post-editing
Compared to Human Translation
By Zhou Bo
Submitted
in Partial Fulfillment of the Requirements for
the Degree of Master of Arts
in Translation Studies
i
ACKNOWLEDGEMENT
I’d like to extend my sincere appreciation to those professors and students who
me much help and supported me through my thesis. Without his consistent guidance
University of Chinese Medicine, without whom, I could not get access to the
Ma offers me great encouragement and helps me tackle with every difficulty I’ve
I also want to thank Dr. Sun Juan who gives me many helpful advices. Besides,
my beloved parents, my dearest friend Wu Lijuan ,Wang Daozhu and Wang Ya are
always by my side and support me through the thesis. Also, I would like to thank
Without all these distinguished and lovely people, this thesis is not possible of
completion.
ii
ABSTRACT
source text and target text between human translation and post-editing.
mixed design. Task and text type are within-subject factors; competence of translator
and 15 postgraduates. Each participant is asked to translate six short texts, three to
translate from scratch and three to post-edit. Translation materials are presented with
Translog II. Real-time eye movement data is collected by eye-tracker. Pupil dilation,
fixation count and fixation duration are collected as proxies of cognitive effort.
Results show that (1) post-editing is processed significantly faster than human
translation (p < 0.01); (2) fixation counts for post-editing are significantly fewer than
those for human translation, so is the average fixation duration; pupil dilation for
post-editing is significantly smaller than that for human translation; all these indicate
that cognitive effort required to post-editing is less, compared with human translation
(p < 0.01); (3) the main effect of text type is significant (p < 0.01), which indicates
that cognitive effort for post-editing varies with text types; (4) the main effect of
iii
competence proves marginally significant (p = 0.051); postgraduates require less
cognitive effort for post-editing than undergraduates; (5) translators look more into
source text area in human translation than in post-editing (p < 0.01); yet no significant
difference is found in terms of fixation counts in target text area; besides, fixation
duration on source text and target text are both significantly longer in translation
fixation counts and fixation duration together indicate that, compared with
post-editing, translators consume more cognitive effort for both source text
comprehension and target text production when translating from scratch; (6) for both
post-editing and human translation task, there are more fixation counts and longer
fixation duration in the target text area than in the source text area.
Based on the results indicated above, post-editing could save temporal effort and
increase productivity. What’s more, post-editing saves cognitive effort in both source
text comprehension and target text production. Thus, post-editing should be a viable
Cognitive effort
iv
摘 要
本研究针对汉英语言对,从认知视角采用眼动的方法探究相较传统的人工翻
译而言,机器翻译的译后编辑是否能够成为大学英语学习者进行汉英翻译的新方
法。针对该研究目的,提出三个研究问题:①译后编辑和人工翻译过程中所花费
的时间和认知努力有什么不同;②被试水平及文本类型是否影响译后编辑过程中
所需的认知努力;③在翻译过程中,译后编辑和人工翻译在对原文理解和译文生
成的认知努力分配上有什么不同。
和文本类型(经济、政治和文学)为被试内变量,被试水平(本科生和研究生)
实时眼动数据由眼动仪记录。实验收集瞳孔直径、注视点个数、注视时长以及任
务总时长用以分析翻译或译后编辑过程中时间及认知努力。
实验结果如下:①译后编辑的完成速度高于人工翻译,两者存在显著性差异
(p < 0.01);②译后编辑过程中注视点个数明显少于人工翻译;平均注视时长远
低于人工翻译;译后编辑时译者瞳孔直径明显小于人工翻译;表明译后编辑过程
译后编辑时所消耗的认知努力随文本不同而变化;④被试水平主效应边缘显著(p
= 0.051),译后编辑中被试水平越高所消耗的认知努力越少;⑤人工翻译中译者
上两者没有显著差异;对原文和译文的平均注视时长上,人工翻译均长于译后编
⑥在人工翻译和译后编辑过程中,译者对译文的注视次数及注视时长均多于对原
文的注视。
研究结果表明,译后编辑可以缩短翻译时间提高翻译效率,减少译者在原文
ii
理解及译文生成上的认知努力的消耗,是大学英语学习者进行汉英翻译的可行选
择。
关键词:译后编辑,机器翻译,眼动,时间努力,认知努力
iii
CONTENTS
ACKNOWLEDGEMENT ............................................................................................. ii
ABSTRACT ..................................................................................................................iii
摘 要........................................................................................................................ ii
CONTENTS .................................................................................................................. iv
1.2 Significance................................................................................................... 3
REFERENCES ............................................................................................................ 70
APPENDICES ............................................................................................................. 78
vi
LIST OF ABBREVIATIONS
PE: Post-editing
PG: Postgraduate
UG: Undergraduate
vii
LIST OF TABLES
(N=20)
(N=20)
task (N=20)
task (N=20)
Table 5-9: Descriptive Statistics of total fixation counts for both human translation
Table 5-11: Results of three-way ANOVA in terms of fixation counts in the ST AOI
Table 5-12: Results of three-way ANOVA in terms of fixation counts in the TT AOI
both AOIs
Table 5-14: Descriptive Statistics of the average fixation duration for all AOIs
AOI
viii
Table 5-16: Descriptive Statistics of the average fixation duration for ST AOIs
AOI
Table 5-18: Descriptive Statistics of the average fixation duration for TT AOIs
ix
LIST OF FIGURES
Figure 5-3: Replaying of the xml file after step 4 (blue circles in the picture refer to
eye fixations)
Figure 5-4: Average processing speed for translating or post-editing different types of
Figure 5-8: Estimated marginal means of total fixation counts of economic text
Figure 5-9: Estimated marginal means of total fixation counts of political text
Figure 5-10: Estimated marginal means of total fixation counts of literary text
Figure 5-11: Average fixation counts in win 1 (ST AOI) for translating or post-editing
Figure 5-12: Average fixation counts in win 2 (TT AOI) for translating or post-editing
xi
CHAPTER ONE
INTRODUCTION
This chapter gives a brief introduction to this research, including the research
methodology of the research and data collection. At the end of this chapter, a brief
1.1 Rationale
information and global communication has produced huge needs for machine
translation not only for personal use but also for commercial usage. Yet, although
boasting the advantage of high speed, machine translation system has always been
criticized for the poor quality of its output. Considering the complexity of the machine
attracting eyes not only from the translation industry but also from the world of
guidelines, quality evaluation, language pairs and cognitive effort, to name a few.
Some studies have indicated that machine translation post-editing has an obvious
shown that for each participant there’s a productivity increase, which means that
increase their productivity. However, studies by Carl (2011) did not find significant
compared with traditional human translation yet can be drawn. Productivity increase
name a few. Koponen (2016: 136) points out that productivity increase depends on
some “specific conditions” and that “sufficiently high quality machine translation
which is currently achievable for certain language pairs and machine translation
Many language pairs have already been studied, for instance English-Danish,
Chinese-English language pair is rarely studied. García (2010) has carried out
Although García (2010) did prove the usability of raw translation, no significant
difference was found in time saving between post-editing and human translation.
Besides, García only considered temporal effort. In this research, both temporal and
cognitive effort will be taken into account to testify the applicability of post-editing to
Chinese-English translation.
technical and cognitive. Temporal effort is about time. Technical effort is related to the
2
technical operation of translators, such as deletion and insertion. Both of these two
efforts are easily observed. The cognitive effort of post-editing, as the most important
one among the three, influences temporal and technical effort and should be seriously
as well as the improvement of machine translation systems, few studies are conducted
1.2 Significance
To bridge the gap of post-editing studies in China, this thesis reports an exploratory
collects data that could reflect cognitive effort. The significance of this study is shown
as follows.
First, post-editing has become a really hot issue for both translation industry and
focus on literature review and ways to improve machine translation system. This
research was among the first experimental translation process research concerning
language pair was testified. As studies in the past have revealed that the applicability
of post-editing varied according to languages and text types, many language pairs
have been ascertained. Although, García (2009, 2010) has conducted experiments
testing post-editing for Chinese and English language pair, the study was conducted
from temporal aspect. Study in this paper was carried out from cognitive aspect.
Cognitive effort was compared between post-editing and human translation and
most studies in the past were conducted with professional translators or post-editors in
small sample, the results of this study still indicate that for the Chinese-English
could become a new way for college English learners to conduct Chinese-English
translation work, compared with traditional human translation. Therefore, the research
cognitive perspective.
2) What influence will text type and the competence of translators have on
3) What’s the distinction of the allocation of cognitive effort to source text and
temporal and cognitive effort of post-editing with human translation. Question two
explores the influence of text type and translator competence on cognitive effort spent
in the post-editing process. Question three is a further investigation into the allocation
of cognitive effort to the source text and target text so as to learn more about the
4
post-editing process and (if possible) to explore the reason of cognitive effort
reduction.
To seek answers for the research questions, an empirical study is conducted with
of the participants, thus to compare cognitive effort expended in the process of PE and
conventional translation.
undergraduates and 15 postgraduates. Each participant has to complete six tasks, three
to translate and three to post-edit. Six tasks are presented to participants with
After the experiment, four kinds of data are abstracted or derived from the xml
files: (i) total task time, (ii) fixation counts, (iii) average fixation duration and (iv)
Chapter One is the introduction of the thesis, including rationale and significance
of this research, research objectives and questions, methodology and data collection.
Chapter Two presents a brief review of studies on post-editing in the late 1950s
and early 1960s as well as in the twenty first century .Studies on post-editing in China
this research. Definitions of major items and theoretical basis for this research are
Chapter Five is data analysis and discussion. It also elaborates some criteria for
eye movement data. Results of the experiment are presented in this chapter. Three
Chapter Six is the conclusion part, briefly concluding major findings of this study,
elaborating some limitations of the study, and also offering suggestions for future
studies.
6
CHAPTER TWO
LITERATURE REVIEW
as PE) in the late 1950s and early 1960s as well as in the past decade, which are,
respectively, the “inception” of post-editing in the word used by García (2012) and the
prosperous period since the work of Krings (finished in 1994) was translated and
different methods and PE effort studies on different language pairs, which are closely
related to this research. At the end of this chapter, a summary of the previous studies
is presented.
machine translation begins to attract people’s eyes and seemingly becomes a strong
candidate for the replacement of human translation. In fact, it’s not a novel issue. As
one of the earliest envisioned uses for machine translation system, PE is as old as
was a pretty hot issue in the late 1950s and early 1960s.
The earliest envision for the use of PE is proposed by Edmundson and Hays
(1958). In their paper, they firstly introduced the methods used then at The RAND
translation and analysis. Edmundson and Hays (1958: 11) noted that “translation” in
7
their paper referred to the two-stage processes of machine translation and post-editing.
The machine translation system would do a rough translation and “a post-editor works
on this list, converting it into a smooth English version of the Russian original”
(Edmundson & Hays, 1958: 11). Although Edmundson and Hays (1958) proposed the
conduction of PE work, they just considered it as a simple revising part for a better
translation.
According to García (2012), Orr and Small (1967) was the first to take
their research to compare three different English versions of the same Russian
normal manual translation. The results showed that scores of the hand-translation
group were higher than post-edited group scores; and the post-edited group scores
were higher than machine translation scores, which consistently indicate that manual
This period was called by García (2012: 293) as the “inception” period of PE, yet
García noted that studies during this period were mainly empirical and were
fundamentals established by theorists, since the computer technology at that time was
Between the inception period and the prosperity period (the past decade), there was
actually another period named as “latency” period by García (2012), which was from
automatic language translation and computer linguistics and stated clearly that
compared with traditional translation, PE required more time, produced worse quality
8
and was more difficult to perform. The ALPAC Report, to some extent, had machine
translation and PE studies cool down. During, this period, those who continued to
pursue PE were some institutions and enterprises like Systran and METEO. Therefore,
The past decade has witnessed prosperity in PE research with the advent of
Translation Memory tool, the Free Online Machine Translation by the end of the
1990s and advances in computer technologies which lead to high quality of machine
translation output. Besides, with an increasing demand for information and global
The experimental studies reappeared since the late 1990s when the work of Krings,
finished in 1994 in German, was translated and published in English in 2001 (García,
2012).
could increase productivity (e.g. Guerberof, 2009; García, 2010; Plitt & Masselot,
2010; Carl et al., 2011; Green & Manning , 2013; Zhechev, 2014) ; whether the output
of machine translation could be used or what’s the quality of PE output (e.g. Blatz et
al., 2004; Specia et al., 2009; Fiederer & O’Brien, 2009; Plitt & Masselot, 2010;
Guerberof, 2014); was it feasible for a person to post-edit without referring to the
source text (e.g. Koehn, 2010; Callison-Burch et al., 2010; Hu et al., 2011; Mitchell et
al., 2013; Koponen & Salmi, 2015); how many effort would be cost in the
post-editing process compared with human translation (e.g. Tatsumi, 2009; Specia et
al., 2010; Sousa et al., 2011; Callison-Burch et al., 2012; Moran et al., 2014; Vieira,
2014) and some new methods that could be applied to PE effort research (e.g. Specia,
2011; Carl et al., 2011; Lacruz et al., 2012; Elming et al., 2014; Lacruz et al., 2014).
Productivity increase and time saving has been proved by many studies in
Plitt & Masselot, 2010). However, there are also some conflicting findings. For
instance, studies by Carl (2011) did not find significant difference in productivity
9
between post-editing and human translation. When investigating the role of
non-experienced translators.
translators, text types, to name a few. Koponen (2016: 136) points out that
for certain language pairs and machine translation system geared toward the specific
Research on PE effort forms a large portion of all the research on PE. PE effort was
categorized into three kinds: temporal effort, technical effort and cognitive effort by
Krings (2001). PE effort studies in the past decade mainly focus on two or three kinds
and eye-tracking. Also, different language pairs were studied. However, in China,
most research on PE is literature review and empirical studies are barely conducted,
Research on PE effort has been carried out with different methods, including TAPs,
Krings (2001) carried out a TAPs study of the mental processes involved in PE
logging and eye tracking. Koglin (2015) used respective TAPs combined with eye
tracking and keystroke logging to investigate the cognitive effort in the process of PE
and human translation. O’Brien (2006) also used the retrospective protocols -“retro
eye cue method” the exact words she used in her paper. The participant spoke out
what they were thinking and doing at a particular time when they looked at the reply
Also, some studies use specific scale to have the participants do self-evaluation
of the PE effort. In Specia’s (2010) study, professional translators were asked to rank
the four sentences based on the degree of PE effort each sentence needed to be done
asked translators to evaluate and rank the sentences after the translation or
post-editing tasks according to the PE degree with a five-point scale and found that
the evaluation was closely related with the quality. Yet, this kind of evaluation can
only roughly provide a prediction for the PE effort, since it is, to some degree,
Keystroke logging has long been used to investigate the cognitive aspect of
translation process (O’Brien, 2006). Elming et al. (2014) point out that the use of key
logging to track how the changes were made is a better measure of the actual technical
effort. In the study of Elming et al. (2014), professional translators were asked to
tool. Keystroke data were collected analyzed. Elming et al. (2014) found that
post-editing could lead to time saving of 25% than human translation and the time
saving was largely related to the number of edits. This result coincides with the one
found by Tatsumi (2009) that the time for PE greatly depended on the number of edits
carried out.
interaction with Translation Memory tools. In this study, she found that when there
was no match for the source text, translators had to spend the most cognitive effort.
O’Brien (2008) further studied the cognitive effort translators required when dealing
with different values of fuzzy matches with eye tracking method, and found that
cognitive effort was not really inversely proportional to the value of fuzzy match. Carl
post-editing and translation from scratch. The results of this study indicated that the
processing speed and translation quality were all boasting improvement for
conclusion.
As Koponen (2016: 136) points out that increase in productivity is subjected to some
which means that the applicability of machine translation post-editing for different
Many language pairs have already been studied. Specia (2010) has studies
post-editing from English to Danish and Spanish and found positive results.
Sousa (2011) assessed the post-editing effort for English and Brazilian Portuguese and
found that when translating subtitles, PE is about 40% faster than conventional
characters (sentence length and text structure) have on PE speed was carried out in
12
terms of English-Japanese language pair.
As for language pairs including Chinese, studies are not as much as other
languages, especially English which almost included in every study (Tatsumi, 2009;
Specia et al., 2010; Sousa et al., 2011; Callison-Burch et al., 2012). Maybe, this is
(2015).
Lourenço da Silva, Schmaltz and Alves et al. (2015) carried out an exploratory
They collected keystroke and eye data of professional translators to compare the
process of PE and human translation. Results indicated that PE and human translation
required different cognitive efforts to the understanding of source text. Also, technical
effort was found different in the text production. Lourenço da Silva et al. (2015)
found that if the number of deletions is greater than that of insertion, the cognitive
García (2010) conducted a study to testify whether the translation advised by the
Google Translator Toolkit for no-match sentence was suitable to use. Translation
students were asked to translate from scratch and post-edit respectively from English
significant difference was found considering the processing time, yet results supported
that the quality of PE output was comparable to the human translation output.
García (2011) further conducted another study. In this study, García included
factors like language directionality, difficulty of source text and performance level of
translators. 14 subjects were asked to translate and post-edit from English to Chinese
and 21 subjects were asked to translate and post-edit from Chinese to English. Time
for doing the task was recorded and the quality of final translation was marked as the
course grade of the subjects. This time, still no significant difference was found in
terms of PE productivity. But the results indicated that compared from translating
from English to Chinese, subjects had more productivity increase when translating
13
from Chinese to English. García (2011: 229) concluded that “translating by
As Feng and Cui (冯功全 & 崔启亮, 2016: 68) has pointed out that post-editing
research in China has far lagged behind those in the West. In Translation Studies,
studies on PE are nearly twenty years later than those in the West countries. PE
China and aboard and research on way to improve machine translation system.
In the 1990s, research on PE were mainly about the way to develop and improve
the intelligent post-editor1 (IPE) (黄河燕 & 陈肇雄, 1995; 韩培新, 1998). Studies
published by Wei and Zhang in 2007. Wei and Zhang (2007) introduced the basic
concept of PE, elaborated the necessity for PE and described how to do PE and who
output. The statistical result of this study has supported that machine translation has
made great progress in dealing with syntax. Luo and Li (2012) also stated that it was
Li and Zhu (2013) conducted a further research based on the research of Luo and
Li (2012). Li and Zhu (2013) supposed that a secondary processing could be done to
post-editors. The error patterns identified in the study of Li and Zhu (2013) provided
very helpful foundation for the development of IPE in the future. Cui and Li (2015)
also identified the error patterns of machine translation by practical examples and
summarized the characteristics of PE. Yet, Cui and Li (2015) focused on scientific and
1
Here, post-editor refers to the machine system that could do post-editing work. In this thesis, post-editor refers to
the person who conducts post-editing work, unless otherwise noted.
14
technical texts.
Cui (2014) and Feng and Cui (2016) elaborated the focuses of PE studies in
China and aboard and predicted the trends for PE research. Wang (2013) focused on
the west. Feng and Zhang (2015) described the necessity of post-editor training in
translation education and proposed that the post-editor training course could improve
the competiveness of graduates who are majoring in translation and also satisfy the
and factors that could affect productivity and found that quality of machine translation
machine translation and verified the viability with his own translation practice. Wang
(2016) also researched the applicability of machine translation plus post-editing to the
non-technical texts.
whole picture of the PE research. There are also some studies on the development or
post-editing has achieved abundant results in the West, especially in the west, yet
2.4 Summary
translation output, post-editing effort and so on. Post-editing research has become a
effort research. However, lots of studies have indicated that the applicability of
15
post-editing is condition-specific. It is subject to text, language, machine translation
researchers have proposed that text type might influence the feasibility of post-editing,
few studies really take this factor into account. In addition, almost all the studies are
conducted for the benefit of translation industry, thus the participants are mostly
gap, this paper presents a study testing the viability of post-editing for
Chinese-English language pair. Participants for this experiment are all students so as
to testify the usability of post-editing by college English learners. What’s more, three
kinds of texts are included to investigate whether text type influence post-editing
effort.
16
CHAPTER THREE
THEORETICAL FRAMEWORK
3.1.1 PE
Bar-Hillel in 1951 stated that the “fully automatic” machine translation was “not
achievable in the foreseeable future’, and there had to be a “human brain” intervening
in the process (1951: 230). Here, the “human brain” refers to the person who does
post-editing work. Bar-Hillel (1951: 231) believed that the task of post-editing was
“to produce out of the raw output … a readable translation in a fraction of the time it
procedure”.
Allen (2003: 297) deemed that the task of post-editors was to “edit, modify and/or
correct pre-translated text that has been processed by an MT system from a source
output to insure it meets a level of quality negotiated in advance between client and
correcting the original output of machine translation system under certain purpose,
As noted by Krings (2001: 178), the “fully automatic high quality” machine
translation output is not really available in the foreseeable future. Therefore, the
amount of PE effort required in the post-editing process will be the primary factor
Effort expended in post-editing process was classified by Krings (2001: 178) into
insertion and reconstruction of the sentence structure and so on. These two kinds of
the type and extent of those cognitive processes that must be activated in order to
can neither be observed nor be measured directly. It was defined by Krings from the
effort was the most important and most decisive variable among the three categories,
The fundamentally theoretical basis or the operational basis for this research is the
(1980) as they were researching into reading comprehension and trying to explain the
18
distribution of fixations.
concurrent with the action of readers’ seeing the word. In other words, when a reader
was reading an article, s/he would try to process every word s/he encountered as soon
as possible, even if s/he would interpret it erroneously. Just and Carpenter (1980) used
the word “interpretation” to refer to the processing of words. They noted that the
“interpretation” consisted of encoding the word, finding a proper referent (if the word
is polysemic) and determining the status of this word in the sentence and in the whole
text. Immediacy assumption emphasized that the interpretation, at all levels, was
carried out immediately without any delay (Just & Carpenter, 1980).
posited that as long as a person was processing a word, s/he would keep looking at
this word. In other words, the eye fixation would rest on this word, till s/he carried on
the processing of the next word. Just and Carpenter (1980: 331) believed that there
was “no appreciable lag between that is being fixated and what is being processed”.
Therefore, the word a reader fixated is the exact word s/he processed. The time s/he
focused on the word (gaze time on the word or fixation duration on the word) is the
time s/he was processing it. Eye-mind assumption provided researchers valuable
perspective to get access to what might happen in people’s mind which used to be a
black box.
However, these two assumptions were not exactly right. Holmqvist (2011) found
that what one was thinking was faster than his or her eye movement, which was a bit
conflicting with the immediacy assumption. Smallwood and Schooler (2006) pointed
out that mind wandering might happen during the task. Although one’s eyes fixate on
a word or other objects, the mind may drift away and something irrelevant may be
considered during the fixation. What’s more, there were no obvious evidence showing
that one was absent of mind. Mind wandering often occurs without one noticing it. At
this point, when one fixated on a word, s/he was not necessarily processing it. S/he
19
might be thinking about something else. This the fixation time might not necessarily
be the processing time in the mind. The same goes to translation studies. When the
translator focuses on a word in the source text, s/he may be considering the production
of target text and trying to find a proper referent in the target language for this word.
Or perhaps, when s/he is looking at the target area, she may be pondering over the
word of source text. At this point, fixations in source text do not always mean source
text comprehension and fixations in target text do not necessarily mean target text
production.
assumption still provide appropriate basis for the correlation between eye fixations
have proven that there are firmly link between eye movement and cognitive
processing. Rayner (1998) has concluded that fixations were firmly linked with
cognitive processing during reading tasks and that eye movement could informatively
reveal the mind. As for the translation tasks, we could reasonably believe that most of
the eye movement data could reflect the moment-to-moment mental activities, since
Smallwood and Schooler (2006) noted that mind drifting always occurred when the
task was easy. And either translation task or post-editing task is far from easy.
from three dimensions: time, operation and cognitive, so to speak, temporal effort,
the most important PE effort from an economic perspective as well as the most easily
measured during the research. The very name of temporal effort suggests that it
concerns the time for doing post-editing work. Usually, the time spent on the task
indicates the amount of temporal effort consumed. In PE effort research, total task
20
time is always combined with the word number of source text. The processing speed,
derived from dividing the “source text word number” by “the total task time”, is
always used in the PE effort research to indicate productivity of task, i.e., the number
of words processed within one minute (e.g. O’Brien, 2006; García, 2010). The
processing speed, also as an indicator of temporal effort in this research, can tell more
Post-editing
Effort
the raw machine translation or to adjust the arrangement of the text. As nowadays
most of the translations are conducted with computers, keyboard activities, like
effort. Technical effort is “purely technical operations” (Krings, 2001: 179). Before,
technical effort was defined by Krings (2001), studies had found that error types
caused by machine translation system such as false verb form and preposition and
finding, optimization of text processing system was proposed to reduce effort (Slocum,
effort.
concerns the cognitive processing happened in people’s mind which could not been
21
detected obviously. The definition of cognitive effort was elaborated in detail in
Although Krings differentiated them into three categories, they were not completely
separated from one another. Temporal effort was the most easily measured. Technical
effort could also be measured externally. However, cognitive effort could not be
observed externally. It is the most decisive effort among the three. Krings especially
noted that even though technical effort was leaded by cognitive activities, technical
effort and cognitive effort should be differentiated (2001: 179). It is possible when
cognitive effort is little while technical effort is great. A mistake in raw machine
translation output may be easily recognized, but it may take a lot to correct the
mistake.
indicator for temporal effort was processing speed of a task; indicators for cognitive
According to immediacy and eye-mind assumption, where the eye is looking reflects
what is processed, although with potential weakness. It did support that there was a
strong correlation between eye movement data and cognitive effort. Eye movement
data such as fixations and pupil dilation was researched and proved to reveal cognitive
effort (Just & Carpenter, 1976, Just & Carpenter, 1980; Hyönä, Tommola & Alaja,
1995). With advances in technology, online eye movement data could be collected
accurately by eye tracker. In this research, eye movement data was collected by the
remote eye tracker Tobii TX300. Indicators of cognitive effort of this research were
fixation count, fixation duration and pupil dilation. Relations between these indicators
22
with cognitive effort will be introduced respectively in the following part.
Fixation count was the number of fixations formed during the translation or
Fixation count is always used to reflect the difficulty of the task and the expertise of
translators. Generally, The more fixation counts are, the more cognitive effort was
spent in the translation process. Doherty (2012) found fewer fixation counts for the
Jakobsen and Jensen (2008) found out that during translation reading boasted the most
fixation counts, followed by sight translation, reading for translation and reading only
for comprehension.
Fixation duration is the time duration of a fixation. The longer fixation duration
indicates the larger and deeper processing of a word. Fixation duration in different
area of interests were collected by Koglin (2015) to compare different cognitive effort
(2015) found that for traditional translation, translator tended to have longer fixation
on source text, while for post-editing task they fixated longer on target text. Since
there are always thousands of fixation counts, each with different duration, fixation
duration used in the research to indicate cognitive effort was average fixation duration
of all fixations. It is the result of dividing total gaze time on screen (or in certain area
Great connection has also been found between pupillary movement and mental
activities. Hyönäet al. (1995) studied pupil dilation in different interpreting tasks and
proved that pupillary responses could suitably indicate the cognitive load of mental
processing activities. O’Brien (2006) first introduced eye tracking into translation
process research and machine translation research. In her research, percentage change
in pupil dilation was adopted as an indicator to compare cognitive effort need for
computer-aided translation with different matches. Doherty, O’Brien and Carl (2010)
adopted gaze time, fixation count and average fixation duration and average pupil
23
dilation as indicators of cognitive effort spent on reading machine-generated
sentences.
Fixation counts (e.g. O’Brien, 2011; Doherty, 2012; Mesa, 2014), fixation
duration (e.g. Carl et al., 2011; Mesa, 2014) and pupil dilation (e.g. Hyönä, 1995;
Iqbal et al., 2005; Lourenço da Silva et al., 2015) are among the most common used
3.2.4 Summary
In summary, the immediacy assumption and eye-mind assumption put up by Just and
Carpenter (1980) provided a fundamental basis for using eye movement data as
research also provided profound evidence to link visual and cognitive focus in
Based on all that have been elaborated above, in this research, temporal and
cognitive effort will be compared between post-editing task and traditional translation
task. For temporal effort, processing speed of the participants is chosen as the
indicator. As for cognitive effort, eye movement data, including fixation counts,
The analytical framework of this research was drawn based on the theoretical basis
As seen in Figure 3-2, for traditional translation task, only source text will be
provided for translators, while for post-editing task, both source text and machine
translation output will be offered. Each participant will conduct both post-editing task
and cognitive effort in the translation or post-editing process will be collected and
discussed. Technical effort will not be taken into consideration here. For temporal
effort, total task time will be collected so as to calculate processing speed. For
cognitive effort, fixation counts, fixation duration and pupil dilation will be recorded
25
In addition, the main objective of this research is to prove that post-editing is
cognitively saving compared with translation from scratch, by contrasting the eye
movement data collected in the post-editing process and in the translation process.
What’s more, participants are asked to stop post-editing as long as they believe that
they have post-edited the machine translation output as good as human translation.
26
CHAPTER FOUR
METHODOLOGY
was used to record the moment-to-moment eye activities of the participants, thus to
conventional translation.
become a new way for college English learners to conduct daily translation work,
compared with traditional human translation. Three research questions are raised as
follows:
2) What influence will text type and the competence of translators have on
3) What’s the distinction of the allocation of cognitive effort to source text and
Based on the previous studies on post-editing effort presented in Chapter Three and
Hypothesis 1: Temporal and cognitive effort for post-editing is less than that for
27
translating from scratch.
spend less effort, in respect to time and cognition, than those less competent (in this
case, undergraduate). The reduction of time and cognitive effort varies among
translation process than during post-editing process, while cognitive effort required to
4.3. Participants
There are in total 30 participants (aged from 20 to 27) involved in the experiment,
including 24 female students and 6 male students. These participants are from
Foreign Studies) and Guangzhou University (School of Foreign Studies). All of them
are majoring in English and have received translation courses for at least three years.
All the participants are Chinese native speakers and have English as their second
language. Among the 30 participants, 15 are 3rd or 4th-year undergraduates and 15 are
undergraduates do not. None of the participants have been trained in PE or have any
professional experience as post-editors. Yet, before the task, all of them are introduced
within-subjects design is adopted, which means that each participant is tested under
dependent variable we have chosen is pupil dilation which varies greatly among
4.4 Materials
language learners in their daily translation work, three kinds of text are chosen:
economic text, political text and literary text, for their being the most common text
For this was a within-subject study, each participant was asked to translate six
short texts (A1, A2, B1, B2, C1, C22), three to translate from scratch and three to
post-edit. Materials chosen for translate and post-edit were around 450 words in total
with the consideration that participants might get tired or bored if the text was too
long, leading to a drop in motivation and a negative effect on the gaze data. Word
A1 and A2 are economic texts extracted and adapted from the same article of
China Business News3 to ensure that they are of comparable difficulties and style.
The same consideration goes to political texts (B1 and B2) and literary texts (C1 and
C2). Political texts are extracted and adapted from the 2016 Report on the Work of the
Government, literary texts from An Essay of Liang Shiqiu. A1 and A2 are about the
sales volume of some great real estate company. B1 and B2 are about the development
It’s worth noticing here that, as Carl (2015) pointed out, machine translation is
far from capable of literary translation as far as style and cultural factors are
2
A, B and C represent for different text types.
3
哪家房企先闯入 3000 亿俱乐部?[N]. 第一财经日报. 2016., 3249 (A07)
29
concerned. The literary texts chosen here are limited to those without much cultural
For the two texts of each text type, one will be translated by Google Translate,
the free online translation system, and the other untranslated, all waiting for
participants to translate or post-edit. A detailed profile of the source texts and the raw
to minimize the risk that observations (in part) had to do with a repeated presentation
sequence. Also, tasks are randomized to the participants with the RAND function
The default sequence of tasks and semi-randomized sequence are also shown in
Table 4-2 and Table 4-3. As shown in Table 4-3, each text will be post-edited fifteen
times and translated fifteen times. Besides, participants will conduct tasks following
30
T30 A1 B1 C1 A2 B2 C2 T30 A2 B2 C2 A1 B1 C1
other half of the texts, first post-editing task then translation task. With this design,
Tobii TX300 eye tracker is used to collect data of eye activities, including gaze time,
an order of A1-A2-B1-B2-C1-C2. Following this order, for half of the texts,
participants first conduct translation task and then post-editing task, while for the
fixations and pupil dilation. Tobii TX300 is a remote eye tracker, which, compared
T29 A1 B1 C1 A2 B2 C2 T29 A2 B2 C1 A1 B1 C2
T28 A1 B1 C1 A2 B2 C2 T28 A2 B1 C2 A1 B2 C1
T27 A1 B1 C1 A2 B2 C2 T27 A1 B2 C1 A2 B1 C2
T26 A1 B1 C1 A2 B2 C2 T26 A1 B1 C2 A2 B2 C1
T25 A1 B1 C1 A2 B2 C2 T25 A1 B1 C1 A2 B2 C2
T24 A1 B1 C1 A2 B2 C2 T24 A2 B2 C2 A1 B1 C1
T22 A1 B1 C1 A2 B2 C2 T22 A2 B1 C2 A1 B2 C1
practice effect caused by task order could be counterbalanced.
T21 A1 B1 C1 A2 B2 C2 T21 A2 B1 C1 A1 B2 C2
T20 A1 B1 C1 A2 B2 C2 T20 A1 B2 C2 A2 B1 C1
T19 A1 B1 C1 A2 B2 C2 T19 A1 B2 C1 A2 B1 C2
T18 A1 B1 C1 A2 B2 C2 T18 A1 B1 C2 A2 B2 C1
T17 A1 B1 C1 A2 B2 C2 T17 A1 B1 C1 A2 B2 C2
T16 A1 B1 C1 A2 B2 C2 T16 A2 B2 C2 A1 B1 C1
31
T15 A1 B1 C1 A2 B2 C2 T15 A2 B2 C1 A1 B1 C2
T14 A1 B1 C1 A2 B2 C2 T14 A2 B1 C2 A1 B2 C1
T13 A1 B1 C1 A2 B2 C2 T13 A2 B1 C1 A1 B2 C2
T12 A1 B1 C1 A2 B2 C2 T12 A1 B2 C2 A2 B1 C1
T11 A1 B1 C1 A2 B2 C2 T11 A1 B2 C1 A2 B1 C2
T10 A1 B1 C1 A2 B2 C2 T10 A1 B1 C2 A2 B2 C1
T09 A1 B1 C1 A2 B2 C2 T09 A1 B1 C1 A2 B2 C2
T08 A1 B1 C1 A2 B2 C2 T08 A2 B2 C2 A1 B1 C1
T07 A1 B1 C1 A2 B2 C2 T07 A2 B2 C1 A1 B1 C2
T06 A1 B1 C1 A2 B2 C2 T06 A2 B1 C2 A1 B2 C1
4.5 Equipment
T05 A1 B1 C1 A2 B2 C2 T05 A2 B1 C1 A1 B2 C2
T04 A1 B1 C1 A2 B2 C2 T04 A1 B2 C2 A2 B1 C1
T03 A1 B1 C1 A2 B2 C2 T03 A1 B2 C1 A2 B1 C2
T02 A1 B1 C1 A2 B2 C2 T02 A1 B1 C2 A2 B2 C1
T01 A1 B1 C1 A2 B2 C2 T01 A1 B1 C1 A2 B2 C2
H
T
E
P
P
with Eye link 1000, is non-invasive, i.e. there’s no need for participants to wear any
unobtrusive capture of natural human behavior. This has increased the ecological
validity. Pupil dilation and fixation could be recorded at a sampling rate of 300 Hz,
which means that highly accurate and precise data could be collected and provide a
solid foundation for eye movement research. Tobii TX300 used in this study is
Eye movement data is recorded by the eye tracker unit (Figure 4-3). Figure 4-4
shows the scene when a participant conducts the translation or post-editing task.
As software Tobii Studio from Tobii TX300 does not support human-computer
designed to record user activity data in the process of translation, as well as reading,
writing, copying and editing. The software was developed by Arnt Lykke Jakobsen
32
and Lasse Schou, while programmed by Lasse Schou, Morten Lemvigh, Jakob
Elming and Michael Carl. Translog II could be connected to TX300 remote eye
tracker to record both gaze activities and keyboard activities. Translog II software
includes two parts: Translog II user and Translog II supervisor. Source texts (for both
tasks) and machine translation output (only for post-editing task) are presented on the
interface of Translog II user. Figure 4-5 shows the interface of Translog II in the
post-editing task.
4.6 Procedures
4.6.1 Environment
The experiment was carried out in a room without sunlight. Only artificial light was
used. Noises were also avoided so as not to disturb participants. As there was only one
eye tracking machine, participants were tested one by one as informed in advance by
4
It was advised that double spaced setting should be adopted so that fixation could be more exact. Yet this version
of Translog II didn’t support double spaced setting.
33
experimenter. All the participants were informed in advance that they were not
allowed to drink or eat anything that contains caffeine. Girls were not allowed to wear
eye make-up. Caffeine level and eye make-up might lead to invalid data.
Before the task, participants were asked to answer a brief questionnaire which was
designed to make a record of their individual profile, including name, sex, age,
education and their personal attitude towards PE and human translation. They were
acknowledged that this information would be kept confidential and only for research
use. There were no warm-up tests for participants since they had to carry out six tasks
which was a quite had workload for them. The workload may also affect data quality.
(1) Guidelines. After the questionnaire was answered, a brief introduction to the
eye tracker was made to the participant so that they could know how this machine
worked. They were told that the quality of post-editing output should be comparable,
as follows:
- No time constrain
(2) Calibration. Before the task, Translog II should be connected to the remote
5
The first two guidelines were taken from O’Brien (2009) based on Wagner (1985).
34
Figure 4-6: Connecting Translog II to eye tracker
following the yellow dot appeared on the screen with their eyes, usually focusing on
the center of the dot. If the calibration data was insufficient or the calibration was
unsuccessful, a recalibration would be done. The Calibration results were like this
(Figure 4-7).
translation task begin. There are in total six independent tasks for each participant (the
tasks are established in advance). Before each task, calibration will be conducted to
ensure that eye-tracker could exactly record eye activities and guarantee the quality of
between the eye tracker and the participant should be within 50 to 80 cm, as suggested
by Tobii manual.
(3) Translation or PE tasks. After the calibration, participants started their tasks.
There were in total six little tasks which were semi-randomized. Participants
translated or post-edited these tasks according the requirements and saved the data
one by one. During the translation or post-editing task, participants were required to
stay in a static position and try to touch type so that they could focus on the screen as
much as possible.
(4) Data saving. After each task, data would be saved once and named the same
as the task file. Data saving was processed by the experimenter or by some capable
36
CHAPTER FIVE
participants wear, varying lighting condition, and participants’ distance with the
monitor, and so forth (O’Brien, 2009; Hvelplund, 2011; Korpal, 2015). To minimize
were taken in this research: curtains were drawn so that no daylight was allowed to
enter the room; the same artificial light was lit during all experiments, day or night.
Participants were required to sit right before the monitor. The distance between the
monitor of remote eye tracker and the participant was within 50 to 80 cm, as
As for the variables collected to reflect cognitive effort, some thresholds were set
based on the previous studies. For the average fixation duration, we followed the
threshold of Sjørup (2013) and Lourenço da Silva et al. (2015) as 180 milliseconds,
which means average fixation duration below 180ms will be discarded. Gaze time on
screen (GTS), total gaze time divided by total task time, was used to measure the
quality of eye-tracking data. A low GTS might indicate that, most of the time in the
that the eye tracker has lost the tracking of eyes. For GTS, the threshold was set as
We also set another threshold to guarantee the eye-tracking data quality: the
percentage of valid win gaze data proposed by Lourenço da Silva et al. (2015). There
are two areas of interest (AOI) in this research: the source text (ST) area and the target
text (TT) area. As the interface for translation and post-editing work in this research is
37
Translog-II User (see Figure 4-5), the AOIs are defaulted by the software with the
The percentage of valid win gaze data equals the “number of occurrences of
win=1 (gaze on ST) plus those of win=2 (gaze on TT)” divided by “the total number
of wins which also include win=0 (gaze ascribed to neither the ST AOI nor the TT
AOI)” (Lourenço da Silva et al., 2015: 150). The threshold for this was 40%. Detailed
information about “win” and value of “win” was elaborated in Table 5-1.
As in this experiment, eye movement data was collected through Translog-II software
installed in the mainframe of the remote eye tracker, all the data was stored in the xml
files. All the keylogging data and eye-tracking data could be found in the xml files.
The eye-tracking data was extracted through derivations in Microsoft Excel from the
xml document. The data stored in the xml file was arranged as Figure 5-1. Terms in
Translog II collects and processes gaze data in three steps: (1) Check whether the
gaze is within the source or the target window; (2) Compute fixations based on a
variant of a certain algorithm; (3) Map gaze points and fixation on closest character
on the screen. It is worth noticing that step 3 is incompatible with Chinese and
Japanese Input Method Editor. Yet, the language pair studied in this research is
The log file should be opened and replayed in the Translog II Supervisor. After the
38
replay, newly produced xml file should be saved and used for data collection.
translation, HT for short, and PE) and Text type (three levels: economic, political and
literary text) are within-subject factors and competence (two levels: postgraduate and
These are in total three independent variables in this research, as indicated above.
As for the dependent variables, since we intend to investigate the temporal and
cognitive aspects of PE and HT, dependent variables chosen for this research are ①
duration. Data collected from the experiment are total task time, fixation counts (in ST
In the following section, in order to save space in figures or tables, some of the
independent and dependent variables are abbreviated. The abbreviations and their
40
Table 5-2: Abbreviations used in tables and figures
Abbreviation Corresponding Variables
UG Undergraduate
PG Postgraduate
HT Human translation
PE Post-editing
E Economic text
P Political text
L Literary text
Cpt Competence
Processing speed here is the number of words that could be processed by a participant
in one minute. It is derived from dividing “total task time” by “the number of source
text”. Relations between processing speed and temporal effort can be described as: the
Figure 5-4, Table 5-2 and Table 5-3. The overall figures of processing speed for all
41
As is shown in Figure 5-4, for both undergraduates and postgraduates, the
average processing speed for translation is lower than that for post-editing,
irrespective of text types. When text type is taken into consideration, according to
Table 5-3 and Table 5-4, processing speed for post-editing is still higher than that for
translation (14.65 vs. 8.97 for economic text; 18.86 vs. 10.63 for political text; and
(shown in Table 5-5) indicate that the main effect of task (HT vs. PE) is highly
significant, F (1, 18) = 16.401, p < 0.01. Post-editing task is faster than translation
task. The main effect of text is also highly significant, F (1, 18) = 37.683, p < 0.01.
Literary text is processed the fastest, followed by political text and then economic
text.
text and competence. The interaction among text, task and competence is also not
Hyönä et al. (1995) found that pupillary responses could suitably indicate the
cognitive load of mental processing activities. Pupil dilates along with the increase of
difficulty and processing load. The pupillary data collected in this research includes
pupil dilation for both left and right eyes. Nevertheless, as there is a high degree of
concordance for left and right eyes (Niehaus, Guldin & Meyer, 2001), pupil dilation
reported here is the average dilation for both eyes. The overall pupil dilation figures
indicated in Table 5-6. According to the results, task has a highly significant
difference, F (1, 18) = 20.560, p < 0.01, which indicates that irrespective of other
factors, pupil dilation tends to be larger when students are conducting translation task
(2.96 mm) than doing post-editing work (2.89 mm). The main effect of text is also
significant, F (1, 18) = 5.771, p < 0.05. Yet, the main effect of competence doesn’t
43
prove significant (F < 1).
The interaction between text and competence is significant, F (1, 18) = 6.876, p <
0.05. Therefore, simple effect test should be conducted in case that the interaction
effect may cover or distort the main effect of competence. However, as our interest
lies in the interaction between task and competence, task and text or the interaction of
the three factors, which are all not statistically significant (p > 0.1), the competence *
In order to show more clearly what the main effect of task is and how it changes
when conducted by students with different competence in terms of different text type,
the post hoc comparisons are carried out and shown in Figure 5-5, 5-6 and 5-7.
The post hoc comparisons reveal that, irrespective of text types and competence,
pupil dilation for post-editing is highly reduced compared with human translation, so
to speak, indicating a great reduction of cognitive effort. For literary and political
texts (Figure 5-6 and 5-7), pupil dilation of postgraduates, for both post-editing and
for postgraduates and undergraduates. Nevertheless, for economical text (see Figure
44
5-5), postgraduates cost more cognitive effort in translation than undergraduates,
undergraduates.
45
Figure 5-7: Estimated Marginal Means of Pupil Dilation of Literary Text
Fixation forms from one’s stably looking at certain object. In translation process
research, which collects eye movement data, fixation count is always used to indicate
the amount of cognitive effort (e.g. Hvelplund, 2011). To some extent, more fixation
counts indicate that more cognitive effort is spent in the translation process. For
fixation count, we not only consider the total fixation count, but also have a look at
fixations distributed to ST area (i.e. win 1 or the ST AOI) and TT area (i.e. win 2 or
the TT AOI) so as to see the allocation of cognitive effort between the two AOIs. The
fixation counts distributed in TT AOI and fixation counts distributed in both AOIs, are
attached in Appendix E.
Total fixations counts are fixations in both source text and target text areas. According
to the ANOVA conducted in terms of all fixation counts, the main effect of task is
highly significant, F (1, 18) = 10.720, p < 0.01. The overall fixation counts for
post-editing is 590.18, 13.33% less than that for human translation which is 680.97. It
indicates that cognitive effort spent in the process of post-editing is less than that
The main effect of text is also highly significant, F (1, 18) = 17.729, p < 0.01,
which declares that cognitive effort for post-editing varies from the text type. Yet, the
interaction effect between Task and text is not significant, p < 0.1. Besides, the Task *
Competence interaction are also not statistically significant (for all interactions, p <
1).
47
Table 5-9: Descriptive Statistics of Total Fixation Counts for both
human translation and post-editing tasks (N=20)
Competence Mean SD
HT-E UG 1009.55 468.03
PG 663.89 211.23
Total 854.00 406.46
HT-P UG 781.36 259.01
PG 653.78 343.45
Total 723.95 298.69
HT-L UG 547.55 122.70
PG 600.00 377.61
Total 571.15 262.07
PE-E UG 778.27 363.90
PG 756.78 399.57
Total 768.60 370.19
PE-P UG 646.09 258.55
PG 524.33 202.17
Total 591.30 237.18
PE-L UG 466.82 173.62
PG 506.44 189.89
Total 484.65 177.36
cognitive effort spent (p < 1). However, from the post hoc comparisons (Figure 5-8,
5-9 and 5-10), for undergraduates, there’s always a reduction in fixation counts from
48
translation task to post-editing tasks, whatever text types are. For postgraduates, it’s
not always the case. When post-editing economic text, postgraduates boast more
fixation counts than when translating it. In addition, postgraduates tend to have less
fixation counts than undergraduate for economic and political texts, but more for
literary text.
Figure 5-8: Estimated Marginal Means of Total Fixation Counts of Economic Text
Figure 5-9: Estimated Marginal Means of Total Fixation Counts of Political Text
49
Figure 5-10: Estimated Marginal Means of Total Fixation Counts of Literary Text
Figure 5-11: Fixation Counts in win 1 (ST AOI) for translating or post-editing different
types of texts conducted by undergraduates and postgraduates
Note: figures in the red square frame are mean values for the three ( e.g. PE-L, PE-P and PE-E)
For this research, as the material is presented with Translog II interface which is
50
divided into two parts, the source text (for both translation and post-editing task) is
shown in the upper part of the interface and raw machine translation (only for
post-editing task) in the bottom part. The upper part is defined as ST AOI (term used
in the final xml file is “win = 1”). Fixation counts distributed to this area are
Results of the three-way ANOVA test conducted in terms of the fixation counts
in win 1 reveal that the number of fixation counts in the ST AOI differs significantly
between post-editing task and translation task (F (1, 18) = 42.677, p < 0.01).
Students, including postgraduates and undergraduates, look more into the source text
area during the translation process (313.17 for translation vs. 180.28 for post-editing).
The main effect of text also proves significant, F (1, 18) = 8.029, p < 0.05. For
post-editing task, the number of fixation counts in ST AOI of economic texts is the
largest (206.15), followed by literary texts (172.2) and then political texts (162.5). The
(1, 18) = 7.970, p < 0.05), will not be further discussed here.
The main effect of Competence is not significant (F < 1). Interaction effects
between Competence and Task, Task and Text, and among the three factors also do
51
not prove significant (p < 1).
As participants are required to translate or post-edit in the bottom part of the Translog
II interface where raw machine translation output is provided for PE task, fixation
counts in TT AOI (also win 2) are interpreted as cognitive effort spent when
Figure 5-12: Average Fixation Counts in win 2 (TT AOI) for translating or post-editing
different types of texts conducted by undergraduates and postgraduates
Note: figures in the red square frame are mean values for the three ( e.g. PE-L, PE-P and
PE-E)
The ANOVAs conducted in terms of the fixation counts in win 2 indicate that the
main effect of task is not significant (p < 1). Fixation counts distributed in TT AOI do
no vary much between PE and HT. Yet, the number of fixation counts in TT AOI for
post-editing task exceeds that for translation, which means translators look more into
The main effect of text is still highly significant, F (1, 19) = 15.490, p < 0.01.
boasts the largest (768.60) number of fixation counts in TT AOI, followed by political
52
texts (591.30) and by literary texts (484.65).
For a translator, longer fixation duration indicates deeper cognitive processing. In the
last section, fixation counts distributed to ST and TT AOI were discussed to infer the
well as indices of cognitive effort. Overall figures of fixation duration for all
As listed in Table 5-13, the main effect of task is highly significant (F (1, 18) = 22.370,
p < 0.01), which means there is great difference between post-editing and human
translation in terms of average fixation duration (520.72 ms for human translation vs.
435.61 ms for post-editing). More effort is costed when participants translate from
scratch, compared with post-editing. This result remain the same for postgraduates
(476.49 ms for human translation vs. 405.22 ms for post-editing) and undergraduates
53
(540.86 ms for human translation vs. 460.48 ms for post-editing).
Competence Mean SD
HT-E UG 544.07 66.89
PG 490.92 67.65
Total 520.15 70.84
HT-P UG 549.63 68.16
PG 515.61 90.12
Total 534.32 78.53
HT-L UG 528.88 85.06
PG 422.94 174.99
Total 481.21 140.09
PE-E UG 464.15 84.86
PG 402.48 43.36
Total 436.40 74.65
PE-P UG 448.09 71.85
PG 383.83 45.59
Total 419.18 68.32
PE-L UG 469.19 97.42
PG 429.36 114.63
Total 451.27 104.60
54
The main effect of Competence is marginally significant, F (1, 18) = 4.387, p =
0.051. It is shown clearly in Table 5-14, that for all tasks and all text types, the
The main effect of text is not significant (F < 1). Also, interactions between
competence and text, competence and task, task and text and among the three factors
Average fixation duration in ST AOI reflects cognitive effort costed to understand the
source text.
vary greatly in source text comprehension, as the main effect of task shows
statistically significant difference, F (1,18) = 30.254, p < 0.01. For human translation,
the average fixation duration of ST AOI is 424.59 ms, while for post-editing, the
average fixation duration of ST AOI is 348.54 ms, a reduction of 17.9%. It is the same
55
case for postgraduates and undergraduates, with a respective reduction of 20% and
16%.
In addition, the interaction between task and text proves significant. The post hoc
comparisons is conducted and shown in Figure 5-13. For all text types, average
fixation duration in ST AOI for post-editing is shorter than that for human translation,
relatively saved. However, as is shown in the figure, the amount of effort saved varies
among different text types. Effort saved when translators post-edit economic and
political texts is similar. However, for literary texts, comparatively little cognitive
effort is saved.
56
Figure 5-13: Estimated Marginal Means of Average Fixation Duration in ST AOI
Average fixation duration in TT AOI reflects cognitive effort spent on target text
production.
57
indicate that human translation and post-editing vary greatly in target text production.
The main effect of task is highly significant, F (1,18) = 54.454, p < 0.01. For human
translation, the average fixation duration of TT AOI is 580.55 ms, while for
20.9%. Translators spend more cognitive effort on target text production when
Table 5-18: Descriptive Statistics of the average fixation duration for TT AOIs
Competence Mean SD
HT-E UG 590.08 87.98
PG 533.86 57.21
Total 564.78 79.22
HT-P UG 636.93 100.41
PG 596.95 122.56
Total 618.94 109.76
HT-L UG 604.21 113.60
PG 552.55 109.27
Total 580.97 111.87
PE-E UG 494.30 94.24
PG 426.49 50.29
Total 463.78 83.29
PE-P UG 481.04 93.21
PG 408.50 55.68
Total 448.40 85.14
PE-L UG 476.83 73.61
PG 465.93 124.83
Total 471.93 97.18
5-18, average fixation duration for undergraduates is a bit longer than postgraduates.
58
post-editing, compared with human translation. Fixation duration differences among
translation post-editing for Chinese-English language pair and for college English
learners, three research questions were raised and correspondingly based on previous
studies three hypotheses were proposed. The following part of this section will discuss
Hypothesis 1: Temporal and cognitive effort for post-editing is less than that for
For the first hypothesis, we assume that post-editing could save both time and
cognitive effort, compared with traditional human translation. Firstly, in terms of time
saving, results of ANOVA in terms of processing speed, since the dependent variable
Three-way ANOVA test shows that the main effect of task is highly significant (p <
post-editing and translation task. Together with detailed figures of processing speed, it
is obvious that post-editing do save time or the temporal effort as defined by Krings
(2001).
translation memory tool. Among the four matches O’Brien chose, the “no match” one
requires translator to translate from scratch and the “MT match” requires translator to
edit the machine-translated output, which is exactly like human translation and
59
post-editing. Results show that the processing speed of “MT match” is nearly twice
the speed of “no match” (O’Brien, 2006: 190). However, there were only four
participants taking part in the experiment as O’Brien claimed that the research was
preliminary and novel in nature. García (2011) also reported similar conclusions,
although the variable he chose was total task time. García (2010) first reported a study
which didn’t show significant difference between post-editing and translating, though
post-editing task did be faster than manual translation. García (2011) then reported
another research and this time statistically significant difference was proven. He
2011). Guerberof (2009) also compared machine translation with translation from
scratch and reported a higher speed for post-editing than manual translation (although
support that less cognitive effort is required in post-editing process than in translating
process. Indices of cognitive effort in this research are pupil dilation, total fixation
counts and average fixation duration of both AOIs. For pupil dilation, according to the
ANOVA result, the main effect task is highly significant (p < 0.01), indicating that
pupil size for human translation is larger than that for post-editing. Thus, cognitive
post-editing. For total fixation duration, ANOVA result also declares a significant
fixations in the translation process than in the post-editing process. In addition, the
ANOVA result of total fixation counts also prove that the number of fixation counts
Results of our experiment are consistent with the research findings of Lourenço
da Silva et al. (2015). Lourenço da Silva et al. (2015) investigated into the process of
post-editing and human translation from Portuguese to Chinese. Results for all
dependent variables concerning eye activities in both AOIs (including fixation counts
60
and average fixation duration) proved all significant, indicating a reduction of overall
cognitive effort for post-editing task. As for pupil dilation, study conducted by
O’Brien (2006) offered a similar finding: the percentage change in pupil dilation for
post-editing machine translation was lower than that for translating from scratch,
which suggested that translating from scratch required more cognitive effort.
The reasons for the reduction of temporal and cognitive effort for post-editing
task may be as follows. First, compared with translation from scratch, translators are
provided with readily translated text in the post-editing task. In other words, the main
save translators from the effort of typing the whole translation text (target text).
although with deficiencies, translators do not have to re-understand the whole source
text. They just need to ponder on some points which aren’t well translated by machine
translation system. These two reasons can also be further justified in the discussion of
hypothesis three which investigate more specifically into the source text
comprehension and target text production. As for the significant differences for
processing speed proved in this research but not proved in García (2010), one
explanation for this is that different translation tools are used: Google Translate
Toolkit (translation memory) for García and free Google Translate for this research.
Besides, text type and text difficulty will also affect the result, let alone the
Hypothesis 2: Competent translator (in this case referring to postgraduate) will spend
less effort, in respect to time and cognition, than those less competent (in this case,
undergraduate). The reduction of time and cognitive effort varies among different text
types.
Results of the experiment partially support this statement. For most dependent
variables (Fixation counts in two AOIs, pupil dilation and processing speed), ANOVA
results indicate the main effect of competence is not statistically significant, although
there do exist some differences. For average fixation duration of both AOIs, main
between task and competence is not significant. The post hoc comparison reveals that
undergraduates fixate longer on the screen than postgraduates, either in the translation
competence, though not significant. The impact is as follows: postgraduates carry out
post-editing task faster than undergraduates; undergraduates always have larger pupil
experience influences post-editing performance. This study showed that the most
experienced translators conducted post-editing task the fastest, while translator with
the least experience was the slowest. This result pertains to our hypothesis. Balling
and Carl (2014) used the large resources in CRITT data base - a translation process
investigated. However, Balling and Carl (2014) claimed that experience of translator
has a smaller influence than they supposed. The possible explanation for this was
individual difference of the translator, proposed by Balling and Carl (2014). In this
research, the reason why only one variable presents significant difference is that
possible that certain undergraduate has done a lot of translation practice and
The second statement is that text type affects temporal and cognitive effort.
ANOVA results also support this statement. Results for text types are significant for
62
processing speed, pupil dilation and total fixation counts (p < 0.05). However, in this
research, effort to maintain the same level of difficulty for texts to post-editing and
translating is made. It is limited to the same text type. As Hvelplund (2011) said the
assessment of difficulty level is hard to perform, let alone texts with different types,
it’s impossible for us to claim the three kinds of source texts are of the same difficulty.
Therefore, no comparison among the three texts would be drawn. We just look at
whether, for a certain text, there is a reduction of effort for post-editing task.
According to the post hoc comparisons, for all text types, undergraduates do faster in
post-editing than in translation task. But for postgraduates, they’re faster when
post-editing economic texts. Combined with the result of fixation counts and average
fixation duration, which show that when dealing with economic text, postgraduates
have more fixations but shorter fixation duration, this abnormality could be explained.
It is possible that as economic text chosen in this research has many figures,
postgraduates tend to recheck the correctness of these figures which produces more
fixations and cost more time. However, the cognitive effort for post-editing economic
translation process than during post-editing process, while cognitive effort required to
After researching into the overall temporal and cognitive effort saving, we want
post-editing and translating from scratch. To this end, we collect fixation counts
distributed to ST AOI and TT AOI and the average fixation duration of ST AOI and
For the ST AOI, there are significantly more fixation counts for translation than
for post-editing (p < 0.01). Besides, average fixation duration of ST AOI for human
63
translation is also significantly higher than that for post-editing. All these indicate that
translators spend more cognitive effort on source text comprehension when translating
Nevertheless, the number of fixation counts in TT AOI for post-editing task exceeds
that for translation, which means translators look more into target text area in
longer for translation task, which reveals that more cognitive effort is required to TT
area in human translation. For target text production, human translation also requires
more cognitive than post-editing. We could conclude that post-editing saves cognitive
effort from both source text comprehension and target text production.
(2015), which also found significant difference in source text in terms of (fixation
counts and fixation duration of ST AOI) and no significant difference in target text
except for average fixation duration. Carl et al. (2011) also found a significant
difference in fixation duration on source text: longer for human translation than
post-editing. Reason for more fixation counts in TT AOI for post-editing was
interpreted by Carl et al. (2011). That is, when dealing with post-editing task,
translators usually firstly read the raw machine translation output provided before they
compare it with source text; moreover, after correction of deficiencies in the text
offered, they would read again and recheck the correctness. However, although more
fixations are led because of the read-and-check behavior, effort needed for correcting
is still less than effort needed for reformulating and typing target text when translating
from scratch.
comprehension and target text production between post-editing and translation from
scratch, we also want to explore the allocation of cognitive effort to source text
comprehension and target text production for post-editing and for translating from
scratch. A comparison between number of fixation counts in ST AOI and TT AOI for
64
translation task shows that translators fixate more on target text area than source text
area. As for post-editing task, translators also look more at target text. A comparison
between average fixation duration in ST AOI and TT AOI indicates that for both
post-editing and translation task, average fixation duration on target text is longer than
on source text. In other words, for both post-editing and translating from scratch,
more cognitive effort is allocated to target text production than source text
which found that for human translation, fixation duration on source text was longer
than that on target text; while for post-editing, fixation duration on source text was
shorter than that on target text. The touch-typing ability of translator may explain this
difference. Participants for Koglin’s (2015) study are professional translators with
years of translation experience, while participants for this study are all students, most
of whom cannot touch type. Therefore, they may focus more on the target text when
they are typing target text. Carl et al. (2011) also got similar results.
5.4.4 Summary
To sum up, based on the discussion above, three conclusions can be draw. First of all,
post-editing could save both time and cognitive effort, compared with traditional
human translation. Second, in most cases, translators with more translation experience
spend less temporal and cognitive effort on post-editing than translating from scratch.
However, text types may have influence on this reduction. In other words, some texts
is suitable to post-editing, some not. Last but not least, Translators spend more
cognitive effort on both source text comprehension and target text production in the
translation process than in the post-editing process. For both human translation and
post-editing, cognitive effort is more distributed to target text production. Yet in this
noticing that all these conclusions are limited to the Chinese-English language pair
65
and to college English learners.
66
CHAPTER SIX
CONCLUSION
In Chapter One, three research questions were raised as follows: (1) In terms of
human translation temporally and cognitively different from each other? (2) What
influence will text type and the competence of translators have on cognitive effort
spent in post-editing? (3) What’s the distinction of the allocation of cognitive effort to
source text and target text between human translation process and post-editing
process?
Based on the data analysis and discussion in Chapter Five, the following
compared with human translation. In other words, translators tend to be faster when
conducting post-editing task than translating from scratch. Besides, less cognitive
effort is required when translators post-edit a text. The reduction of cognitive effort is
shown as less number of fixations and shorter fixation duration on the text.
(2) The reduction in time and cognitive effort is subject to text types and
faster. Also cognitive effort spent by competent translators is less than that spent by
less capable translators. Text type also influences the reduction of effort. Since in this
research, the difficulty level for different text types is not guaranteed as the same, no
conclusion concerning which text type is more suitable for post-editing could be
67
cognitive effort when translators post-edit all these three text types.
(3) The reason why post-editing could save temporal and cognitive effort is that
post-editing saves effort in both source text comprehension and target text production.
In post-editing process, translators focus more on the target text to check and correct
In all, considering the temporal and cognitive effort saving, post-editing should
6.2 Limitations
Although this research has drawn significant conclusions and can provide much
experience and lesson for researches in the future, it still has many limitations.
First, conclusions drawn from the study are restricted to the Chinese-English
Second, conclusion we draw about text type can only tell that for post-editing,
there’re differences in the reduction of cognitive effort among different text types.
However, since the three kinds of texts chosen in this research aren’t made sure that
they are of the same difficulty. Comparison between different types cannot be made.
Third, the results are also limited by the small sample size (although fairly large
sample for the PE studies) and short text length (considering the restriction of eye
tracking method).
Last, as the participants in this research are all students, the research results
Advanced technology in machine translation system and request for large-scale and
68
rapid information in the global world together push research on machine transition
and post-editing to the center of translation studies. In the future, more studies,
conducted. More specific research on the text types on the post-editing cognitive
69
REFERENCES
Allen, Jeffrey. 2003. Post-editing [A]. In Harold Somers (ed.), Computers and
Aziz, Wilker, Sheila Castilho, & Lucia Specia. 2012. PET: A Tool for Postediting and
Balling, Laura Winther & Michael Carl. 2014. Production Time across Languages and
Publishing, 239-268.
[J]. Journal of the Association for Information Science and Technology 2(4):
229-237.
Blatz, John, Erin Fitzgerald, George Foster, et al. 2004. Confidence Estimation for
Callison-Burch, Chris, Philipp Koehn, Christof Monz, et al. 2010. Findings of the
2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine
Callison-Burch, Chris, Philipp Koehn, Lucia Specia, et al. 2012. Findings of the 2012
Carl, Michael. 2009. Triangulating Product and Process Data: Quantifying Alignment
Units with Keystroke Data [A]. In Inger M. Mees, Fabio Alves & Susanne
Carl, Michael, Barbara Dragsted, Jakob Elming, et al. 2011. The Process of
Carl, Michael, Silke Gutermuth & Silvia Hansen-Schirra. 2015. Post-editing Machine
Doherty, Stephen, Sharon O'Brien & Michael Carl. 2010. Eye Tracking as an MT
Edmundson, Harold Parkins & David Glen Hays. 1958. Research Methodology for
Elming, Jakob, Laura Winther Balling & Michael Carl. 2014. Investigating User
Fiederer, Rebecca & Sharon O'Brien. 2009. Quality and Machine Translation: A
García, Ignacio. 2010. Is Machine Translation Ready Yet? [J]. Target 22(1): 7-21.
Green, Roy. 1982. The MT Errors Which Cause Most Trouble to Posteditors [J].
Green, Spence, Jeffrey Heer & Christopher D. Manning. 2013. The Efficacy of
Guerberof, Ana Arenas. 2009. Productivity and Quality in the Post-Editing of Outputs
from Translation Memories and Machine Translation [J]. Localization Focus 7(1):
11-21.
From a Quality and Productivity Perspective [A]. In Sharon O'Brien et al. (eds.),
Holmqvist, Kenneth, Marcus Nyström, Richard Andersson, et al. 2011. Eye Tracking:
Hu, Chang, Philip Resnik, Yakov Kronrod, et al. 2011. The Value of Monolingual
School.
Hyönä, Jukka, Jorma Tommola & Anna-Mari Alaja. 1995. Pupil dilation as a Measure
Iqbal, Shamsi T., Piotr D. Adamczyk, Xianjun Sam Zheng, et al. 2005. Towards an
Jakobsen, Arnt Lykke & Kristian T. H. Jensen. 2008. Eye Movement Behaviour
across Four Different Types of Reading Task [A]. In Susanne Göpferich, Arnt
Lykke Jakobsen & Inger M. Mees (eds.), Looking at Eyes: Eye-tracking Studies
Press, 103-124.
Just, Marcel Adam & Patricia A. Carpenter. 1976. Eye Fixations and Cognitive
Just, Marcel Adam & Patricia A. Carpenter. 1980. A Theory of Reading: From eye
Koehn, Philipp. 2010. Enabling Monolingual Translators: Post-editing vs. Options [A].
In Ron Kaplan et al. (eds.), Human Language Technologies: The 2010 Annual
537-545.
Koponen, Maarit & Leena Salmi. 2015. On the Correctness of Machine Translation: A
Mikołaj Deckert (eds.), Accessing audiovisual translation [C]. Łódź: Peter Lang,
199-212.
Lacruz, Isabel, Gregory M. Shreve & Erik Angelone. 2012. Average Pause Ratio as an
O’Brien, Michel Simard & Lucia Specia (eds.), Proceedings of the AMTA 2012
Translation, 21-30.
Lacruz, Isabel & Gregory M. Shreve. 2014. Pauses and Cognitive Effort in
(1986) 105-109.
Lourenço da Silva, Igor A., Márcia Schmaltz, Fabio Alves, et al. 2015. Translating
Exploratory Study of Key Logging and Eye Tracking [J]. Translation Spaces
4(1): 144-168.
74
Mesa Lao, Bartolomé. 2014. Gaze Behaviour on Source Texts: An Exploratory Study
Sharon O’Brien, Michel Simard & Lucia Specia (eds.), Proceedings of Machine
Eye Tracking Analysis [A]. In Susanne Göpferich, Arnt Lykke Jakobsen & Inger
Translation 25(3):197-215.
Plitt, Mirko & François Masselot. 2010. A Productivity Test of Statistical Machine
Slocum, Jonathan. 1985. A Survey of Machine Translation: Its History, Current Status,
75
and Future Prospects [J]. Computational linguistics 11(1): 1-17.
Smallwood, Jonathan & Jonathan W. Schooler. 2006. The Restless Mind [J].
Sousa, Sheila C. M., Wilker Aziz & Lucia Specia. 2011. Assessing the Post-editing
Specia, Lucia, Marc Turchi, Nicola Cancedda, et al. 2009. Estimating the
Specia, Lucia, Nicola Cancedda & Marc Dymetman. 2010. A Dataset for Assessing
Post-Editing Speed, and Some Other Factors [A]. In Laurie Gerber et al. (eds.),
332-339.
2-13.
76
崔启亮. 2014. 论机器翻译的译后编辑[J]. 中国翻译 (06): 68-73.
67-89.
界(01): 65-72.
(03): 129-135.
83-87.
学.
学.
77
APPENDICES
A1
业第一。
A2
企业 117 亿元。
B1
今年我国发展面临的困难更多更大、挑战更为严峻。不过我们有中国特色社会主义制度
和中国人民勤劳智慧,只要我们团结一致,就一定能够实现全年经济社会发展目标。
B2
中国的发展从来都是在应对挑战中前进的。经过多年的快速发展,我国物质基础雄厚,
经济潜力足。改革开放也不断注入新动力。任何艰难险阻都挡不住中国发展的步伐。
C1
时间即生命。没有人不爱惜他的生命,但很少人珍视他的时间。如果想在有生之年做一
点什么事,学一点什么学问,充实自己,使生命成为有意义,那么就不可浪费光阴。
C2
零碎的时间最可宝贵,但是也最容易丢弃。我们的时间往往于不知不觉中被荒废掉。那
些在“度周末”的美名之下把时间大量消耗的人,他是在“杀时间”,也是在杀他自己。
78
Appendix B: Raw Machine Translation Output
All these raw outputs of machine translation were translated by Google Translate in
December 7, 2016.
A1
The first three quarters of this year, Hengda accumulated sales of about 280.5 billion yuan.
Cumulative sales area and sales price were 34.57 million square meters and 8115 yuan /
square meters, respectively, compared with 2015 increased by 106.0% and 5.8%. Total sales
A2
The first nine months of this year, Country Garden has achieved sales of 226.69 billion yuan,
up 43.7%. Operating income of 117.05 billion yuan, an increase of 20.5%. Last year, Country
Garden to achieve sales of 162.9 billion yuan during the year, leading the second enterprise
11.7 billion.
B1
This year China's development is facing more difficulties and challenges. However,
we have the socialist system with Chinese characteristics and the hard-working
wisdom of the Chinese people. As long as we unite as one, we will be able to achieve
B2
China's development has always been in response to challenges in the forward. After years of
rapid development, China has a solid material foundation and sufficient economic potential.
Reform and opening up also continue to inject new impetus. Any difficulties and obstacles are
79
C1
Time is life. No one does not care for his life, but few people cherish his time. If you
want to do something in your lifetime, learn a little knowledge, enrich yourself, make
C2
The most precious piece of time, but also the most likely to discard. Our time is often
unwittingly abandoned. Those who spend a lot of time under the fame of "weekend" are
80
Appendix C: Overall Figures of Processing Speed
Processing Speed
HT PE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 3.89 5.13 11.01 5.88 10.99 13.27
P2 8.00 9.16 9.04 13.49 15.26 20.60
P3 6.26 9.64 16.29 8.14 19.44 21.35
P4 / 8.14 15.03 / 15.53 29.81
P5 7.46 9.10 13.48 7.15 10.01 20.12
P6 5.48 9.52 12.15 11.06 9.38 21.86
P7 / 9.44 18.55 / 16.08 16.76
Undergraduates P8 11.19 9.71 13.84 13.03 11.41 14.50
P12 3.58 7.32 12.58 7.89 9.22 26.37
P15 6.53 9.02 13.48 9.13 15.71 10.13
P17 6.38 / 13.90 17.39 / 12.58
P18 7.25 / 14.47 17.61 / 16.77
P23 7.94 10.84 16.27 40.21 36.98 45.62
P25 10.67 8.56 15.56 24.09 12.28 22.31
P26 7.53 12.22 14.80 9.30 27.97 13.15
Average 7.09 9.06 14.03 14.18 16.17 20.35
Total 10.06 16.90
P9 17.25 19.83 37.90 37.52 31.58 23.69
P10 18.07 16.49 20.49 31.09 33.01 66.13
P11 6.31 / 16.41 12.26 / 39.34
P13 8.57 9.37 10.94 10.05 16.90 11.02
P14 10.14 10.74 11.22 13.07 29.15 25.25
P16 12.70 14.61 8.87 11.84 30.93 21.26
Postgraduates
P19 8.30 10.96 14.65 16.39 27.13 19.32
P20 8.16 11.85 11.82 8.51 12.07 12.78
P21 11.82 7.24 7.88 6.68 14.68 13.28
P22 / 12.20 17.96 / 22.25 20.57
P24 9.97 12.71 11.82 10.75 11.20 16.01
P27 / 16.33 21.35 / 26.06 48.96
81
P28 8.55 13.19 18.57 9.48 22.85 20.23
P29 / 8.34 7.87 / 22.25 18.23
P30 / 14.25 15.54 / 14.87 16.13
Average 10.89 12.72 15.55 15.24 22.49 24.81
Total 13.06 20.85
Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
82
Appendix D: Overall Figures of Pupil Dilation
Pupil Dilation
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 2.66 2.67 2.76 2.62 2.64 2.77
P2 2.85 2.94 2.99 2.95 2.91 2.95
P3 2.66 2.71 2.75 2.64 2.59 2.65
P4 / 2.52 2.51 / 2.48 2.46
P5 2.71 2.62 2.64 2.64 2.66 2.58
P6 3.36 3.38 3.41 3.36 3.23 3.23
P7 / 3.31 3.15 / 3.20 3.18
Undergraduate P8 3.45 3.38 3.47 3.50 3.43 3.44
P12 3.00 2.99 2.92 2.91 2.82 2.87
P15 2.86 2.76 2.78 2.67 2.63 2.70
P17 3.03 / 3.07 3.00 / 3.04
P18 3.51 / 3.38 3.28 / 3.28
P23 2.87 2.89 2.80 2.74 2.76 2.72
P25 3.25 3.20 3.25 3.26 3.17 3.20
P26 2.94 2.93 3.00 2.91 2.92 2.99
Average 3.01 2.95 2.99 2.96 2.88 2.94
Total 2.98 2.93
P9 2.70 2.69 2.64 2.53 2.48 2.50
P10 2.97 2.81 2.80 2.78 2.75 2.72
P11 3.80 / 3.54 3.57 / 3.46
P13 3.25 3.25 3.18 3.01 3.12 3.05
P14 3.08 3.10 3.11 3.14 3.05 3.04
P16 3.32 / 2.99 3.09 / 2.92
Postgraduate P19 3.35 3.27 3.16 3.04 3.09 3.10
P20 3.37 3.13 3.27 3.38 3.20 3.21
P21 2.35 2.35 2.35 2.28 2.29 2.30
P22 / 2.88 2.80 / 2.71 2.90
P24 2.83 2.79 2.72 2.87 2.77 2.71
P27 / 3.43 3.38 / 3.33 3.27
P28 3.03 2.94 2.92 2.96 2.92 2.93
83
P29 / 2.63 2.62 / 2.60 2.67
P30 / 3.09 3.07 / 2.91 2.98
Average 3.09 2.95 2.97 2.97 2.86 2.92
Total 3.01 2.92
Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
84
Appendix E: Overall Figures of Fixation Counts
Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
86
Fixations Counts in Win 1
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 496 478 154 301 190 424
P2 416 178 145 112 56 46
P3 608 377 178 311 134 141
P4 / 296 239 / 130 75
P5 510 265 207 340 271 101
P6 362 233 185 182 218 130
P7 / 201 111 / 128 157
Undergraduate P8 196 244 196 114 177 165
P12 874 514 347 346 319 158
P15 696 401 413 465 156 377
P17 353 / 236 129 / 195
P18 412 / 287 182 / 225
P23 419 318 277 69 85 87
P25 245 313 205 101 267 128
P26 251 150 91 166 66 119
Average 449.08 305.23 218.07 216.77 169.00 168.53
P9 187 181 99 95 58 59
P10 394 / 116 171 / 156
P11 295 304 233 141 100 107
P13 89 274 316 143 94 151
P14 237 / 373 189 / 142
P16 249 149 96 132 89 101
P19 526 405 315 237 347 323
Postgraduate
P20 433 667 1074 484 267 356
P21 / 62 113 / 143 46
P22 237 172 241 177 124 121
P24 / 120 113 / 63 50
P27 425 286 189 147 127 219
P28 / 286 327 / 57 184
P29 / 171 157 / 100 130
87
P30 294.82 247.69 256.93 179.64 128.77 151.73
Average 187 181 99 95 58 59
Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
88
Fixations Counts in Win 2
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 924 798 347 853 460 2
P2 89 425 549 442 429 347
P3 608 379 281 779 389 328
P4 / 475 188 / 350 322
P5 404 441 322 590 460 288
P6 577 310 237 372 482 215
P7 / 326 178 / 424 399
Undergraduate P8 384 532 394 363 618 387
P12 1262 728 467 925 919 309
P15 442 446 218 654 487 525
P17 428 / 209 451 / 459
P18 487 / 261 425 / 352
P23 403 297 183 154 176 135
P25 340 429 276 232 466 267
P26 599 339 351 690 282 456
Average 534.38 455.77 297.40 533.08 457.08 319.40
Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
90
Appendix F: Overall Figures of Average Fixation Duration
91
P28 481.36 533.15 0.00 424.33 410.67 448.24
P29 / 525.76 553.20 / 332.45 388.91
P30 / 480.74 471.17 / 465.97 493.83
Average 496.58 514.36 452.88 408.15 389.56 443.54
Total 487.94 413.75
Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
92
Average Fixations Duration in ST AOI
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 491.47 397.72 364.57 407.25 326.68 718.92
P2 600.66 495.39 476.00 383.33 302.21 440.83
P3 455.02 395.24 377.38 344.84 381.43 343.70
P4 / 471.56 502.00 / 362.27 287.43
P5 404.15 522.13 454.86 429.42 389.05 420.81
P6 537.68 437.25 448.14 525.64 494.61 351.22
P7 / 473.48 391.23 / 330.41 335.76
Undergraduate P8 468.13 396.07 381.04 430.45 352.47 314.67
P12 440.49 360.28 321.10 335.59 314.57 268.01
P15 400.12 362.45 371.88 315.76 325.78 408.56
P17 492.86 / 356.26 313.57 / 327.95
P18 429.42 / 343.68 309.20 / 305.69
P23 451.09 453.83 443.68 306.35 354.86 368.47
P25 539.72 462.18 469.77 418.66 415.52 351.16
P26 380.94 340.17 322.25 275.59 267.68 252.99
Average 468.60 428.29 401.59 368.90 355.20 366.41
93
P30 / 493.27 388.95 / 338.23 368.31
Average 443.65 424.34 379.00 331.91 311.95 353.82
Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
94
Average Fixations Duration in TT AOI
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 763.03 738.30 858.25 623.92 674.33 401.50
P2 476.76 567.95 521.99 489.55 446.35 411.49
P3 561.06 711.43 660.60 476.40 421.01 447.52
P4 / 660.82 628.15 / 436.13 335.74
P5 619.17 625.78 605.50 464.37 541.14 522.62
P6 610.89 787.26 728.88 571.13 615.13 546.12
P7 / 701.25 598.89 / 501.27 461.84
Undergraduate P8 605.06 595.64 558.10 663.54 471.21 538.32
P12 500.07 487.39 450.41 390.53 368.47 363.38
P15 454.85 490.46 514.21 373.54 393.91 435.93
P17 463.13 / 474.41 377.96 / 485.12
P18 473.95 / 434.39 379.51 / 424.13
P23 631.89 593.34 540.68 390.99 447.72 446.95
P25 646.52 726.66 580.19 506.31 470.42 538.40
P26 621.53 682.06 627.55 487.01 441.73 592.93
Average 571.38 643.72 585.48 476.52 479.14 463.47
95
P30 / 471.37 529.85 / 503.10 555.18
Average 537.00 580.46 562.42 432.43 414.26 484.07
Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.
96