You are on page 1of 112

分类号 ________ 密级 ___ 公开__ __

UDC __ 编号 20141210057

广东外语外贸大学硕士学位论文

An Empirical Investigation of Cognitive Effort Required


to Machine Translation Post-editing Compared to Human
Translation
译后编辑与人工翻译过程中认知努力的对比实
证研究

申请人姓名 周 博

导师姓名及职称 卢植 教授

申请学位类别 文 学

学科专业名称 翻译学

论文提交日期 2017 年 3 月 17 日

论文答辩日期 2017 年 5 月 25 日

答辩委员会 曾衍桃 教授 (主席)

刘梦莲 副教授 邹兵 讲师

学位授予单位 广东外语外贸大学
独创性声明

本人郑重声明:所呈交的学位论文是本人在导师指导下进行的研究工作及取得的

研究成果。据我所知,除了文中特别加以标注和致谢的地方外,论文中不包含其他人

已经发表或撰写过的研究成果,也不包含为获得 广东外语外贸大学 或其他教育机

构的学位或证书而使用过的材料。与我一同工作的人对本研究所做的任何贡献均已在

论文中作了明确的说明并表示谢意。

作者签名: 签字日期: 年 月 日

学位论文版权使用授权书

本学位论文作者完全了解 广东外语外贸大学 有关保留、使用学位论文的规定,

有权保留并向国家有关部门或机构送交论文的复印件和磁盘,允许论文被查阅和借阅。

本人授权 广东外语外贸大学 可以将学位论文的全部或部分内容编入有关数据库进

行检索,可以采用影印、缩印或扫描等复制手段保存、汇编学位论文。

作者签名: 导师签名:

签字日期: 年 月 日 签字日期: 年 月 日
An Empirical Investigation of Cognitive Effort
Required to Machine Translation Post-editing
Compared to Human Translation

By Zhou Bo

Supervised by Professor Lu Zhi

Submitted
in Partial Fulfillment of the Requirements for
the Degree of Master of Arts
in Translation Studies

Guangdong University of Foreign Studies


May 2017

i
ACKNOWLEDGEMENT

I’d like to extend my sincere appreciation to those professors and students who

have helped me a lot with my thesis.

My utmost gratitude goes to my supervisor Professor Lu Zhi, who has provided

me much help and supported me through my thesis. Without his consistent guidance

and insightful suggestions, this thesis is by no means successfully completed.

I’m also more than grateful to Associate Professor Ma Lijun, Guangzhou

University of Chinese Medicine, without whom, I could not get access to the

indispensable eye-tracking equipment Tobii TX300 used in this research. Professor

Ma offers me great encouragement and helps me tackle with every difficulty I’ve

encountered during the experiment.

I also want to thank Dr. Sun Juan who gives me many helpful advices. Besides,

my beloved parents, my dearest friend Wu Lijuan ,Wang Daozhu and Wang Ya are

always by my side and support me through the thesis. Also, I would like to thank

those students who’ve participated in my experiment.

Without all these distinguished and lovely people, this thesis is not possible of

completion.

ii
ABSTRACT

The present study is to investigate from a cognitive perspective the applicability

of post-editing of machine translation for college English learners to conduct

Chinese-English translation, compared with traditional human translation. Method

involved in this research is eye-tracking. Three research questions are raised to

explore: a) differences of temporal and cognitive effort required to post-editing and

human translation; b) impacts of text type and competence of translator on the

cognitive effort for post-editing; c) distinctions of the allocation of cognitive effort to

source text and target text between human translation and post-editing.

This research has a 2 (Task: post-editing and human translation) ×3 (Text:

economic, political and literary) ×2 (Competence: undergraduate and postgraduate)

mixed design. Task and text type are within-subject factors; competence of translator

is between-subject factor. Participants for this research consist of 15 undergraduates

and 15 postgraduates. Each participant is asked to translate six short texts, three to

translate from scratch and three to post-edit. Translation materials are presented with

Translog II. Real-time eye movement data is collected by eye-tracker. Pupil dilation,

fixation count and fixation duration are collected as proxies of cognitive effort.

Processing speed is considered as an indicator of temporal effort.

Results show that (1) post-editing is processed significantly faster than human

translation (p < 0.01); (2) fixation counts for post-editing are significantly fewer than

those for human translation, so is the average fixation duration; pupil dilation for

post-editing is significantly smaller than that for human translation; all these indicate

that cognitive effort required to post-editing is less, compared with human translation

(p < 0.01); (3) the main effect of text type is significant (p < 0.01), which indicates

that cognitive effort for post-editing varies with text types; (4) the main effect of
iii
competence proves marginally significant (p = 0.051); postgraduates require less

cognitive effort for post-editing than undergraduates; (5) translators look more into

source text area in human translation than in post-editing (p < 0.01); yet no significant

difference is found in terms of fixation counts in target text area; besides, fixation

duration on source text and target text are both significantly longer in translation

process than in post-editing process (p < 0.01). Significant differences proven in

fixation counts and fixation duration together indicate that, compared with

post-editing, translators consume more cognitive effort for both source text

comprehension and target text production when translating from scratch; (6) for both

post-editing and human translation task, there are more fixation counts and longer

fixation duration in the target text area than in the source text area.

Based on the results indicated above, post-editing could save temporal effort and

increase productivity. What’s more, post-editing saves cognitive effort in both source

text comprehension and target text production. Thus, post-editing should be a viable

alternative for college English learner to translate from Chinese to English.

Key Words: Post-editing, Machine translation, Eye-tracking, Temporal effort,

Cognitive effort

iv
摘 要

本研究针对汉英语言对,从认知视角采用眼动的方法探究相较传统的人工翻

译而言,机器翻译的译后编辑是否能够成为大学英语学习者进行汉英翻译的新方

法。针对该研究目的,提出三个研究问题:①译后编辑和人工翻译过程中所花费

的时间和认知努力有什么不同;②被试水平及文本类型是否影响译后编辑过程中

所需的认知努力;③在翻译过程中,译后编辑和人工翻译在对原文理解和译文生

成的认知努力分配上有什么不同。

本实验为 2×3×2 的三因素混合实验设计,任务类型(人工翻译和译后编辑)

和文本类型(经济、政治和文学)为被试内变量,被试水平(本科生和研究生)

为被试间变量。实验招募 15 名本科生和 15 名研究生。每名被试需进行六个任务,

三个为人工翻译,三个为译后编辑。实验文本由 Translog II 呈现,翻译过程中的

实时眼动数据由眼动仪记录。实验收集瞳孔直径、注视点个数、注视时长以及任

务总时长用以分析翻译或译后编辑过程中时间及认知努力。

实验结果如下:①译后编辑的完成速度高于人工翻译,两者存在显著性差异

(p < 0.01);②译后编辑过程中注视点个数明显少于人工翻译;平均注视时长远

低于人工翻译;译后编辑时译者瞳孔直径明显小于人工翻译;表明译后编辑过程

中的认知努力小于人工翻译(p < 0.01);③文本类型的主效应显著(p < 0.01),

译后编辑时所消耗的认知努力随文本不同而变化;④被试水平主效应边缘显著(p

= 0.051),译后编辑中被试水平越高所消耗的认知努力越少;⑤人工翻译中译者

对原文的注视明显多于译后编辑中对原文的注视(p < 0.01),但在对译文的注视

上两者没有显著差异;对原文和译文的平均注视时长上,人工翻译均长于译后编

辑(p < 0.01),表明在原文理解和译文生成上人工翻译均耗费较多的认知努力。

⑥在人工翻译和译后编辑过程中,译者对译文的注视次数及注视时长均多于对原

文的注视。

研究结果表明,译后编辑可以缩短翻译时间提高翻译效率,减少译者在原文
ii
理解及译文生成上的认知努力的消耗,是大学英语学习者进行汉英翻译的可行选

择。

关键词:译后编辑,机器翻译,眼动,时间努力,认知努力

iii
CONTENTS

ACKNOWLEDGEMENT ............................................................................................. ii

ABSTRACT ..................................................................................................................iii

摘 要........................................................................................................................ ii

CONTENTS .................................................................................................................. iv

LIST OF ABBREVIATIONS....................................................................................... vii

LIST OF TABLES ......................................................................................................viii

LIST OF FIGURES ....................................................................................................... x

CHAPTER ONE INTRODUCTION .......................................................................... 1

1.1 Rationale ....................................................................................................... 1

1.2 Significance................................................................................................... 3

1.3 Research Objective and Research Questions ................................................ 4

1.4 Research Methodology and Data Collection ................................................ 5

1.5 Organization of the Thesis ............................................................................ 5

CHAPTER TWO LITERATURE REVIEW............................................................... 7

2.1 Historical Overview of PE Research ............................................................... 7

2.1.1 PE Research in the Late 1950s and Early 1960s................................... 7

2.1.2 PE Research in the 21th Century .......................................................... 8

2.2 Research on PE Effort .................................................................................... 10

2.2.1 PE Effort Research with Different Methods ....................................... 10

2.2.2 PE Effort Research Concerning Different Language Pairs ................. 12

2.3 PE Effort Research in China .......................................................................... 14

2.4 Summary ........................................................................................................ 15

CHAPTER THREE THEORETICAL FRAMEWORK ........................................... 17

3.1 Working Definition ........................................................................................ 17


iv
3.1.1 PE ........................................................................................................ 17

3.1.2 Cognitive Effort .................................................................................. 18

3.2 Theoretical Basis ............................................................................................ 18

3.2.1 Immediacy Assumption and Eye-Mind Assumption .......................... 18

3.2.2 Krings’ Differentiation of PE Effort ................................................... 20

3.2.3 Eye Movement Data and Cognitive Effort ......................................... 22

3.2.4 Summary ............................................................................................. 24

3.3 Analytical Framework .................................................................................... 24

CHAPTER FOUR METHODOLOGY..................................................................... 27

4.1. Research Questions ....................................................................................... 27

4.2. Research Hypothesis ..................................................................................... 27

4.3. Participants .................................................................................................... 28

4.4 Materials ........................................................................................................ 29

4.5 Equipment ...................................................................................................... 31

4.6 Procedures ...................................................................................................... 33

4.6.1 Environment ........................................................................................ 33

4.6.2 Task Execution .................................................................................... 34

CHAPTER FIVE DATA ANALYSIS AND DISCUSSION ..................................... 37

5.1 Data Quality ................................................................................................... 37

5.2 Data Processing .............................................................................................. 38

5.3 Results of the Experiment .............................................................................. 40

5.3.1 Processing Speed ................................................................................ 41

5.3.2 Pupil Dilation ...................................................................................... 43

5.3.3 Fixation Count .................................................................................... 47

5.3.4 Average Fixation Duration .................................................................. 53

5.4 Discussion of the Results ............................................................................... 59

5.4.1 Discussion of Hypothesis 1 ................................................................. 59

5.4.2 Discussion of Hypothesis 2 ................................................................. 61


v
5.4.3 Discussion of Hypothesis 3 ................................................................. 63

5.4.4 Summary ............................................................................................. 65

CHAPTER SIX CONCLUSION .............................................................................. 67

6.1 Major Findings ............................................................................................... 67

6.2 Limitations ..................................................................................................... 68

6.3 Suggestions for Future Research ................................................................... 68

REFERENCES ............................................................................................................ 70

APPENDICES ............................................................................................................. 78

Appendix A: Source Texts ................................................................................... 78

Appendix B: Raw Machine Translation Output................................................... 79

Appendix C: Overall Figures of Processing Speed.............................................. 81

Appendix D: Overall Figures of Pupil Dilation ................................................... 83

Appendix E: Overall Figures of Fixation Counts ................................................ 85

Appendix F: Overall Figures of Average Fixation Duration................................ 91

vi
LIST OF ABBREVIATIONS

AOI: Areas of Interest

GTS: Gaze Time on Screen

HT: Human Translation

IPE: Intelligent Post-editor

PE: Post-editing

PG: Postgraduate

ST AOI: Source Text Area of Interest

ST: Source Text

TT AOI: Target Text Area of Interest

TT: Target Text

UG: Undergraduate

vii
LIST OF TABLES

Table 4-1: Word number for each text

Table 4-2: Default sequence of source texts

Table 4-3: Semi-randomized sequence of source texts

Table 5-1: Terms in the xml document (based on Translog II manual)

Table 5-2: Abbreviations used in tables and figures

Table 5-3: Descriptive Statistics of participants’ processing speed of translation Task

(N=20)

Table 5-4: Descriptive Statistics of participants’ processing speed of post-editing task

(N=20)

Table 5-5: Results of three-way ANOVA in terms of processing speed

Table 5-6: Results of three-way ANOVA in terms of pupil dilation

Table 5-7: Descriptive Statistics of participants’ average pupil dilation of translation

task (N=20)

Table 5-8: Descriptive Statistics of participants’ average pupil dilation of post-editing

task (N=20)

Table 5-9: Descriptive Statistics of total fixation counts for both human translation

and post-editing tasks (N=20)

Table 5-10: Results of three-way ANOVA in terms of total fixation counts

Table 5-11: Results of three-way ANOVA in terms of fixation counts in the ST AOI

Table 5-12: Results of three-way ANOVA in terms of fixation counts in the TT AOI

Table 5-13: Results of three-way ANOVA in terms of average fixation duration in

both AOIs

Table 5-14: Descriptive Statistics of the average fixation duration for all AOIs

Table 5-15: Results of three-way ANOVA in terms of average fixation duration in ST

AOI

viii
Table 5-16: Descriptive Statistics of the average fixation duration for ST AOIs

Table 5-17: Results of three-way ANOVA in terms of average fixation duration in TT

AOI

Table 5-18: Descriptive Statistics of the average fixation duration for TT AOIs

ix
LIST OF FIGURES

Figure 3-1: Differentiation of PE Effort (Krings, 2001)

Figure 4-1: Random assignment of the task to participants

Figure 4-2: TX300 monitor

Figure 4-3: Eye tracker unit

Figure 4-4: Participant conducting eye-tracking task

Figure 4-5: Screenshot of Translog II interface for PE

Figure 4-6: Connecting Translog II to eye tracker

Figure 4-7: Calibration Results

Figure 5-1: Eye-tracking data recorded in the xml doc.

Figure 5-2: Replaying of the xml file before step 4

Figure 5-3: Replaying of the xml file after step 4 (blue circles in the picture refer to

eye fixations)

Figure 5-4: Average processing speed for translating or post-editing different types of

texts conducted by undergraduates and postgraduates students

Figure 5-5: Estimated marginal means of pupil dilation of economic text

Figure 5-6: Estimated marginal means of pupil dilation of political text

Figure 5-7: Estimated marginal means of pupil dilation of literary text

Figure 5-8: Estimated marginal means of total fixation counts of economic text

Figure 5-9: Estimated marginal means of total fixation counts of political text

Figure 5-10: Estimated marginal means of total fixation counts of literary text

Figure 5-11: Average fixation counts in win 1 (ST AOI) for translating or post-editing

different types of texts conducted by undergraduates and postgraduates

Figure 5-12: Average fixation counts in win 2 (TT AOI) for translating or post-editing

different types of texts conducted by undergraduates and postgraduates


x
Figure 5-13: Estimated marginal means of average fixation duration in ST AOI

xi
CHAPTER ONE

INTRODUCTION

This chapter gives a brief introduction to this research, including the research

rationale, significance of this research, research objectives and questions,

methodology of the research and data collection. At the end of this chapter, a brief

layout will be given to offer an overall picture of this thesis.

1.1 Rationale

In this information age, as an effect of globalization, increasing demand for

information and global communication has produced huge needs for machine

translation not only for personal use but also for commercial usage. Yet, although

boasting the advantage of high speed, machine translation system has always been

criticized for the poor quality of its output. Considering the complexity of the machine

translation system, instead of counting on the improvement of machine translation

system, people start to turn their eyes to post-editing.

Post-editing, defined by the TAUS/CNGL as “the correction of

machine‐generated output to insure it meets a level of quality negotiated in advance

between client and post-editor” (Carl, Gutermuth & Hansen-Schirra, 2015), is

attracting eyes not only from the translation industry but also from the world of

academics. In the past decade, a vast body of publications concerning post-editing

was released, covering various aspects of post-editing, including productivity,

guidelines, quality evaluation, language pairs and cognitive effort, to name a few.

Some studies have indicated that machine translation post-editing has an obvious

advantage in productivity increase. Study conducted by Plitt and Masselot (2010)

measured the productivity increase of traditional human translation, compared with


1
machine translation post-editing. Results of Plitt and Masselot’s study (2010) has

shown that for each participant there’s a productivity increase, which means that

post-editing of statistical machine translation allowed translators to substantially

increase their productivity. However, studies by Carl (2011) did not find significant

difference in productivity between post-editing and human translation. When

investigating the role of professional experience in post-editing productivity,

Guerberof (2014) found no significant difference in processing speed when comparing

the experienced with non-experienced translators.

Previous large-scale studies on post-editing productivity indicate that no firm

conclusion declaring that there will be a productivity increase for post-editing

compared with traditional human translation yet can be drawn. Productivity increase

appears to depend on many factors: language pair, experience of the translators, to

name a few. Koponen (2016: 136) points out that productivity increase depends on

some “specific conditions” and that “sufficiently high quality machine translation

which is currently achievable for certain language pairs and machine translation

system geared toward the specific text type being translated”.

Many language pairs have already been studied, for instance English-Danish,

English-German, English-Spanish and so on (Lao, 2014; Carl, 2011). However, the

Chinese-English language pair is rarely studied. García (2010) has carried out

experiments with English-Chinese trainees, trying to test the usability of the

translation provided by translation memory system Google Translate Toolkit.

Although García (2010) did prove the usability of raw translation, no significant

difference was found in time saving between post-editing and human translation.

Besides, García only considered temporal effort. In this research, both temporal and

cognitive effort will be taken into account to testify the applicability of post-editing to

Chinese-English translation.

Krings (2001) classifies effort of post-editing into three categories: temporal,

technical and cognitive. Temporal effort is about time. Technical effort is related to the
2
technical operation of translators, such as deletion and insertion. Both of these two

efforts are easily observed. The cognitive effort of post-editing, as the most important

one among the three, influences temporal and technical effort and should be seriously

considered. In addition, as post-editing studies in China are mainly theoretical reviews

as well as the improvement of machine translation systems, few studies are conducted

from cognitive perspective.

1.2 Significance

To bridge the gap of post-editing studies in China, this thesis reports an exploratory

research on the applicability of post-editing to the Chinese-English language pair

performed by college English learners. Temporal and cognitive effort required to

translation and post-editing process is compared in this study. Eye-tracker is used to

collects data that could reflect cognitive effort. The significance of this study is shown

as follows.

First, post-editing has become a really hot issue for both translation industry and

translation academia in the West, whereas in China, studies on post-editing mainly

focus on literature review and ways to improve machine translation system. This

research was among the first experimental translation process research concerning

post-editing in China. It provided important practical lessons for studies concerning

this topic in the future.

Second, in this research, the applicability of post-editing for the Chinese-English

language pair was testified. As studies in the past have revealed that the applicability

of post-editing varied according to languages and text types, many language pairs

have been ascertained. Although, García (2009, 2010) has conducted experiments

testing post-editing for Chinese and English language pair, the study was conducted

from temporal aspect. Study in this paper was carried out from cognitive aspect.

Cognitive effort was compared between post-editing and human translation and

proved to be less for post-editing.


3
Last but not least, participants in this research are college English learners, while

most studies in the past were conducted with professional translators or post-editors in

consideration of applicability of post-editing in translation industry. This research

proves that post-editing is not exclusive to professional translators. Although with a

small sample, the results of this study still indicate that for the Chinese-English

language pair, post-editing is feasible for college English learners.

1.3 Research Objective and Research Questions

In this thesis, we are seeking to answer whether machine translation post-editing

could become a new way for college English learners to conduct Chinese-English

translation work, compared with traditional human translation. Therefore, the research

objective is to investigate the applicability of machine translation post-editing in the

Chinese-English language pair performed by college English learners from a

cognitive perspective.

We put forward the following three detailed research questions:

1) In terms of Chinese-English language pair, how is machine translation

post-editing and traditional human translation temporally and cognitively different

from each other?

2) What influence will text type and the competence of translators have on

cognitive effort spent in post-editing?

3) What’s the distinction of the allocation of cognitive effort to source text and

target text between human translation process and post-editing process?

Question one is the basic question of this research, exploring whether

post-editing is applicable for the Chinese-English language pair by comparing the

temporal and cognitive effort of post-editing with human translation. Question two

explores the influence of text type and translator competence on cognitive effort spent

in the post-editing process. Question three is a further investigation into the allocation

of cognitive effort to the source text and target text so as to learn more about the
4
post-editing process and (if possible) to explore the reason of cognitive effort

reduction.

1.4 Research Methodology and Data Collection

To seek answers for the research questions, an empirical study is conducted with

eye-tracking equipment, which is used to record the moment-to-moment eye activities

of the participants, thus to compare cognitive effort expended in the process of PE and

conventional translation.

30 participants are employed to take part in the experiment, including 15

undergraduates and 15 postgraduates. Each participant has to complete six tasks, three

to translate and three to post-edit. Six tasks are presented to participants with

semi-randomized order to reduce carry-over effect. Data is collected by Translog II

software and saved in xml document.

After the experiment, four kinds of data are abstracted or derived from the xml

files: (i) total task time, (ii) fixation counts, (iii) average fixation duration and (iv)

pupil dilation. Data is analyzed with SPSS 22.0.

1.5 Organization of the Thesis

This thesis consists of six chapters.

Chapter One is the introduction of the thesis, including rationale and significance

of this research, research objectives and questions, methodology and data collection.

Chapter Two presents a brief review of studies on post-editing in the late 1950s

and early 1960s as well as in the twenty first century .Studies on post-editing in China

are also introduced in the last part of this chapter.

Chapter Three presents the theoretical framework and analytical framework of

this research. Definitions of major items and theoretical basis for this research are

elaborated in this chapter.

Chapter Four elaborates the methodology of this research, including participants,


5
materials, equipment and procedures of the experiment.

Chapter Five is data analysis and discussion. It also elaborates some criteria for

eye movement data. Results of the experiment are presented in this chapter. Three

research hypotheses will be discussed in combination with experiment results.

Chapter Six is the conclusion part, briefly concluding major findings of this study,

elaborating some limitations of the study, and also offering suggestions for future

studies.

6
CHAPTER TWO

LITERATURE REVIEW

This chapter presents a brief review of studies on post-editing (hereinafter referred to

as PE) in the late 1950s and early 1960s as well as in the past decade, which are,

respectively, the “inception” of post-editing in the word used by García (2012) and the

prosperous period since the work of Krings (finished in 1994) was translated and

published in 2001. Much of this chapter is devoted to the post-editing effort

(hereinafter referred to as PE effort) research, including PE effort studies with

different methods and PE effort studies on different language pairs, which are closely

related to this research. At the end of this chapter, a summary of the previous studies

is presented.

2.1 Historical Overview of PE Research

2.1.1 PE Research in the Late 1950s and Early 1960s

Though recently, with the development of computer technology, post-editing of

machine translation begins to attract people’s eyes and seemingly becomes a strong

candidate for the replacement of human translation. In fact, it’s not a novel issue. As

one of the earliest envisioned uses for machine translation system, PE is as old as

machine translation. According to García (2012), post-editing of machine translation

was a pretty hot issue in the late 1950s and early 1960s.

The earliest envision for the use of PE is proposed by Edmundson and Hays

(1958). In their paper, they firstly introduced the methods used then at The RAND

Corporation for research on machine translation of scientific Russian, which could be

divided into four components: text preparation, glossary development, translation,

translation and analysis. Edmundson and Hays (1958: 11) noted that “translation” in

7
their paper referred to the two-stage processes of machine translation and post-editing.

The machine translation system would do a rough translation and “a post-editor works

on this list, converting it into a smooth English version of the Russian original”

(Edmundson & Hays, 1958: 11). Although Edmundson and Hays (1958) proposed the

conduction of PE work, they just considered it as a simple revising part for a better

translation.

According to García (2012), Orr and Small (1967) was the first to take

post-editing seriously into research. A reading comprehension test was conducted in

their research to compare three different English versions of the same Russian

technical articles: translated by machine only, by machine plus post-editing and by

normal manual translation. The results showed that scores of the hand-translation

group were higher than post-edited group scores; and the post-edited group scores

were higher than machine translation scores, which consistently indicate that manual

translation exceeded post-edited translation, and post-edited translation exceeded

machine translation (Orr & Small, 1967).

This period was called by García (2012: 293) as the “inception” period of PE, yet

García noted that studies during this period were mainly empirical and were

fundamentals established by theorists, since the computer technology at that time was

not adequate enough (2012: 299).

2.1.2 PE Research in the 21th Century

Between the inception period and the prosperity period (the past decade), there was

actually another period named as “latency” period by García (2012), which was from

1967 to 1999. In 1966, a report by the Automatic Language Processing Advisory

Committee (ALPAC) - Language And Machines: Computers in Translation and

Linguistics elaborated the state of development of vast applications of computer in

automatic language translation and computer linguistics and stated clearly that

compared with traditional translation, PE required more time, produced worse quality
8
and was more difficult to perform. The ALPAC Report, to some extent, had machine

translation and PE studies cool down. During, this period, those who continued to

pursue PE were some institutions and enterprises like Systran and METEO. Therefore,

studies in this latency period will not be elaborated here.

The past decade has witnessed prosperity in PE research with the advent of

Translation Memory tool, the Free Online Machine Translation by the end of the

1990s and advances in computer technologies which lead to high quality of machine

translation output. Besides, with an increasing demand for information and global

communication, spotlights have again turned to Machine translation and PE studies.

The experimental studies reappeared since the late 1990s when the work of Krings,

finished in 1994 in German, was translated and published in English in 2001 (García,

2012).

There is a vast body of publications on PE, covering various topics: Whether PE

could increase productivity (e.g. Guerberof, 2009; García, 2010; Plitt & Masselot,

2010; Carl et al., 2011; Green & Manning , 2013; Zhechev, 2014) ; whether the output

of machine translation could be used or what’s the quality of PE output (e.g. Blatz et

al., 2004; Specia et al., 2009; Fiederer & O’Brien, 2009; Plitt & Masselot, 2010;

Guerberof, 2014); was it feasible for a person to post-edit without referring to the

source text (e.g. Koehn, 2010; Callison-Burch et al., 2010; Hu et al., 2011; Mitchell et

al., 2013; Koponen & Salmi, 2015); how many effort would be cost in the

post-editing process compared with human translation (e.g. Tatsumi, 2009; Specia et

al., 2010; Sousa et al., 2011; Callison-Burch et al., 2012; Moran et al., 2014; Vieira,

2014) and some new methods that could be applied to PE effort research (e.g. Specia,

2011; Carl et al., 2011; Lacruz et al., 2012; Elming et al., 2014; Lacruz et al., 2014).

Productivity increase and time saving has been proved by many studies in

support of PE in lieu of traditional human translation (O’Brien, 2011; Guerberof, 2009;

Plitt & Masselot, 2010). However, there are also some conflicting findings. For

instance, studies by Carl (2011) did not find significant difference in productivity
9
between post-editing and human translation. When investigating the role of

professional experience in post-editing productivity, Guerberof (2014) also found no

significant differences in processing speed when comparing the experienced with

non-experienced translators.

Previous large-scale studies on PE productivity indicate that productivity

increase appears to depend on many factors: language pair, experience of the

translators, text types, to name a few. Koponen (2016: 136) points out that

productivity increase depends on some “specific conditions” and these conditions

relate to “sufficiently high quality machine translation which is currently achievable

for certain language pairs and machine translation system geared toward the specific

text type being translated”.

2.2 Research on PE Effort

Research on PE effort forms a large portion of all the research on PE. PE effort was

categorized into three kinds: temporal effort, technical effort and cognitive effort by

Krings (2001). PE effort studies in the past decade mainly focus on two or three kinds

based on this classification. Different researchers developed different methods to

measure post-editing effort such as Think-aloud protocols (TAPs), keystroke logging

and eye-tracking. Also, different language pairs were studied. However, in China,

most research on PE is literature review and empirical studies are barely conducted,

especially in terms of PE effort.

2.2.1 PE Effort Research with Different Methods

Research on PE effort has been carried out with different methods, including TAPs,

keystroke logging, eye tracking, and screen recording and so on.

Krings (2001) carried out a TAPs study of the mental processes involved in PE

compared with traditional translation. Krings studies PE from a psycholinguistic point

of view, focusing on the cost and effort of PE compared to conventional human


10
translation. Normally, TAPs method is not used alone to investigate PE effort. It is

always used as a supplement to other methods, such as screen recording, keystroke

logging and eye tracking. Koglin (2015) used respective TAPs combined with eye

tracking and keystroke logging to investigate the cognitive effort in the process of PE

and human translation. O’Brien (2006) also used the retrospective protocols -“retro

eye cue method” the exact words she used in her paper. The participant spoke out

what they were thinking and doing at a particular time when they looked at the reply

of their gaze activity.

Also, some studies use specific scale to have the participants do self-evaluation

of the PE effort. In Specia’s (2010) study, professional translators were asked to rank

the four sentences based on the degree of PE effort each sentence needed to be done

with according to a four-point scale. Callison-Burch, Koehn, Monz, et al. (2012)

asked translators to evaluate and rank the sentences after the translation or

post-editing tasks according to the PE degree with a five-point scale and found that

the evaluation was closely related with the quality. Yet, this kind of evaluation can

only roughly provide a prediction for the PE effort, since it is, to some degree,

subjective and much depends on the participants.

Keystroke logging has long been used to investigate the cognitive aspect of

translation process (O’Brien, 2006). Elming et al. (2014) point out that the use of key

logging to track how the changes were made is a better measure of the actual technical

effort. In the study of Elming et al. (2014), professional translators were asked to

post-edit and translate with the CASMACAT workbench which is a computer-aided

tool. Keystroke data were collected analyzed. Elming et al. (2014) found that

post-editing could lead to time saving of 25% than human translation and the time

saving was largely related to the number of edits. This result coincides with the one

found by Tatsumi (2009) that the time for PE greatly depended on the number of edits

carried out.

O’Brien (2006) conducted a preliminary investigation which proved that eye


11
tracking could be used as a useful research methodology for investigating translators’

interaction with Translation Memory tools. In this study, she found that when there

was no match for the source text, translators had to spend the most cognitive effort.

O’Brien (2008) further studied the cognitive effort translators required when dealing

with different values of fuzzy matches with eye tracking method, and found that

cognitive effort was not really inversely proportional to the value of fuzzy match. Carl

(2011) conducted an eye tracking study, investigating the process between

post-editing and translation from scratch. The results of this study indicated that the

processing speed and translation quality were all boasting improvement for

post-editing, compared with human translation.

In summary, as every researching method boasts its own advantages and

disadvantages, researchers prefer to combine some of them together, which is called

“triangulation” by Carl (2009), to get a relatively objective and generally applicable

conclusion.

2.2.2 PE Effort Research Concerning Different Language Pairs

As Koponen (2016: 136) points out that increase in productivity is subjected to some

“specific conditions”. These specific conditions largely depend on language pair,

which means that the applicability of machine translation post-editing for different

language pairs should be considered before PE is promoted.

Many language pairs have already been studied. Specia (2010) has studies

post-editing from English to Danish and Spanish and found positive results.

Callison-Burch et al. (2012) has tested four language pairs: English-German,

English-French, English-Czech and English-Spanish, yet in one single direction.

Sousa (2011) assessed the post-editing effort for English and Brazilian Portuguese and

found that when translating subtitles, PE is about 40% faster than conventional

translation. Study conducted by Tatsumi (2009) in assessing what impact text

characters (sentence length and text structure) have on PE speed was carried out in
12
terms of English-Japanese language pair.

As for language pairs including Chinese, studies are not as much as other

languages, especially English which almost included in every study (Tatsumi, 2009;

Specia et al., 2010; Sousa et al., 2011; Callison-Burch et al., 2012). Maybe, this is

because Chinese is a logographic language as proposed by Lourenço da Silva et al.

(2015).

Lourenço da Silva, Schmaltz and Alves et al. (2015) carried out an exploratory

research in terms of the Portuguese-Chinese language pair, which is barely studied.

They collected keystroke and eye data of professional translators to compare the

process of PE and human translation. Results indicated that PE and human translation

required different cognitive efforts to the understanding of source text. Also, technical

effort was found different in the text production. Lourenço da Silva et al. (2015)

found that if the number of deletions is greater than that of insertion, the cognitive

effort would be greater for the PE process.

García (2010) conducted a study to testify whether the translation advised by the

Google Translator Toolkit for no-match sentence was suitable to use. Translation

students were asked to translate from scratch and post-edit respectively from English

to Chinese. Quality of the translation and post-editing output were marked. No

significant difference was found considering the processing time, yet results supported

that the quality of PE output was comparable to the human translation output.

García (2011) further conducted another study. In this study, García included

factors like language directionality, difficulty of source text and performance level of

translators. 14 subjects were asked to translate and post-edit from English to Chinese

and 21 subjects were asked to translate and post-edit from Chinese to English. Time

for doing the task was recorded and the quality of final translation was marked as the

course grade of the subjects. This time, still no significant difference was found in

terms of PE productivity. But the results indicated that compared from translating

from English to Chinese, subjects had more productivity increase when translating
13
from Chinese to English. García (2011: 229) concluded that “translating by

post-editing works reasonably well with translation trainees”.

2.3 PE Effort Research in China

As Feng and Cui (冯功全 & 崔启亮, 2016: 68) has pointed out that post-editing

research in China has far lagged behind those in the West. In Translation Studies,

studies on PE are nearly twenty years later than those in the West countries. PE

studies in China are mainly on two aspects: introductory overview of PE studies in

China and aboard and research on way to improve machine translation system.

In the 1990s, research on PE were mainly about the way to develop and improve

the intelligent post-editor1 (IPE) (黄河燕 & 陈肇雄, 1995; 韩培新, 1998). Studies

of PE by researchers in Translation Studies were actually punctuated by the article

published by Wei and Zhang in 2007. Wei and Zhang (2007) introduced the basic

concept of PE, elaborated the necessity for PE and described how to do PE and who

the post-editor should be.

Luo and Li (2012), based on the translational corpus of automotive technical

documentation, compared the machine translation output with human translation

output. The statistical result of this study has supported that machine translation has

made great progress in dealing with syntax. Luo and Li (2012) also stated that it was

highly necessary to strengthen studies on machine translation error patterns.

Li and Zhu (2013) conducted a further research based on the research of Luo and

Li (2012). Li and Zhu (2013) supposed that a secondary processing could be done to

these error patterns of machine translation in order to reduce the workload of

post-editors. The error patterns identified in the study of Li and Zhu (2013) provided

very helpful foundation for the development of IPE in the future. Cui and Li (2015)

also identified the error patterns of machine translation by practical examples and

summarized the characteristics of PE. Yet, Cui and Li (2015) focused on scientific and
1
Here, post-editor refers to the machine system that could do post-editing work. In this thesis, post-editor refers to
the person who conducts post-editing work, unless otherwise noted.
14
technical texts.

Cui (2014) and Feng and Cui (2016) elaborated the focuses of PE studies in

China and aboard and predicted the trends for PE research. Wang (2013) focused on

the empirical studies on computer-aided translation process outside China, mainly in

the west. Feng and Zhang (2015) described the necessity of post-editor training in

translation education and proposed that the post-editor training course could improve

the competiveness of graduates who are majoring in translation and also satisfy the

increasing demand of post-editors from translation industry.

There are also some studies concerning post-editing conducted by student

researchers. Huang (2016) explored strategies to improve productivity of post-editing

and factors that could affect productivity and found that quality of machine translation

output significantly influenced post-editing productivity. Wang (2015) elaborated the

viability of post-editing by briefly introducing computer-aided translation and

machine translation and verified the viability with his own translation practice. Wang

(2016) also researched the applicability of machine translation plus post-editing to the

non-technical texts.

In summary, most of the PE studies in China are theoretical review, giving a

whole picture of the PE research. There are also some studies on the development or

improvement of machine translation systems or IPE. The process research of

post-editing has achieved abundant results in the West, especially in the west, yet

barely studied in China.

2.4 Summary

In conclusion, post-editing research has covered various topics, including post-editing

efficiency, the usability of machine translation output, quality estimation of machine

translation output, post-editing effort and so on. Post-editing research has become a

hot issue these years, especially as eye-tracking method is introduced to post-editing

effort research. However, lots of studies have indicated that the applicability of
15
post-editing is condition-specific. It is subject to text, language, machine translation

system and so on.

Based on the literature review elaborated above, post-editing effort research

concerning Chinese-English language pair is barely studied. Besides, even though

researchers have proposed that text type might influence the feasibility of post-editing,

few studies really take this factor into account. In addition, almost all the studies are

conducted for the benefit of translation industry, thus the participants are mostly

professional translators or post-editors. Few studies aim at testing whether student

translators could conduct translation work by post-editing. Therefore, to address this

gap, this paper presents a study testing the viability of post-editing for

Chinese-English language pair. Participants for this experiment are all students so as

to testify the usability of post-editing by college English learners. What’s more, three

kinds of texts are included to investigate whether text type influence post-editing

effort.

16
CHAPTER THREE

THEORETICAL FRAMEWORK

3.1 Working Definition

3.1.1 PE

PE has long been proposed and considered as necessary to machine translation.

Bar-Hillel in 1951 stated that the “fully automatic” machine translation was “not

achievable in the foreseeable future’, and there had to be a “human brain” intervening

in the process (1951: 230). Here, the “human brain” refers to the person who does

post-editing work. Bar-Hillel (1951: 231) believed that the task of post-editing was

“to produce out of the raw output … a readable translation in a fraction of the time it

would take a bilingual expert to produce a translation with the conventional

procedure”.

PE used to be defines as the correction of machine translation output by human.

Allen (2003: 297) deemed that the task of post-editors was to “edit, modify and/or

correct pre-translated text that has been processed by an MT system from a source

language into (a) target language(s)”.

PE is defined by the TAUS/CNGL as “the correction of machine‐generated

output to insure it meets a level of quality negotiated in advance between client and

post-editor” (Carl et al., 2015: 146).

“Post-editing or postediting (PE for short) is the process of modifying or

correcting the original output of machine translation system under certain purpose,

including correcting the translation (language) errors or improving the accuracy or

readability of the MT output” (冯全功 & 崔启亮, 2016: 67).

In this paper, PE is defined as the correction of machine translation output to


17
insure it meets a level of quality set by the post-editor.

3.1.2 Cognitive Effort

As noted by Krings (2001: 178), the “fully automatic high quality” machine

translation output is not really available in the foreseeable future. Therefore, the

amount of PE effort required in the post-editing process will be the primary factor

which determines whether machine translation is worth the effort or not.

Effort expended in post-editing process was classified by Krings (2001: 178) into

three categories: a) temporal post-editing effort, b) cognitive post-editing effort, and c)

technical post-editing effort.

Temporal post-editing effort is connected with time; and technical post-editing

effort is related to the physical operations performed by post-editors such as deletion,

insertion and reconstruction of the sentence structure and so on. These two kinds of

effort could be recorded or observed externally, while cognitive post-editing effort

could not be observed directly.

As for cognitive post-editing effort, according to Krings (2001: 179), it “involves

the type and extent of those cognitive processes that must be activated in order to

remedy a given deficiency in a machine translation”. Cognitive effort of post-editing

can neither be observed nor be measured directly. It was defined by Krings from the

psycholinguistic perspective. Krings (2001) emphasized that cognitive post-editing

effort was the most important and most decisive variable among the three categories,

so to speak, affecting temporal and technical effort of post-editing.

3.2 Theoretical Basis

3.2.1 Immediacy Assumption and Eye-Mind Assumption

The fundamentally theoretical basis or the operational basis for this research is the

“immediacy assumption” and “eye-mind assumption” put up by Just and Carpenter

(1980) as they were researching into reading comprehension and trying to explain the
18
distribution of fixations.

Immediacy assumption assumed that the cognitive processing of a word was

concurrent with the action of readers’ seeing the word. In other words, when a reader

was reading an article, s/he would try to process every word s/he encountered as soon

as possible, even if s/he would interpret it erroneously. Just and Carpenter (1980) used

the word “interpretation” to refer to the processing of words. They noted that the

“interpretation” consisted of encoding the word, finding a proper referent (if the word

is polysemic) and determining the status of this word in the sentence and in the whole

text. Immediacy assumption emphasized that the interpretation, at all levels, was

carried out immediately without any delay (Just & Carpenter, 1980).

The eye-mind assumption linked eye fixation with cognitive processing. It

posited that as long as a person was processing a word, s/he would keep looking at

this word. In other words, the eye fixation would rest on this word, till s/he carried on

the processing of the next word. Just and Carpenter (1980: 331) believed that there

was “no appreciable lag between that is being fixated and what is being processed”.

Therefore, the word a reader fixated is the exact word s/he processed. The time s/he

focused on the word (gaze time on the word or fixation duration on the word) is the

time s/he was processing it. Eye-mind assumption provided researchers valuable

perspective to get access to what might happen in people’s mind which used to be a

black box.

However, these two assumptions were not exactly right. Holmqvist (2011) found

that what one was thinking was faster than his or her eye movement, which was a bit

conflicting with the immediacy assumption. Smallwood and Schooler (2006) pointed

out that mind wandering might happen during the task. Although one’s eyes fixate on

a word or other objects, the mind may drift away and something irrelevant may be

considered during the fixation. What’s more, there were no obvious evidence showing

that one was absent of mind. Mind wandering often occurs without one noticing it. At

this point, when one fixated on a word, s/he was not necessarily processing it. S/he
19
might be thinking about something else. This the fixation time might not necessarily

be the processing time in the mind. The same goes to translation studies. When the

translator focuses on a word in the source text, s/he may be considering the production

of target text and trying to find a proper referent in the target language for this word.

Or perhaps, when s/he is looking at the target area, she may be pondering over the

word of source text. At this point, fixations in source text do not always mean source

text comprehension and fixations in target text do not necessarily mean target text

production.

Notwithstanding this weakness, immediacy assumption and eye-mind

assumption still provide appropriate basis for the correlation between eye fixations

and cognitive processing of mind. Previous studies, especially studies in psychology,

have proven that there are firmly link between eye movement and cognitive

processing. Rayner (1998) has concluded that fixations were firmly linked with

cognitive processing during reading tasks and that eye movement could informatively

reveal the mind. As for the translation tasks, we could reasonably believe that most of

the eye movement data could reflect the moment-to-moment mental activities, since

Smallwood and Schooler (2006) noted that mind drifting always occurred when the

task was easy. And either translation task or post-editing task is far from easy.

3.2.2 Krings’ Differentiation of PE Effort

As mentioned in section 3.1.2, post-editing effort was differentiated by Krings (2001)

from three dimensions: time, operation and cognitive, so to speak, temporal effort,

technical effort and cognitive effort.

Temporal effort is the mostly concerned and studied effort in PE research. It is

the most important PE effort from an economic perspective as well as the most easily

measured during the research. The very name of temporal effort suggests that it

concerns the time for doing post-editing work. Usually, the time spent on the task

indicates the amount of temporal effort consumed. In PE effort research, total task
20
time is always combined with the word number of source text. The processing speed,

derived from dividing the “source text word number” by “the total task time”, is

always used in the PE effort research to indicate productivity of task, i.e., the number

of words processed within one minute (e.g. O’Brien, 2006; García, 2010). The

processing speed, also as an indicator of temporal effort in this research, can tell more

than single total task time.

Post-editing
Effort

Temporal Technical Cognitive


Post-editing Post-editing Post-editing
Effort Effort Effort

Figure 3-1: Differentiation of PE Effort (Krings, 2001)

Technical effort occurs as post-editors are trying to correct errors or mistakes of

the raw machine translation or to adjust the arrangement of the text. As nowadays

most of the translations are conducted with computers, keyboard activities, like

deletions, insertions, rearrangement, to name a few, are typical indicators of technical

effort. Technical effort is “purely technical operations” (Krings, 2001: 179). Before,

technical effort was defined by Krings (2001), studies had found that error types

caused by machine translation system such as false verb form and preposition and

inappropriate word-for-word translation pose great difficulty to post-editing (Lavorel,

1982; Green, 1982), leading to increase in post-editing effort. In response to this

finding, optimization of text processing system was proposed to reduce effort (Slocum,

1985; Vasconcellos, 1987). Here “post-editing effort” mostly referred to technical

effort.

Cognitive effort was proposed by Krings from psycholinguistic point of view. It

concerns the cognitive processing happened in people’s mind which could not been
21
detected obviously. The definition of cognitive effort was elaborated in detail in

section 3.1.2 and will not be repeated here.

Relations among the three kinds of post-editing effort should be clarified.

Although Krings differentiated them into three categories, they were not completely

separated from one another. Temporal effort was the most easily measured. Technical

effort could also be measured externally. However, cognitive effort could not be

observed externally. It is the most decisive effort among the three. Krings especially

noted that even though technical effort was leaded by cognitive activities, technical

effort and cognitive effort should be differentiated (2001: 179). It is possible when

cognitive effort is little while technical effort is great. A mistake in raw machine

translation output may be easily recognized, but it may take a lot to correct the

mistake.

In this research, temporal and cognitive efforts were considered to make

comparison between post-editing and translation from scratch. In this research,

indicator for temporal effort was processing speed of a task; indicators for cognitive

effort collected here were moment-to-moment eye movements of the participants,

including fixation count, fixation duration and pupil dilation.

3.2.3 Eye Movement Data and Cognitive Effort

According to immediacy and eye-mind assumption, where the eye is looking reflects

what is processed, although with potential weakness. It did support that there was a

strong correlation between eye movement data and cognitive effort. Eye movement

data such as fixations and pupil dilation was researched and proved to reveal cognitive

effort (Just & Carpenter, 1976, Just & Carpenter, 1980; Hyönä, Tommola & Alaja,

1995). With advances in technology, online eye movement data could be collected

accurately by eye tracker. In this research, eye movement data was collected by the

remote eye tracker Tobii TX300. Indicators of cognitive effort of this research were

fixation count, fixation duration and pupil dilation. Relations between these indicators
22
with cognitive effort will be introduced respectively in the following part.

Fixation count was the number of fixations formed during the translation or

post-editing process. A fixation is formed as translator keeps looking at one word.

Fixation count is always used to reflect the difficulty of the task and the expertise of

translators. Generally, The more fixation counts are, the more cognitive effort was

spent in the translation process. Doherty (2012) found fewer fixation counts for the

machine translation output of controlled language than that of uncontrolled language.

Jakobsen and Jensen (2008) found out that during translation reading boasted the most

fixation counts, followed by sight translation, reading for translation and reading only

for comprehension.

Fixation duration is the time duration of a fixation. The longer fixation duration

indicates the larger and deeper processing of a word. Fixation duration in different

area of interests were collected by Koglin (2015) to compare different cognitive effort

required to PE and conventional translation in terms of metaphor translation. Koglin

(2015) found that for traditional translation, translator tended to have longer fixation

on source text, while for post-editing task they fixated longer on target text. Since

there are always thousands of fixation counts, each with different duration, fixation

duration used in the research to indicate cognitive effort was average fixation duration

of all fixations. It is the result of dividing total gaze time on screen (or in certain area

of interest) by the number of fixation counts.

Great connection has also been found between pupillary movement and mental

activities. Hyönäet al. (1995) studied pupil dilation in different interpreting tasks and

proved that pupillary responses could suitably indicate the cognitive load of mental

processing activities. O’Brien (2006) first introduced eye tracking into translation

process research and machine translation research. In her research, percentage change

in pupil dilation was adopted as an indicator to compare cognitive effort need for

computer-aided translation with different matches. Doherty, O’Brien and Carl (2010)

adopted gaze time, fixation count and average fixation duration and average pupil
23
dilation as indicators of cognitive effort spent on reading machine-generated

sentences.

Fixation counts (e.g. O’Brien, 2011; Doherty, 2012; Mesa, 2014), fixation

duration (e.g. Carl et al., 2011; Mesa, 2014) and pupil dilation (e.g. Hyönä, 1995;

Iqbal et al., 2005; Lourenço da Silva et al., 2015) are among the most common used

indicators of cognitive effort in translation and PE studies.

3.2.4 Summary

In summary, the immediacy assumption and eye-mind assumption put up by Just and

Carpenter (1980) provided a fundamental basis for using eye movement data as

indicators of cognitive processing happened inside human brain. Large body of

research also provided profound evidence to link visual and cognitive focus in

translating and post-editing process. In addition, Krings (2001) differentiated

post-editing effort into temporal, technical and cognitive effort.

Based on all that have been elaborated above, in this research, temporal and

cognitive effort will be compared between post-editing task and traditional translation

task. For temporal effort, processing speed of the participants is chosen as the

indicator. As for cognitive effort, eye movement data, including fixation counts,

average fixation duration and pupil dilation, will be the indices.

3.3 Analytical Framework

The analytical framework of this research was drawn based on the theoretical basis

(see Figure 3-2).

As seen in Figure 3-2, for traditional translation task, only source text will be

provided for translators, while for post-editing task, both source text and machine

translation output will be offered. Each participant will conduct both post-editing task

and translation task so as to reduce individual differences of eye movement data

which may greatly affect experiment results.


24
The
translation or
post-editing
process

Source text comprehension Temporal effort: time, processing speed

Target text production Technical effort: deletions, insertions,


rearrangement, etc. (not discussed in this
research)
Cognitive effort: fixation counts, average
fixation duration, pupil dilation
(Krings, 2001)

Figure 3-2: Analytical Framework

Based on the differentiation of PE effort (Krings, 2001), indicators of temporal

and cognitive effort in the translation or post-editing process will be collected and

discussed. Technical effort will not be taken into consideration here. For temporal

effort, total task time will be collected so as to calculate processing speed. For

cognitive effort, fixation counts, fixation duration and pupil dilation will be recorded

by the remote eye tracker Tobii TX300.

25
In addition, the main objective of this research is to prove that post-editing is

cognitively saving compared with translation from scratch, by contrasting the eye

movement data collected in the post-editing process and in the translation process.

What’s more, participants are asked to stop post-editing as long as they believe that

they have post-edited the machine translation output as good as human translation.

Therefore, the quality of the final products will not be considered.

26
CHAPTER FOUR

METHODOLOGY

Methodology of this research is elaborated in this chapter. Eye tracking experiment

was used to record the moment-to-moment eye activities of the participants, thus to

compare temporal and cognitive effort expended in the process of PE and

conventional translation.

4.1. Research Questions

As the research objective is to testify the applicability of machine translation

post-editing in the Chinese-English language pair performed by college English

learners so as to answer the question whether machine translation post-editing could

become a new way for college English learners to conduct daily translation work,

compared with traditional human translation. Three research questions are raised as

follows:

1) In terms of Chinese-English language pair, how is machine translation

post-editing and traditional human translation temporally and cognitively different

from each other?

2) What influence will text type and the competence of translators have on

cognitive effort spent in post-editing?

3) What’s the distinction of the allocation of cognitive effort to source text and

target text between human translation process and post-editing process?

4.2. Research Hypothesis

Based on the previous studies on post-editing effort presented in Chapter Three and

the research questions of this research, three hypotheses are proposed.

Hypothesis 1: Temporal and cognitive effort for post-editing is less than that for
27
translating from scratch.

Hypothesis 2: Competent translator (in this case referring to postgraduate) will

spend less effort, in respect to time and cognition, than those less competent (in this

case, undergraduate). The reduction of time and cognitive effort varies among

different text types.

Hypothesis 3: Translators spend more cognitive effort on source text during

translation process than during post-editing process, while cognitive effort required to

target text is higher in the post-editing process.

4.3. Participants

There are in total 30 participants (aged from 20 to 27) involved in the experiment,

including 24 female students and 6 male students. These participants are from

Guangdong University of Foreign Studies (School of Interpreting and Translation

Studies, Schools of English for International Business and Faculty of English

Language and Culture), Guangzhou University of Chinese Medicine (School of

Foreign Studies) and Guangzhou University (School of Foreign Studies). All of them

are majoring in English and have received translation courses for at least three years.

All the participants are Chinese native speakers and have English as their second

language. Among the 30 participants, 15 are 3rd or 4th-year undergraduates and 15 are

MA students (postgraduates). Postgraduates have all passed TEM-8, while

undergraduates do not. None of the participants have been trained in PE or have any

professional experience as post-editors. Yet, before the task, all of them are introduced

the basic principle of post-editing in this experiment.

Although individual differences among participants are inevitable, in this study,

within-subjects design is adopted, which means that each participant is tested under

each condition to reduce the influence of individual differences. Especially as one

dependent variable we have chosen is pupil dilation which varies greatly among

people, within-subject design could significantly reduce the negative influence of


28
individual difference.

4.4 Materials

In this study, as we intend to test the applicability of PE performed by college

language learners in their daily translation work, three kinds of text are chosen:

economic text, political text and literary text, for their being the most common text

types students might meet in their daily life.

For this was a within-subject study, each participant was asked to translate six

short texts (A1, A2, B1, B2, C1, C22), three to translate from scratch and three to

post-edit. Materials chosen for translate and post-edit were around 450 words in total

with the consideration that participants might get tired or bored if the text was too

long, leading to a drop in motivation and a negative effect on the gaze data. Word

number for each text is shown as Table 4-1.

Table 4-1: Word number for each text


Text A1 A2 B1 B2 C1 C2 Mean
Word Number 72 72 75 75 75 77 74.33

A1 and A2 are economic texts extracted and adapted from the same article of

China Business News3 to ensure that they are of comparable difficulties and style.

The same consideration goes to political texts (B1 and B2) and literary texts (C1 and

C2). Political texts are extracted and adapted from the 2016 Report on the Work of the

Government, literary texts from An Essay of Liang Shiqiu. A1 and A2 are about the

sales volume of some great real estate company. B1 and B2 are about the development

of China. C1 and C2 are expository articles on time saving.

It’s worth noticing here that, as Carl (2015) pointed out, machine translation is

far from capable of literary translation as far as style and cultural factors are

2
A, B and C represent for different text types.
3
哪家房企先闯入 3000 亿俱乐部?[N]. 第一财经日报. 2016., 3249 (A07)
29
concerned. The literary texts chosen here are limited to those without much cultural

specific items or not boasting specific writing styles.

For the two texts of each text type, one will be translated by Google Translate,

the free online translation system, and the other untranslated, all waiting for

participants to translate or post-edit. A detailed profile of the source texts and the raw

output of Google Translation are attached in Appendix A and Appendix B.

Semi-randomized presentation sequences of the source texts are adopted in order

to minimize the risk that observations (in part) had to do with a repeated presentation

sequence. Also, tasks are randomized to the participants with the RAND function

(written as “=rand ()”) of Excel, as follows in Figure4-1

Figure 4-1: Random assignment of the task to participants

The default sequence of tasks and semi-randomized sequence are also shown in

Table 4-2 and Table 4-3. As shown in Table 4-3, each text will be post-edited fifteen

times and translated fifteen times. Besides, participants will conduct tasks following

30
T30 A1 B1 C1 A2 B2 C2 T30 A2 B2 C2 A1 B1 C1
other half of the texts, first post-editing task then translation task. With this design,

Tobii TX300 eye tracker is used to collect data of eye activities, including gaze time,
an order of A1-A2-B1-B2-C1-C2. Following this order, for half of the texts,

participants first conduct translation task and then post-editing task, while for the

fixations and pupil dilation. Tobii TX300 is a remote eye tracker, which, compared
T29 A1 B1 C1 A2 B2 C2 T29 A2 B2 C1 A1 B1 C2
T28 A1 B1 C1 A2 B2 C2 T28 A2 B1 C2 A1 B2 C1
T27 A1 B1 C1 A2 B2 C2 T27 A1 B2 C1 A2 B1 C2
T26 A1 B1 C1 A2 B2 C2 T26 A1 B1 C2 A2 B2 C1
T25 A1 B1 C1 A2 B2 C2 T25 A1 B1 C1 A2 B2 C2
T24 A1 B1 C1 A2 B2 C2 T24 A2 B2 C2 A1 B1 C1

Table 4-3: Semi-randomized sequence of source texts


T23 A1 B1 C1 A2 B2 C2 T23 A2 B2 C1 A1 B1 C2
Table 4-2: Default sequence of source texts

T22 A1 B1 C1 A2 B2 C2 T22 A2 B1 C2 A1 B2 C1
practice effect caused by task order could be counterbalanced.

T21 A1 B1 C1 A2 B2 C2 T21 A2 B1 C1 A1 B2 C2
T20 A1 B1 C1 A2 B2 C2 T20 A1 B2 C2 A2 B1 C1
T19 A1 B1 C1 A2 B2 C2 T19 A1 B2 C1 A2 B1 C2
T18 A1 B1 C1 A2 B2 C2 T18 A1 B1 C2 A2 B2 C1
T17 A1 B1 C1 A2 B2 C2 T17 A1 B1 C1 A2 B2 C2
T16 A1 B1 C1 A2 B2 C2 T16 A2 B2 C2 A1 B1 C1

31
T15 A1 B1 C1 A2 B2 C2 T15 A2 B2 C1 A1 B1 C2
T14 A1 B1 C1 A2 B2 C2 T14 A2 B1 C2 A1 B2 C1
T13 A1 B1 C1 A2 B2 C2 T13 A2 B1 C1 A1 B2 C2
T12 A1 B1 C1 A2 B2 C2 T12 A1 B2 C2 A2 B1 C1
T11 A1 B1 C1 A2 B2 C2 T11 A1 B2 C1 A2 B1 C2
T10 A1 B1 C1 A2 B2 C2 T10 A1 B1 C2 A2 B2 C1
T09 A1 B1 C1 A2 B2 C2 T09 A1 B1 C1 A2 B2 C2
T08 A1 B1 C1 A2 B2 C2 T08 A2 B2 C2 A1 B1 C1
T07 A1 B1 C1 A2 B2 C2 T07 A2 B2 C1 A1 B1 C2
T06 A1 B1 C1 A2 B2 C2 T06 A2 B1 C2 A1 B2 C1

4.5 Equipment
T05 A1 B1 C1 A2 B2 C2 T05 A2 B1 C1 A1 B2 C2
T04 A1 B1 C1 A2 B2 C2 T04 A1 B2 C2 A2 B1 C1
T03 A1 B1 C1 A2 B2 C2 T03 A1 B2 C1 A2 B1 C2
T02 A1 B1 C1 A2 B2 C2 T02 A1 B1 C2 A2 B2 C1
T01 A1 B1 C1 A2 B2 C2 T01 A1 B1 C1 A2 B2 C2

H
T

E
P

P
with Eye link 1000, is non-invasive, i.e. there’s no need for participants to wear any

head-mouthed equipment. The large movement box of Tobii TX300 enables

unobtrusive capture of natural human behavior. This has increased the ecological

validity. Pupil dilation and fixation could be recorded at a sampling rate of 300 Hz,

which means that highly accurate and precise data could be collected and provide a

solid foundation for eye movement research. Tobii TX300 used in this study is

provided by Guangzhou University of Chinese Medicine.

Components of Tobii TX300 are shown as follows:

Figure 4-2: TX300 monitor Figure 4-3: Eye tracker unit

Figure 4-4: Participant conducting eye-tracking task

Eye movement data is recorded by the eye tracker unit (Figure 4-3). Figure 4-4

shows the scene when a participant conducts the translation or post-editing task.

As software Tobii Studio from Tobii TX300 does not support human-computer

interaction, we employ the interface of Translog II (version 2.0.1.222), software

designed to record user activity data in the process of translation, as well as reading,

writing, copying and editing. The software was developed by Arnt Lykke Jakobsen
32
and Lasse Schou, while programmed by Lasse Schou, Morten Lemvigh, Jakob

Elming and Michael Carl. Translog II could be connected to TX300 remote eye

tracker to record both gaze activities and keyboard activities. Translog II software

includes two parts: Translog II user and Translog II supervisor. Source texts (for both

tasks) and machine translation output (only for post-editing task) are presented on the

interface of Translog II user. Figure 4-5 shows the interface of Translog II in the

post-editing task.

Figure 4-5: Screenshot of Translog-II User Interface for PE


Note: Source text (Simsum, size 18) in the top half of the interface; Target
text (Yu Gothic UI font, size 18) in the bottom half of the interface.4

4.6 Procedures

4.6.1 Environment

The experiment was carried out in a room without sunlight. Only artificial light was

used. Noises were also avoided so as not to disturb participants. As there was only one

eye tracking machine, participants were tested one by one as informed in advance by

4
It was advised that double spaced setting should be adopted so that fixation could be more exact. Yet this version
of Translog II didn’t support double spaced setting.
33
experimenter. All the participants were informed in advance that they were not

allowed to drink or eat anything that contains caffeine. Girls were not allowed to wear

eye make-up. Caffeine level and eye make-up might lead to invalid data.

4.6.2 Task Execution

Before the task, participants were asked to answer a brief questionnaire which was

designed to make a record of their individual profile, including name, sex, age,

education and their personal attitude towards PE and human translation. They were

acknowledged that this information would be kept confidential and only for research

use. There were no warm-up tests for participants since they had to carry out six tasks

which was a quite had workload for them. The workload may also affect data quality.

(1) Guidelines. After the questionnaire was answered, a brief introduction to the

eye tracker was made to the participant so that they could know how this machine

worked. They were told that the quality of post-editing output should be comparable,

as much as possible, to that of human translation. PE guidelines5 for participants were

as follows:

- Do not hesitate too long over a problem

- Do not embark on time-consuming research

- No time constrain

- No online or offline dictionary allowed

(2) Calibration. Before the task, Translog II should be connected to the remote

eye tracker (see Figure 4-6).

5
The first two guidelines were taken from O’Brien (2009) based on Wagner (1985).
34
Figure 4-6: Connecting Translog II to eye tracker

After the connection, calibration was conducted. Participants were asked to

following the yellow dot appeared on the screen with their eyes, usually focusing on

the center of the dot. If the calibration data was insufficient or the calibration was

unsuccessful, a recalibration would be done. The Calibration results were like this

(Figure 4-7).

Figure 4-7: Calibration Results

Only if the calibration is conducted and qualified, will the post-editing or

translation task begin. There are in total six independent tasks for each participant (the

tasks are established in advance). Before each task, calibration will be conducted to

ensure that eye-tracker could exactly record eye activities and guarantee the quality of

eye movement data.


35
The participants were required to sit right before the monitor. The distance

between the eye tracker and the participant should be within 50 to 80 cm, as suggested

by Tobii manual.

(3) Translation or PE tasks. After the calibration, participants started their tasks.

There were in total six little tasks which were semi-randomized. Participants

translated or post-edited these tasks according the requirements and saved the data

one by one. During the translation or post-editing task, participants were required to

stay in a static position and try to touch type so that they could focus on the screen as

much as possible.

(4) Data saving. After each task, data would be saved once and named the same

as the task file. Data saving was processed by the experimenter or by some capable

participants themselves. Data was saved in xml file.

36
CHAPTER FIVE

DATA ANALYSIS AND DISCUSSION

5.1 Data Quality

The quality of eye-tracking data is sensitive to several factors: eye make-up

participants wear, varying lighting condition, and participants’ distance with the

monitor, and so forth (O’Brien, 2009; Hvelplund, 2011; Korpal, 2015). To minimize

the implications of some of these potentially error-inducing factors, various measures

were taken in this research: curtains were drawn so that no daylight was allowed to

enter the room; the same artificial light was lit during all experiments, day or night.

Participants were required to sit right before the monitor. The distance between the

monitor of remote eye tracker and the participant was within 50 to 80 cm, as

suggested by Tobii TX300 manual.

As for the variables collected to reflect cognitive effort, some thresholds were set

based on the previous studies. For the average fixation duration, we followed the

threshold of Sjørup (2013) and Lourenço da Silva et al. (2015) as 180 milliseconds,

which means average fixation duration below 180ms will be discarded. Gaze time on

screen (GTS), total gaze time divided by total task time, was used to measure the

quality of eye-tracking data. A low GTS might indicate that, most of the time in the

process of translation or post-editing, participants have looked out of the screen or

that the eye tracker has lost the tracking of eyes. For GTS, the threshold was set as

30% (Lourenço da Silva et al., 2015; Hvelplund, 2011).

We also set another threshold to guarantee the eye-tracking data quality: the

percentage of valid win gaze data proposed by Lourenço da Silva et al. (2015). There

are two areas of interest (AOI) in this research: the source text (ST) area and the target

text (TT) area. As the interface for translation and post-editing work in this research is

37
Translog-II User (see Figure 4-5), the AOIs are defaulted by the software with the

upper interface as ST AOI and the lower part as TT AOI

The percentage of valid win gaze data equals the “number of occurrences of

win=1 (gaze on ST) plus those of win=2 (gaze on TT)” divided by “the total number

of wins which also include win=0 (gaze ascribed to neither the ST AOI nor the TT

AOI)” (Lourenço da Silva et al., 2015: 150). The threshold for this was 40%. Detailed

information about “win” and value of “win” was elaborated in Table 5-1.

5.2 Data Processing

As in this experiment, eye movement data was collected through Translog-II software

installed in the mainframe of the remote eye tracker, all the data was stored in the xml

files. All the keylogging data and eye-tracking data could be found in the xml files.

The eye-tracking data was extracted through derivations in Microsoft Excel from the

xml document. The data stored in the xml file was arranged as Figure 5-1. Terms in

the xml document can be interpreted in Table 5-1.

Figure 5-1: Eye-tracking data recorded in the xml doc.

Translog II collects and processes gaze data in three steps: (1) Check whether the

gaze is within the source or the target window; (2) Compute fixations based on a

variant of a certain algorithm; (3) Map gaze points and fixation on closest character

on the screen. It is worth noticing that step 3 is incompatible with Chinese and

Japanese Input Method Editor. Yet, the language pair studied in this research is

Chinese-English. Therefore, an additional step (4) - offline gaze mapping is needed.

The log file should be opened and replayed in the Translog II Supervisor. After the
38
replay, newly produced xml file should be saved and used for data collection.

Table 5-1: Terms in the xml document (based on Translog II Manual)


Term Description

Eye time =“781” Moment of a gaze with reference to the beginning


of the recording
Cursor=“95” The position of the nearest character with reference
to the first character the text looked at
pr= “2.467” pl= “2.471” Diameters of the right and left pupils, respectively
Yr= “407” Xr= “83” right eye coordinates, Y and X, with reference to
screen-zero
Yl= “406” Xl= “75” Left eye coordinates, Y and X, with reference to
screen-zero
Win= “1”, “0”, “2” “Win” stands for window the user looks at or types
in: 1 = source window, 2 = target window, 0 = user
looking at none of the windows
Fix time X= “208” E , with reference
Y= “49” to screen-zero
Dur = “297” Duration of the fixation
TT= “789” Tracker Time: moment of gaze with reference to the
beginning of the recording; measured by the eye
tracker machine
Note: Screen zero refers to the top left point of the screen

Figure 5-2: Replaying of the xml file before step 4


39
Figure 5-3: Replaying of the xml file after step 4 (blue circles in
the picture refer to eye fixations)

5.3 Results of the Experiment

This is a 2×3×2 mixed-design experiment. Task execution (two levels: human

translation, HT for short, and PE) and Text type (three levels: economic, political and

literary text) are within-subject factors and competence (two levels: postgraduate and

undergraduate) is between-subject factor.

These are in total three independent variables in this research, as indicated above.

As for the dependent variables, since we intend to investigate the temporal and

cognitive aspects of PE and HT, dependent variables chosen for this research are ①

processing speed, ② pupil dilation, ③ fixation counts and ④ average fixation

duration. Data collected from the experiment are total task time, fixation counts (in ST

AOI, TT AOI and in both ST and TT AOIs) and fixation duration.

In the following section, in order to save space in figures or tables, some of the

independent and dependent variables are abbreviated. The abbreviations and their

corresponding variables are shown in Table 5-2.

40
Table 5-2: Abbreviations used in tables and figures
Abbreviation Corresponding Variables
UG Undergraduate
PG Postgraduate
HT Human translation
PE Post-editing
E Economic text
P Political text
L Literary text
Cpt Competence

5.3.1 Processing Speed

Processing speed here is the number of words that could be processed by a participant

in one minute. It is derived from dividing “total task time” by “the number of source

text”. Relations between processing speed and temporal effort can be described as: the

higher value processing is, the less temporal effort it costs.

Average processing speeds for translating or post-editing different types of texts

conducted by undergraduates(UG) and postgraduates (PG) participants are shown in

Figure 5-4, Table 5-2 and Table 5-3. The overall figures of processing speed for all

Figure 5-4: Average processing speed for translating or post-editing different


types of texts conducted by undergraduates and postgraduates

41
As is shown in Figure 5-4, for both undergraduates and postgraduates, the

average processing speed for translation is lower than that for post-editing,

irrespective of text types. When text type is taken into consideration, according to

Table 5-3 and Table 5-4, processing speed for post-editing is still higher than that for

translation (14.65 vs. 8.97 for economic text; 18.86 vs. 10.63 for political text; and

21.85 vs. 14.69 for literary text).

Table 5-3: Descriptive Statistics of participants’ processing speed of


human translation task (N=20)
HT
Competence Economic Political Literary
Mean SD Mean SD Mean SD
UG 7.14 2.40 9.11 1.81 13.50 2.24
PG 11.20 3.85 12.49 3.77 16.14 9.07
Total 8.97 3.68 10.63 3.27 14.69 6.25

Table 5-4: Descriptive Statistics of participants’ processing speed of


post-editing task (N=20)
PE
Competence Economic Political Literary
Mean SD Mean SD Mean SD
UG 13.58 10.13 16.24 8.84 20.84 9.59
PG 15.95 10.89 22.06 8.56 23.08 16.88
Total 14.65 10.27 18.86 8.99 21.85 13.02

Three-way ANOVA is conducted in terms of processing speed. The results

(shown in Table 5-5) indicate that the main effect of task (HT vs. PE) is highly

significant, F (1, 18) = 16.401, p < 0.01. Post-editing task is faster than translation

task. The main effect of text is also highly significant, F (1, 18) = 37.683, p < 0.01.

Literary text is processed the fastest, followed by political text and then economic

text.

However, the main effect of competence is not significant (p > 0.1). No


42
interaction effect is found between task and competence, so is between task and text,

text and competence. The interaction among text, task and competence is also not

obvious (F < 1).

Table 5-5: Results of three-way ANOVA in terms of processing speed

Source df Mean Square F p


Task 1 1467.269 16.401 .001*
Text 1 817.281 37.683 .000*
Cpt 1 346.935 1.537 .231
Task * Cpt 1 .099 .001 .974
Text * Cpt 1 2.984 .138 .715
Task * Text 1 11.828 .308 .586
Task * Text * Cpt 1 2.049 .053 .820
Note: * indicates that there is significant difference.

5.3.2 Pupil Dilation

Hyönä et al. (1995) found that pupillary responses could suitably indicate the

cognitive load of mental processing activities. Pupil dilates along with the increase of

difficulty and processing load. The pupillary data collected in this research includes

pupil dilation for both left and right eyes. Nevertheless, as there is a high degree of

concordance for left and right eyes (Niehaus, Guldin & Meyer, 2001), pupil dilation

reported here is the average dilation for both eyes. The overall pupil dilation figures

for all participants are shown in Appendix D.

Results of the three-way ANOVA conducted in terms of pupil dilation is

indicated in Table 5-6. According to the results, task has a highly significant

difference, F (1, 18) = 20.560, p < 0.01, which indicates that irrespective of other

factors, pupil dilation tends to be larger when students are conducting translation task

(2.96 mm) than doing post-editing work (2.89 mm). The main effect of text is also

significant, F (1, 18) = 5.771, p < 0.05. Yet, the main effect of competence doesn’t
43
prove significant (F < 1).

Table 5-6: Results of three-way ANOVA in terms of pupil dilation

Source df Mean Square F p

Task 1 .139 20.560 .000*


Text 1 .020 5.771 .027*
Cpt 1 .043 .084 .775
Task * Cpt 1 .005 .731 .404
Text * Cpt 1 .024 6.876 .017*
Task * Text 1 .000 .151 .702
Task * Text *Cpt 1 .005 1.891 .186
Note: * indicates that there is significant difference.

The interaction between text and competence is significant, F (1, 18) = 6.876, p <

0.05. Therefore, simple effect test should be conducted in case that the interaction

effect may cover or distort the main effect of competence. However, as our interest

lies in the interaction between task and competence, task and text or the interaction of

the three factors, which are all not statistically significant (p > 0.1), the competence *

text interaction will not be further discussed here.

In order to show more clearly what the main effect of task is and how it changes

when conducted by students with different competence in terms of different text type,

the post hoc comparisons are carried out and shown in Figure 5-5, 5-6 and 5-7.

The post hoc comparisons reveal that, irrespective of text types and competence,

pupil dilation for post-editing is highly reduced compared with human translation, so

to speak, indicating a great reduction of cognitive effort. For literary and political

texts (Figure 5-6 and 5-7), pupil dilation of postgraduates, for both post-editing and

traditional translation, is smaller than that of undergraduates. In other words,

cognitive effort costed by postgraduates is less than cognitive effort costed by

undergraduates. Besides, amount of cognitive effort saved by post-editing is similar

for postgraduates and undergraduates. Nevertheless, for economical text (see Figure

44
5-5), postgraduates cost more cognitive effort in translation than undergraduates,

whereas less in post-editing than undergraduates. The amount of cognitive effort

reduced from human translation to post-editing is larger for postgraduates than

undergraduates.

Figure 5-5: Estimated Marginal Means of Pupil Dilation of Economic Text

Figure 5-6: Estimated Marginal Means of Pupil Dilation of Political Text

45
Figure 5-7: Estimated Marginal Means of Pupil Dilation of Literary Text

Table 5-7: Descriptive Statistics of participants’ average pupil dilation of


human translation task (N=20)
HT
Competence Economic Political Literary
Mean SD Mean SD Mean SD
UG 2.97 0.27 2.95 0.27 2.98 0.28
PG 2.99 0.33 2.93 0.30 2.91 0.30
Total 2.98 0.29 2.94 0.28 2.95 0.29

Table 5-8: Descriptive Statistics of participants’ average pupil dilation of


post-editing task (N=20)
PE
Competence Economic Political Literary
Mean SD Mean SD Mean SD
UG 2.93 0.31 2.89 0.28 2.92 0.28
PG 2.89 0.33 2.85 0.31 2.84 0.30
Total 2.91 0.31 2.87 0.29 2.88 0.28

The descriptive statistics of participants’ average pupil dilation for both


46
translation and post-editing tasks in Table 5-7 and 5-8 also objectively support what

we’ve seen from figures of post hoc comparisons.

5.3.3 Fixation Count

Fixation forms from one’s stably looking at certain object. In translation process

research, which collects eye movement data, fixation count is always used to indicate

the amount of cognitive effort (e.g. Hvelplund, 2011). To some extent, more fixation

counts indicate that more cognitive effort is spent in the translation process. For

fixation count, we not only consider the total fixation count, but also have a look at

fixations distributed to ST area (i.e. win 1 or the ST AOI) and TT area (i.e. win 2 or

the TT AOI) so as to see the allocation of cognitive effort between the two AOIs. The

overall figures of fixation counts, including fixation counts distributed in ST AOI,

fixation counts distributed in TT AOI and fixation counts distributed in both AOIs, are

attached in Appendix E.

5.3.3.1 Total Fixation Counts

Total fixations counts are fixations in both source text and target text areas. According

to the ANOVA conducted in terms of all fixation counts, the main effect of task is

highly significant, F (1, 18) = 10.720, p < 0.01. The overall fixation counts for

post-editing is 590.18, 13.33% less than that for human translation which is 680.97. It

indicates that cognitive effort spent in the process of post-editing is less than that

spent for translation from scratch.

The main effect of text is also highly significant, F (1, 18) = 17.729, p < 0.01,

which declares that cognitive effort for post-editing varies from the text type. Yet, the

interaction effect between Task and text is not significant, p < 0.1. Besides, the Task *

Competence interaction, Text * Competence interaction and Task * Text *

Competence interaction are also not statistically significant (for all interactions, p <

1).
47
Table 5-9: Descriptive Statistics of Total Fixation Counts for both
human translation and post-editing tasks (N=20)
Competence Mean SD
HT-E UG 1009.55 468.03
PG 663.89 211.23
Total 854.00 406.46
HT-P UG 781.36 259.01
PG 653.78 343.45
Total 723.95 298.69
HT-L UG 547.55 122.70
PG 600.00 377.61
Total 571.15 262.07
PE-E UG 778.27 363.90
PG 756.78 399.57
Total 768.60 370.19
PE-P UG 646.09 258.55
PG 524.33 202.17
Total 591.30 237.18
PE-L UG 466.82 173.62
PG 506.44 189.89
Total 484.65 177.36

Table 5-10: Results of three-way ANOVA in terms of total fixation counts

Source df Mean Square F p

Task 1 275031.980 10.720 .004*


Text 1 1464012.929 17.729 .001*
Cpt 1 226883.408 .684 .419
Task * Cpt 1 82987.980 3.235 .089
Text * Cpt 1 260981.729 3.161 .092
Task * Text 1 1594.813 .037 .851
Task * Text * Cpt 1 140533.213 3.220 .090
Note: * indicates that there is a significant difference.

There’s no significant difference between undergraduate and postgraduate in

cognitive effort spent (p < 1). However, from the post hoc comparisons (Figure 5-8,

5-9 and 5-10), for undergraduates, there’s always a reduction in fixation counts from

48
translation task to post-editing tasks, whatever text types are. For postgraduates, it’s

not always the case. When post-editing economic text, postgraduates boast more

fixation counts than when translating it. In addition, postgraduates tend to have less

fixation counts than undergraduate for economic and political texts, but more for

literary text.

Figure 5-8: Estimated Marginal Means of Total Fixation Counts of Economic Text

Figure 5-9: Estimated Marginal Means of Total Fixation Counts of Political Text
49
Figure 5-10: Estimated Marginal Means of Total Fixation Counts of Literary Text

5.3.3.2 Fixation Counts in ST AOI

Figure 5-11: Fixation Counts in win 1 (ST AOI) for translating or post-editing different
types of texts conducted by undergraduates and postgraduates
Note: figures in the red square frame are mean values for the three ( e.g. PE-L, PE-P and PE-E)

For this research, as the material is presented with Translog II interface which is

50
divided into two parts, the source text (for both translation and post-editing task) is

shown in the upper part of the interface and raw machine translation (only for

post-editing task) in the bottom part. The upper part is defined as ST AOI (term used

in the final xml file is “win = 1”). Fixation counts distributed to this area are

interpreted as cognitive effort spent on source text understanding.

Results of the three-way ANOVA test conducted in terms of the fixation counts

in win 1 reveal that the number of fixation counts in the ST AOI differs significantly

between post-editing task and translation task (F (1, 18) = 42.677, p < 0.01).

Students, including postgraduates and undergraduates, look more into the source text

area during the translation process (313.17 for translation vs. 180.28 for post-editing).

The main effect of text also proves significant, F (1, 18) = 8.029, p < 0.05. For

post-editing task, the number of fixation counts in ST AOI of economic texts is the

largest (206.15), followed by literary texts (172.2) and then political texts (162.5). The

interaction effect between Text and Competence, although statistically significant (F

(1, 18) = 7.970, p < 0.05), will not be further discussed here.

Table 5-11: Results of three-way ANOVA in terms of


fixation counts in the ST AOI

Source df Mean Square F p.

Task 1 518219.055 42.677 .000*


Text 1 112194.000 8.029 .011*
Cpt 1 32346.600 .378 .546
Task * Cpt 1 1857.055 .153 .700
Text * Cpt 1 111367.500 7.970 .011*
Task * Text 1 38205.306 2.670 .120
Task * Text * Cpt 1 47505.506 3.320 .085
Note: * indicates that there is a significant difference.

The main effect of Competence is not significant (F < 1). Interaction effects

between Competence and Task, Task and Text, and among the three factors also do

51
not prove significant (p < 1).

5.3.3.2 Fixation Counts in TT AOI

As participants are required to translate or post-edit in the bottom part of the Translog

II interface where raw machine translation output is provided for PE task, fixation

counts in TT AOI (also win 2) are interpreted as cognitive effort spent when

translators produce translation texts.

Figure 5-12: Average Fixation Counts in win 2 (TT AOI) for translating or post-editing
different types of texts conducted by undergraduates and postgraduates
Note: figures in the red square frame are mean values for the three ( e.g. PE-L, PE-P and
PE-E)

The ANOVAs conducted in terms of the fixation counts in win 2 indicate that the

main effect of task is not significant (p < 1). Fixation counts distributed in TT AOI do

no vary much between PE and HT. Yet, the number of fixation counts in TT AOI for

post-editing task exceeds that for translation, which means translators look more into

TT area in PE than HT.

The main effect of text is still highly significant, F (1, 19) = 15.490, p < 0.01.

For post-editing task, regardless of the competence of participants, economic text

boasts the largest (768.60) number of fixation counts in TT AOI, followed by political

52
texts (591.30) and by literary texts (484.65).

Table 5-12: Results of three-way ANOVA in terms of


fixation counts in the TT AOI

Source df Mean Square F p.

Task 1 38196.566 2.224 .153


Text 1 765643.010 15.490 .001*
Cpt 1 87895.168 .816 .378
Task * Cpt 1 60016.566 3.495 .078
Text * Cpt 1 31380.710 .635 .436
Task * Text 1 55411.692 3.247 .088
Task * Text * Cpt 1 24623.892 1.443 .245
Note: * indicates that there is a significant difference.

5.3.4 Average Fixation Duration

For a translator, longer fixation duration indicates deeper cognitive processing. In the

last section, fixation counts distributed to ST and TT AOI were discussed to infer the

allocation of cognitive effort. In this session, average fixation duration of ST and TT

AOI will also be discussed as a complementary to distribution of fixation counts, as

well as indices of cognitive effort. Overall figures of fixation duration for all

participants are attached in Appendix F.

5.3.4.1 Average Fixation Duration for all AOIs

As listed in Table 5-13, the main effect of task is highly significant (F (1, 18) = 22.370,

p < 0.01), which means there is great difference between post-editing and human

translation in terms of average fixation duration (520.72 ms for human translation vs.

435.61 ms for post-editing). More effort is costed when participants translate from

scratch, compared with post-editing. This result remain the same for postgraduates

(476.49 ms for human translation vs. 405.22 ms for post-editing) and undergraduates

53
(540.86 ms for human translation vs. 460.48 ms for post-editing).

Table 5-13: Results of three-way ANOVA in terms of


average fixation duration in both AOIs

Source df Mean Square F p.

Task 1 170764.744 22.370 .000*


Text 1 3249.338 .479 .498
Cpt 1 106246.972 4.387 .051*
Task * Cpt 1 617.498 .081 .779
Text * Cpt 1 1186.224 .175 .681
Task * Text 1 16393.720 3.590 .074
Task * Text * Cpt 1 6893.130 1.510 .235
Note: * indicates that there is a significant difference.

Table 5-14: Descriptive Statistics of the average fixation


duration for all AOIs

Competence Mean SD
HT-E UG 544.07 66.89
PG 490.92 67.65
Total 520.15 70.84
HT-P UG 549.63 68.16
PG 515.61 90.12
Total 534.32 78.53
HT-L UG 528.88 85.06
PG 422.94 174.99
Total 481.21 140.09
PE-E UG 464.15 84.86
PG 402.48 43.36
Total 436.40 74.65
PE-P UG 448.09 71.85
PG 383.83 45.59
Total 419.18 68.32
PE-L UG 469.19 97.42
PG 429.36 114.63
Total 451.27 104.60
54
The main effect of Competence is marginally significant, F (1, 18) = 4.387, p =

0.051. It is shown clearly in Table 5-14, that for all tasks and all text types, the

average fixation duration of postgraduates is shorter than that of undergraduates. In

other words, it is comparatively easier for postgraduate to translate or post-edit texts

with the same difficulty than undergraduate.

The main effect of text is not significant (F < 1). Also, interactions between

competence and text, competence and task, task and text and among the three factors

are not significant (p < 1).

5.3.4.2 Average Fixation Duration in ST AOI

Average fixation duration in ST AOI reflects cognitive effort costed to understand the

source text.

Table 5-15: Results of three-way ANOVA in terms of


average fixation duration in ST AOI

Source df Mean Square F p

Task 1 136328.750 30.254 .000*


Text 1 11761.387 2.845 .109
Cpt 1 34004.169 2.548 .128
Task * Cpt 1 3341.734 .742 .400
Text * Cpt 1 758.490 .183 .674
Task * Text 1 29821.225 7.662 .013*
Task * Text * Cpt 1 96.326 .025 .877
Note: * indicates that there is a significant difference.

Results of three-way ANOVA indicate that human translation and post-editing

vary greatly in source text comprehension, as the main effect of task shows

statistically significant difference, F (1,18) = 30.254, p < 0.01. For human translation,

the average fixation duration of ST AOI is 424.59 ms, while for post-editing, the

average fixation duration of ST AOI is 348.54 ms, a reduction of 17.9%. It is the same

55
case for postgraduates and undergraduates, with a respective reduction of 20% and

16%.

Table 5-16: Descriptive Statistics of the average fixation


duration for ST AOIs
Competence Mean SD
HT-E UG 469.95 67.52
PG 434.75 88.99
Total 454.11 77.82
HT-P UG 420.25 58.54
PG 412.99 75.84
Total 416.98 65.11
HT-L UG 402.79 57.43
PG 375.55 58.53
Total 390.53 58.06
PE-E UG 379.35 72.18
PG 326.08 37.39
Total 355.38 63.80
PE-P UG 356.81 62.20
PG 313.24 34.33
Total 337.20 55.02
PE-L UG 385.39 125.21
PG 348.91 70.66
Total 368.98 103.44

In addition, the interaction between task and text proves significant. The post hoc

comparisons is conducted and shown in Figure 5-13. For all text types, average

fixation duration in ST AOI for post-editing is shorter than that for human translation,

so to speak, cognitive effort spent on source text comprehension for post-editing is

relatively saved. However, as is shown in the figure, the amount of effort saved varies

among different text types. Effort saved when translators post-edit economic and

political texts is similar. However, for literary texts, comparatively little cognitive

effort is saved.
56
Figure 5-13: Estimated Marginal Means of Average Fixation Duration in ST AOI

5.3.4.3 Average Fixation Duration in TT AOI

Average fixation duration in TT AOI reflects cognitive effort spent on target text

production.

Table 5-17: Results of three-way ANOVA in terms of


average fixation duration in TT AOI

Source df Mean Square F p

Task 1 478398.131 54.454 .000*


Text 1 3717.809 .772 .391
Cpt 1 73816.737 2.559 .127
Task * Cpt 1 9.479 .001 .974
Text * Cpt 1 4675.437 .970 .338
Task * Text 1 145.887 .032 .859
Task * Text * Cpt 1 3391.398 .753 .397
Note: * indicates that there is a significant difference.

Results of three-way ANOVA in terms of average fixation duration in TT AOI

57
indicate that human translation and post-editing vary greatly in target text production.

The main effect of task is highly significant, F (1,18) = 54.454, p < 0.01. For human

translation, the average fixation duration of TT AOI is 580.55 ms, while for

post-editing, the average fixation duration of TT AOI is 458.93 ms, a reduction of

20.9%. Translators spend more cognitive effort on target text production when

translating from scratch than post-editing machine translation output.

Table 5-18: Descriptive Statistics of the average fixation duration for TT AOIs

Competence Mean SD
HT-E UG 590.08 87.98
PG 533.86 57.21
Total 564.78 79.22
HT-P UG 636.93 100.41
PG 596.95 122.56
Total 618.94 109.76
HT-L UG 604.21 113.60
PG 552.55 109.27
Total 580.97 111.87
PE-E UG 494.30 94.24
PG 426.49 50.29
Total 463.78 83.29
PE-P UG 481.04 93.21
PG 408.50 55.68
Total 448.40 85.14
PE-L UG 476.83 73.61
PG 465.93 124.83
Total 471.93 97.18

Although, no significant difference is proven for competence. Seen from Table

5-18, average fixation duration for undergraduates is a bit longer than postgraduates.

For post-editing and translation processed by postgraduates, there is a reduction of

20.8%. For undergraduates, average fixation duration reduces by 21.2% for

58
post-editing, compared with human translation. Fixation duration differences among

three text types are very small, as shown in Table 5-18.

5.4 Discussion of the Results

As presented in the previous Chapter, so as to testify the applicability of machine

translation post-editing for Chinese-English language pair and for college English

learners, three research questions were raised and correspondingly based on previous

studies three hypotheses were proposed. The following part of this section will discuss

these hypotheses at full length in combination with the experiment results.

5.4.1 Discussion of Hypothesis 1

Hypothesis 1: Temporal and cognitive effort for post-editing is less than that for

translating from scratch.

For the first hypothesis, we assume that post-editing could save both time and

cognitive effort, compared with traditional human translation. Firstly, in terms of time

saving, results of ANOVA in terms of processing speed, since the dependent variable

chosen to indicate temporal effort is processing speed, support the hypothesis.

Three-way ANOVA test shows that the main effect of task is highly significant (p <

0.01). This means there is a significant difference in processing speed between

post-editing and translation task. Together with detailed figures of processing speed, it

is obvious that post-editing do save time or the temporal effort as defined by Krings

(2001).

This result is in accordance with findings reported by O’Brien (2006), who

conducted an eye-tracking study inventively to assess the “cognitive load”, in her

word, expended when translators translate different matches in Trados - the

translation memory tool. Among the four matches O’Brien chose, the “no match” one

requires translator to translate from scratch and the “MT match” requires translator to

edit the machine-translated output, which is exactly like human translation and
59
post-editing. Results show that the processing speed of “MT match” is nearly twice

the speed of “no match” (O’Brien, 2006: 190). However, there were only four

participants taking part in the experiment as O’Brien claimed that the research was

preliminary and novel in nature. García (2011) also reported similar conclusions,

although the variable he chose was total task time. García (2010) first reported a study

which didn’t show significant difference between post-editing and translating, though

post-editing task did be faster than manual translation. García (2011) then reported

another research and this time statistically significant difference was proven. He

concluded that post-editing should be a viable alternative to human translation (García,

2011). Guerberof (2009) also compared machine translation with translation from

scratch and reported a higher speed for post-editing than manual translation (although

not statistically significant) as well as a productivity gain of about 25%.

Secondly, in terms of overall cognitive effort saving, ANOVA results also

support that less cognitive effort is required in post-editing process than in translating

process. Indices of cognitive effort in this research are pupil dilation, total fixation

counts and average fixation duration of both AOIs. For pupil dilation, according to the

ANOVA result, the main effect task is highly significant (p < 0.01), indicating that

pupil size for human translation is larger than that for post-editing. Thus, cognitive

effort spent on human translation is significantly greater than effort spent on

post-editing. For total fixation duration, ANOVA result also declares a significant

difference in task: translation vs. post-editing. Translators tend to have longer

fixations in the translation process than in the post-editing process. In addition, the

ANOVA result of total fixation counts also prove that the number of fixation counts

for human translation is obviously higher than that for post-editing.

Results of our experiment are consistent with the research findings of Lourenço

da Silva et al. (2015). Lourenço da Silva et al. (2015) investigated into the process of

post-editing and human translation from Portuguese to Chinese. Results for all

dependent variables concerning eye activities in both AOIs (including fixation counts
60
and average fixation duration) proved all significant, indicating a reduction of overall

cognitive effort for post-editing task. As for pupil dilation, study conducted by

O’Brien (2006) offered a similar finding: the percentage change in pupil dilation for

post-editing machine translation was lower than that for translating from scratch,

which suggested that translating from scratch required more cognitive effort.

The reasons for the reduction of temporal and cognitive effort for post-editing

task may be as follows. First, compared with translation from scratch, translators are

provided with readily translated text in the post-editing task. In other words, the main

task for translator is to correct the deficiencies of machine translation. Post-editing

save translators from the effort of typing the whole translation text (target text).

Second, as the corresponding translation of source text is provided during post-editing,

although with deficiencies, translators do not have to re-understand the whole source

text. They just need to ponder on some points which aren’t well translated by machine

translation system. These two reasons can also be further justified in the discussion of

hypothesis three which investigate more specifically into the source text

comprehension and target text production. As for the significant differences for

processing speed proved in this research but not proved in García (2010), one

explanation for this is that different translation tools are used: Google Translate

Toolkit (translation memory) for García and free Google Translate for this research.

Besides, text type and text difficulty will also affect the result, let alone the

improvement of machine translation system over these years.

5.4.2 Discussion of Hypothesis 2

Hypothesis 2: Competent translator (in this case referring to postgraduate) will spend

less effort, in respect to time and cognition, than those less competent (in this case,

undergraduate). The reduction of time and cognitive effort varies among different text

types.

Hypothesis 2 consists of two statements. These two statements will be discussed


61
respectively. First, competence of translator affects temporal and cognitive effort.

Results of the experiment partially support this statement. For most dependent

variables (Fixation counts in two AOIs, pupil dilation and processing speed), ANOVA

results indicate the main effect of competence is not statistically significant, although

there do exist some differences. For average fixation duration of both AOIs, main

effect of competence proves marginally significant (p = 0.051) and interaction

between task and competence is not significant. The post hoc comparison reveals that

undergraduates fixate longer on the screen than postgraduates, either in the translation

process or in the post-editing process. For other variables, there is an impact of

competence, though not significant. The impact is as follows: postgraduates carry out

post-editing task faster than undergraduates; undergraduates always have larger pupil

dilation than postgraduates when conducting post-editing task.

Almeida and O'Brien (2010) conducted a study to explore how translation

experience influences post-editing performance. This study showed that the most

experienced translators conducted post-editing task the fastest, while translator with

the least experience was the slowest. This result pertains to our hypothesis. Balling

and Carl (2014) used the large resources in CRITT data base - a translation process

research database to conduct a large-scale analysis. 68 translators’ data was

investigated. However, Balling and Carl (2014) claimed that experience of translator

has a smaller influence than they supposed. The possible explanation for this was

individual difference of the translator, proposed by Balling and Carl (2014). In this

research, the reason why only one variable presents significant difference is that

although participants are classified into undergraduate and postgraduate. The

translation experience for all undergraduates is not the same, so is postgraduates. It is

possible that certain undergraduate has done a lot of translation practice and

accumulated more translation experience than postgraduates.

The second statement is that text type affects temporal and cognitive effort.

ANOVA results also support this statement. Results for text types are significant for
62
processing speed, pupil dilation and total fixation counts (p < 0.05). However, in this

research, effort to maintain the same level of difficulty for texts to post-editing and

translating is made. It is limited to the same text type. As Hvelplund (2011) said the

assessment of difficulty level is hard to perform, let alone texts with different types,

it’s impossible for us to claim the three kinds of source texts are of the same difficulty.

Therefore, no comparison among the three texts would be drawn. We just look at

whether, for a certain text, there is a reduction of effort for post-editing task.

According to the post hoc comparisons, for all text types, undergraduates do faster in

post-editing than in translation task. But for postgraduates, they’re faster when

post-editing economic texts. Combined with the result of fixation counts and average

fixation duration, which show that when dealing with economic text, postgraduates

have more fixations but shorter fixation duration, this abnormality could be explained.

It is possible that as economic text chosen in this research has many figures,

postgraduates tend to recheck the correctness of these figures which produces more

fixations and cost more time. However, the cognitive effort for post-editing economic

text is still reduced.

5.4.3 Discussion of Hypothesis 3

Hypothesis 3: Translators spend more cognitive effort on source text during

translation process than during post-editing process, while cognitive effort required to

target text is higher during post-editing process.

After researching into the overall temporal and cognitive effort saving, we want

to further look at the difference of cognitive effort distribution in the process of

post-editing and translating from scratch. To this end, we collect fixation counts

distributed to ST AOI and TT AOI and the average fixation duration of ST AOI and

that of TT AOI. ANOVA in terms of these variables are conducted.

For the ST AOI, there are significantly more fixation counts for translation than

for post-editing (p < 0.01). Besides, average fixation duration of ST AOI for human
63
translation is also significantly higher than that for post-editing. All these indicate that

translators spend more cognitive effort on source text comprehension when translating

from scratch. For fixation counts in TT AOI, there’s no statistically difference.

Nevertheless, the number of fixation counts in TT AOI for post-editing task exceeds

that for translation, which means translators look more into target text area in

post-editing task. ANOVA of average fixation duration in TT AOI is significantly

longer for translation task, which reveals that more cognitive effort is required to TT

area in human translation. For target text production, human translation also requires

more cognitive than post-editing. We could conclude that post-editing saves cognitive

effort from both source text comprehension and target text production.

This conclusion is consistent with findings reported by Lourenço da Silva et al.

(2015), which also found significant difference in source text in terms of (fixation

counts and fixation duration of ST AOI) and no significant difference in target text

except for average fixation duration. Carl et al. (2011) also found a significant

difference in fixation duration on source text: longer for human translation than

post-editing. Reason for more fixation counts in TT AOI for post-editing was

interpreted by Carl et al. (2011). That is, when dealing with post-editing task,

translators usually firstly read the raw machine translation output provided before they

compare it with source text; moreover, after correction of deficiencies in the text

offered, they would read again and recheck the correctness. However, although more

fixations are led because of the read-and-check behavior, effort needed for correcting

is still less than effort needed for reformulating and typing target text when translating

from scratch.

In addition to the differences of cognitive effort required to source text

comprehension and target text production between post-editing and translation from

scratch, we also want to explore the allocation of cognitive effort to source text

comprehension and target text production for post-editing and for translating from

scratch. A comparison between number of fixation counts in ST AOI and TT AOI for
64
translation task shows that translators fixate more on target text area than source text

area. As for post-editing task, translators also look more at target text. A comparison

between average fixation duration in ST AOI and TT AOI indicates that for both

post-editing and translation task, average fixation duration on target text is longer than

on source text. In other words, for both post-editing and translating from scratch,

more cognitive effort is allocated to target text production than source text

comprehension. This finding is partially consistent with findings of Koglin (2015),

which found that for human translation, fixation duration on source text was longer

than that on target text; while for post-editing, fixation duration on source text was

shorter than that on target text. The touch-typing ability of translator may explain this

difference. Participants for Koglin’s (2015) study are professional translators with

years of translation experience, while participants for this study are all students, most

of whom cannot touch type. Therefore, they may focus more on the target text when

they are typing target text. Carl et al. (2011) also got similar results.

5.4.4 Summary

To sum up, based on the discussion above, three conclusions can be draw. First of all,

post-editing could save both time and cognitive effort, compared with traditional

human translation. Second, in most cases, translators with more translation experience

spend less temporal and cognitive effort on post-editing than translating from scratch.

However, text types may have influence on this reduction. In other words, some texts

is suitable to post-editing, some not. Last but not least, Translators spend more

cognitive effort on both source text comprehension and target text production in the

translation process than in the post-editing process. For both human translation and

post-editing, cognitive effort is more distributed to target text production. Yet in this

research, this distribution of cognitive effort in human translation process may be

ascribed to the poor touch-typing ability of student translators. However, it is worth

noticing that all these conclusions are limited to the Chinese-English language pair
65
and to college English learners.

66
CHAPTER SIX

CONCLUSION

6.1 Major Findings

In Chapter One, three research questions were raised as follows: (1) In terms of

Chinese-English language pair, how is machine translation post-editing and traditional

human translation temporally and cognitively different from each other? (2) What

influence will text type and the competence of translators have on cognitive effort

spent in post-editing? (3) What’s the distinction of the allocation of cognitive effort to

source text and target text between human translation process and post-editing

process?

Based on the data analysis and discussion in Chapter Five, the following

conclusions are drawn:

(1) Post-editing of raw machine-translated output saves temporal effort,

compared with human translation. In other words, translators tend to be faster when

conducting post-editing task than translating from scratch. Besides, less cognitive

effort is required when translators post-edit a text. The reduction of cognitive effort is

shown as less number of fixations and shorter fixation duration on the text.

(2) The reduction in time and cognitive effort is subject to text types and

competence of translators. Typically, competent translators conduct post-editing work

faster. Also cognitive effort spent by competent translators is less than that spent by

less capable translators. Text type also influences the reduction of effort. Since in this

research, the difficulty level for different text types is not guaranteed as the same, no

conclusion concerning which text type is more suitable for post-editing could be

drawn. However, results of the experiment indicate that there is a reduction in

67
cognitive effort when translators post-edit all these three text types.

(3) The reason why post-editing could save temporal and cognitive effort is that

post-editing saves effort in both source text comprehension and target text production.

In post-editing process, translators focus more on the target text to check and correct

the machine translation, whereas in translation process, translators focus more on

source text comprehension.

In all, considering the temporal and cognitive effort saving, post-editing should

be a proper alternative to human translation for college English learners.

6.2 Limitations

Although this research has drawn significant conclusions and can provide much

experience and lesson for researches in the future, it still has many limitations.

First, conclusions drawn from the study are restricted to the Chinese-English

language pair. Post-editing is language-specific. It’s possible that cognitive effort is

reduced in one language pair but increased for another.

Second, conclusion we draw about text type can only tell that for post-editing,

there’re differences in the reduction of cognitive effort among different text types.

However, since the three kinds of texts chosen in this research aren’t made sure that

they are of the same difficulty. Comparison between different types cannot be made.

Third, the results are also limited by the small sample size (although fairly large

sample for the PE studies) and short text length (considering the restriction of eye

tracking method).

Last, as the participants in this research are all students, the research results

cannot be expanded to professional translators. Besides, these findings require further

validation in further research with a larger number of participants, text types.

6.3 Suggestions for Future Research

Advanced technology in machine translation system and request for large-scale and
68
rapid information in the global world together push research on machine transition

and post-editing to the center of translation studies. In the future, more studies,

concerning Chinese-English language pair or the other way around, could be

conducted. More specific research on the text types on the post-editing cognitive

effort could also be carried out.

69
REFERENCES

Allen, Jeffrey. 2003. Post-editing [A]. In Harold Somers (ed.), Computers and

Translation: A Translator’s Guide [C]. Amsterdam: John Benjamins, 297-317.

Aziz, Wilker, Sheila Castilho, & Lucia Specia. 2012. PET: A Tool for Postediting and

Assessing Machine Translation [A]. In Nicoletta Calzolari et al. (eds.),

Proceedings of the 8th International Conference on Language Resources and

Evaluation [C]. European Language Resources Association (ELRA), 3982-3987.

Balling, Laura Winther & Michael Carl. 2014. Production Time across Languages and

Tasks: A Large-Scale Analysis Using the CRITT Translation Process Database

[A]. In John W. Schwieter & Aline Ferreira (eds.), The Development of

Translation Competence: Theories and Methodologies from Psycholinguistics

and Cognitive Science [C]. Newcastle upon Tyne: Cambridge Scholars

Publishing, 239-268.

Bar-Hillel, Yehoshua. 1951. The Present State of Research on Mechanical Translation

[J]. Journal of the Association for Information Science and Technology 2(4):

229-237.

Blatz, John, Erin Fitzgerald, George Foster, et al. 2004. Confidence Estimation for

Machine Translation [J]. Mental Imagery 33: 9-40.

Callison-Burch, Chris, Philipp Koehn, Christof Monz, et al. 2010. Findings of the

2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine

Translation [A]. In Chris Callison-Burch et al. (eds.), Joint Fifth Workshop on

Statistical Machine Translation and MetricsMATR [C]. Association for

Computational Linguistics, 33(2): 17-53.

Callison-Burch, Chris, Philipp Koehn, Lucia Specia, et al. 2012. Findings of the 2012

Joint Workshop on Statistical Machine Translation [A]. In Chris Callison-Burch


70
et al. (eds.), In Proceedings of the Seventh Workshop on Statistical Machine

Translation [C]. Association for Computational Linguistics, 10-51.

Carl, Michael. 2009. Triangulating Product and Process Data: Quantifying Alignment

Units with Keystroke Data [A]. In Inger M. Mees, Fabio Alves & Susanne

Göpferich (eds.), Methodology , Technology and Innovation in Translation

Process Research [C]. Samfundslitteratur, 225-247.

Carl, Michael, Barbara Dragsted, Jakob Elming, et al. 2011. The Process of

Post-editing: A pilot Study [A]. In Bernadette Sharp et al. (eds.), Proceedings of

the 8th International NLPSC Workshop. Special Theme: Human-machine

Interaction in Translation [C]. Frederiksberg: Samfundslitteratur, 131-142.

Carl, Michael, Silke Gutermuth & Silvia Hansen-Schirra. 2015. Post-editing Machine

Translation Efficiency, Strategies, and Revision Process in Professional

Translation Settings [A]. In Aline Ferreira & John W. Schwieter (eds.),

Psycholinguistic and Cognitive Inquiries into Translation and Interpreting [C].

John Benjamins Publishing Company, 145-174.

De Almeida, Giselle & Sharon O’Brien. 2010. Analysing Post-editing Performance:

Correlations with Years of Translation Experience [A]. In Proceedings of the

14th Annual Conference of the European Association for Machine Translation

[C]. European Association for Machine Translation.

Doherty, Stephen, Sharon O'Brien & Michael Carl. 2010. Eye Tracking as an MT

Evaluation Technique [J]. Machine Translation 24(1): 1-13.

Doherty, Stephen. 2012. Investigating the Effects of Controlled Language on the

Reading and Comprehension of Machine Translated Texts: A Mixed-Methods

Approach [D]. Dublin City University.

Edmundson, Harold Parkins & David Glen Hays. 1958. Research Methodology for

Machine Translation [J]. Mechanical Translation 5(1): 8-15.

Elming, Jakob, Laura Winther Balling & Michael Carl. 2014. Investigating User

Behaviour in Post-Editing and Translation Using the CASMACAT Workbench


71
[A]. In Sharon O'Brien et al (eds.), Post-editing of Machine Translation:

Processes and Applications [C]. Cambridge Scholars Publishing, 147-169.

Fiederer, Rebecca & Sharon O'Brien. 2009. Quality and Machine Translation: A

Realistic Objective? [J]. The Journal of Specialised Translation 11: 52-74.

García, Ignacio. 2010. Is Machine Translation Ready Yet? [J]. Target 22(1): 7-21.

García, Ignacio. 2011. Translating by Post-Editing: Is It the Way Forward? [J].

Machine Translation 25(3): 217-237.

García, Ignacio. 2012. A Brief History of Postediting and of Research on Postediting

[J]. Revista Anglo Saxonica 3(3): 292-310.

Green, Roy. 1982. The MT Errors Which Cause Most Trouble to Posteditors [J].

Lawson (1986) 101-104.

Green, Spence, Jeffrey Heer & Christopher D. Manning. 2013. The Efficacy of

Human Post-Editing for Language Translation [A]. In Joseph A. Konstan, Ed Chi

& Kristina Höök (eds.), Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems [C]. New York: ACM press, 439-448.

Guerberof, Ana Arenas. 2009. Productivity and Quality in the Post-Editing of Outputs

from Translation Memories and Machine Translation [J]. Localization Focus 7(1):

11-21.

Guerberof, Ana Arenas. 2014. The Role of Professional Experience in Post-Editing

From a Quality and Productivity Perspective [A]. In Sharon O'Brien et al. (eds.),

Post-editing of Machine Translation: Processes and Applications [C]. Cambridge

Scholars Publishing, 51-76.

Holmqvist, Kenneth, Marcus Nyström, Richard Andersson, et al. 2011. Eye Tracking:

A Comprehensive Guide to Methods and Measures [M]. Oxford University Press.

Hu, Chang, Philip Resnik, Yakov Kronrod, et al. 2011. The Value of Monolingual

Crowdsourcing in a Real-World Translation Scenario: Simulation Using Haitian

Creole Emergency SMS Messages [A]. In Chris Callison-Burch et al. (eds.),

Proceedings of the Sixth Workshop on Statistical Machine Translation [C].


72
Stroudsburg PA: Association for Computational Linguistics, 399-404.

Hvelplund, Kristian Tangsgaard. 2011. Allocation of Cognitive Resources in

Translation: an eye-tracking and key-Logging Study [D]. Copenhagen Business

School.

Hyönä, Jukka, Jorma Tommola & Anna-Mari Alaja. 1995. Pupil dilation as a Measure

of Processing Load in Simultaneous Interpretation and Other Language Tasks [J].

Quarterly Journal of Experimental Psychology 48A(3): 598-612.

Iqbal, Shamsi T., Piotr D. Adamczyk, Xianjun Sam Zheng, et al. 2005. Towards an

Index of Opportunity: Understanding Changes in Mental Workload during Task

Execution [A]. In Joseph A. Konstan, Ed Chi & Kristina Höök (eds.),

Proceedings of the SIGCHI Conference on Human Factors in Computing

Systems [C]. New York: ACM press, 311-320.

Jakobsen, Arnt Lykke & Kristian T. H. Jensen. 2008. Eye Movement Behaviour

across Four Different Types of Reading Task [A]. In Susanne Göpferich, Arnt

Lykke Jakobsen & Inger M. Mees (eds.), Looking at Eyes: Eye-tracking Studies

of Reading and Translation Processing [C]. Copenhagen: Samfundslitteratur

Press, 103-124.

Just, Marcel Adam & Patricia A. Carpenter. 1976. Eye Fixations and Cognitive

Processes [J]. Cognitive Psychology 8(4): 441-480.

Just, Marcel Adam & Patricia A. Carpenter. 1980. A Theory of Reading: From eye

fixations to comprehension [J]. Psychological Review 87(4): 329-54.

Koehn, Philipp. 2010. Enabling Monolingual Translators: Post-editing vs. Options [A].

In Ron Kaplan et al. (eds.), Human Language Technologies: The 2010 Annual

Conference of the North American Chapter of the Association for Computational

Linguistics [C]. Stroudsburg PA: Association for Computational Linguistics,

537-545.

Koglin, Arlene. 2015. An Empirical Investigation of Cognitive Effort Required to

Post-Edit Machine Translated Metaphors Compared to the Translation of


73
Metaphors [J]. Translation & Interpreting 7(1): 126-141.

Koponen, Maarit & Leena Salmi. 2015. On the Correctness of Machine Translation: A

Machine Translation Post-Editing Task [J]. The Journal of Specialised

Translation 23: 118-136.

Koponen, Maarit. 2016. Is Machine Translation Post-editing Worth the Effort? A

Survey of Research into Post-editing and Effort [J] Journal of Specialised

Translation 25: 131-148.

Korpal, Paweł. 2015. Eye-tracking in Translation and Interpreting Studies: The

Growing Popularity and Methodological Problems [A]. In Łukasz Bogucki &

Mikołaj Deckert (eds.), Accessing audiovisual translation [C]. Łódź: Peter Lang,

199-212.

Krings, Hans P. 2001. Repairing Texts: Empirical Investigations of Machine

Translation Post-Editing Processes (ed. Geoffrey S. Koby) [M]. Kent, Ohio:

Kent State University Press.

Lacruz, Isabel, Gregory M. Shreve & Erik Angelone. 2012. Average Pause Ratio as an

Indicator of Cognitive Effort in Post-Editing: A Case Study [A]. In Sharon

O’Brien, Michel Simard & Lucia Specia (eds.), Proceedings of the AMTA 2012

Workshop on Post-editing Technology and Practice [C]. Association for Machine

Translation, 21-30.

Lacruz, Isabel & Gregory M. Shreve. 2014. Pauses and Cognitive Effort in

Post-editing [A]. In O’Brien et al. (eds.), Post-editing of Machine Translation:

Processes and Applications [C]. Cambridge Scholars Publishing, 246–272.

Lavorel, Bernard. 1982. Experience in English–French Post-Editing [J]. Lawson

(1986) 105-109.

Lourenço da Silva, Igor A., Márcia Schmaltz, Fabio Alves, et al. 2015. Translating

and Post-Editing in the Chinese-Portuguese Language Pair: Insights from an

Exploratory Study of Key Logging and Eye Tracking [J]. Translation Spaces

4(1): 144-168.
74
Mesa Lao, Bartolomé. 2014. Gaze Behaviour on Source Texts: An Exploratory Study

Comparing Translation and Post-editing [A]. In Sharon O'Brien et al. (eds.),

Post-editing of Machine Translation: Processes and Applications [C]. Cambridge

Scholars Publishing, 219-245.

Mitchell, Linda, Johann Roturier & Sharon O'Brien. 2013. Community-based

Post-editing of Machine-translated Content: Monolingual vs. Bilingual [A]. In

Sharon O’Brien, Michel Simard & Lucia Specia (eds.), Proceedings of Machine

Translation Summit XIV Workshop on Post-Editing Technology and Practice [C].

European Association for Machine Translation, 35-43.

O’Brien, Sharon. 2006. Eye-tracking and Translation Memory Matches [J].

Perspectives Studies in Translatology 14(3): 185-205.

O’Brien, Sharon. 2008. Processing Fuzzy Matches in Translation Memory Tools: An

Eye Tracking Analysis [A]. In Susanne Göpferich, Arnt Lykke Jakobsen & Inger

M. Mees (eds.), Looking at Eyes: Eye-tracking Studies of Reading and

Translation Processing [C]. Copenhagen: Samfundslitteratur Press, 79-101.

O’Brien, Sharon. 2011. Towards Predicting Post-Editing Productivity [J]. Machine

Translation 25(3):197-215.

Orr, David B. & Victor H. Small. 1967. Comprehensibility of Machine-aided

Translations of Russian Scientific Documents [J]. Mechanical Translation and

Computational Linguistics 10(1-2): 1–10.

Plitt, Mirko & François Masselot. 2010. A Productivity Test of Statistical Machine

Translation Post-Editing in a Typical Localisation Context [J]. The Prague

Bulletin of Mathematical Linguistics 93: 7–16.

Rayner, Keith. 1998. Eye movements in Reading and Information Processing: 20

Years of Research [J]. Psychological Bulletin 124(3): 372-422.

Sjørup, Annette Camilla. 2013. Cognitive Effort in Metaphor Translation: An

Eye-tracking and Key-logging Study [D]. Copenhagen Business School.

Slocum, Jonathan. 1985. A Survey of Machine Translation: Its History, Current Status,
75
and Future Prospects [J]. Computational linguistics 11(1): 1-17.

Smallwood, Jonathan & Jonathan W. Schooler. 2006. The Restless Mind [J].

Psychological bulletin 132(6): 946.

Sousa, Sheila C. M., Wilker Aziz & Lucia Specia. 2011. Assessing the Post-editing

Effort for Automatic and Semi-automatic Translations of DVD Subtitles [A]. In

Galia Angelova et al. (eds.), Proceedings of the Recent Advances in Natural

Language Processing Conference [C]. Hissar, Bulgaria: RANLP 2011

Organising Committee, 97-103.

Specia, Lucia, Marc Turchi, Nicola Cancedda, et al. 2009. Estimating the

Sentence-level Quality of Machine Translation Systems [A]. In Liuis Màrques &

Harold Somers (eds.), Proceedings of the 13th Annual Conference of the

European Association for Machine Translation [C]. European Association for

Machine Translation, 28-37.

Specia, Lucia, Nicola Cancedda & Marc Dymetman. 2010. A Dataset for Assessing

Machine Translation Evaluation Metrics [A]. In Proceedings of the Seventh

conference on International Language Resources and Evaluation [C]. European

Language Resources Association.

Tatsumi, Midori. 2009. Correlation Between Automatic Evaluation Metric Scores,

Post-Editing Speed, and Some Other Factors [A]. In Laurie Gerber et al. (eds.),

Proceedings of the MT Summit XI [C]. Association for Machine Translation,

332-339.

Vasconcellos, Muriel. 1987. A Comparison of MT Post-editing and Traditional

Revision [A]. In Proceedings of the 28th Annual Conference of the American

Translators [C]. Medford, New Jersey: Learned Information, 409-415.

Zhechev, Ventsislav. 2014. Analysing the Post-Editing of Machine Translation at

Autodesk [A]. In Sharon O’Brien et al. (eds.), Post-editing of Machine

Translation: Processes and Applications [C]. Cambridge Scholars Publishing,

2-13.
76
崔启亮. 2014. 论机器翻译的译后编辑[J]. 中国翻译 (06): 68-73.

崔启亮, 李闻. 2015. 译后编辑错误类型研究——基于科技文本英汉机器翻译[J].

中国科技翻译( 04): 19-22.

冯全功, 崔启亮. 2016. 译后编辑研究:焦点透析与发展趋势[J]. 上海翻译(6):

67-89.

冯全功, 张慧玉. 2015. 全球语言服务行业背景下译后编辑者培养研究[J]. 外语

界(01): 65-72.

韩培新. 1996. 智能译后编辑器 IPE [D]. 中国科学院研究生院(计算技术研究所).

黄河燕, 陈肇雄. 1995. 一种智能译后编辑器的设计及其实现算法[J]. 软件学报

(03): 129-135.

黄婕. 2016. 英汉译后编辑的效率及其影响因素研究[D]. 北京外国语大学.

李梅, 朱锡明. 2013. 译后编辑自动化的英汉机器翻译新探索[J]. 中国翻译(04):

83-87.

刘艳梅, 冉诗洋, 李德凤. 2013. 眼动法在翻译过程研究中的应用与展望[J]. 外国

语(上海外国语大学学报) (05): 59-66.

罗季美, 李梅. 2012. 机器翻译译文错误分析[J]. 中国翻译(05): 84-89.

王汕. 2015. 计算机辅助翻译结合译后编辑模式的可行性报告[D]. 华中师范大

学.

王娟. 2013. 国外机辅条件下的翻译过程实证研究[J]. 上海翻译(03): 60-65.

王萍. 2016. 机器翻译下预编辑和译后编辑在文史翻译中的作用[D]. 山东师范大

学.

魏长宏, 张春柏. 2007. 机器翻译的译后编辑[J]. 中国科技翻译(03): 22-24+9.

77
APPENDICES

Appendix A: Source Texts

A1

今年前三季度,恒大累计销售金额约达 2805 亿元。累计销售面积及销售均价分别为 3457

万平方米及 8115 元/平方米,比 2015 年分别增长 106.0%以及 5.8%。销售总额排名全行

业第一。

A2

今年前九个月,碧桂园已经实现销售金额 2256.9 亿元,同比上升 43.7%。实现营业收入

1170.5 亿元,同比增长 20.5%。去年,碧桂园实现年内销售额 1629 亿元,领先第二名

企业 117 亿元。

B1

今年我国发展面临的困难更多更大、挑战更为严峻。不过我们有中国特色社会主义制度

和中国人民勤劳智慧,只要我们团结一致,就一定能够实现全年经济社会发展目标。

B2

中国的发展从来都是在应对挑战中前进的。经过多年的快速发展,我国物质基础雄厚,

经济潜力足。改革开放也不断注入新动力。任何艰难险阻都挡不住中国发展的步伐。

C1

时间即生命。没有人不爱惜他的生命,但很少人珍视他的时间。如果想在有生之年做一

点什么事,学一点什么学问,充实自己,使生命成为有意义,那么就不可浪费光阴。

C2

零碎的时间最可宝贵,但是也最容易丢弃。我们的时间往往于不知不觉中被荒废掉。那

些在“度周末”的美名之下把时间大量消耗的人,他是在“杀时间”,也是在杀他自己。

78
Appendix B: Raw Machine Translation Output

All these raw outputs of machine translation were translated by Google Translate in

December 7, 2016.

A1

The first three quarters of this year, Hengda accumulated sales of about 280.5 billion yuan.

Cumulative sales area and sales price were 34.57 million square meters and 8115 yuan /

square meters, respectively, compared with 2015 increased by 106.0% and 5.8%. Total sales

ranked first in the industry.

A2

The first nine months of this year, Country Garden has achieved sales of 226.69 billion yuan,

up 43.7%. Operating income of 117.05 billion yuan, an increase of 20.5%. Last year, Country

Garden to achieve sales of 162.9 billion yuan during the year, leading the second enterprise

11.7 billion.

B1

This year China's development is facing more difficulties and challenges. However,

we have the socialist system with Chinese characteristics and the hard-working

wisdom of the Chinese people. As long as we unite as one, we will be able to achieve

the goal of economic and social development throughout the year.

B2

China's development has always been in response to challenges in the forward. After years of

rapid development, China has a solid material foundation and sufficient economic potential.

Reform and opening up also continue to inject new impetus. Any difficulties and obstacles are

unable to stop the pace of development in China.

79
C1

Time is life. No one does not care for his life, but few people cherish his time. If you

want to do something in your lifetime, learn a little knowledge, enrich yourself, make

life meaningful, then you can not waste time.

C2

The most precious piece of time, but also the most likely to discard. Our time is often

unwittingly abandoned. Those who spend a lot of time under the fame of "weekend" are

"killing time" and killing himself.

80
Appendix C: Overall Figures of Processing Speed

Processing Speed
HT PE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 3.89 5.13 11.01 5.88 10.99 13.27
P2 8.00 9.16 9.04 13.49 15.26 20.60
P3 6.26 9.64 16.29 8.14 19.44 21.35
P4 / 8.14 15.03 / 15.53 29.81
P5 7.46 9.10 13.48 7.15 10.01 20.12
P6 5.48 9.52 12.15 11.06 9.38 21.86
P7 / 9.44 18.55 / 16.08 16.76
Undergraduates P8 11.19 9.71 13.84 13.03 11.41 14.50
P12 3.58 7.32 12.58 7.89 9.22 26.37
P15 6.53 9.02 13.48 9.13 15.71 10.13
P17 6.38 / 13.90 17.39 / 12.58
P18 7.25 / 14.47 17.61 / 16.77
P23 7.94 10.84 16.27 40.21 36.98 45.62
P25 10.67 8.56 15.56 24.09 12.28 22.31
P26 7.53 12.22 14.80 9.30 27.97 13.15
Average 7.09 9.06 14.03 14.18 16.17 20.35
Total 10.06 16.90
P9 17.25 19.83 37.90 37.52 31.58 23.69
P10 18.07 16.49 20.49 31.09 33.01 66.13
P11 6.31 / 16.41 12.26 / 39.34
P13 8.57 9.37 10.94 10.05 16.90 11.02
P14 10.14 10.74 11.22 13.07 29.15 25.25
P16 12.70 14.61 8.87 11.84 30.93 21.26
Postgraduates
P19 8.30 10.96 14.65 16.39 27.13 19.32
P20 8.16 11.85 11.82 8.51 12.07 12.78
P21 11.82 7.24 7.88 6.68 14.68 13.28
P22 / 12.20 17.96 / 22.25 20.57
P24 9.97 12.71 11.82 10.75 11.20 16.01
P27 / 16.33 21.35 / 26.06 48.96

81
P28 8.55 13.19 18.57 9.48 22.85 20.23
P29 / 8.34 7.87 / 22.25 18.23
P30 / 14.25 15.54 / 14.87 16.13
Average 10.89 12.72 15.55 15.24 22.49 24.81
Total 13.06 20.85

Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

82
Appendix D: Overall Figures of Pupil Dilation

Pupil Dilation
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 2.66 2.67 2.76 2.62 2.64 2.77
P2 2.85 2.94 2.99 2.95 2.91 2.95
P3 2.66 2.71 2.75 2.64 2.59 2.65
P4 / 2.52 2.51 / 2.48 2.46
P5 2.71 2.62 2.64 2.64 2.66 2.58
P6 3.36 3.38 3.41 3.36 3.23 3.23
P7 / 3.31 3.15 / 3.20 3.18
Undergraduate P8 3.45 3.38 3.47 3.50 3.43 3.44
P12 3.00 2.99 2.92 2.91 2.82 2.87
P15 2.86 2.76 2.78 2.67 2.63 2.70
P17 3.03 / 3.07 3.00 / 3.04
P18 3.51 / 3.38 3.28 / 3.28
P23 2.87 2.89 2.80 2.74 2.76 2.72
P25 3.25 3.20 3.25 3.26 3.17 3.20
P26 2.94 2.93 3.00 2.91 2.92 2.99
Average 3.01 2.95 2.99 2.96 2.88 2.94
Total 2.98 2.93
P9 2.70 2.69 2.64 2.53 2.48 2.50
P10 2.97 2.81 2.80 2.78 2.75 2.72
P11 3.80 / 3.54 3.57 / 3.46
P13 3.25 3.25 3.18 3.01 3.12 3.05
P14 3.08 3.10 3.11 3.14 3.05 3.04
P16 3.32 / 2.99 3.09 / 2.92
Postgraduate P19 3.35 3.27 3.16 3.04 3.09 3.10
P20 3.37 3.13 3.27 3.38 3.20 3.21
P21 2.35 2.35 2.35 2.28 2.29 2.30
P22 / 2.88 2.80 / 2.71 2.90
P24 2.83 2.79 2.72 2.87 2.77 2.71
P27 / 3.43 3.38 / 3.33 3.27
P28 3.03 2.94 2.92 2.96 2.92 2.93
83
P29 / 2.63 2.62 / 2.60 2.67
P30 / 3.09 3.07 / 2.91 2.98
Average 3.09 2.95 2.97 2.97 2.86 2.92
Total 3.01 2.92

Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

84
Appendix E: Overall Figures of Fixation Counts

Total Fixations Counts


HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 1420 1276 501 1154 650 426
P2 505 603 694 554 485 393
P3 1216 756 459 1090 523 469
P4 / 771 427 / 480 397
P5 914 706 529 930 731 389
P6 939 543 422 554 700 345
P7 / 527 289 / 552 556
Undergraduate P8 580 776 590 477 795 552
P12 2136 1242 814 1271 1238 467
P15 1138 847 631 1119 643 902
P17 781 / 445 580 / 654
P18 899 / 548 607 / 577
P23 822 615 460 223 261 222
P25 585 742 481 333 733 395
P26 850 489 442 856 348 575
Average 983.46 761.00 515.47 749.85 626.08 487.93

Total 753.31 621.29

P9 366 283 178 196 337 390


P10 444 490 395 349 354 227
P11 755 / 346 736 / 203
P13 772 712 621 757 506 510
P14 392 597 661 642 350 424
P16 567 / 992 717 / 523
Postgraduate P19 775 442 362 540 405 425
P20 937 713 680 1050 806 808
P21 712 1497 1499 1547 830 803
P22 / 289 244 / 490 231
P24 705 597 616 830 710 528
P27 / 305 252 / 327 140
P28 872 553 388 900 421 443
85
P29 / 678 708 / 457 413
P30 / 400 377 / 444 396
Average 663.36 581.23 554.60 751.27 495.15 430.93

Total 599.73 559.12

Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

86
Fixations Counts in Win 1
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 496 478 154 301 190 424
P2 416 178 145 112 56 46
P3 608 377 178 311 134 141
P4 / 296 239 / 130 75
P5 510 265 207 340 271 101
P6 362 233 185 182 218 130
P7 / 201 111 / 128 157
Undergraduate P8 196 244 196 114 177 165
P12 874 514 347 346 319 158
P15 696 401 413 465 156 377
P17 353 / 236 129 / 195
P18 412 / 287 182 / 225
P23 419 318 277 69 85 87
P25 245 313 205 101 267 128
P26 251 150 91 166 66 119
Average 449.08 305.23 218.07 216.77 169.00 168.53

Total 324.12 184.77

P9 187 181 99 95 58 59
P10 394 / 116 171 / 156
P11 295 304 233 141 100 107
P13 89 274 316 143 94 151
P14 237 / 373 189 / 142
P16 249 149 96 132 89 101
P19 526 405 315 237 347 323
Postgraduate
P20 433 667 1074 484 267 356
P21 / 62 113 / 143 46
P22 237 172 241 177 124 121
P24 / 120 113 / 63 50
P27 425 286 189 147 127 219
P28 / 286 327 / 57 184
P29 / 171 157 / 100 130
87
P30 294.82 247.69 256.93 179.64 128.77 151.73
Average 187 181 99 95 58 59

Total 266.48 153.38

Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

88
Fixations Counts in Win 2
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 924 798 347 853 460 2
P2 89 425 549 442 429 347
P3 608 379 281 779 389 328
P4 / 475 188 / 350 322
P5 404 441 322 590 460 288
P6 577 310 237 372 482 215
P7 / 326 178 / 424 399
Undergraduate P8 384 532 394 363 618 387
P12 1262 728 467 925 919 309
P15 442 446 218 654 487 525
P17 428 / 209 451 / 459
P18 487 / 261 425 / 352
P23 403 297 183 154 176 135
P25 340 429 276 232 466 267
P26 599 339 351 690 282 456
Average 534.38 455.77 297.40 533.08 457.08 319.40

Total 429.18 436.52

P9 257 309 296 254 296 168


P10 361 / 230 565 / 47
P11 477 408 388 616 406 403
P13 303 323 345 499 256 273
P14 330 / 619 528 / 381
P16 526 293 266 408 316 324
P19 411 308 365 813 459 485
Postgraduate
P20 279 830 425 1063 563 447
P21 / 227 131 / 347 185
P22 468 425 375 653 586 407
P24 / 185 139 / 264 90
P27 447 267 199 753 294 224
P28 / 392 381 / 400 229
P29 / 229 220 / 344 266
89
P30 368.55 333.54 297.67 571.64 366.38 279.20
Average 257 309 296 254 296 168

Total 333.25 405.74

Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

90
Appendix F: Overall Figures of Average Fixation Duration

Average Fixations Duration in ST and TT AOIs


HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 668.17 610.72 706.50 567.41 572.71 717.43
P2 578.83 546.53 512.38 468.07 429.71 414.92
P3 508.04 553.75 550.77 438.86 410.87 416.31
P4 / 588.16 557.54 / 416.13 326.61
P5 499.19 586.87 546.55 451.59 484.76 496.19
P6 582.67 637.07 605.81 556.18 577.60 472.68
P7 / 614.38 519.13 / 461.65 426.24
Undergraduate P8 558.79 532.89 499.28 607.83 444.77 471.47
P12 475.69 434.79 395.29 375.57 354.58 331.11
P15 421.38 429.85 421.05 349.53 377.38 424.49
P17 476.56 / 411.75 363.64 / 438.26
P18 453.55 / 386.88 358.43 / 377.94
P23 539.73 521.20 482.27 364.80 417.48 416.19
P25 601.79 615.09 533.13 479.72 450.43 477.73
P26 550.48 577.19 564.69 446.01 408.72 522.58
Average 531.91 557.58 512.87 448.28 446.68 448.68
Total 534.12 447.88
P9 451.51 565.21 514.04 440.29 355.43 437.91
P10 444.53 486.04 434.22 351.22 343.66 271.05
P11 525.26 / 553.38 403.74 / 392.67
P13 489.17 529.69 472.84 433.11 451.08 691.91
P14 653.11 593.78 545.78 461.86 399.79 393.21
P16 518.83 / 461.73 463.54 / 377.30
Postgraduate
P19 526.70 668.05 601.26 328.17 324.82 457.01
P20 429.65 419.25 408.09 392.57 389.03 363.90
P21 452.52 371.56 352.04 382.66 339.40 363.92
P22 / 570.80 504.98 / 376.43 627.90
P24 489.77 473.75 478.20 408.10 440.64 437.11
P27 / 468.87 442.20 / 434.90 508.30

91
P28 481.36 533.15 0.00 424.33 410.67 448.24
P29 / 525.76 553.20 / 332.45 388.91
P30 / 480.74 471.17 / 465.97 493.83
Average 496.58 514.36 452.88 408.15 389.56 443.54
Total 487.94 413.75

Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

92
Average Fixations Duration in ST AOI
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 491.47 397.72 364.57 407.25 326.68 718.92
P2 600.66 495.39 476.00 383.33 302.21 440.83
P3 455.02 395.24 377.38 344.84 381.43 343.70
P4 / 471.56 502.00 / 362.27 287.43
P5 404.15 522.13 454.86 429.42 389.05 420.81
P6 537.68 437.25 448.14 525.64 494.61 351.22
P7 / 473.48 391.23 / 330.41 335.76
Undergraduate P8 468.13 396.07 381.04 430.45 352.47 314.67
P12 440.49 360.28 321.10 335.59 314.57 268.01
P15 400.12 362.45 371.88 315.76 325.78 408.56
P17 492.86 / 356.26 313.57 / 327.95
P18 429.42 / 343.68 309.20 / 305.69
P23 451.09 453.83 443.68 306.35 354.86 368.47
P25 539.72 462.18 469.77 418.66 415.52 351.16
P26 380.94 340.17 322.25 275.59 267.68 252.99
Average 468.60 428.29 401.59 368.90 355.20 366.41

Total 432.82 363.50

P9 391.74 437.55 383.12 331.90 314.76 346.51


P10 398.16 338.91 259.32 263.18 231.10 240.97
P11 516.95 / 357.07 349.30 / 388.40
P13 491.01 476.15 333.86 335.03 306.30 479.16
P14 651.22 452.04 448.94 361.22 335.18 295.54
P16 450.42 / 369.71 367.06 / 327.39
P19 388.62 566.47 448.01 310.89 334.78 348.34
Postgraduate
P20 380.68 371.00 361.90 281.83 313.37 302.05
P21 410.68 362.84 355.14 355.26 302.30 323.61
P22 / 474.35 379.35 / 329.40 391.59
P24 367.67 349.74 387.19 316.84 335.15 400.51
P27 / 363.97 355.31 / 286.37 349.22
P28 432.98 362.23 402.50 378.55 346.22 403.48
P29 / 467.91 454.70 / 282.14 342.18

93
P30 / 493.27 388.95 / 338.23 368.31
Average 443.65 424.34 379.00 331.91 311.95 353.82

Total 415.66 332.56

Note: "/" means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

94
Average Fixations Duration in TT AOI
HT MTPE
Text-E Text-P Text-L Text-E Text-P Text-L
P1 763.03 738.30 858.25 623.92 674.33 401.50
P2 476.76 567.95 521.99 489.55 446.35 411.49
P3 561.06 711.43 660.60 476.40 421.01 447.52
P4 / 660.82 628.15 / 436.13 335.74
P5 619.17 625.78 605.50 464.37 541.14 522.62
P6 610.89 787.26 728.88 571.13 615.13 546.12
P7 / 701.25 598.89 / 501.27 461.84
Undergraduate P8 605.06 595.64 558.10 663.54 471.21 538.32
P12 500.07 487.39 450.41 390.53 368.47 363.38
P15 454.85 490.46 514.21 373.54 393.91 435.93
P17 463.13 / 474.41 377.96 / 485.12
P18 473.95 / 434.39 379.51 / 424.13
P23 631.89 593.34 540.68 390.99 447.72 446.95
P25 646.52 726.66 580.19 506.31 470.42 538.40
P26 621.53 682.06 627.55 487.01 441.73 592.93
Average 571.38 643.72 585.48 476.52 479.14 463.47

Total 600.19 473.04

P9 503.92 695.60 654.09 488.10 373.84 484.13


P10 478.28 572.22 492.71 384.15 365.71 281.61
P11 534.33 / 652.40 420.22 / 406.81
P13 488.02 569.59 556.30 455.56 486.73 748.39
P14 653.67 714.02 634.48 490.71 423.51 447.22
P16 567.95 / 517.19 498.08 / 395.90
P19 592.06 719.71 656.57 333.76 322.01 490.89
Postgraduate
P20 492.32 482.69 447.96 424.86 446.22 405.10
P21 517.46 378.57 344.21 395.13 356.99 396.02
P22 / 597.14 613.34 / 395.81 686.66
P24 551.60 523.94 536.68 432.84 462.96 448.00
P27 / 536.91 512.84 / 470.34 596.68
P28 527.36 716.23 649.96 433.27 438.51 492.00
P29 / 567.97 637.75 / 339.62 426.45

95
P30 / 471.37 529.85 / 503.10 555.18
Average 537.00 580.46 562.42 432.43 414.26 484.07

Total 559.96 443.58

Note: “/” means that the data was unqualified and discarded. It is worth noticing that for a
participant, if data for one task is unqualified, data for other tasks should also be
discarded.

96

You might also like