You are on page 1of 26

Interacting with Computers 17 (2005) 121–146

www.elsevier.com/locate/intcom

Chinese character entry for mobile phones:


a longitudinal investigation
Min Lin, Andrew Sears*
UMBC, Information Systems, 1000 Hilltop Circle, Baltimore 21250, USA
Received 7 January 2004; revised 9 August 2004; accepted 27 November 2004
Available online 28 December 2004

Abstract
The increasing popularity of Short Message Services (SMS) in China highlights the need for
effective and efficient methods for entering Chinese text on mobile phones. While stroke-based
methods have potential advantages over pronunciation-based solutions, usability issues have limited
the effectiveness of existing stroke-based methods. One significant usability challenge has been the
ambiguous stroke-to-key mapping rules that are typically employed. We proposed a new solution that
employs a combination of abstract symbols and example strokes to help users map strokes to keys
more effectively. A longitudinal experiment was used to evaluate character entry performance using
both objective and subjective measures for our new design as well as the existing solution. The results
confirmed that a new design allows for improved performance as well as higher satisfaction levels as
compared to the original design. Further, after approximately 1 h of experience with the stroke-based
method, novices were able to enter Chinese text at speeds comparable to that observed with the
pronunciation-based Pinyin method. Results showed that the new design provided users with a better
understanding of the system throughout the study, beginning with their first exposure to the keypad. By
utilizing a combination of abstract representations and concrete examples of the available strokes, the
new design reduced the ambiguity that typically exists regarding stroke-to-key mappings. In this way,
usability was improved without any changes to the underlying technologies. Our results demonstrate
that stroke-based solutions for Chinese character entry can be effective alternatives for mobile phones,
providing an effective alternative for the many individuals who can write Chinese but do not speak the
Mandarin dialect that serves as the basis for Pinyin. The improved solution could also be used with a
traditional numeric keypad to allow one-handed data entry for desktop or mobile computers.
q 2004 Elsevier B.V. All rights reserved.
Keywords: Mobile computers; Mandarin dialect; Pinyin method

* Corresponding author. Tel.: C1 4104553883; fax: C1 4104551531.


E-mail address: asears@umbc.edu (A. Sears).

0953-5438/$ - see front matter q 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.intcom.2004.11.003
122 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

1. Introduction

Short Message Service (SMS) extends the mobile phone’s traditional role as an
intermediary, allowing individuals to communicate either verbally or in writing depending
on their specific needs. SMS is an increasingly popular capability with mobile phones,
supporting business, personal, and entertainment-oriented communications. Unfortu-
nately, entering text on mobile phones is a significant challenge given the limited keypad
that is typically available.
Since SMS was first introduced in China, millions of individuals who read and write
Chinese have adopted it as a convenient way of communicating with other people who
have mobile phones. China’s biggest service provider, China Mobile, has estimated that
its customers sent 40 billion text messages in year 2002. While the ability to send and
receive text messages in Chinese could prove useful for many people, the actual
experience of entering Chinese characters on mobile devices is not easy (Lin and Sears,
in press). The characteristics of the Chinese language make it a challenge to provide
effective and efficient input solutions using traditional keyboards or keypads initially
designed for English-speaking users. Unlike English, Chinese is an iconographic
language. In written Chinese, the minimum unit is a stroke but the minimum functional
unit is a character. Each stroke is written with a single action while most characters are
composed of two or more strokes. With the exception of several single-stroke characters,
individual strokes have no meaning in isolation while characters represent a complex
interweaving of sound, shape, and meaning. With more than 3000 commonly used
characters, a keyboard with approximately 4000 keys could be a good choice for
professional typists (Archer et al., 1988), but keypads available on mobile devices
provide far fewer options.
Chinese handwriting recognition applications, such as DragonPene and PenPower,
running on the Palm operating system highlight another possible solution for mobile
devices. The most attractive feature of such systems is arguably the natural interface.
With incremental recognition, these systems can quickly recognize characters using just
the first few strokes (Matić et al., 2002). However, just like any recognition system,
Chinese handwriting recognition software is not error free. Further, since many people
do not write strokes in the standard way, or in the standard order, many users will be
forced to adapt to the system which presents additional opportunities for errors. Even
with these challenges, Chinese handwriting systems are gaining popularity for a variety
of applications. At the same time, handwriting systems require both hands for operation,
which leads to difficulties when one hand is occupied or when a person is on the move.
Solutions which allow for one-handed operation remain an important goal, especially
when designing for mobile phones. Entering Chinese characters using a standard 12-key
telephone keypad is a significant challenge that serves as the focus for the current
article.
Currently there are two primary methods for entering Chinese characters on mobile
phones. One is based on the way a character is pronounced while the second is based on
the strokes used to write the character. Pinyin, a pronunciation-based method, is the most
frequently used method for entering Chinese characters when using traditional computers
(Yuan, 1997). To use Pinyin, users must first translate the sound of the Chinese character
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 123

into a string of Roman letters (i.e. a Pinyin script). For example, the Chinese character “ ”
(i.e. center) is translated into the Pinyin script “zhong.” Unfortunately, mapping from
Chinese characters to Pinyin scripts is a one-to-many process since a single character can
be pronounced in multiple ways depending on the context in which it is being used (i.e.
“ ” is pronounced “zhe” in “ ” while pronounced “zhao” in the word “ ”, resulting in
two different Pinyin scripts for the same character). Next, users must enter the Pinyin
script using any available input technique (e.g. a QWERTY keyboard when using a PC,
Multi-tap when using a mobile phone). In our example, this requires 5 keystrokes on a
QWERTY keyboard but 12 keystrokes when using Multi-tap on a mobile phone. Finally,
the mapping from Pinyin scripts back to Chinese characters is also one-to-many since
multiple Chinese characters can produce exactly the same Pinyin script (i.e. more than ten
different characters including each of the following “ ”, produce the Pinyin
script “zhong”). As a result, after entering the Pinyin script, users must scan through
multiple alternatives to find the desired character. Sometimes, this list of alternatives is
quite long.
Pinyin is popular primarily because it provides a natural mapping between Pinyin
scripts and the keys on the traditional QWERTY keyboard. However, the many-to-one
mapping from Chinese character to Pinyin script is far less natural. As discussed below,
there are both cultural and technical problems that limit the effectiveness and accessibility
of Pinyin, especially when using mobile phones.

2. Pronunciation-based methods

Three steps are required when using pronunciation-based methods: translation, code
input, and search among alternatives. The first step involves translating the sound of the
character into some alternative form, such as a Pinyin script using Roman letters. Such
a conversion simplifies the process of entering the code, but Roman letters are foreign
to the Chinese culture. As a result, extra effort is required to learn the Roman letters
and to translate the sound of a Chinese character into Roman letters. As noted by
Sacher et al. (2001), using a foreign script to mediate the input of a native language is
“peculiar.” Furthermore, Pinyin was developed based on Mandarin, a northern dialect
of Chinese. At the same time, there are seven major and more than fifty minor dialects
in use in China. As a result, there are a large number of Chinese people who either
cannot speak Mandarin at all or speak Mandarin in a way that does not match the
pronunciation rules used to create Pinyin. For example, Hong Kong Census and
Statistics Department reported that in the year 2001 only 0.9% of the population
(excluding those who had lost their vocal functionality or were younger than 5 years
old) spoke Mandarin as their primary language; while 89.2% spoke Cantonese and
5.5% spoke other Chinese dialects (Census and Statistics Department, 2001). Therefore,
individuals who are not native speakers of the Mandarin dialect may experience
significant difficulty translating Chinese characters into their corresponding Pinyin
scripts. Significantly reducing, or eliminating, the errors these individuals experience is
difficult given the challenges involved in changing long-term pronunciation habits.
124 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Even for Mandarin speakers, typing Chinese using the Pinyin method is still not natural
as English-speaking users typing English.
After translating the Chinese character into a Pinyin script, users must enter the script.
When a traditional keyboard is used, entering a script becomes a relatively simple task of
typing the appropriate characters. In contrast, when one uses a mobile phone, entering the
Pinyin script can be a significant challenge. On a typical mobile phone, Roman characters
are entered using the standard 12-key keypad where each key represents three or four
characters. Various techniques exist for solving this problem, including Multi-tap and T9w
(www.t9.com), but multiple studies confirm that basic text entry using mobile phone
keypads tends to be a slow process (James and Reischel, 2001; MacKenzie et al., 2001;
Silfverberg et al., 2000).
Finally, once the Pinyin script is entered, users must select the appropriate Chinese
character. This is a non-trivial problem since the mapping from Pinyin script back to
Chinese characters is many-to-one. Since Pinyin scripts represent the sounds produced
when speaking Chinese characters, and multiple Chinese characters produce the same
sounds, most Pinyin scripts are associated with more than one character. In fact, some
scripts correspond to more than one hundred different characters (Qiao et al., 1990). As a
result, even if the correct Pinyin script is entered without any errors, users must still select
the desired character from what may be a long list of alternatives. This is an error-prone
and time-consumed process. Interestingly, researchers have explored the use of eye-
tracking technologies (Wang et al., 2001) to speed the process of selecting the desired
character during PC-based interactions. While preliminary evaluations indicated that this
approach has the potential to improve interactions, it is unlikely to prove effective for
mobile devices. Widely used word-phrase association techniques can help shorten the list
of alternatives, but this also results in longer scripts which increases the likelihood of
mistakes and may make error correction more time consuming (i.e. more keystrokes for
correction, more difficult to identify errors).

3. Shape-based methods

Shape- or stroke-based solutions have the potential to provide advantages as compared


to pronunciation-based solutions. While pronunciation-based solutions can only be used
by individuals who can accurately translate Chinese into Pinyin scripts, anyone who can
write Chinese should be able to use naturally designed shape-based methods. For the user
population of Chinese text entry, some of them may not know the Pinyin method or even
speak Mandarin, but they do know what a Chinese character looks like. As a result, stroke-
based methods have the potential to be used successfully by more people than
pronunciation-based systems. The mapping between character and shape is one-to-one
in both directions. While a single character may be pronounced in several different ways
depending on the current usage, and multiple characters can produce the same Pinyin
script, each character is written in exactly one way regardless of the context. Further, while
there are two versions of written Chinese (i.e. Simplified and Traditional Chinese) the
stroke set used to write the characters remain the same. Therefore, a single stroke-based
solution could be used for both versions of Chinese. The limited number of strokes and
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 125

the one-to-one mapping between Chinese characters and strokes make stroke-based
solutions a promising alternative for Chinese input on mobile phones, as well as PC users.
The primary shape-based input method in mainland China is Wubi, which literally
means “five strokes”. Wubi uses radicals instead of strokes as the minimal input unit. A
radical is a structure unit which includes one or more strokes. In Wubi, radicals are
grouped into five major categories, with each category further divided into five sub-
groups. The twenty-five radical categories then are assigned to twenty-five keys (A–Y) on
a standard QWERTY keyboard. Each Chinese character is decomposed into radicals and
entered via a sequence of two to five keystrokes, resulting in the name of the technique:
“five strokes.” With extensive training, a highly skilled typist using the Wubi method is
able to enter characters at up to 200 cpm (characters per minute). The single most
significant challenge for new users is that characters are decomposed in a way that is not
necessarily the norm, making it necessary to learn the Wubi decomposition of each
character in addition to the standard decomposition that is used when writing the character.
This also makes it more likely that users will forget the Wubi decompositions.
It is difficult to implement a shape-based input method designed for traditional
keyboards on a miniature keypad without significant modification. Using Wubi as an
example, re-assigning the twenty-five radical categories to as few as nine keys is a
significant challenge. The advantages demonstrated when using Wubi with traditional
keyboards may not survive the transition and new challenges will be introduced. For
example, users must learn new rules for mapping radicals to keys, limiting the benefits
users may experience due to prior exposure to the Wubi solution.
In addition to the aforementioned Pinyin and Wubi methods, numerous other
pronunciation-based methods (e.g. Shuangpin Beijing Golden Human Computer Co,
2004), shape-based methods (e.g. Zhengma ZhongYi Info-tech Ltd, 2004), and mixed
methods (e.g. Ziranma method Beijing Nature Software Ltd, 2000) have been suggested as
faster alternatives. Unfortunately, the usability, and efficiency, of these methods has not
been described in the literature. Further, all of these methods share a common
characteristic: the way characters are represented is not natural. As a result, they share
the same problems as highlighted above for the Wubi method.
Stroke-based methods should be easy to learn because the decomposition of characters
into basic strokes is well defined and closely matches handwriting methods. An official
standard defines this process. However, the number of unique strokes used to write
Chinese characters still exceeds the number of keys found on typical mobile devices,
resulting in significant design challenges that have hindered the adoption of existing
stroke-based solutions (Lin and Sears, in press).
Major mobile phone manufacturers have implemented different stroke-based systems.
The main difference between these alternative systems is how the strokes are assigned to
the numeric keys. For example, the eZiTextw technique from Zi Corporation groups the
strokes into eight sets and assigns each set to a single key. In contrast, the iTapw technique
from Motorola groups the strokes into nine sets, using all nine numeric keys. The
variability in these designer-defined groupings highlights the fact that there is no single,
universally accepted, set of rules for grouping these strokes. As a result, users may or may
not understand the rules that are employed. More importantly, when users fail to associate
each stroke with the proper key, data entry is slow and error prone. This is illustrated by
126 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

the data entry rate of only 2.68 cpm, combined with a key-selection error rate of 46%,
reported for first-time users of the iTapw solution (Lin and Sears, in press). While the
instructions do provide information that would allow users to select the proper key for
each stroke, a manual with more than 200 pages is very likely to be ignored by most users
(Nielsen et al., 1986). The technology itself provides few cues as to which key should be
pressed for each stroke. While the graphics on each key (i.e. legends) are designed to be
the interface between the user and system, they convey limited information about the
underlying rules that were used to group the strokes. Effective legends, which provide
insights into these rules, appear to be as critical as the underlying input methodology.

3.1. Current stroke input method

Few studies have focused on understanding the impact legend design can have on the
efficacy of a stroke-based input method. In this paper, we use an existing stroke input
method to examine the effect legend design has on Chinese character entry.
In the current implementation, individual strokes are grouped into nine categories and
assigned to the nine numeric keys on mobile phone keypad (Fig. 1a). A Chinese character
is entered stroke-by-stroke using the standard, ‘official’ writing order for the individual

Fig. 1. (a) the original keypad design (b) the new keypad design.
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 127

Fig. 2. Demo screens showing how a user may enter the character “ ”. The strokes that have been entered are
displayed in the top-right corner of the display. The list of possible characters is updated at the bottom of the
screen after each stroke is entered.

strokes. Each stroke is entered by pressing a single numeric key. Typically, it is not
necessary to enter all of the individual strokes before the desired character appears in the
list of alternatives. After each keystroke, a list of characters matching the strokes entered
by the user is updated at the bottom of the screen. To facilitate the selection of the correct
character, the list is sorted based on how frequently each character is used, as with
common word prediction programs. With the word-phrase associations, a list of characters
that frequently follow the most recently entered character is presented after each character
is entered. Take character “ ” as example, it is composed with four strokes (i.e. )
entered in that order which corresponds to keystrokes 5, 3, 8, then 5. Using word-phrase
associations, once “ ” entered, a list of characters that frequently follow this character are
presented (Fig. 2). An interactive demonstration is available on the web page http://www.
motorola.com/lexicus/iTAP_Demos.html.
Three kinds of key selection errors are observed when novice users complete character-
entry tasks using this stroke input solution. They may simply select the wrong key (i.e.,
pressing key 4 for stroke , which is entered using key 5). At times, users separate a single
stroke into two or more pieces and attempt to enter each piece separately. For example,
they may try to enter the stroke using the key sequence 8 5 when the correct solution is a
single press of the key 3. Finally, users may try to enter two or more strokes using a single
key (i.e. pressing key 1 to enter both and–in the character “ ” using a single key).

3.2. A new stroke-based solution

In the current design (Fig. 1a), a single stroke is used as the legend for each key to
represent the corresponding group of strokes. In an earlier study, we had participants use
the current design to input Chinese characters. Our results confirmed that text entry speeds
128 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

were problematic, but also revealed that when the desired stroke did not match any of
the legend symbols, the chance of selecting the correct key was only 46% (Lin and Sears,
in press). Clearly, the current design failed to convey the necessary information to allow
users to understand which strokes could be entered using each key. An analysis of the
current stroke groupings confirmed that, for each group, there was one unique
characteristic could be defined that present in all of the strokes. Unfortunately, users
were not able to determine how strokes were grouped and errors were common.
Redesigning the legends to convey more information should help improve the text entry
performance without having to change the underlying algorithms. Building on our
observation that there was a unique characteristic that could be used to group strokes, we
developed a set of abstract symbols to reflect and highlight these common characteristics.
We also selected two example strokes to help clarify the diversity of strokes that could be
entered using each key in addition to the common characteristic the strokes share (Fig. 1b).
To begin evaluating our proposed solution, we conducted a simulator-based study with
four alternative designs including the original design, the original design with example
strokes added, our abstract symbols, and our abstract symbols with example strokes added
(Lin and Sears, in press). Through this between groups study, the potential benefits of both
abstract symbols and examples were confirmed. The results indicated that abstract
symbols alone allowed users to avoid errors that involved mistakenly combining two or
more strokes (i.e. combination errors). In contrast, concrete stroke examples reduced the
number of errors that involved inappropriately separating a single stroke into two or more
fragments (i.e. separation errors). By combining abstract symbols and concrete stroke
examples, text entry performance almost tripled as compared to the original design.
Interestingly, combining abstract symbols and concrete stroke examples not only reduced
separation and combination errors, but also significantly decreased those errors where
users simply selected the wrong key. However, these results were based on limited
interactions with a simulator, not with real phones. The current study extends these results
by having participants complete text entry tasks using real phones over a span of 6 days.

4. Method

We compared the proposed design (abstract symbols with concrete examples) to the
current design used in commercially available mobile phones. Custom phones were
created for the proposed design, allowing users to complete data entry tasks with real
phones instead of simulators. To provide insights into changes that occur in satisfaction,
data entry rates, and the users’ understanding of the underlying technology as they gained
experience, we employed a longitudinal experimental design.

4.1. Participants

Participants had to be born and raised in China, having lived in the United States for no
more than four years. Twenty-four volunteers (twelve males and twelve females) from the
Baltimore-Washington area were recruited through a web-based system. Participants were
screened to ensure that they did not have any difficulty in writing Simplified Chinese or
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 129

Table 1
Age and education level of participants

Design Age (mean/median) Education Level (number of people)


Bachelor’s degree Master’s degree
Original 27.7/28 5 7
New 26.2/26 7 5

understanding Mandarin Chinese. Study participants did not have any previous experience
using the stroke input method for Motorola mobile phones. Participants were randomly
assigned to two groups while ensuring that each group included six males and six females.
There were no significant differences between the two groups with regard to age or
education (Table 1). Participants were paid $60.00 as compensation for their participation
in this 6-day study.

4.2. Apparatus

Two versions of the Motorola V60 mobile phone were used in this study. The first was
the standard commercial product available in China which uses the original keypad design
(Fig. 1a). The second was the same phone with a custom ‘abstract with examples’ keypad
design (Fig. 1b).
Two versions of a keypad simulator were implemented in Javae, each providing
participants with the same nine keys that are used to enter strokes on the mobile phones.
The first used the same graphics as were available on the commercial phone while the
second used the new graphics available on our custom keypad (Fig. 3). Unlike the phones,
the simulator does not provide any feedback indicating whether the keystrokes entered
were correct or incorrect. Similarly, users are not presented with a list of possible Chinese
characters when using the simulator. The simulator runs on a Gateway Solo Pro 9300
laptop.

4.3. Tasks

A set of 28 two-character Chinese words (Fig. 4) was developed such that every
character could be identified based solely on its pronunciation. These 28 words served as

Fig. 3. The two versions of legend design for Motorola iTapw Chinese input method: (a) Motorola’s original
design and (b) proposed new design.
130 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Fig. 4. The complete list of 28 words used in the pre-test and simulator tasks.

the foundation for our simulator-based tasks. On the first day of the study, these 28 words
were introduced to participants by having them listen to an audio recording of the words
one at a time. For each word, participants were asked to write down the first character. This
pre-test was designed to ensure that the participants knew how to write each of the 28
characters.
On each of the 6 days of the study, participants entered Chinese characters using the
mobile phone, completed some simulator-based interactions, and responded to a brief
survey. The text entry tasks were designed to provide insights into data entry rates and any
errors that may occur while the simulator-based iterations were used to gain insight into
the users’ understanding of how individual strokes map to keys. The questionnaire was
designed to provide insights into user perceptions of the data entry solution they were
using.
For the text entry tasks, users entered five sentences that were based on headlines from a
Chinese news website. Some headlines were used in their original form, but a few were
modified to ensure that every sentence contained exactly seventeen characters. The
sentences covered five categories: international events, economy, education, technology,
and sports with six sentences in each category providing a total of 30 sentences.
Participants entered five different sentences on each day of the study including one
sentence from each of the five categories. For each sentence, they listened to an audio
recording and then wrote the sentence on a sheet of paper. Next, the experimenter verified
that the sentence was correct or provided any necessary corrections. At that point,
participants entered the sentence using the phone they had been given. Participants were
instructed to balance time and errors as they would when using the phone under realistic
conditions. Over the 6 days of the study, each participant entered a total of 510 Chinese
characters using the mobile phone including 361 unique characters. Data from these
phone-based interactions served as the foundation for all text entry speed and character-
level error data reported below. To enter characters correctly, users had to overcome two
challenges. First, they had to determine the correct order to enter the strokes. Second, they
had to determine which key could be used to enter the desired stroke. Consequently, any
difficulty users encountered determining the proper order to enter the strokes is
incorporated in the entry speed and character-level failure results reported below.
Immediately after completing the text entry task, participants were asked to complete a
brief distracter task. The distracter task required participants to compute the sum of three
numbers ranging from 10 to 99 and was employed to allow the simulator-based
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 131

interactions to provide better insights into the users’ understanding of the text entry
mechanism they were using. The purpose of the distracter task was to briefly pull the
participants’ attention away from their interaction with phones, ensuring that the
participants did not continue to mentally review the mappings between strokes and keys
(by requiring them to focus on other, unrelated, information). As a result, their subsequent
responses during the simulator-based tasks provided a more effective representation of
their knowledge of the key-to-stroke mappings used by the system.
Next, the 28 two-character words used during the pre-test were presented, this time on
the simulator. The first characters from these words comprised the set of characters for
which data were collected. This set of characters was designed to include all possible
strokes at least twice with the exception of two infrequently used strokes and that only
exist in specific two characters (“ ” and “ ”), respectively. By design, these characters
were kept simple requiring an average of 5.2 strokes per character. By including every
stroke used to compose all Chinese characters, we believe this set of characters can be
considered representative, and will provide results that can be generalized to address the
challenges involved in entering all Chinese characters. The same audio recordings were
used, but the order in which the 28 words were presented was randomized. After each
word, participants had to select the keys they would use to enter the strokes required to
write the first character. No audio or visual feedback was provided, in order to measure
knowledge gained during their phone-based text entry tasks. Data from these simulator-
based interactions served as the foundation for the stroke-level accuracy results reported
below.
In addition to the objective measures gathered through both phone- and simulator-
based tasks, participants were instructed to complete a brief questionnaire providing
subjective assessments of their experiences interacting with the mobile phones. They were
asked to estimate the number of characters for which they made no errors, ignoring errors
that involved entering the correct strokes in an incorrect order, during the simulator-based
interactions. Participants also responded to questions regarding how easy the technique
was to use, how fast they could enter text, how many errors they made, and how likely they
would be to use this solution in the future if they had the opportunity. These questions all
used a nine-point Likert scale.
Finally, participants completed a post-test button-pressing task after completing all
other activities on their last day participating in this study. Participants entered a series of
24 predefined six digit numbers as quickly as possible. This post-test was designed to
provide a measure of how quickly participants were able to press the buttons on the mobile
phone with the resulting values, referred to as keystroke speeds, used as covariates in our
statistical analyses.

4.4. Procedures

A pilot study with two participants lasted a total of 8 days. Results from this pilot study
indicated that 6 days would provide sufficient time for initial learning to occur such that
performance would become stable. We also concluded that a five-point scale was not
adequate for the satisfaction questionnaires, resulting in the nine-point scale used in the
actual study.
132 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

All participants in the main study completed tasks for 6 consecutive days, beginning on
a Monday and concluding on the following Saturday. A between-groups design was
employed such that each participant used a mobile phone and simulator that employed
either the standard keypad that is available on the commercial product or the new keypad
that uses both abstract symbols and concrete examples. As discussed above, all
participants completed a pre-test on their first day, a post-test on their last day, and a
collection of text entry tasks, simulation interactions, and questionnaires on each day of
the study. The thirty test sentences described above were assigned to each participant in a
random order while satisfying the following three conditions:

(1) In every session, participants entered one sentence from each of the five categories
defined above.
(2) Participants entered each of the thirty sentences exactly once during the study.
(3) Sentences were distributed across the 6 days of participation to ensure that every
sentence was entered by exactly two participants in each group (i.e. original keypad,
new keypad) on each of the 6 days.

On the first day, participants received a brief one-page introduction to the input
technique they would be using as well as any relevant navigation operations they would
need to use. They were not given any information about the stroke-to-key mappings. Each
participant then practiced using the input technique for one minute to become familiar
with the basic interactions that are required. After completing the phone-based data entry
task, participants completed the distracter task and took a mandatory 5-min break. Next,
they completed the simulation-based interactions and completed the brief satisfaction
questionnaire.

4.5. Dependent variables

Effectiveness of the two keypad designs was measured with two primary dependent
variables.
Entry speed: This value provides a simple measure of how quickly participants could
enter the required text during the phone-based data entry tasks, measured in characters per
minute (cpm).
Character-level failures: This value represents the percentage of characters entered
incorrectly during the phone-based data entry tasks. Since the focus is on how many of the
characters were entered incorrectly, this was considered the most practical of the accuracy
measurements.
In addition to performance measures, we were interested in how well participants
understood the system. Therefore, we also analyzed stroke-level accuracy as an indicator
of how well users understand the stroke-to-key mappings used by the system.
Stroke-level accuracy: This dependent variable is used only for the simulator-
interactions, not the data entry tasks. Since the purpose of the simulator interactions was to
gain insights into how well the participants understood the stroke-to-key mappings, our
focus was on stroke-level interactions or more specifically stroke-to-key mappings.
Stroke-level accuracy rates represent the percentage of strokes for which a participant
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 133

Fig. 5. A set of thirty-six unique strokes defined in this study.

selected the one correct key during the simulator-based interactions. There are thirty-six
unique strokes identified in this study (Fig. 5). Thirty-five strokes can be entered using one
and only one key. These strokes are included in this analysis. One stroke (i.e. “ ”) is
ambiguous because in Motorola’s implementation this stroke can be entered using the 6
key (as suggested by the graphic included on this key in Fig. 1a) or the 2 key. Since
selecting the 6 key is the correct answer when entering this stroke, but the 2 key would also
work, it is unclear how to evaluate the accuracy of the users’ model of the stroke-to-key
mapping. Therefore, we intentionally excluded this stroke when selecting the characters
used in the simulator task. Since some strokes are used more than once when entering the
twenty-eight characters, a stroke was only considered correct if the user selected the
correct stroke-to-key mapping for every instance of the stroke during the simulator task.
Since our goal is to assess the accuracy of stroke-to-key mappings, stroke-level accuracy is
assessed while ignoring the order in which the strokes were entered.
Finally, we also evaluated the participants’ subjective ratings of the systems. We
collected measures of overall satisfaction with the solution they used as well as measures
of their interest in using the solution in the future.
Overall Satisfaction: After each trial, participants used a 9-point Likert scale to rate:
how easy it was to input the characters, how quickly they could input the characters, and
acceptability of the error rate. The average of these three ratings was used to assess overall
satisfaction.
134 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Future Usage: Finally, each day the participants responded to a question regarding
their interest in using the input technique in the future. As with the satisfaction questions,
participants responded using a 9-point Likert scale.

5. Hypothesis

In our earlier, simulator-based study, we observed a high error rate with regard to the
participants’ ability to map strokes to individual keys (Lin and Sears, in press). We
expected similar patterns when using mobile phones, but this earlier study also suggested
that the new keypad design should allow for more accurate stroke-to-key mappings. Since
users of the new design should make fewer stroke-level errors and spend less time
correcting those errors that do occur, our first hypothesis (H1) is that the new design will
allow for faster data entry rates.

H1: The new keypad design (group 2) will result in faster data entry speed as compared to
the original keypad design (group 1).

Our participants did not have prior experience with the stroke-based character entry
technique used in the current study. As a result, participants are expected to gain useful
knowledge as they interact with the system. This will include gaining a better
understanding of the stroke-to-key mappings as well as the basic interactions required
to enter characters. H2 asserts that participants will complete tasks more quickly as they
gain experience.

H2: The entry speed will improve across trials for both groups.

Character-level failures only occur when users enter the correct strokes in the wrong order
or when they enter an incorrect set of strokes and these errors are not corrected. Our new
design does not provide any new support for entering strokes in the correct order, but should
reduce the number of stroke-to-key mapping errors. As a result, the number of instances where
character-level failures can occur should be reduced. Therefore H3 suggests that the new
design should result in fewer character-level failures as compared to the current solution.

H3: Group 2 will have a lower character-level failure rate than group 1.

As stated above, our participants begin with no prior experience with the stroke-based
character entry solution used in the current study. As a result, they are expected to gain
knowledge through their interactions making the correct of errors easier. H4 states that
character-level failures will decrease with experience.

H4: The character-level failure rate will decrease across trials for both groups.

Our earlier simulator-based study demonstrated the benefits of the new design, including
a significant increase in stroke-level accuracy as compared to the current solution. H5 asserts
that these benefits will continue to be evident during the current study.
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 135

H5: Group 2 will have higher stroke-level accuracy rates than group 1.

As with H2 and H4, we assert that participants will gain useful knowledge as they interact
with the system and this will result in improved performance with practice.

H6: Stroke-level accuracy will increase across trials for both groups.

H1 and H3 suggest that participants interacting with the new design will complete their
tasks more quickly and more accurately than those using the current solution. H7 suggests
that this improved performance will also translate into higher satisfaction ratings.

H7: Group 2 will have a higher overall satisfaction score than group 1.

H8 asserts that, as performance improves with experience, satisfaction will also improve.

H8: The overall satisfaction score will increase across trials for both groups.

We suggest that interest in using the solution in the future is driven by both performance
and satisfaction. With the new design resulting in better performance and greater
satisfaction, interest in future usage is also expected to be higher.

H9: Group 2 will be more interested in future usage of the system than group 1.

As users gain experience, their understanding of how the system works will increase,
their performance will improve, and their satisfaction will increase. As a result, their desire
to use the system in the future is expected to increase with experience.

H10: The interest in future usage will increase across trials for both groups.

6. Results

To identify and correct any systematic differences between the groups with regard to the
rate at which they were able to press the buttons on the mobile phones, we compared the
keystroke speeds, as measured in the post-test, of the two groups of participants. A between-
groups t-test did not identify a significant difference between the groups [t(22)Z0.09, n.s.],
suggesting that the other speed-based results would be valid.

6.1. Entry speed

Means and standard deviations for data entry speed achieved when entering Chinese
sentences using the original and new keypad designs are reported in Table 2 and illustrated
in Fig. 6. Using keystroke speed as the covariate, an ANCOVA with repeated measures was
utilized to assess the effects of keypad design and trial on entry speed. The trial and keypad
design both had significant effects on data entry speed [F(5, 105)Z10.204, p!0.001];
F(1, 21)Z5.241, p!0.04, respectively]. A significant keystroke speed and character entry
136 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Table 2
Means of character entry speed (unit: character per minute) using original and new legend designs (standard
deviations in parentheses)

Design Trial
1 2 3 4 5 6
Original 2.37 (0.53) 3.96 (0.88) 4.90 (1.37) 5.65 (1.62) 6.54 (2.26) 6.27 (1.93)
New 3.13 (0.93) 5.07 (0.87) 5.95 (1.65) 6.09 (1.29) 7.02 (1.61) 7.80 (1.19)

speed correlation was also observed [F(5, 105)Z5.317, p!0.001], indicating the natural
relationship where the faster one presses keys, the faster they are able to input characters. No
significant interaction between trial and keypad design was observed [F(5, 105)Z0.224,
n.s.]. As expected, entry speed increased as users gained experience and the new design
resulted in faster data entry rates. H1 and H2 were both supported.

6.2. Character-level failure

Means and standard deviations for the percentage of characters entered incorrectly when
using the original and the new keypad designs are reported in Table 3 and illustrated in
Fig. 7. An ANOVA with repeated measures was utilized to assess the effects of keypad
design and trial on character-level failure rates. Trial and keypad design both had significant
effect on the percentage of characters entered incorrectly [F(5, 110)Z16.265, p!0.001];

Fig. 6. Character entry speed for phone entry task (in cpm).

Table 3
Means of character-level failure rate (unit: %) using original and new legend designs (standard deviations in
parentheses)

Design Trial
1 2 3 4 5 6
Original 12.6 (11.9) 9.1 (8.6) 5.4 (6.6) 4.0 (4.3) 2.9 (4.6) 2.8 (4.4)
New 4.0 (2.7) 2.4 (2.6) 0.9 (1.0) 0.7 (0.8) 1.0 (1.1) 0.2 (0.5)
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 137

Fig. 7. Percentage of character-level failures.

F(1, 22)Z6.67, p!0.02, respectively] and a significant interaction between trial and keypad
design was observed [F(5, 110)Z3.831, p!0.004]. This interaction indicates that the
benefits experienced as a result of increased experience differed between the two groups. As
illustrated in Fig. 7, those participants that used the original design experienced a larger
reduction in errors as they gained experience. This is due, in large part, to the substantially
higher error rates they experienced during the earlier trials which left more room for
improvement. Even with these larger gains, it is important to note that users of the original
design never reached the same level of accuracy as the users working with the new solution.
As expected, character-level failures decreased as users gained experience and the new
keypad design resulted in fewer errors than the original design. H3 and H4 were both
supported.

6.3. Stroke-level accuracy

Means and standard deviations for stroke-level accuracy rates are reported in Table 4
and illustrated in Fig. 8. An ANOVA with repeated measures was utilized to examine
the effects of keypad design and trial on stroke-level accuracy rates. Trial and keypad
design both had a significant effect [F(5, 110)Z18.595, p!0.001]; F(1, 22)Z30.013,
p!0.001, respectively]. A significant interaction between trial and keypad design was

Table 4
Means of stroke-level accuracy (unit: %) using original and new legend designs (standard deviations in
parentheses)

Design Trial
1 2 3 4 5 6
Original 48.4 (10.8) 53.2 (13.2) 60.0 (14.8) 64.6 (17.1) 63.0 (16.0) 66.0 (15.5)
New 77.5 (9.7) 80.1 (7.9) 83.8 (8.1) 84.0 (8.1) 85.4 (6.3) 85.9 (6.6)
138 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Fig. 8. Stroke-level accuracy (note that trial zero results are from an earlier study (Lin and Sears, in press) where
users just looked at the keypad design and selected keys without any confirmation of correctness).

observed [F(5, 110)Z2.646, p!0.03]. Again, this interaction highlights the fact that the
two groups of users experienced different benefits as a result of increased experience.
As illustrated in Fig. 8, it appears that users of the new design experienced greater
benefits during the early trials with their accuracy scores leveling off after
approximately three trials. In contrast, users of the original solution continued to
improve throughout the study, but more importantly they never reached the same level
of accuracy as the users of the new design. As expected, stroke-level accuracy
increased as users gained experience, and the new keypad design resulted in higher
stroke-level accuracy rates than the original design. H5 and H6 were both supported.

6.4. Overall satisfaction

Mean and standard deviations for the overall satisfaction scores are reported in
Table 5 (see Fig. 9). An ANOVA with repeated measures was utilized to examine the
effects of keypad design and trial. A significant effect for trial was observed [F(5,
110)Z26.243, p!0.001]. However, the effect of keypad design appeared to be non-
significant [F(1, 22)Z2.376, n.s.]. As expected, participants felt more satisfied as they
used the system longer, no matter which keypad design was used. H8 was supported but
H7 was not supported.

Table 5
Means of overall satisfaction score using original and new legend designs (standard deviations in parentheses)

Design Trial
1 2 3 4 5 6
Original 4.8 (1.8) 4.3 (1.4) 4.1 (1.5) 3.6 (1.5) 3.5 (1.8) 3.1 (1.5)
New 4.1 (1.2) 3.7 (1.5) 3.2 (1.5) 3.0 (1.5) 2.4 (0.9) 2.0 (0.8)
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 139

Fig. 9. Subjective perceived difficulty using 9-point scale (lower scores indicates less difficulty).

6.5. Future usage

Mean and standard deviations for the future usage scores are reported in Table 6
(see Fig. 10). An ANOVA with repeated measures was utilized to examine the effects of
keypad design and trial. A significant effect for trial was observed [F(5, 110)Z10.695, p!
0.001], but the effect of keypad design was not significant [F(1, 22)Z0.175, n.s.].
Table 6
Means of future usage score using original and new legend designs (standard deviations in parentheses)

Design Trial
1 2 3 4 5 6
Original 3.9 (2.4) 4.3 (2.2) 3.5 (2.2) 3.6 (2.1) 3.4 (1.8) 3.1 (2.1)
New 4.3 (1.8) 3.9 (1.4) 3.3 (1.6) 3.3 (1.6) 2.7 (1.4) 2.1 (1.3)

Fig. 10. Scores of future usage using 9-point scale (higher score indicates less interest).
140 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

As expected, participants felt more interested in using the system as they used it longer. H10
was supported, but H9 was not supported.

7. Discussion

The new keypad design significantly increased data entry rates while decreasing the
number of characters entered incorrectly. By switching from the original design to the new
design, data entry speed increased by thirty-two percent and 68 percent of the errors were
eliminated for first time users. With practice spanning 6 days, the data entry speeds obtained
with the original keypad design increased from 2.37 to 6.27 cpm. In contrast, the new design
resulted in data entry rates of 3.13 cpm in the first trial and 7.80 cpm after the same amount
of practice (Fig. 6). Even after six trials, the new design still allowed for significantly faster
data entry rates, with significantly fewer errors, when compared with the original design.
Further, the new design appears to be promising when compared to pronunciation-based
Pinyin entry method, which has been evaluated in an earlier study, producing a data entry
rate of 5.46 cpm (Lin and Sears, in press). A comparable data entry rate was reached on day
three when our participants used the new keypad design. At that point, the total interaction
time was approximately one hour. After the six trials, about 100-min of use, the entry speed
reached 7.80 cpm showing a 43% improvement compared to the Pinyin method.
Significant interaction between group and trial was observed for character-level errors.
For the group with the original design, the character-level error rate continued to decrease
throughout the study, but began to level off after the fifth trial. In contrast, this error rate
began to level off after only three trials when participants use the new design. Participants
using the original design entered 12.6% of the characters incorrectly in trial one. Only after
four trails did the character-level error rate decrease to the level that was observed in trail
one with the new design (i.e. 4%). In other words, it took the users of the original design
more than an hour to reduce their error rate to the initial level of participants using the new
design (Fig. 7). By that time, the error rate for the participants using the new design was
under 1%. With practice, error rates stabilize at approximately 2.8% (mean of the last two
trials for the group using original design) and 0.7% (mean of the last four trials for the group
using the new design). With 75% fewer errors, users of the new design input Chinese text
with few errors. Not only did the new design provide an immediate benefit, reducing initial
error rates by 68%, but this benefit was maintained throughout the study. Our results
indicated that with approximately an hour of practice, a user with the new design was able to
enter Chinese text at a reasonable speed with negligible errors. Finally, on day three, the
error rates observed with the new keypad design (0.9%) were lower than the error rate
(1.5%) observed for Pinyin method. Therefore, our data suggests that after three trials, which
are equivalent to about one hour of interaction time, participants using the new keypad
design were able to enter Chinese on a mobile phone faster and more accurately than users
interacting with the standard Pinyin method on a mobile phone.
All of these results are affected by the users’ mental model of the system. As mentioned
above, one significant obstacle for stroke-based input solutions is the mapping of strokes
to keys. We suggest that an incorrect user mental model will lead to confusion, erroneous
input, time consuming error correction activities, reduced data entry rates, and frustration.
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 141

Stroke-level accuracy was defined as a method of assessing the accuracy of the users’
models of how the system functions. Based on this metric, it is clear that the group with the
new design had a better understanding of how the system worked as compared to the group
with the original design (Fig. 8). Even without interacting with the system, the new design
allowed users to understand about 60% of the mappings of strokes to keys while the original
design resulted in users understanding only 35% of these mappings (see Trial “0”, Fig. 8).
The gap was similar after one trial, but narrowed slightly by the fourth trial. Combining the
character-level error rates with stroke-level accuracy provides additional insights into what
is required for users to achieve nearly error-free performance with a stroke-based input
solution. For the purpose of this study, we define nearly error-free performance as situations
when participant are able to complete their tasks while making no more than one character-
level error in each of two consecutive trials. The average stroke-level accuracy when users
were able to achieve nearly error-free performance was 75%. Interestingly, given these
trends, it is unclear if users of the original design would ever reach such a threshold,
highlighting the possibility that these users may this solution and seek other alternatives such
as Pinyin. In contrast, users of the new design were able to map almost 80% of the strokes
correctly after only one trail and quickly crossed the threshold where they had the potential
for error-free results. This finding supports the efficacy of the stroke-based solution since the
amount of practice required to pass this critical threshold can be completed in approximately
one-half of an hour.
Trial had a significant effect for both overall satisfaction and future usage. For both
groups, satisfaction improved and their interest in future usage grew. Since we use 9-point
Likert scales, a score of three or less was considered to be positive response and a score of
seven or more is considered to be negative response. Figs. 11 and 12 show the number of
positive and negative responses after each trial for overall satisfaction and future usage,
respectively. Although the differences in overall satisfaction and the interest in future usage
were not significant, the data still provide valuable reference points. With experience, the
number of positive responses increased. For each trial, more positive responses were

Fig. 11. Positive and negative responses of overall satisfaction for two designs.
142 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Fig. 12. Positive and negative responses of future usage for two designs.

reported for the new design than the original design. By the end of the study, eleven of
twelve participants using the new design provided positive response on the overall
satisfaction, while only two thirds of those using the original design provided positive
responses. None of the participants using the new design provided a negative satisfaction
response, whereas five did for the original design. The results therefore suggest that the
participants using the new design were more satisfied than those using the original design.
For future usage scores, the number of positive responses remained stable for the original
design, but interest grew in the group using the new design. Given the nature of the designs
being compared, and the high likelihood of transfer between conditions if participants used
both solutions, the current study employed a between-group design. As a result, participants
did not share a common point of reference, making comparisons of subjective scores more
difficult.

8. Conclusion

This study compared two alternative keypad legend designs. The two designs used
exactly the same underlying Chinese text entry technology, but the new design allowed for
dramatic performance improvements. The entry speed and error rates are significantly
improved when using the new design. Stroke-level accuracy, a measure of the users’
understanding of the stroke-to-key mappings, was also significantly better with the new
design. An earlier study suggested that improved legend design could facilitate the process
of learning the stroke-to-key mappings required for stroke-based Chinese data entry
solutions (Lin and Sears, in press), and the present study solidifies these findings and
provides new insights. Clearly, users of the new design were faster and produced more
accurate results, but the stroke-level accuracy data may provide the most interesting results.
Even before using the system, users exposed to the new design had a better understanding of
how the system worked and this improved understanding was maintained throughout the
entire study. The analysis of stroke-level accuracy in conjunction with character-level errors
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 143

strongly suggests that users should be able to understand approximately 75% of the stroke-
to-key mappings to achieve nearly error-free results. This level of understanding was easily
attained with the new design, and we suggest that it is unlikely that users would invest
sufficient time to reach this level of proficiency with the original design. At the same time, it
seems feasible that under realistic usage conditions, users may never achieve nearly error-
free results with the original keypad design.
There is a natural trade-off between providing more informative legends and keeping the
graphics sufficiently simple and small such that they will fit on the increasingly small keys
provided on mobile phones. How to continue decreasing the size of the graphics, while
ensuring that sufficient information is conveyed to users, will continue to be a challenge
meriting further investigation.
The efficacy of this stroke-based solution, implemented on a 12-key keypad, suggests that
similar solutions may prove useful as an alternative in other contexts. For example, this
solution could be implemented using the numeric keypad of a standard PC keyboard for
quick, one-handed, Chinese input. Given the advantage that users do not have to know
character-to-Pinyin translation, stroke-based input solutions might also become attractive
alternative methods for PC users who do not speak Mandarin in the standard way. Even for
the users who are familiar with Pinyin, stroke-based solutions may still be a valuable
alternative. Such a solution could be implemented to allow one-handed input for the strokes
with the second hand dedicated to selecting the correct option from the alternative list. The
entry speed on PC keyboard using Pinyin method has been estimated to be about 20 words
per minute by a Microsofte research team (Microsoft Corp, 2003). Such a speed is achieved
by professional users with sentence-level prediction capability. Our study confirmed that
users could enter 7.8 cpm, with practice, using the new keypad design with word-level
prediction. However, our data entry rate was obtained by novice users using a mobile phone
keypad, which provides limited interaction capabilities, suggesting that a comparable
solution that took advantage of a full PC keyboard should allow for faster data entry rates.
Future studies may explore the efficacy of this solution with individuals living in China or
alternative implementations (e.g. for use with a standard PC or laptop computer). Additional
investigations of legend design may continue to focus on providing sufficient information to
support rapid learning while minimizing physical space requirements.
Our new legend design improves the users’ understanding of stroke-to-key mappings,
and thus substantially reduces key selection errors. A second category of errors, which occur
when the correct strokes are entered in the wrong order, was not explored in the current
study. For each Chinese character, there is a standard order in which the strokes should be
written. Existing stroke-based solutions, including the one used in the current study, use this
stroke order information to help disambiguate characters that are created using the same
strokes. Unfortunately, people do not necessarily know the correct order for every character,
or they may simply write the strokes in a different order because they find it more
convenient. This issue is beyond the scope of the current investigation because legends do
not deliver any information regarding the order in which strokes should be entered.
However, it is important to note that this type of error is not uncommon, making this an
important topic for future inquiry. More specifically, future studies may investigate the
specific order-related errors that occur. The challenge is to allow for greater flexibility with
144 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

regard to the order stokes are entered without simultaneously increasing the number of
alternative characters users must choose from.
While this study focused on keypad design, the results can generalize to other human–
computer interaction design scenarios. More specifically, our results should prove useful
whenever users are interacting with technologies that utilize multiple many-to-one
mappings. In these situations, users must learn the mappings to interact effectively with
the system. Our results confirm that carefully designed legends, which highlight the rules
that define the mappings, can allow users to develop more accurate and effective models of
how the system functions. More specifically, we found that it was useful to highlight the key
characteristics that cause items to be grouped together while simultaneously emphasizing
how diverse the items are that are mapped into each group.
There are many directions for future research that would build on the results reported
here. First, the participants for the current study were recruited from within the United
States. Though all participants were native Chinese-speakers, they were bilingual and living
in a non-Chinese environment at the time of the study. To begin addressing the question of
generalizability, we replicated this study in Beijing, China with native Chinese participants.
Analysis of the resulting data is encouraging. Both time and accuracy measures exhibit the
same patterns as those described above with the new solution resulting in a substantial
increase in data entry rates and a dramatic reduction in error rates during the first trial with
differences in performance becoming smaller with practice. For example, as highlighted in
Table 7, the data entry rates reported above are very close to those observed during our
follow-up study in Beijing as are the rates at which participants improved (i.e. % increase
between day 1 and day 6).
Additional studies with other groups of users may prove interesting. For example, future
studies comparing stroke- and pronunciation-based solutions may seek to recruit participants
who write Chinese, but do not speak Mandarin (which serves as the foundation for the
Pinyin technique). The specific design evaluated in this study was carefully constructed to
provide users with information that was considered important as they determined which key
to use for each stroke they had to enter. We believe that the current study validated our
approach (e.g. abstract symbol with example strokes), and demonstrated the efficacy of the
proposed solution, but future studies that explore alternative solutions (e.g. different abstract
symbols or different example strokes) could prove useful.
Evaluating new solutions, such as the one presented in this paper, can pose significant
challenges. For example, once a participant interacts with one of the keypads they gain
sufficient knowledge that they can no longer have an equivalent experience with the second

Table 7
Comparison of data entry rates, and rates of improvement, for current US-based study and our follow-up study in
Beijing, China

Current study (US) Follow-up study (China)


Original New Original New
Data entry rate on day 1 2.4 3.1 2.5 3.0
% Improvement by day 6 64.6 49.2 66.8 55.9
M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146 145

solution. As a result, a between-groups design was employed to eliminate any transfer


effects. Therefore, participants had made absolute judgments when providing responses to
the questionnaires, making the identification of significant differences for the subjective
responses more difficult. Since users tend to blame themselves rather than the system when
experiencing difficulties completing tasks that appear simple (Norman, 1988), it is possible
that participants may have been reluctant to provide negative evaluations even when their
experiences were less than encouraging.
Future research could also explore the application of the abstract with example design
paradigm to more general information organization tasks. For example, current file systems
all employ tree-like representations, but organizing files and folders continues to prove
challenging. If a folder simply represents a collection of files, this becomes yet another
many-to-one mapping. Can the abstract with example concept be used to define more
effective textual labels for folders? Could customized icons based on this concept provide
more effective hints as to which files are located in each folder? There are many situations
where users must learn many-to-one mappings to interact with information technologies.
Can a design such as ours, which utilizes an abstract with example design paradigm, provide
a generic and effective mechanism to help users learn these mappings?

Acknowledgements

This research was funded by Motorola, Inc. We gratefully acknowledge their support that
made this research possible as well as feedback on the experimental design and earlier
versions of this document. We would also like to thank Chris M. Law as well as the
anonymous reviewers for thoughtful comments which led to many improvements to this
paper.

References

Archer, N.P., Chan, M.W.L., Huang, S.J., Liu, R.T., 1988. A Chinese–English microcomputer system.
Communications of the ACM 31 (8), 977–982.
Beijing Golden Human Computer Co., 2004. Shuangpin Input Method—Introduction of Common Input Methods.
http://www.hongen.com/pc/pcketang/ime/imejssp1.htm. Last retrieved: 4/15/2004.
Beijing Nature Software Ltd., 2001. A Brief Introduction of Ziranma Input System 2000. http://www.zrm.com.cn/
intro.htm. Last retrieved: 4/15/2004.
Census and Statistics Department, Hong Kong Special Administration Region, PRC (2001). Population of Five-
year-old and Above, Categorized by Primary Speaking Language in year 1991, 1996, and 2001. http://www.
info.gov.hk/censtatd/chinese/hkstat/fas/01c/cd0062001c_text.htm. Last retrieved: 4/15/2004.
James, C.L., Reischel, K.M., 2001. Text Input for Mobile Devices: Comparing Model Prediction to Actual
Performance. Proceeding of SIGCHI ’01 2001;, 365–371.
Lin M., Sears A., in press. Constructing Chinese characters: Keypad design for mobile phones. Behaviour and
Information Technology.
MacKenzie, I.S., Kober, H., Smith, D., Johns, T., Skepner, E., 2001. LetterWise: Prefix-based Disambiguation for
Mobile Text Input. Proceedings of UIST 2001 2001;, 111–120.
Matić, N.P., Platt, J.C., Wang, T., 2002. QuickStroke: an incremental on-line Chinese handwriting recognition
system. Proceedings of 16th International Conference Pattern Recognition, IEEE Press, vol. 3, pp. 435–439.
146 M. Lin, A. Sears / Interacting with Computers 17 (2005) 121–146

Microsoft Corp., 2003. Six Thousand Keys. http://research.microsoft.com/displayArticle.aspx?idZ168. Last


retrieved: 4/15/2004.
Nielsen, J., Mack, R.L., Bergendroff, K.H., Grischkowsky, N.L., 1986. Integrated software usage in the professional
work environment: evidence from questionnaires and interviews, Proceedings of the Conference on Human
Factors and Computing Systems, 162. ACM Press.
Norman, D.A., 1988. The Psychology of Everyday Things. Basic Books Inc, New York.
Qiao, J., Qiao, Y., Qiao, S., 1990. Six-digit coding method. Communications of the ACM 33 (5), 491–494.
Sacher, H., Tng, T.-H., Loudon, G., 2001. Beyond translation: approaches to interactive products for chinese
consumers. International Journal of Human–Computer Interaction 13 (1), 41–51.
Silfverberg, M., MacKenzie, I.S., Korhonen, P., 2000. Predicting text entry speed on mobile phones, Proceedings of
CHI 2000. ACM Press pp. 9–16.
Wang, J., Zhai, S., Su, H., 2001. Chinese input with keyboard and eye-tracking—an anatomical study, Proceedings
of SIGCHI’01. ACM Press pp. 349–356.
Yuan, C., 1997. Chinese Language Processing. Shanghai education publishing company.
ZhongYi Info-tech Ltd, 2004. Zhengma Input Method. http://www.china-e.com.cn/zhcode. Last retrieved:
4/15/2004.

You might also like