You are on page 1of 17

GET TING

GUIDE
STARTED INTERNATIONALIZATION

MultiLingual
Inter nationalization Computing & Technology
This guide is an introduction to internationalization — what it is, why we do Editor-in-Chief, Publisher Donna Parrish
it, and how it is done. Managing Editor Laurel Wagers
When I talk to people encountering this term for the first time, I tell them Translation Department Editor Jim Healey
Copy Editor Cecilia Spence
about my cookie recipe. I will be the first to tell you that I am not a cook, but I
Research David Shadbolt
do have a cookie recipe that is a very nice combination of butter and sugar and News Kendra Gray, Becky Bennett
flour. What does this have to do with internationalization? Well, I can make this Illustrator Doug Jones
cookie batter into a wide variety of cookies. I can add oatmeal and raisins or Production Sandy Compton
chocolate chips or cinnamon and nutmeg. The results are (almost) always tasty Cover photograph courtesy
cookies, but they are tailored to the preferences of the recipients. I think it is of Seattle Public Library
because I started with a good quality item that has been carefully designed to
allow for many “localizations.” Editorial Board
Are you hungry for more? Here is what we’ve included in this guide to help Jeff Allen, Henri Broekmate, Bill Hall,
you get started. Andres Heuberger, Chris Langewis,
Many people think of software when they think of internationalization. But Ken Lunde, John O’Conner,
Mandy Pet, Reinhard Schäler
Tracy Russell takes us beyond that to give us a description of important interna-
tionalization principles that apply to content and design. Advertising Director Jennifer Del Carlo
To someone new to internationalization, the subject of Unicode can easily be Advertising Kevin Watson, Bonnie Merrell
misunderstood. And for good reason: the word is misused in many ways. Richard Webmaster Aric Spence
Gillam, who wrote Unicode Demystified, has written an introduction to the topic, Assistants Shannon Abromeit,
explains exactly what Unicode is and why its misuses are incorrect. Zabrielle Dillon
In addition to Unicode, some misunderstandings about internationalization
in general persist. Andrea S. Vine serves up a dozen of these misconceptions and Advertising: advertising@multilingual.com
explains just what is wrong with them and why it is wrong. http://www.multilingual.com/advertising
Most programmers have probably written a “Hello, World” program to learn 208-263-8178
a new programming environment. Donald A. DePalma takes a delightful look at
Subscriptions, customer service, back issues:
the classic beginners’ program using a short Java fragment and shows us just how
subscriptions@multilingual.com
many ways it can fail the internationalization test. http://www.multilingual.com/subscribe
Bill Hall, author of Globalization Handbook for the Microsoft .NET Platform
(available at http://www.multilingual.com/eBooks), outlines various questions to Submissions: editor@multilingual.com
be considered when designing software for a global audience. He then provides Editorial guidelines are available at
valuable information for project managers and programmers alike. His sidebar http://www.multilingual.com/editorialWriter
“Some Principles for Internationalization” is a worthwhile resource for beginners
and experienced programmers alike. — Donna Parrish, Publisher Reprints: reprints@multilingual.com

This guide is published as a supplement


to MultiLingual Computing & Technology,
the magazine about language technology,
localization, web globalization
and international software development.
authors
authors

DePalma Gillam Hall Russell Vine

DONALD A. DEPALMA is cofounder and president of Common Sense Advisory and author of
Business Without Borders: A Strategic Guide to Global Marketing. He can be reached at don@
commonsenseadvisory.com
RICHARD GILLAM is a senior software developer at Language Analysis Systems and author of
Unicode Demystified: A Practical Programmer’s Guide to the Encoding Standard. He can be
reached at rtgillam@rtgillam.cnc.net
BILL HALL is a writer, teacher and consultant in internationalization, currently at Adobe Systems,
and author of Globalization Handbook for the Microsoft .NET Platform. He can be reached at
MultiLingual Computing, Inc.
319 North First Avenue, Suite 2
billhall@mlmassoc.com
TRACY RUSSELL is publishing services manager at the localization firm Wordbank. She can Sandpoint, Idaho 83864-1495 USA
be reached at tracey_russell@wordbank.com 2O8-263-8178 • Fax: 2O8-263-631O
ANDREA S. VINE is an internationalization architect at Sun Microsystems and writes a blog at info@multilingual.com
http://blogs.sun.com/i18ngal. She can be reached at andrea.vine@sun.com http://www.multilingual.com

2
GET TING
INTERNATIONALIZATION

GUIDE
STARTED

Get Ready
to Go Inter national
Tracy Russell

There is an old joke in which a traveler Decide upon an appropriate tone of both the cultural and the technical implica-
stops to ask for directions. The old man voice and register for the target audience and tions — that is, the suitability of the design
scratches his head and says, “Well, if I were stick to it. for local markets and the suitability of the
you, I wouldn’t start from here.” Unfor- Develop and approve key messages and design for the localization process.
tunately, we often feel like saying this to terminology first. Designing for local markets is about
some of our clients who present us with Avoid clichés, cultural references and considering how the message will be
projects for localization that have clearly not jargon because they are difficult to trans- received. Is there any danger that the design
been conceived with any understanding of late effectively. could be regarded as culturally sensitive in
the concept of internationalization. Do not use “street” language or words any current or future international markets?
So what exactly is internationalization and phrases that will only be used by a Do the visual elements create a positive
in the context of localization? It is the process minority of your target audience. impression in these markets? Does the
of engineering a product or developing a design communicate the intended meaning?
service so that it can be easily and efficiently Designing for the localization process is
localized without having to be rewritten,
redesigned or reengineered to cope with dif-
Designing about understanding the technicalities of
design and how they can either promote or
ferent languages and regions. for local markets hinder the localization process. The agency
In this introductory guide, we offer responsible for localizing the design will be
some guidelines for clients whose products is about considering strongly reliant on the technical and visual
and services will be marketed beyond their design of the original in order to produce a
domestic market, and, as marketing commu- how the message consistent set of localized versions. The
nications specialists, we focus on the key ele- speed, efficiency and cost of design local-
ments of international communication — will be received. ization will also depend on whether the
content and design. You will notice that a design has been fully internationalized and,
common thread running through all our therefore, does not require time-consuming
advice is the need to consider international- Either avoid abbreviations and acro- language-specific manipulation.
ization earlier rather than later in the devel- nyms or write the terms out in full before Let us look at the two main cultural
opment of marketing materials to avoid using the abbreviations and acronyms. issues related to designing for international
unnecessary costs and delays. Avoid names based on abbreviations. markets — color and imagery.
How to say what you mean and mean Even when abbreviations are universally rec- The color purple — death or royalty?
what you say in any language. The golden ognized, they can present pronunciation Color can have a strong positive or negative
rule of creating source text that will work problems for different cultures. representation in all cultures. Understanding
effectively in any language and market is to Avoid metaphors or names based on the impact of color will help with the design,
keep it clear and simple and to avoid as images. A bull market or a groundhog day will enabling you to emphasize or de-emphasize
many cultural references as possible. The be meaningless to many cultures. corporate colors for a global audience.
source text should be well written, unam- Be aware that humor often does not The color black, for example, signifies
biguous and grammatically correct. It should travel beyond its culture of origin and can be death in the West, but in China the color of
conform to any in-house corporate guide- very expensive to adapt. death is white. Purple signifies bravery and
lines for terminology and style to reinforce royalty in the West, but is the color of
corporate branding but should also be Think Inter national mourning in Brazil. Red is commonly associ-
acceptable to local markets from an idio- ated with danger in the West but is associated
matic perspective. Befor e You Get Cr eative with weddings in China. Green and light
The internationalization guidelines blue are regarded as sacred colors in the
below for the creation of content are not Since the globalization process is often Middle East, and saffron yellow is a sacred
mandatory, but they will help to ensure that based on the adaptation of copy and design color for Buddhists.
the source text can be used internationally, from an original marketing tool such as an This is not to say that sensitive colors
will minimize localization cost and time, and English language website or an advertising cannot be used in designs for a global audi-
allow the user to read and understand the campaign, the way in which the original ence. It is useful, however, to consider the
text easily. design is created has a substantial impact on impact of color choice in the context of a mul-
Keep copy short and succinct. localization. When designing for an interna- ticultural target market at the earliest stages of
Write clearly and unambiguously. tional marketplace, you have to consider the design process.

3
GET TING
GUIDE
STARTED INTERNATIONALIZATION
Beware of the dog. You also need to be Designing for the your communications, remember that the
aware of the suitability of images for a global font should be widely available in all medi-
audience and be prepared to offer different
Localization Pr ocess ums. For website localization, Arial and
images depending on the target market. Times New Roman are probably the safest
Examples of the types of images that can Once the text and design are culturally bets. If a browser cannot display the correct
cause difficulties include people, animals, suitable, the next challenge is to ensure that font, the result will be the nearest the browser
flags and icons. the design is internationalized from a techni- can find on the machine, which may com-
People. Many cultures are extremely sen- cal perspective. There are many issues to con- promise the design.
sitive to ethnicity, dress and poses, particularly sider here, and, again, a key piece of advice is Beware the use of corporate fonts.
relating to women. A recent poster campaign to consider all of these before the localization Unless they have been developed by large
for Lux, featuring Sarah Jessica Parker in a process begins, particularly if the design is for organizations with large budgets, they do not
sleeveless dress, had to be hastily airbrushed a multilingual website. tend to support non-English characters. Font
to cover her arms for the Israeli market. Leave plenty of room for text expan- design and creation are highly specialized and
Animals. They conjure up different sion. No two languages take up the same expensive processes.
images in different cultures. Dogs are gener- amount of space when laid out in a design. Where a font is used that does not sup-
ally considered to be man’s best friend in the Individual words can expand by up to 300%, port localized characters, the only option is to
West, but Arab cultures find them “unclean” and design elements such as a text box can find the closest available match for the target
and offensive. sometimes take up twice as much space as the character set, which means that your agencies
Flags. Flags are always best avoided English source. But one paragraph in a docu- and localization companies must also possess
because they are more political than cultural ment might expand by 30%, and the next a copy of the chosen font. It is possible to pro-
and do not clearly represent a specific lan- may not expand at all. Some languages such duce customized versions of Western Euro-
guage. Which language is represented, for as Russian can expand up to 70%. Others pean fonts for non-Western languages, but it
instance, by the Swiss or Belgian flag? such as Hebrew and Asia-Pacific languages means that extra time will have to be built
Icons. Common cultural references such may contract and take up less space. into the project. In short, failing to carefully
as mailboxes, rubbish bins and phone boxes The behavior of localized text thus pro- consider font selection at the design stage can
are often used in website designs but are vides a significant challenge in producing a add time and cost to a project.
unlikely to be universally understood as each design that can be internationalized and sup- Designing for bidirectional and dou-
country has a different design. port a wide range of target languages. ble-byte languages. Bidirectional languages
In order to accommodate text expan- such as Arabic and Hebrew may require a full
sion, the layout and design of the original re-working, as they must read from right to
must either allow space for such expansion to left. This means the production of reversed
occur or for elements to be moved. Generally, artwork and possibly a change of graphics.
text expansion is handled by expanding into Consultation at the earliest possible stage of
empty areas of the page or by reducing the design will assist with speedy delivery of
size, leading and tracking of the text (by as lit- localized Hebrew and Arabic versions.
tle as possible). In some cases, however, more While some publishing packages readily
deliberate action must be taken. support the direct input of Asia-Pacific char-
For example, headlines often do not acter sets such as Chinese, Japanese and
translate easily, and five words can become Korean, localized design can be produced so
eight or ten. Point size reduction is often the that the file can be run through the normal
only option on a design where text expansion printing process without the need for special-
has not been taken into consideration. Inter- ist software. Again, our advice is to consult
esting alignments and typographical empha- with the experts at the earliest possible stage
sis on the different elements of a headline of design to ensure the speedy delivery of
can, as a result, be difficult to reproduce. localized Asia-Pacific versions.
Captions should not be crammed too Ask your printer for advice. It is always
tightly on a page, either to a graphic or to worth consulting your printer before finalizing
each other. A heavily labelled diagram needs the design, particularly when applying the
plenty of space for text expansion. same design across multiple language outputs.
Tables and forms are difficult to han- Printing costs can be greatly reduced by the
dle because of the use of unalterable col- use of a “fifth plate,” a technique that is used
ored backgrounds and lines that make extensively in the packaging world, where an
expansion impossible. initial large print run containing all the color
Select your languages before you elements (photos/graphics) is produced. This
choose your fonts. Non-Western languages “blank” printed output is then overprinted
require different typefaces in order to accom- with the text elements in a smaller, language-
modate the extra characters not supported by specific print run, which allows for more
standard fonts. It is, therefore, a good idea to print-on-demand flexibility. This technique
consider what languages will be required at only works where the language-specific ele-
the earliest possible stage of design before you ments can be printed in a separate color (or
decide on which font you are going to use. If colors) without affecting the rest of the design.
you want to use a particular font across all This typically means that the text is black.

4
GET TING
INTERNATIONALIZATION

GUIDE
STARTED

Avoid turning text into graphics. The


guiding principle is to avoid putting translat-
able text elements in separate graphics files
unnecessarily — either for online or offline
communication. Once embedded into a
graphics file, it is a manual process to extract
the text, and this has to be done separately
from the main text extraction. A better
Localization Tool for GlobalReady
approach when designing for print is to place
text in frames laid over the graphic. This PowerBuilder Applications • Seasoned language technology professionals
means that text can be extracted in one sim- delivering the best and most cost-effective solutions
ple process. Less switching between programs Enable changes the active language of PowerBuilder • Experienced with both clients and vendors in every
is needed during the typesetting process, and software dynamically during runtime. Our localization role — management, technology, language,
there are fewer files to deliver once localiza- technology enhances the developer’s application sales and finance
tion is completed. It is easier to accommodate framework, rapidly capturing (various formats) text • For clients, we have streamlined localization
different graphics for different language ver- for translation. After minimal one-time changes, the departments for Fortune 500 companies and
sions as appropriate without involving trans- source code compiles into a multilingual application. designed environments which are among the most
lators and typesetters in the process. highly automated in the world
Developers and localizers alike prefer Enable’s
Burning text into graphics for online
communication means that it is displayed in a focused and cost-effective approach to user interface • For vendors, we have increased profitability
certain way and cannot be adjusted by user localization over other, more generic tools. GlobalReady —
screen resolution preferences or the re-sizing Enable is the correct choice for both new projects Our expertise is translation and technology
of a window. This might be desirable from a and new versions of existing PowerBuilder software.
design perspective, but it makes the localiza-
tion process longer and more complex GlobalReady
because there is no efficient way of auto- Enable 19710 Ventura Boulevard, Suite 203
mating text extraction and re-insertion into Vio Gorgo 48/C, 30030, Caltana Woodland Hills, CA 91364
graphics, and you will probably need to Venice, Italy 818-887-8718 • Fax: 805-435-3761
involve a DTP/web graphics specialist. The 39-041-5730206 • Fax: 39-041-5730206 info@globalready.com • http://www.globalready.com
same look and feel can often be achieved info@enable-pb.com • http://www.enable-pb.com http://www.L10NEngineers.com
using plain HTML, particularly with the use
of Cascading Style Sheets, and this results in a
far more localization-friendly design.
Embedding text in graphics also locks
your design into one that is only suitable for
the English language. If text is already embed-
ded, ensure that the design allows for lan-
guage expansion. Otherwise, the only option
for localized versions is to reduce the point
15 Years of Well-managed
size to make the text fit, and this can affect the
legibility of the content. Brazilian Translations
Interested in This is what you get:
Ready for Takeof f?
Maximizing Overseas’ • Quality in customer service, deliverables and open
communication
Preparing for international departures
is not difficult, but it does involve getting a Revenue? • Responsiveness and ability to adapt to ever changing
requirements
grip on a number of cultural issues before
you even brief your creative agency. You Adams Globalization brings you an award winning • Customized service with no compromise to translation
need to work with a partner who under- team for your internationalization, localization and quality, even in challenging volumes
stands internationalization and localization, software localization testing needs. Our extensive • Promptness to raise issues and prevent any disruptions
otherwise you could make costly mistakes. 23 years of experience, industry-specific knowledge, to project quality
Never underestimate the potential sensitivity technical skills and excellent customer service This is what we get:
of any ethnic, religious or cultural group to allow you to deliver your content in any format
what you say and to how you present your • Extremely satisfied customers
and in most languages.
business visually. If you’re not one of them yet, experience the difference!
As any business knows, a reputation can Adams Globalization
take a lifetime to earn and a moment to 10435 Burnet Road, Suite 125 Follow-Up
destroy. So why jeopardize your chances of Austin, TX 78758 Av. Presidente Wilson 165 / Sala 1308
international success just because your agency 800-880-0667 • 512-821-1818 • Fax: 512-821-1888 Rio de Janeiro, RJ, Brazil 20030-020
didn’t know that a beautiful young Chinese sales@adamsglobalization.com 55-21-2524-2994 • Fax: 55-21-2210-5472
woman dressed in white is more likely to be on http://www.adamsglobalization.com info@follow-up.com.br • http://www.follow-up.com.br
her way to her funeral than her wedding? Ω

5
GET TING
GUIDE
STARTED INTERNATIONALIZATION

‘Hello, World’ as an
Inter nationalization Wake-up Call
Donald A. DePalma

While eXtreme programming makes the { impact on a development project. All the
headlines, the reality is that developers code public static void main errors demonstrate that code which should
by example, precedent or plagiarism. I have (String argv[]) throws Exception be a personal pact between coder and com-
long contended that only one COBOL appli- { puter can frequently appear before an end
cation was ever written from scratch, and MyDebug.trace("main"); user’s wondering eyes, showcasing what we
every COBOL programmer since Admiral Date today = new Date(); call the “code-content interdependence.”
Grace Hopper merely cut and pasted her System.out.println Now I’ll put on my dusty programmer hat
code. What happens when a new programmer "Hello, world! Today's date and take a trip through my code sample.
fires up his or her favorite interactive develop- is " + today.toString());
ment environment to build some code for }
}
Multiple Er r ors
international deployment? He or she will like-
ly copy some code that he or she wrote before Experts will tell you that Java is inter- Undeclared variable. Whoops! This is
and adapt it to the project at hand. nationalized right out of the box. While I’m a simple matter of coder hygiene. I did not
Let’s leave the ancien régime of COBOL not an expert on Java, I used to play one declare MyDebug. My bad. I learned Pascal
behind with an easy example, using the first back in the 1990s when, as an analyst, I from Andy van Dam. It’s no wonder he gave
code written by C-savvy programmers new to scored lots of coffee, coffee cups and other me an incomplete in that course way back
a language or development tool. The code coffee paraphernalia from Java develop- in 1979. I do remember him saying some-
most likely to be written displays the text ment tool vendors anxious to convince ana- thing about programming not being the
“Hello, world.” Simple as they are, these few lysts that their Java was higher octane than path that a talented linguist such as I should
lines do not travel well beyond English. their competitors. I wear my coffee beans take. Did he diss me or what? Well, Brown
Consider the following Java fragment: on my sleeve. University can forget about this alumnus
This short Java example demonstrates donating any money when they come beg-
package mine; a number of problems. Some are merely ging for their annual fund this year!
public class Greetings bad hygiene, but others could have a costly Locale-specific strings and code. Both
“Hello, world” and the code in which it is
embedded are specific to English-speaking
locales. I know that best practices dictate
that I should isolate locale-specific items
such as text, icons and formats in external,
localizable resource files. Oh well, it’s not as
if anyone will use this code in any deployed
application, so what’s the harm of doing it
just this once? I can always fix it later when
I have more time — or whoever borrows
this code can do it in his or her copious
spare time.
Internal use only. The routine "main"
is for me to debug this code, not something
that my end users should see. Without any
context, some translator somewhere down-
stream in the process might decide to
translate this perfectly good English word
into whatever the target language is. If my
company uses an external agency, their
translators will probably attempt to trans-
late it when they localize the code.
Either way, somehow I think that this one
word could cost my company hundreds of
dollars of time to research and resolve, espe-
cially as it propagates into code for different

6
GET TING
INTERNATIONALIZATION

GUIDE
STARTED

international markets. Here’s what I’ll do: “Don’t have a cow!” Whatever. Oh yeah, savvy language such as Pascal, SNOBOL or
I’ll leave it as it is and buy a tool such as “whatever” means “I don’t really care what even C or C++, there would have been
LingoPort’s Globalyzer or bring in some the answer is.” more problems.
specialist service provider such as Basis or Internationalization sure isn’t a walk in
Symbio to extract all the quoted strings in Lessons Lear ned the park.
an application and do all the right things Because application code and user-visible
with them. Or I could worry about them I now have seven lines of code with six content are so intertwined, you can cause a
now. Oh well, let me stick a little Post-It major bugs. I know that Dr. Van Dam lot of damage just coding the way you do
note on my monitor so that I do it later. would find more and would certainly every day.
Maybe after I get back from coffee. That object to my programming style. Hmm, if I Get help.
cuppa Sumatran sounds pretty good right had written this in a less internationally I gotta get some more coffee. Ω
about now.
Mixed message. The output string
“Hello, world! Today’s date is April 9, 2005”
is concatenated from text plus a date string
function. This string isn’t too bad, especially
since it’s my birthday. It’s a good thing that
it’s a short sentence. Talented linguist that I
am (note to self — “don’t donate anything
to Brown this year or next!”), I know that
more complex messages might get wrapped
around a syntax tree. All us savvy linguists
know that English favors a subject-verb-
object (SVO) word order, while languages
such as Japanese and Russian are more tol-
erant of structures such as SOV or OVS.
Maybe some conditional code for locale
would work here. Another sticky note to self
— “Research sentence order and locales.”
Maybe next week.
English methods. The “Date.toString”
function produces an acceptable US date
string that might not look good in Senegal.
Where is Senegal? Hmm, Google “Senegal”
and check out the results. Geez, 47 listings
on my computer for Senegal. Weirdness
personified. Why do I have listings for • Multilingual solutions (localization, content
Senegal on my PC? Oh yeah, I was checking Logrus management, engineering and testing)
locale-sensitive methods, wasn’t I? Let’s • Large production site in Moscow
search for time and date formats in Senegal. specializes in • Large and complex projects
Wow, 609,000 hits, but thankfully none of
them on my computer. That would really be complementary • ERP/CRM/BPO specialization
• Multilingual software development projects
wack. I better get a bit more specific with
these search terms. You know what? I’m
solutions: • Multilingual Web content management
tired of this. Even if those French speakers
prefer “day/month/year” they’re not going
Logrus is a provider of multilingual solutions into a large number of languages, making it
to have a cow if they see “month/day/year” possible for software publishers and other companies to ensure global presence of their products. We
are they? specialize in large and highly technical projects requiring unique technical experience, a high level of
Unusable documentation. Sooner or self-sufficiency, and outstanding problem-solving capabilities.
later someone will read my notes about how
my little Java program violates internation-
Logrus was founded as a dedicated software localization company. We go beyond translation
alization best practices. What if the next (although we do a lot of translating) and beyond programming (although we do compile software,
programmer is not a teenager or someone prepare the builds and fix bugs). We are localization professionals.
else who grew up with American television?
Would that programmer know that there
was an actor who played a doctor and then
appeared in commercials for a medicine
saying “I’m not a doctor, but I used to play
one on TV”? That to “diss” is to show disre-
spect? Or that being “wack” is bad? Or that
Logrus • www.logrus.ru
Bart from the animated TV series The management@logrus.ru • +1(215)947-4773
Simpsons is always telling concerned adults,

7
GET TING
GUIDE
STARTED INTERNATIONALIZATION

Unicode Fr om
50,000 Feet
Richar d Gillam

The computer industry has a strong ten- assigning numbers to characters. If two
dency to latch onto buzzwords. I can remem- applications follow the same standard for
ber when XML first started to get the attention representing text, they can pass text back and
of the industry. Before you knew it, everybody forth between each other, and they’ll both be
was trying to find some way to say his or her able to do things with it properly.
application “supported XML.” It didn’t matter The problem, of course, is that there are
whether there was anything about XML that so many different standards. Most modern
was especially useful in the application’s prob- computing systems use the ASCII standard or
lem domain. You needed a way to tie it to XML something based on it. ASCII was published
just the same. in the 1960s by what is now the American
Unicode has been another one of those National Standards Institute (ANSI) and uses
perennials in the buzzword sweepstakes. the values from 32 to 126 to represent the 26
Many developers are looking to find a way to uppercase and lowercase letters of the
say their product “supports Unicode” or “is English alphabet, the 10 digits, and various
based on Unicode,” but they often don’t really punctuation marks and symbols. The values
know or care what that means. Many devel- from 0 to 31 and the value 127 were reserved
opers seem to feel that “my program supports for various signals that controlled the com-
Unicode” and “my program is international- munication protocol, and byte values from
ized” are equivalent statements. This is not 128 to 255 weren’t used.
only wrong, but scary. They’re two very dif- The problem is that ASCII only includes
ferent things. It’s possible both to support codes for the letters in the English alphabet.
Unicode and still not be internationalized, Speakers of other languages didn’t have codes
and it’s equally possible to write an interna- for the letters of their alphabets. Since the
tionalized program that doesn’t have any- byte values from 128 to 255 weren’t standard-
thing to do with Unicode. ized by ASCII, various other standards
Other authors in this guide will talk sprung up that used these code values for the
about what it means for a program to be letters of other alphabets. Standards were put
internationalized, so let’s instead take a quick forth by computer vendors, national govern-
look at just what Unicode is and the prob- ments and so on.
lems it does solve. We’ll also skim lightly over Now there’s a plethora of character
the surface of Unicode’s main features. encoding standards out there, each of which
Unicode is a character encoding stan- defines code values for a single language or a
dard. What does this mean? Computers don’t small group of related languages. There are
have any innate knowledge of text or charac- several problems with this. First, the stan-
ters or anything like that; all computers really dards are mutually incompatible. While you
understand at all are numbers (actually, it’s can usually count on the value 65 represent-
bit patterns, but let’s not go too deep here). If ing the capital letter A, the value 215 can rep-
you want to represent text in software, you resent many different characters, depending
adopt a convention where each character you on the encoding standard. Second, encoded
need to represent is assigned to a number. text often travels across media without any
You decide that in your application, anytime external indication of the encoding standard
you see, say, the number 1 in a memory loca- it follows. Software receiving a message of
tion you know is supposed to hold text, you unknown encoding has to guess or simply
interpret it as the letter A. When you see the assume, thus leading in many cases to man-
number 2, it’s B and so on. gled characters. The sending software intends
Of course, text is so common that rather for a particular numeric value to represent
than having each developer adopt his or her some character, and the receiving software
own convention for representing text with interprets it as something totally different,
numbers, the industry issues standards, offi- leading to garbage. Third, mixing languages
cial documents that define conventions for in a single document often requires changing

8
GET TING
INTERNATIONALIZATION

GUIDE
STARTED

from one encoding standard to another in numbers is complex, and Unicode goes to
the middle of the document, and there are much trouble to explain how this is done for
often no mechanisms in the software for various complex writing systems.
doing that. It’s also not always clear just when two
Unicode was designed to solve these different doodads are the same character and
problems. The idea was to use a larger data when they’re different. In many writing sys-
type than a byte for each character and then tems, the shape of a letter can change dra-
give every character in every language its own matically depending upon the letters around
unique numeric representation. This means it. Unicode places much rigor around the
you can mix languages freely in a document process of deciding whether two different
without the software being written to sup- written squiggles get different numbers or
port mixing encodings, and you can send text the same number.
from one system to another without worry- Unicode goes to more trouble to nail
ing about it getting mangled on the other end down the semantics of each character. The
(as long as the sending and receiving systems standard contains not just a big pile of code
both support Unicode). charts, but also a huge database of properties
But it’s possible to write an internation- that define how software should treat differ-
alized application without using Unicode. ent characters. Is the character a letter, a digit
You just have to keep track of which encod- or a punctuation mark? If it’s a letter, is it
ing the system is using to represent text in all uppercase or lowercase? Which character is
the places where text appears and make it its partner in the opposite case? If the charac-
possible to use different encodings when nec- ter is a number, which numeric value does it
essary to represent different languages. In represent? If it’s a diacritical mark, how does
other words, you can do it, but it requires the it attach to its base character? Is the character
application to do much more bookkeeping part of a right-to-left writing system? Does it
than is necessary with Unicode. Unicode join cursively to other characters? And so on
allows you to process data and present a user and so on.
interface in any language without having to Unicode also includes many rules on
switch from one encoding standard to anoth- how to do different things with encoded text.
er. It doesn’t make internationalization possi- There are rules and guidelines for determin-
ble, but it does make it easier. ing where line and word boundaries occur.
It should also be clear that Unicode There are rules and guidelines for converting
doesn’t solve your internationalization prob- to other encoding systems, for doing lan-
lems. You still have to translate the text. You guage-sensitive string comparison, for dis-
still have to remember to call number- and playing various things on the screen, for
date-formatting routines that can produce implementing Unicode-based regular expres-
different output for users of different lan- sions or programming-language identifiers.
guages. All Unicode makes possible is repre- And much, much more.
senting text in many different languages So, Unicode not only gives you codes for
without keeping track of the encoding. practically every character in practically every
Although Unicode is unique among writing system used to write languages today,
character encoding standards, it’s not just but it also provides you with a wealth of
because it assigns numbers to more charac- implementation know-how, and Unicode sup-
ters — more than 95,000 in the most recent port libraries provide you with facilities for
version, including many that have no other doing all kinds of text-related things. The
standardized representations. Unicode is Unicode standard sprawls across not only a
also unique in that it approaches the busi- huge 1,500-page book, but also a CD full of
ness of assigning numbers to characters database files and many ancillary addenda and
with far more rigor than any other encoding technical reports. It’s not all because it con-
standard has attempted. For many writing tains 95,000 character assignments; it’s be-
systems other than the Latin alphabet used cause tremendous blood, sweat and tears have
by English, questions as to how to use num- gone into just how to use those 95,000
bers to represent it aren’t at all clear-cut. In assigned code values to do what you want to
many writing systems, the letters don’t do in the language you want to do it in.
march in a nice orderly fashion from the Unicode is not just the largest collection of
left-hand side of the page to the right. In characters ever encoded in a single standard;
some, they go from right to left. In some, it’s the most comprehensive collection of
they knot together in very complex ways. In rules, guidelines and best practices for han-
some, they’re adorned with various accent, dling text in computer software ever compiled.
tone or vowel marks that attach to the letters You could write an internationalized
in many different places. Straightening this application without using Unicode, but why
out into a one-dimensional sequence of would you? Ω

9
GET TING
GUIDE
STARTED INTERNATIONALIZATION

Developing Softwar e W ith


Inter nationalization in Mind
Bill Hall

What are internationaliza- delayed releases and missed sales opportu- last minute, if many errors are not found,
tion, localization and global- nities that are seldom recouped. and if awkward compromises are often nec-
ization? essary to meet release schedules.
Where does international-
Internationalization is the engineering ization belong in the software What is a typical first-time
and design aspect of creating a world-ready development cycle? internationalization experience?
product. Internationalization work proper-
ly starts in the design phase and lasts until Internationalization is a component of It goes something like this. Let’s sup-
the product has been localized and software engineering that should be applied pose the setting is the United States. An
released. Localization is the term to a product throughout the devel- idea for a product is conceived; develop-
most often used for the task of opment cycle. Many develop- ment begins apace; version 1.0 is released
adapting a product to one ment organizations naively in English; and work immediately starts on
particular target market. The product believe that international- a bug fix release simultaneously with the
Localization includes trans- manager asks if ization can be added at next version. Throughout, no thought is
lation of user-interface the last minute. given to internationalization — mainly
strings, adjusting cultur- English will be OK, Unfortunately, because no one on the staff really knows
ally sensitive elements internationalization what it means. One day a Japanese company
and any other task re- and the Japanese is not a coat of paint calls up to say that it wants the product.
quired to make the prod- that is applied to the The product manager asks if English will
uct usable in a particular company replies, surface of the product, be OK, and the Japanese company replies,
world region or locale. A
locale is typically identified
“Absolutely not!” as is localization. It is
much more like oil that
“Absolutely not!” At this point, the entire
development cycle becomes completely
by language and region identi- lubricates the whole system. disrupted as company management tries to
fiers, such as US English, Austrian Internationalization errors tend decide how to handle the situation.
German and so on. A product is global- to be found throughout the system Any number of paths can be followed,
ized if it is both internationalized and local- and at every level. They can be as varied as but usually the worst possible decision is
ized. If we write G11N for globalization, incorrect uses of library and system calls, made: a separate thread of development
then G11N = I18N + L10N. improper pointer arithmetic, embedded user begins with the code branching to get the
Internationalization, in simplistic terms, interface items, inattention to rendering Japanese release out while US development
is a job for programmers, and localization locale sensitive data correctly and erroneous proceeds toward its next version. Unless
falls to the linguists. Internationalization assumptions about character encoding. company behavior is modified, the cycle
and localization are only loosely related; a Internationalization omissions and over- continues of an English release followed
product can be fully internationalized sights can cause substantial rewrites of large much later by releases in other languages
without having been translated into an- blocks of code and brisk renegotiation with one by one. In the meantime, the main
other language. third-party suppliers of key modules. development group continues to make the
same internationalization mistakes release
Why is internationalization What does internationaliza- after release. The whole process is very
important? tion cost? expensive, maintenance and patches be-
come difficult to provide, and the substan-
If the software has been properly Large companies that routinely devel- tial benefits of simultaneous release are
internationalized, localization (transla- op with internationalization in mind find consistently missed.
tion) can proceed quickly, efficiently and that development costs can increase by 10%
at a reasonable cost, and the product can to 20% or more since developers must be How can a company avoid re-
be sold in other world regions. But if the educated, internationalization phase checks peating this sad experience?
product is not internationalized, the trans- must be added, and QA plans must be mod-
lation step can be hampered substantially ified for the additional testing required. Many companies have recognized the
as program bugs are detected, reported Fortunately, those costs can be amortized futility of the approach I have just described
back to the development staff, fixed and over multiple language releases, reducing and will take the time to merge the interna-
returned to the translation and quality the effective unit cost. But costs can easily tionalized code into a single, worldwide
assurance (QA) teams. The results are double if internationalization is left to the base. From that point they follow a strategy

10
GET TING
INTERNATIONALIZATION
INTERNATIONALIZATION

GUIDE
STARTED

of developing code that is independent of gives some thought as code is being written Is there some kind of check-
language and locale along with supporting as to its possible effect on international- list for internationalization?
modules that manage language and locale ization. Coding for internationalization
specific issues. Locale-neutral escape mech- is more a matter of attitude and mindset The problem with checklists is that one
anisms are developed that provide means rather than linguistic expertise. The most can find important exceptions for each rule.
for accessing supporting modules handling important step that a developer can take As stated above, internationalization is more
internationalization issues. As a simple is to learn about the National Language a matter of attitude and programming skills.
example, a module might contain the user Support available on the platform on Ask yourself if you write C++, Perl, VB, Java
interface for a particular language, and the which he or she works. Such knowledge or C# using a checklist. Most likely you don’t.
escape mechanism would be a call to get a can be acquired through reading, train- But there are some principles. The accompa-
string indexed by a number or a hash value. ing and learning to write small sample nying chart shows some general rules culled
Another example could be the need to ren- applications that exercise one or more of from many sources. The statements are
der display of data in a way suitable for a these functions. rather broad and need expert interpretation
given world region (locale). In this case the
escape might be to use the information
provided by the operating or runtime sys-
tem itself. Taking advantage of such sup-
port is one of the most powerful ways to
internationalize a program with a mini-
mum of effort. It is all a question of modu-
lar development and design.

What kinds of internationaliza-


tion models are used, and what
are the merits of each?
This is a complex topic that depends
entirely on the platforms on which the
application runs. However, all systems pro-
vide such guidelines, and these must be
thoroughly studied. First-timers are often
overwhelmed by what has to be learned —
another reason why you start the interna-
tionalization effort early in the develop-
ment cycle.

How does a developer learn


about internationalization?
Unfortunately, it is nearly impossible
to find schools or universities anywhere
that are either interested in or qualified to
teach internationalization as a part of a
normal computer science education. It is a
serious failure by those who otherwise do
an excellent job of teaching the art and sci-
ence of programming and programming
languages. Companies are therefore forced
into educating their staffs or drawing upon
outside expertise. Localization companies
and independent internationalization engin-
eers often provide such services.

As a developer, do I have to
be able to speak four languages
in order to learn international-
ization?
Internationalization is an engineer-
ing effort. An appreciation of the fact that
cultural and linguistic differences exist
and that software needs to compensate is
enough. It also helps if the developer

11
GETTING
GET TING
GUIDE
GUIDE
STARTED
STARTED MULTILINGUAL
INTERNATIONALIZATION
CONTENT MANAGEMENT

Some Principles for Internationalization


The program design team considers internationalization from the Program’s internal character encoding is Unicode.
beginning of the project. Program properly handles all characters in the program’s character set.
Icons, cursors and bitmaps are generic, are culturally acceptable and do Program handles non-homogeneous network environments where
not contain text. machines are operating with different encodings.
If ethnocentric graphics, colors or fonts are used, they can be replaced Code processes all character sets correctly regardless of character widths.
dynamically using locale-sensitive switch statements.
Code supports Unicode and conversion between Unicode and any local
Menus, dialogs and web layouts can tolerate text expansion. code pages.
Development language strings are reviewed for meaning and spelling to No assumptions are made that one character storage element represents
reduce user confusion and lessen translation errors. one linguistic character.
Strings are documented using comments to provide context for Code uses generic data types and generic function prototypes if available
translators. in compiler.
Strings or characters that should not be localized are clearly marked. Code does not use embedded font names or make assumptions about
Shortcut-key combinations are accessible on all international keyboards. particular fonts being available.
International laws affecting design and operation are considered. Program displays and prints text using appropriate fonts.
Third-party software used in the product is examined for international- Program meets international testing standards.
ization support. Text is translated and meets the standards of native speakers.
Consistent terminology is used in messages. Dialog and forms are resized, and text is hyphenated appropriately.
The product runs properly in its base language in all target locales. Translated dialog boxes, toolbars, status bars and menus fit on the screen
Strings are not assembled by concatenation of fragments. at different resolutions.
Source code does not contain hard-coded character constants, numeric Menu and dialog-box keyboard assignments are unique.
constants, screen positions, filenames or pathnames that assume a User can type all supported characters into documents, dialog boxes and
particular language. filenames.
String buffers are large enough to handle translated words and phrases. User can successfully cut, paste, save and print text regardless of
Program handles input of international data. language.
All language editions can deal with one another’s documents. Sorting and case conversion are culturally correct.
Program contains support for locale-specific hardware if required. Application works correctly on localized editions of the target operating
Program depends on operating or runtime system functions for sorting, system.
character typing and string mapping. These considerations apply specifically for web internationalization:
Program takes advantage of generic text layout functions when available. Make sure your presentation tier is ready to manage multiple character
Program responds to changes in the user’s choice of interna- encodings correctly in a variety of browsers.
tional settings. Check all your forms and other input fields for encoding compatibility.
Program handles user keyboard layout changes. Follow all the rules for internationally portable design as listed earlier.
Far East editions support input method editors (IME), vertical text and Check your middle-tier components for internationalization compli-
line-breaking rules. ance. Ensure that information about encoding and locale of data is
All international editions of the program are compiled from one set of passed correctly between presentation and backend tiers.
source files. Validate databases to make sure that schemas, data types and table
Localizable items are stored in resource files, message tables or design are ready for a multi-locale environment.
message catalogues. Check database client calls for use of built-in National Language
All language editions share a common file format. Support to return properly encoded and sorted record sets.

and illustrations but should give you an idea. long it will take. The ideal audit will usually numbers, currency, calendars, sorting and
The base development language is assumed take three to five days or more and should the like. Have a machine ready with source
to be English. involve several members of the development code and build tools for viewing. If you have
team. Be prepared to brief the auditing team already isolated the user interface, make sure
We did not pay attention to on the main features of your product and the auditors can read through strings, view
internationalization during de- have someone who knows the system very dialogs and menus, and check clip art for
velopment. What do we do? well show the auditors how the product everything from text in bitmaps to strings
works. Expect the auditors to do things you assembled by concatenation. Allow direct
Get an audit by an experienced company probably never thought about such as access to developers; it is especially useful
or individual who knows about this topic. changing the system, user and input locales for an auditor to sit with a programmer and
An audit can identify the issues and help you to see how well the program handles differ- view his or her particular contribution at
prioritize what needs to be done and how ent keyboards, fonts, date and time formats, runtime as well as examine the underlying

12
GET TING
INTERNATIONALIZATION

GUIDE
STARTED

code. Keep in mind that the auditors are considered internationalization, the work job, try to work out a method whereby the
applying their experiences in international- may be no more complex than getting internationalization work can be done on the
ization to mentally run through the items list- strings and other user interface items out of main development line. Otherwise, the
ed above to see how well your product meets code and fixing up problems with incorrect organization will have to do one or more pos-
these criteria. Of course, the auditors will tai- API use. It should be no surprise that well- sibly large merges, and in the meantime, the
lor their actions based on your program. written code is often very easily fixed to development staff will continue to introduce
meet internationalization requirements and internationalization errors that will have to be
What are the deliverables from badly written code can be pure hell to corrected again and again.
the internationalization audit? rewrite. Internationalization engineers call
this the good-code-bad-code phenomenon. What do we do after release?
The auditors will present you with a
report detailing problems found during What do we do after reading Resolve never to overlook internation-
runtime, possible problems in the source the report? alization when writing code. Add interna-
code they examined (down to the proce- tionalization phase checks and QA to normal
dure level with suggestions for repair), gen- Decide as soon as possible whether you development and train any new engineers.
eral recommendations about what needs to will perform the work in-house, outsource it Reinforce good programmer behavior and
be done in the short and long term (if you or do a combination. The best is to do the do the opposite with bad. Most of all,
are going to Western Europe first and the work in-house since the developers will learn expect that internationalization will cost
Far East or bidirectional languages later, the steps to coding internationalization as a you something in extra development and
you may be able to delay some work for a natural part of coding. But even this approach allow for the time and expense. Remember
while), and an estimate of how long the will require some mentoring by experts, and that this cost will be amortized over several
internationalization task might take. Keep for a while you may want to have an interna- localizations and that the internationaliza-
in mind that such estimates are often very tionalization consultant on hand to identify, tion costs per release will be much lower.
difficult to quantify, as are all such determi- explain and help correct the code. You may Enjoy the revenue received from abroad. Ω
nations about software. It can happen that also want to get two or three days of seminar This article is a revised and updated
errors are buried so deep and so tangled that training for all of the development staff in version of one that first appeared in the
major architectural changes are required. In order to acquaint everyone with the basics. Getting Started Guide “Internationaliza-
other instances, if developers have used Seminars rarely teach technique, but they do tion” in MultiLingual Computing & Tech-
good coding practices even without having raise awareness. If you decide to outsource the nology #47 Volume 13 Issue 3.

13
GET TING
GUIDE
STARTED INTERNATIONALIZATION

12 Myths and Misconceptions


About Inter nationalization
Andr ea S. Vine

Myth #1: Making user interface Many folks assume the people translating Engineers need to enable business folks to
elements localizable is enough. the product will always choose the best word make the decisions necessary to sell as much
for the context. The truth is that localizations product as possible. This, in turn, makes the
If this were true, it would mean that the run on tight schedules and low budgets. company more profitable, which raises the
product is modified for every country where it Translators usually translate text directly in stock price (well, sometimes), and everyone
is sold — Canada, the United Kingdom, message catalogs, rather than as they appear benefits. Localization needs to be enabled
Brazil, New Zealand, Greece and dozens more. on the screen. They are not well versed in throughout the product.
Obviously no company localizes products for product functionality, and there is little time Log messages fall into a special category
every single country it sells into, or localiza- and expertise to perform thorough linguistic of messages. They are usually not localized
tion groups would be much, much larger and checks of the text in the context of the running directly, but may, in fact, be indirectly local-
their budgets would be significantly bigger. software. They are usually paid by the word, so ized via a log viewer. When this is available,
Instead, companies sell the English product all volume is their watchword. Imagine what hap- log messages need to be in a separate
over the world, with the exception of a few pens to the translation in this situation. resource file in order to be localized. For this
large markets where localized product is sold. reason, log messages need to be localizable,
Even the localized products are sold into mul- Myth #3: The code is in Java, and but they need to be separated from other
tiple countries. therefore it’s internationalized. messages so that localization knows whether
For this reason, the locale of the user to translate them or not. If a message goes
should be detected or determined in some Long before the advent of Java, there was to both a log and the user interface and if the
way, even if the user must be asked explicitly. internationalized code. How on earth did pro- log messages are restricted to English, then
Numeric formats, text formats, dates and any grammers manage this? The answer is that the message going to the user interface
other formatted data should appear in a style internationalization was always possible; it just should be retrieved from the localized
that is used in that locale. took more effort. Java is written to make inter- resource file, and the message going to the
Note that values must be handled care- nationalization much easier. It is not impossi- log should come from the English resource
fully. For example, if someone in Germany ble, however, to write Java code that is not file. English files should be shipped with all
asks for a price and the price is stored in US internationalized. In fact, it’s pretty easy to localized products.
dollars, then there are two possible methods write code that only supports English in the
of conveying the value to that user: United States in Java. So, even Java must be Myth #6: The product uses open
The currency unit displayed is US dol- carefully coded to support international data. source, and so internationalization
lars, but the numeric format of the actual doesn’t apply.
value is that of Germany: USD 250 467,10 Myth #4: The product has full
The value is converted from US dollars Unicode support and therefore is A lot of folks use the excuse that they
into German marks, and the value is displayed internationalized. have no control over the open source, and so
with the German mark currency symbol, in a they can’t deal with internationalization
German numeric format: DEM 528 450,47 Like the Java myth, so goes the Unicode issues in that code. Yet a product’s interna-
Even with the value expressed in US dol- myth. Like Java, Unicode support can make tionalization is only as good as its weakest
lars, the thousands separator is a blank, and the handling international data much easier. But component. The decision to use open source
decimal is a comma. If the US thousands sepa- once again, code must be written to manage in the first place must take into account all
rator, the comma, is used, a German user might data in different languages, in different the requirements for those components. If
well be confused about the amount. locales and, for the time being, in different the open-source pieces don’t fulfill the cus-
Formats should be locale sensitive, charsets. Ha-ha (did that translate well?). tomer requirements, then the decision to use
but value units should only change if there is them must include the coding effort required
a conversion. Myth #5: Administrative inter- to internationalize them. Otherwise, cus-
Graphics are part of the universal faces and log messages don’t need tomers will not get a product which works
product approach. They are so expensive to to be internationalized. the way they expect it to, which in turn
localize that no one usually bothers unless means they’re not going to buy it.
there’s embedded text (which should be Admins are people too. In some markets,
avoided). Graphical images should be uni- the admin interface must be localized. What Myth #7: ISO-8859-1 is the stan-
versally appropriate. was done in the past in localization is not nec- dard encoding for HTML.
essarily what will be done in the future.
Myth #2: Translators choose the Whether or not a product gets translated is a The HTML specification states that ISO-
best phrase in the target language. business decision, not a technical decision. 8859-1 is the default encoding, but it is not

14
GET TING
INTERNATIONALIZATION
INTERNATIONALIZATION

GUIDE
STARTED

the standard one. Using ISO-8859-1 means Myth #10: Internationalization


KNOWLEDGE
that only a limited number of languages can was added in the last release, so FROM THE CORE
be represented on the web page. But it doesn’t nothing more needs to be done. Historically speaking, opportunities for
have to be that way. localization professionals to update their
In the HTML 4.0 specification, Unicode Imagine if this were said about any pro- knowledge by meeting with peers to
was made the reference character set. This gram architecture or functionality — say, exchange ideas, experiences and information
means that all numeric character references security, performance or the ability to print a have been rare. The Localization Institute,
are Unicode values, and that means that page in a word processing program. It’s ludi- with MultiLingual Computing, has filled this
processors supporting HTML 4.0 also sup- crous to assume anything in program code is gap with events such as Localization World
port Unicode. This allows the use of UTF-8 as complete as long as the code keeps changing. conferences. Other forums include the
the encoding for HTML pages with the confi- Internationalization is inherent in the pro- Institute’s roundtables. The Management
dence that browsers and web servers are able gram code; every time a new line is added, it Roundtable has been held yearly for nine
to handle it. And UTF-8 covers most of the must be taken into account. New require- years, and the Project Managers’ Roundtable
world’s languages. ments come in, making it necessary to add for seven.
functionality, or possibly to change the Internationalization as a standalone topic
Myth #8: All company employees architecture. Until all work on the product has received less attention. Generally consi-
speak English, so only English needs is discontinued, the internationalization is dered the province of core development in
to be supported for internal tools. not done. sophisticated development enterprises and
Myth #11: The product works in poorly understood in less-sophisticated en-
Within a given company, there’s usually
deavors, internationalization has suffered from
a primary language chosen for business Japanese, so it’s internationalized. a shortage of organized information
transactions within the company. For com-
exchange and standard-bearers duly unified
panies based in the United States, this is usu- It’s great that the product has been tested
by shared knowledge. With localization
ally English. Internal tools, however, handle in Japanese; it uncovers a great many prob-
data that is beyond internal business. For lems. But not all of them. Other languages,
methodologies becoming increasingly well
example, they may include a customer data- such as Arabic, Hindi and French, have dif- understood, the time has come to refocus
base, a support log or a bug database. Data ferent issues from Japanese. Even Chinese is attention on internationalization. Program-
from customers is often in another language, different. There are issues with other locales ming technology has moved from C and
using other locale formats. as well. And the product may not work in a C++ into new language paradigms that
Bugs in the product often relate to data multilingual environment. So keep testing! contain new mysteries and methods. Emer-
that is not English. Trying to record a bug ging markets such as China have flexed their
with Japanese data in an English system is a Myth #12: Internationalization is muscles by imposing new mandatory stan-
frustrating exercise. Often the person log- done by other engineers after the dards. Information has become fragmented,
ging the bug gives up, and the bug isn’t and it needs to be brought back together
base product code is completed. again in one knowledgebase.
logged. If the person persists, then there may
not be enough data to replicate the bug. Just thinking through the engineering From June 29 to July 1, 2005, The Locali-
Quality suffers, and the product team may process, does it make sense for the same zation Institute will offer an Internationa-
never know why sales are slipping in non- code to be worked on by more than one lization Roundtable at the Granlibakken
English markets. engineer? It may make sense for a second Conference Center near Lake Tahoe, California.
Internal tool teams should gather engineer to review the code, but rewrite it? Discussions will be advanced and technical in
requirements just like product teams (assum- Most companies would consider that a nature and will focus on issues that need to
ing product teams are gathering require- huge cost issue. After all, engineers are be considered by lead developers, code archi-
ments internationally). expensive. But if a company process is set tects, VPs of development, experienced interna-
up so that a separate group of engineers, or tionalization engineers, technically oriented
Myth #9: This product has never third-party vendor, internationalizes their localization managers and subject matter
been localized, so it doesn’t need product, that’s exactly what happens. And experts in any of the fields.
internationalization. it will have to happen for each and every Topics include the state of international-
release. Internationalization is an architec- ization technology in today’s programming
Internationalization is about data pro- ture and coding methodology, not an add- languages; architectural issues; assessment; GB
cessing, not just user interface. English prod- on functionality. Even if a company could 18030-2000 issues and certification; tools;
uct is sold all over the world, so all data must afford to have post-release international- and making the case for the value of inter-
be processed correctly. Even within the ization done, the quality is significantly nationalization to benefit a company.
English-speaking markets, non-English data lower. The internationalization engineers For further information, see
is processed (see Myths #1 and #8). don’t usually know the product as well as http://www.localizationinstitute.com
What was done in the past in localiza- the product engineers, so they introduce
tion is not necessarily what will be done in bugs. In addition, the product itself is not
the future. Whether or not a product gets architected for international use, and so
translated is a business decision, not a may fail in providing useful functionality
technical decision (see Myth #5). Engineers for markets around the world. The Localization Institute
need to enable business folks to make the Internationalization is not something 4513 Vernon Boulevard, Suite 11
decisions necessary to sell as much product that someone else does. It’s something that Madison, WI 53705 USA
Phone 608.233.1790 Fax 608.441.6124
as possible. everyone should do. Ω

15
GET TING
TRANSLATION

GUIDE
STARTED

Subscription Of fer

This supplement introduces you to the level; changing the requirements for inter- more complex. Leaders in the development
magazine MultiLingual Computing & national software; and changing how busi- of these systems explain how they work and
Technology. Published nine times a year, ness is done all over the world. how they work together.
filled with news, technical developments MultiLingual Computing & Technology
and language information, it is widely is your source for the best information and Inter nationalization
recognized as a useful and informative pub- insight into these developments and how
lication for people who are interested in the they will affect you and your business. Making software ready for the interna-
role of language, technology and transla- tional market requires more than just a good
tion in our twenty-first-century world. Global Web idea. How does an international developer
prepare a product for multiple locales? Will
Translation Every Web site is a global Web site, and the pictures and colors you select for a user
even a site designed for one country may interface in France be suitable for users in
How are translation tools changing require several languages to be effective. Brazil? Elements such as date and currency
the art and science of communicating ideas Experienced Web professionals explain formats sound like simple components, but
and information between speakers of dif- how to create a site that works for users developers who ignore the many interna-
ferent languages? Translators are vital to everywhere, how to attract those users to tional variants find that their products may
the development of international and your site and how to keep it current. be unusable. You’ll find sound ideas and
localized software. Those who specialize in Whether you use the Internet and World practical help in every issue.
technical documents, such as manuals for Wide Web for e-mail, for purchasing
computer hardware and software, industri- services, for promoting your business Localization
al equipment and medical products, use or for conducting fully international e-
sophisticated tools along with professional commerce, you’ll benefit from the informa- How can you make your product look
expertise to translate complex text clearly tion and ideas in each issue of MultiLingual and feel as if it were built in another coun-
and precisely. Translators and people who Computing & Technology. try for users of that language and culture?
use translation services track new develop- How do you choose a localization service
ments through articles and news items in Managing Content vendor? Developers and localizers offer
MultiLingual Computing & Technology. their ideas and relate their experiences with
How do you track all the words and the practical advice that will save you time and
Language Technology changes that occur in a multilingual Web money in your localization projects.
site? How do you know who’s doing what
From multiple keyboard layouts and and where? How do you respond to cus- And Ther e’s Much Mor e
input methods to Unicode-enabled opera- tomers and vendors in a prompt manner
ting systems, language-specific encodings, and in their own languages? The growing Authors with in-depth knowledge
systems that recognize your handwriting or and changing field of content management summarize changes in the language indus-
your speech in any language — language and global management systems (CMS and try and explain its financial side, describe
technology is changing day by day. And this GMS), customer relations management the challenges of computing in various lan-
technology is also changing the way in (CRM) and other management disciplines guages, explain and update encoding
which people communicate on a personal is increasingly important as systems become schemes and evaluate software and sys-
tems. Other articles focus on particular
countries or regions; translation and local-
To subscribe, use our secur e on-line for m at ization training programs; the uses of lan-
guage technology in specific industries — a
wide array of current topics from the world
www.multilingual.com/subscribe of multilingual computing.
Nine times a year, readers of Multi-
Lingual Computing & Technology explore
Be sur e to enter this on-line r egistration code: sup71 language technology and its applications,
project management, basic elements and
advanced ideas with the people and com-
panies who are building the future.
17

You might also like