You are on page 1of 42

Building a Bilingual (Chinese/English) Apparel eLexicon Prototype

Computing Department, Institute of Vocational Education (IVE) (Kwai Chung campus), Vocational Training Council (VTC) Hong Kong charlestang@computer.org http://charlestang.org

Charles Man-wing Tang

Practical Experience in Teaching CAT. Bilingual lexicography, seeks to alert students to the importance of reference science (RS) in a knowledge-based society. The University of Hong Kong Library calls reference science as information resources which include both dictionaries and encyclopedias. Accordingly, a Bilingual Dictionary falls into the second meaning defined by the American Heritage Illustrated Encyclopedic Dictionary (1987): A book listing words of a language with translations into another language. Tom McArthur (1992) supplemented that electronic publishing has several forms: online search; disks and tapes; and electronic distribution. This paper focuses on developing a bilingual electronic lexicon in the fashion and textile industry that could be integrated into the master lexicon system for computer-aided translation (CAT). The researcher was teaching Information Systems in Textile and Clothing to 60 final year students at the Institute of Vocational Education (IVE), Vocational Training Council (VTC) in Hong Kong. Each student was distributed with a source document containing some twenty bilingual terms of the industry. Every pair of English/Chinese terms was assigned with a code for categorization. The students had to input just the English term and the code first in a spreadsheet file and sent the softcopy to the researcher via WebCT. After proofread, they added in the Chinese term accordingly as version two. Putonghua and Cantonese pronunciations on the first Chinese character are added for indexing. About 1500 terms were then gathered for information retrieval and elaboration. It can be transformed in database and word processing (WP) formats for optional processing too. This resembles an information services production line that the researcher is keen on, and this method of information processing, the building of eLexicon, could be applied to any industry.
what news among the merchants? (William Shakespeare: THE MERCHANT OF VENICE, Act III Scene I)

1. Introduction
In the second semester of the academic year 2008-9, the researcher, as a lecturer in the Computing Department, was appointed to teach the final (4th) year class, Higher Diploma in Fashion Business, for the Department of Fashion and Textiles (the then Hong Kong Design Institute, HKDI) of IVE, VTC1. The name of the module was Information Systems for Textile and Clothing (ISTC). In which, they had to learn the fundamental
1

VTC is the largest vocational education and training semi-governmental body in the region.

concept of information systems; and with some hands-on laboratory work that could make them realize the pragmatic side of the information technology capable of applying to their design profession, the apparel. The researcher then chose the building of an Apparel eLexicon prototype, as to let the students have a grasp on how to build a trade glossary useful to their future job. This system in electronic form could be handy to the fashion and textile industry in Hong Kong, and to the greater China, for its terminology would go bilingual (Chinese/English).

1.1 Definition
Before continuation, it is necessary to clarify the meaning of all related terms. bilingual All terms stemmed out from the English origin, and were qualified with their Chinese counter parts. However, for non-Chinese speaking readers, it is necessary to know that the official Chinese language is referred to the Han Chinese2. There is only one written Chinese language since the Qin Dynasty3. However, since the communist regime4 taken over China, the official written Chinese characters turn to be in simplified form; but the National Party Government that retreated to Taiwan, together with Hong Kong, Macau, and the overseas Chinese communities all over the world, still adopt the traditional Chinese characters. Here, the traditional form was used5, instead of the simplified Chinese in data processing 6 , as this research was carried out in Hong Kong7. Besides, though having the modern official Chinese spoken language, Putonghua8 to link up all Chinese peoples, Hong Kong citizens are still using Cantonese 9 , this research therefore included both Putonghua and Cantonese pronunciation as the indexes10 for easy searching11.
The majority of the Chinese peoples are Han. 221-206 BC. 4 1949 now. 5 In computer encode BIG-5 developed by the Taiwanese. 6 Encode GB-2312-80 by the mainland Chinese. 7 As Simplified Chinese characters stemmed out from the traditional Chinese, many Chinese folks all over the world know both forms to a certain extent. 8 It was officially enforced in 1956 by the Secretary of State of China. It is called Mandarin since the Qing Dynasty (1616-1911) and is still so called in Taiwan. 9 It is the southern dialect of the Guangdong, Guangxi and some other provinces, adding up to around 0.1 billion Chinese speakers all over the world. Canton was the former name of Guangzhou, the capital of the Guangdong province. See also Appendix II 4.6. 10 Details of the Chinese languages background knowledge could be found in the thesis of the researchers MMS (Master of Management Science) degree (1991), CHINESE COMPUTING HISTORY AND CURRENT TRENDS and in Appendix II: Some Facts about the Cantonese and Putonghua Pronunciation Systems. 11 Though the Cantonese Index serves well the business purpose, the academics and the Chinese government have a good intention of providing Cantonese pronouncing facilities for the sake of learning Putonghua, so that all peoples of China could have a strong bonding by the official language (Qiao Yannong 1972 Preface).
3 2

apparel All things related to dress, clothing, vesture, garments, raiment, garb, costume, attire and habiliments12. eLexicon The electronic form of lexicon. lexicon Some form of a dictionary. prototype The primitive form of some creation.

1.2 Apparel eLexicon Design Plan


First, it goes bilingual; that is, English/Chinese. And, it has categorization; that is, using a single English capital letter to stand for the category. For example, A Accessories, D Dress, S Sewing etc. There are seven steps in its development as shown below. This ressembles an information services production line that the researcher had been doing in the last two decades of the 20th century13. Step #1: Preparation of source data The teacher gathered the bilingual terms and assigned each student from the two classes of about 60 students a source data sheet for input. Each sheet was marked with the students name for recording, as to avoid duplication and to check for any data missing. Step #2: Input just the English term with a code Each student had to enter the category code along with the English term into a spreadsheet, and then sent this version #1 to the teacher, who acted as the production supervisor, for centralization. Step #3: Proofread, update and add in the Chinese terms The teacher/supervisor saved this version #1 data file of each student/data-entry officer and printed them out with the file name added on the worksheet to identify who was responsible for which file. The printout was returned to the responsible officer who would give this printout along with the original data sheet to ones partner for proofread, checking if there were any wrong entries. The proofread printout had to be signed by the partner with date and time, and returned to the owner for update. The data sheet owner then added in the Chinese term in pair with its English origin to the worksheet and named it version #2.
INTERNATIONAL (ENGLISH-CHINESE, ENGLISH THROUGH ENGLISH) DICTIONARY. The researcher was the production supervisor of the DATABASE CREDIT MONITOR, a weekly publication on the credit status of the Hong Kong companies, during the late 1970s to the early 80s. He was also the information consultant supervising the HONG KONG COMPUTER DIRECTORY in 1993-1997 at the Hong Kong Productivity Council, providing marketing information on the computing companies in Hong Kong. Both were the only authoritative publication of the time.
13 12

Step #4: Proofread, update and add in the Putonghua pronunciation as an index Again, this version #2 had to be sent to the teacher for central saving and printing out. The printout was returned to the responsible officer and let the partner do the second proofread. Update had to be made if there were any mistakes. The Putonghua pronunciation14 of the first character of the Chinese term for each entry was added on by the same student. In such a way, version #3 was created and sent to the teacher again. Step #5: Proofread, update and add in the Cantonese pronunciation as another index Same old trick, this version #3 was saved centrally by the teacher; but this time, the teacher/supervisor would do the proofreading on his own for all individual worksheets because a layman/data entry clerk/student might not be so keen on Pinyin15. The handcorrected printout was returned to the students for update. The Cantonese pronunciation of the first character of the Chinese term for each entry was added on; version #4 was created and sent to the teacher. Step #6: Final proofread, update and return to the teacher for consolidation This version #4 was saved centrally by the teacher who would do the proofreading on his own for all individual worksheets on the Cantonese pronunciation. This is a tough job, special reference had to be looked up because there are too many standards in Cantonese pronunciation, and none is the best16 from the perspective of user-friendliness. The handcorrected printout was returned to the students for their update and to be named as version #5. It was then sent to the teacher as the final individual version for consolidation. Step #7: Consolidation, conversion, and embedment17 of data entries This version #5 was saved centrally by the teacher who would do the final check up on all entries as the job of a production supervisor. He then integrated all entries of the students into a master spreadsheet file which is good enough for data search in electronic format. However, for higher level of data manipulation, a database version could be made. Besides, if the lexicon had to be published, all data entries could be sorted in various indexes and embedded in a word processing file for printing.

2. System Development
Here, system development is of course referring to the development of this electronic apparel lexicon prototype. In current computing concept, a system has six aspects, hardware, software, procedure, personnel, data/information, and networking. It is
It is officially known as Pinyin. The researcher studies Putonghua since 1975 for many levels, and got the professional diploma in Putonghua teaching in 2008. 16 The Hong Kong Telephone Directorys Cantonese Index (See Appendix I) is adopted since it is popular in Hong Kong for several decades. Yet, the symbols cannot give the exact pronunciation. Other pronunciation standards are listed in Appendix II. 17 Object Linking and Embedding (OLE) was started in Microsoft Windows 3.1. It is a method of combining information that is processed by different application programs.
15 14

necessary to confine them in respect to the development of the Apparel eLexicon Prototype.

2.1 Six Aspects of the Apparel eLexicon Prototype


hardware The development of this system was carried out mainly at the campus computer laboratory where each workstation is a desktop that is hooked up in a LAN. However, the students could also use their own PC at home or anywhere to finish their work and send it to the mailbox of the lecturers WebCT18 site.

Fig. 2.1.1 WebCT Mainpage of VTC


18

WebCT stands for Web Teaching Tools. It is a learning management platform that manages the delivery of e-Learning courses. The current version is WebCT 8 [Campus Edition].

software Besides the Internet facilities under the Microsoft Windows XP environment and the WebCT platform, students were exposed to the Microsoft Office 2003 so that they could fully use the Word and the Excel programs for this research. In fact, the Access was also reserved as to let the students convert the spreadsheet data into database format at a later stage if time allowed. Developing a database of lexicon needs not to be too technical because this Apparel eLexicon system is targeting at small companies in the industry19. Assuming that every company must have at least a Microsoft Windows-based PC, with a business version of the Microsoft Office that contains the Word and the Excel20, they then could use this handy Apparel eLexicon system for reference on the terminology relating to their business in both Chinese and English, and could integrate more terms on their own. Procedure This is about the building up of the electronic apparel lexicon prototype by group effort, starting from the preparation of the source documents, input of the English terms and code, create version #1 of the individual data file in spreadsheet format, printout for 1st proofread and update, append the Chinese terms, create version #2 of the file, printout for 2nd proofread and update, append the Putonghua index, create version #3 of the file, printout for 3rd proofread and update, append the Cantonese index, create version #4 of the file, printout for 4th proofread and update, create version #5 of the individual file for consolidation into a grouped master spreadsheet file by append/merge, doing all types of sorting and final checking on all possible mistakes before converting the master file into the database file format, doing all types of sorting and embed them into the word processing file format for hardcopy publishing if necessary.

Fig. 2.1.2 Apparel eLexicon System Development Stages

19 20

Most companies in Hong Kong are SMB (Small- and Medium-sized Business) Microsoft does not provide the Access database management system (DBMS) on the business version of the Office, but on the higher priced professional and enterprise versions.

Personnel This project involved the lecturer and his two classes of 60 Higher Diploma in Fashion Business students of IVE, VTC. The teacher is from the Computing Department, and the students are from the Fashion and Textiles Department of the same campus. Besides the lecturer, students and the institute, potential stake-holder includes the apparel industry in Hong Kong which can utilize this Apparel eLexicon system for reference. Data/Information As each student would handle 20-30 entries of the apparel terms, multiplied by 60 students would have around 1500 entries as the initial data volume for this prototype. Ideally, there should have 10 times of such entries for real usage. This would be done in the coming semester with the planned four classes taking similar subject with the lecturer. Networking Developing a database is not an easy task because it needs all frontline staff involved to have database concept in order not to screw up the data. In computing history, many firms developing database systems wasted much of their resources due to malmanagement on data processing21 . So, this eLexicon system gathered the data entries from the individual students batch by batch22 via eMAIL23 from their standalone PCs with caution. All data were then centralized and consolidated by the lecturer at this primitive level of the system.

2.2 Procedure Details


Step #1: Preparation of source data As mentioned earlier, the teacher gathered the bilingual terms and assigned each student from the two classes of about 60 students a source data sheet for input. Each sheet was marked with the students name for recording, as to avoid duplication and to check for any data missing. The concept of computing is having three major functions: input, processing, output (I-PO) working together, where raw data is input into the computer; after processing, it is output as useful information. In which, these information are stored for retrieval. And, the retrieved information, in turn, could be reprocessed to give newer information. So, the storage unit is considered to be both an input and an output device. However, many computing teachers teach students24 just on the skills to process data or to develop the system without telling them how to collect data, an important step in real business. That is, most laymen usually overlook the importance of the source document.
Gates, Bill. (1999) BUSINESS @ THE SPEED OF THOUGHT: USING A DIGITAL NERVOUS SYSTEM. NY: Warner Books Inc. 22 Batch processing is employed instead of interactive processing for the sake of simplicity and to avoid mistakes by greater data control with supervisor intervention. 23 This is the way that the research spells electronic mail for many years, as to conform to other terms like eBusiness, eCommerce, etc. 24 This practice is not up to the university level.
21

Fig. 2.2.1 Basic Computing Functions Emphasizing on the Source Document/Raw Data Collection

Where do the bilingual apparel terms come from? This involves data gathering techniques. A database developing company usually has a search team working on it; but as this project is just for educational practising purpose, the researcher simply picked up some specimens from the existing published lexicon books that could be bought in Hong Kong relating to the field. Their entry formats are referenced to design this prototype. They are listed below one by one for examination. Reference #1: Clothing Glossary_Clothing Design Association

Fig. 2.2.2a Published Lexicon Books Used as References

Reference #2: Clothing Glossary_Man Kwong Publisher (left) #3: New Culture Publisher (right)

Fig. 2.2.2b Published Lexicon Books Used as References

Fig. 2.2.2c Published Lexicon Books Used as References

Reference #4: English-Japanese-Chinese: Textile and Apparel Dictionary

Fig. 2.2.2d Published Lexicon Books Used as References

Reference #5: Modern English Encyclopedia Handbook

Fig. 2.2.2e Published Lexicon Books Used as References

From the above five references, it can be found that most of them are just having the English and Chinese terms in English alphabetical order, few of them are having a little Chinese explanation on some of the terms. It is not shown here but in fact most of them are having categorizations. By the way, the researcher just picked up some pages from the Modern English Encyclopedic Handbook for testing 25 , and elaborated on its categorization. After consideration, five fields were set up for this Apparel eLexicon prototype: Type Code, English Term, Chinese Term, Putonghua Index and Cantonese Index. As this is just a testing, each student was assigned with 20-30 entries only. To simulate real information production line, it is necessary to mark on the source document which student is responsible for which batch.
25

After examining closely on all references, it is pretty sure that some of them must have referenced to others to some extent.

Besides, since nowadays young people is not so responsible as their earlier generations, the supervisor had to make a copy of the source documents in order to trace the validity of the data entered, otherwise the database might turn into a mess and is useless. Step #2: Input just the English term with a code Each student had to enter the category code along with the English term into a spreadsheet, and then sent this version #1 to the teacher, who acted as the production supervisor, for centralization. File management system (FMS) concept had to be taught first; otherwise the students26 might develop irrelevant data files which could not be integrated into the master file easily, causing greater effort at higher cost to trace and rectify errors after then. One FMS concept is to have standardized file name for individual operators. Below are two of the samples, it could be seen that the file name of the student at the right is incorrect because it does not have the version shown; this causes confusion if the file is not opened for clarification. Besides, standardized code is very important because it is used for tracing the category. Again, the student at the right typed in the category name in full instead of a code, and the rest of the code cells were missing; this shows her carelessness. Think about, if building a large database in a sizable company, would the data be good enough to be used if it is full of mistakes? Control in every stage of the data processing is therefore a must for high quality information production.

Fig. 2.2.3 Comparing Two Samples of Version #1 Typed by the Students


26

The graduating students might be the then data processing frontline if they could not find a designing job.

10

Step #3: Proofread, update and add in the Chinese terms The teacher/supervisor saved this version #1 data file of each student/data-entry officer and printed them out with the file name added on the worksheet to identify who is responsible for which file. The printout was returned to the responsible officer who would give this print-out along with the original data sheet to ones partner for proofread, checking if there were any wrong entries. The proofread printout had to be signed by the partner and returned to the owner for update. The data sheet owner then added in the Chinese term in pair with its English origin to the worksheet and named it version #2. Even in the old days of the early 1980s, computer printouts from the mainframe that entertained hundreds of users would have the user_id and the file name shown on the cover page of the fan-fold paper, nowadays, though computer technologies are far advanced than before, computing learners always overlook to leave the filename and the username on the header or the footer of the printed pages for identification. This makes the students themselves unable to identify which pages are theirs while working in a computer laboratory with one or few printers to be shared among dozens of students. Below are the print preview samples of two students. Needless to say, the student at right could hardly get back her printout for proofread because there is no identification on the page.

Fig. 2.2.4 Comparing Two Samples of Version #2 Typed by the Students

One the other hand, having a counter part to do the proofreading is very important because, in psychology, most people would forgive oneself and put the blame on others. With this nature of human weakness, we seldom could trace our own mistakes. Nowadays, though having more advanced IT and almost everybody could be a computer user, unfortunately, most people overlook the importance of data control and proofreading. This is why there are so many mistakes in the business world27.
27

Including the accounting scandal at the turn of the century and the recent economic tsunami; they were all initiated by some irresponsible parties in the leading IT nation, USA!

11

Fig. 2.2.4 Importance of Proofread by Partner to Avoid Errors

Step #4: Proofread, update and add in the Putonghua pronunciation as an index Again, this version #2 had to be sent to the teacher for central saving and printing out. The printout was returned to the responsible officer and let the partner do the second proofread. Update had to be made if there were any mistakes. The Putonghua pronunciation of the first character of the Chinese term for each entry was added on by the same student. In such a way, version #3 was created and sent to the teacher again. After consideration, the designer/teacher confined to have just the first Chinese character entitled with the Pinyin. If looked carefully at the Chinese publication industry in the mainland China, one might discover that they are using Pinyin for the book title as an indexing/stock taking method. Below uses a reference book of this research (a Cantonese Pronouncing Dictionary) as an illustration.

Fig. 2.2.5 Using Pinyin as an Index for Book Title in All Books in the Mainland China (left). Pinyin Intonation Has to be Typed Using Special Character Tables (right)

12

Here, it could be seen that no intonation symbols were made. So, the researcher also requested the students not to put in the intonation for simplicity. This is very important for user-friendliness in typing search keys. Using intonations would only cause an end user to be reluctant to use the system (see illustration in Fig. 2.2.5 above at right).

Fig. 2.2.6 Comparing Two Samples of Version #3 Typed by the Students

Step #5: Proofread, update and add in the Cantonese pronunciation as another index This version #3 was saved centrally by the teacher; but this time, the teacher/supervisor would do the proofreading on his own for all individual worksheets because a layman/data entry clerk/student might not be so keen on the standardized Putonghua pronunciation. The hand-corrected printout was returned to the students for update. The Cantonese pronunciation of the first character of the Chinese term for each entry was added on; version #4 was created and sent to the teacher. From the two samples of version #3 illustrated above, it could be realized that even emphasized that only the first Chinese character needs to have the phonetic symbols, some careless students might still make more than one character Pinyined (the 15th and the 16th entries of the student in Fig. 2.2.6 at right). Besides, some might also type in the intonation. As the students are not real office frontline, the teacher/supervisor has to look closely on all entries in order to make a high-quality prototype for this Apparel eLexicon system. It was found that, even though the students had learnt Putonghua Pinyin in their curriculum, they might not master the Pinyin for the Chinese characters well (entries #21#23 in Fig. 2.2.6 at right were wrong). And, some students know how to make use of the online system to get the corresponding Pinyin for the terms28. On the other hand, there might have an illusion that since the Hong Kong citizens are speaking Cantonese, they should have no problem in sorting out the Cantonese pronouncing symbols for the Chinese characters.
28

This is a URL found in a students worksheet to facilitate Pinyin, http://www.putonghuaweb.com/.

13

In fact, there are two problems facing the students, one is, within the recent two decades, since more and more non-Cantonese speaking migrants coming from the mainland to Hong Kong, many people in Hong Kong are not pronouncing Cantonese well; besides, since Cantonese is just a dialect, the China government does not enforce an official pronouncing symbolic system for it. So, there are over half a dozen Cantonese pronouncing systems prevalent29. To play safe, the researcher asked the students to stick to the system provided by the HONG KONG TELEPHONE DIRECTORY which has been adopted in Hong Kong for at least 30 years. A copy of the Cantonese index was given to each student to aid input30.

Fig. 2.2.7 Comparing Two Samples of Version #4 Typed by the Students

Step #6: Final proofread, update and return to the teacher for consolidation This version #4 was saved centrally by the teacher who would do the proofreading on his own for all individual worksheets on the Cantonese pronunciation. This is a tough job, special reference had to be looked up because there are too many standards in Cantonese pronunciation and none is the best. The hand-corrected printout was returned to the students for their update and to be named as version #5. It was then sent to the teacher as the final individual version for consolidation. After processing, it was found that there were more mistakes typed by the students in the Cantonese pronunciation symbols than in the Pinyin. One of the reasons besides those mentioned above, is the telephone directory does not provide the full set of characters with corresponding pronunciation symbols. Even the researcher had to look up the directory index carefully, and sort them out one by one. Those characters not provided were added in by hand writing by the researcher on his source pages (see Appendix I). Some students went online to get the Cantonese pronouncing symbols, unfortunately, it was not the system provided by the directory. This issue would be looked up at a later section of this research paper for a better solution.
29 30

See Appendix II for detail See a page shown in Appendix I

14

Fig. 2.2.7 Sample of Version #5, the Finalized Version of a Student

Step #7: Consolidation, conversion, and embedment of data entries This version #5 was saved centrally by the teacher who would do the final check up on all entries as the job of a production supervisor. He then integrated all entries of the students into a master spreadsheet file which is good enough for data search in electronic format. However, for higher level of data manipulation, a database version could be made. Besides, if the lexicon has to be published, all data entries could be sorted in various indexes and embedded in a word processing file for editing and printing. Below shows the centralized storage of the student files in the teacher/supervisors workstation. Class A had 33 students, and Class B had 27 students31. However, though larger in size for Class A, as they had a better learning morale, their performance was better, with fewer mistakes made in their files. Class atmosphere is a key success factor. Even taught by the same lecturer, a class with one or more inattentive student(s) would mess up the performance of the whole class. Comparisons shown in the previous steps reflect the situation, with Class B, the weaker performers are always at the right side.

Fig. 2.2.8 60 Students from Two ISTC Classes Are Contributing to the Building of this Apparel eLexicon Prototype
31

Actually 59, one student deregistered.

15

3. An Apparel eLexicon Prototype in Spreadsheet Format


As said, developing a database is not an easy task because it needs all frontline staff involved to have more database concept in order not to screw up the data. After appending all entries from the 60 students sequentially, from the first student of Class A to the last, and then those of Class B, 1,363 entries were integrated. During processing, many mistakes were found, some students did not hand in all versions (like the Bottom-Left in Fig. 3.0.0 below), some had filenames messed up (the Top two in the figure), some had typos of all sorts etc. The supervisor would correct the significant mistakes during appending; for example, add in or change the category code, the Cantonese Index and the Putonghua Index.

Fig. 3.0.0 Four Students Samples on the Data File Versions (with the One at the Bottom-Right be Most Correct)

It is suggested not to rearrange the order of this master merged worksheet for the ease of tracing the source in case there were any errors, but to copy it into other worksheets for various sorting on all fields when mistakes were erected. Below are some of the significant mistakes spotted on the master file at a glance. From this, it is noted that the proofreading by the students might not be so trust-worthy.

16

3.1 Master File Version #1


(A) Should not use capital letter in the leading character of the English term

Fig. 3.1.1 Mistake #1: The Leading Character of the English Term Should Not be in Capital Letter

(B) Should not have spacing in the Chinese term

Fig. 3.1.2 Mistake #2: The Chinese Term Should Not Have Spacing

(C) Should not have Intonation Symbols32 for the Putonghua Index

Fig. 3.1.3 Mistake #3: The Putonghua Index Should Not Have Intonation Symbols
32

There are only four intonations in Putonghua, with the symbols put above the vowels. See also Appendix II 4.1 for further detail.

17

(D) Wrong Pinyin for the Putonghua Index

Fig. 3.1.4 Mistake #4: The Putonghua Index Has Wrong Pinyin

(E) Having an [Alt]+[Enter] control key typed at the Putonghua Index cell

Fig. 3.1.5a Mistake #5: The Putonghua Index Cell Has an [Alt]+[Enter] Typed (Hard to find if not sorted)

This kind of error was not easily found. It was discovered only after sorting the Putonghua Index and wondering why the sorting sequence was wrong.

Fig. 3.1.5b Mistake #5: The Putonghua Index Cell Has an [Alt]+[Enter] Control Key Typed (Hard to find if not sorted)

18

3.2 Master File Version #2


At this stage, it is better to have the corrected master file saved as version #2 for further spotting on any possible mistakes. Each column with sorting has to be examined carefully. (A) Further enhancement #1: More Categorization Codes were added for refinement Since some of the entries have the same characteristics that could be further categorized, new codes were made for clarity. Below are the Type Code used: A-Accessories, B-Bag, C-Cosmetics, D-Dress, H-Hat, M-Material, P-Personnel, S-Sewing, T-Tool, W-Washing and Y-Hosiery. (B) Further enhancement #2: The English Term with multiple terms would be split into multiple entries.

Fig. 3.2.1 Further Enhancement #2: Multiple Terms to be Split into Multiple Entries

(C) Further enhancement #3: No spacing before the open bracket

Fig. 3.2.2 Further Enhancement #3: No Spacing before the Open Bracket

19

(D) Further enhancement #4: No duplicated English terms, only multiple Chinese terms

Fig. 3.2.3 Further Enhancement #4: No Duplicated English Terms, Only Multiple Chinese Terms

(E) Further enhancement #5: Turn the leading word of numeral in the English Term into Arabic number

Fig. 3.2.4 Further Enhancement #5: Turn the Leading Word of Numeral in the English Term into Arabic Number

(F) Further enhancement #6: Look up the sorted English Terms for anymore duplication, and always do correction in the Master_v2 worksheet

use the [Find] function to help tracing the entry

Fig. 3.2.5 Further enhancement #6: Look up the Sorted English Terms for Anymore Duplication, and Always Do Correction in the Master_v2 Worksheet

20

3.3 Master File Version #3


Tracing the duplicated entries, spotting out errors etc. are not easy tasks, because it still needs human eye to spot them out on screen, though having the electronic spreadsheet system to assist speedy arrangement and instant update. Finally, all updates were made and version #3 of the master file was established. Sorting was made on every field. (A) Sorting the Type Code The original master file (version #3) was copied to a new worksheet and was sorted with the Category as the primary sort key, and the English Term as the secondary key.

Fig. 3.3.1a The Master_v3 Worksheet Retained the Original Order of Entries

Fig. 3.3.1b The Master_v3 Worksheet was Copied to the Sort-Type Worksheet for Sorting

Fig. 3.3.1c Sort-Type Worksheet was Sorted with Type as the Primary Key and English Term as Secondary Key

21

Fig. 3.3.1d Four Samples of the Sorted Sort-Type Worksheet with Type as the Primary Key and English Term as Secondary Key

22

(B) Sorting the English Term The ascending order of the English characters is as follows: 0 1 2 3 4 5 6 7 8 9 (space) ! " # $ % & ( ) * , . / : ; ? @ [ \ ] ^ _ ` { | } ~ + < = > A B C D E F G H I J K L M N O P Q R S T U V W X Y33. From the second sample of Fig. 3.3.1d above, it can be realized that case is not sensitive if looking at the placing of Berlin in between other terms, due to users preference. On the other hand, also due to users preference, the first word of the English term in numerals was converted into Arabic form for the sake of easy reference.

Fig. 3.3.2 The English Term Worksheet was Sorted in Ascending Order

(C) Sorting the Chinese Term

Fig. 3.3.3a The Chinese Term Worksheet was Sorted, Which Seemed to Have No Logical Base
33

Information provided at the Help of Microsoft Excel.

23

In Fig. 3.3.3a above, at first, the ascending order of the Chinese characters seems to go by counting strokes, but with closer examination, it seemed not to be in any logical order at all! Common sorting methods in the Chinese dictionaries are: stroke counts, stroke forms, pronunciation order etc34 . Yet, they are not used here; and it is not using the computing encode character set order neither35. It is then hard to trace the Chinese terms by human eye. So, Putonghua and Cantonese Indexes were added for easy reference. (D) Sorting the Putonghua Index It is quite handy if one knows the nation-wide official Pinyin Symbols.

Fig. 3.3.4 The Putonghua Index Worksheet was Sorted According to Pinyin

(E) Sorting the Cantonese Index Though many Hong Kong citizens learn Putonghua, most of them do not master the official Pinyin Symbols well, so there is a need to add in the Cantonese index. It is placed before the Putonghua Index because this Apparel eLexicon serves for the business sector in Hong Kong in the first place. However, there is always another side of the coin. Though Cantonese is the mother tongue of the Hong Kong citizens, there are too many Cantonese phonetic systems existed, and the one chosen here was the popular one adopted by the HONG KONG TELEPHONE DIRECTORY for at least thirty years. Yet, even the students and the lecturer are not familiar with it, and the collection is limited; general users would then have much difficulty in using it. Further research has to be made for solving this problem.
Usually a Chinese-related dictionary would provide several indexes, one is the Index of Phonetic Alphabet (), and another one is Index of Character Strokes () (Beijing Language Institute (1987)). 35 Is this a bug in the Microsoft Office, or the Chinese sorting sequence is always overlooked in computing?
34

24

Fig. 3.3.5 The Cantonese Index Worksheet was Sorted According to the HONG KONG TELEPHONE DIRECTORY

4.

An Apparel eLexicon Prototype Converted from Spreadsheet Format to Database Format

This Apparel eLexicon prototype system is working mainly on the spreadsheet format, from the initial data entry by individual students to the merging of all entries of group effort. Usually, a spreadsheet system of any brand could handle all the basic functions needed for this prototype, sorting and finding a term in the whole collection; adding new entries into the master file is an easy task too. All these are based on one design concept: simple is the best. This is because the system is aiming to serve the small apparel shops that want to install just a basic electronic system to help them finding out the bilingual terms of their trade, and then they could use them in their routine work. As mentioned earlier, there is one condition to use this eLexicon, it is assumed that every business firm should have a personal computer that has a basic Microsoft Office installed36. For the sake of testing the efficiency and effectiveness, the researcher hereby tries to convert the spreadsheet file into a database using the Microsoft Office 2003 for illustration37.
The researcher was a traditional computer professional studying computing in the early 1980s, so he is not addicted to Microsoft. In the early days of computing some 30 years ago, software were free, they were invented by the academics and share freely. However, when he tried on the free download OpenOffice offered by Sun Microsystem, he found that he still needs time to check up this system before recommending to general users. 37 Until now (2009), even many computing professionals prefer using Office 2003 instead of the 2007 version for simplicity and convenience.
36

25

4.1 Importing Spreadsheet File Data into the DBMS


First, open the database management system (DBMS), Microsoft Access 2003, assigning Apparel eLexicon Prototype as the database file name. Import external data from the Apparel eLexicon Prototype v3 spreadsheet file by choosing Microsoft Excel as the file type. Choose the worksheet "Master_v3" to get the external data, with column heading. Put it in a new table, and do not use Type as the index.

Fig. 4.1a Getting the Apparel eLexicon Data from the Spreadsheet into the DBMS

26

Then choose English Term as the primary key, and retain to call the newly created table in the Access as Master_v3. Since a primary key is a field in a database that is unique for identification, it cannot be established if there were duplication of the English terms, so Access prevents the setting of the primary key. It is necessary to go back and check the Master_v3 worksheet of the Excel file to correct all duplicated entries first. Finally, after all fixing ups, there are 1,332 entries in the Apparel eLexicon Prototype ready for use and for expansion. As Master_v3 is used as the primary key, all records are automatically sorted accordingly (see bottom-right of Fig. 4.1b below).

Fig. 4.1b The Apparel eLexicon Prototype Has 1,332 Data Entries Ready for Use and Expansion

27

4.2 Database Management


There are three aspects of database management: entering data, modifying/updating data, and presenting output reports. And, a DBMS must make provisions for adding new records. Besides, the main purpose of a DBMS is to make it possible to obtain meaningful information from the data contained in the database easily just by several clicks and typing38. Here, different ways of making queries are made for illustration. (A) Query Sample #1: List ALL records begin in a and have the English Terms shown first in Ascending Order, followed by the Chinese Terms, the Cantonese Index, the Putonghua Index, and the Type code. First, click on the Queries section. Choose the first option Create query by Design view. Add the Master_v3 table. The fields will then be displayed in a pull-down box. Close the Show Table. Click on all cells to assign the expected requirements.

Fig. 4.2a1 Query Sample #1: List ALL records begin in a and have the English Terms shown first in Ascending Order, followed by Chinese Terms, Cantonese Index, Putonghua Index, and Type.
38

DICTIONARY OF COMPUTER AND INTERNET TERMS. pp127-8.

28

(A) Query Sample #1 (contd): When finished setting all requirements for the query, click on the View options at the topleft hand corner below the File pull-down menu, and select Datasheet View to display the expected result. For those who are familiar with the structured query language (SQL), choose the SQL View to have a look at how the SQL statements are written. Save the query with the label Query 1: Find ALL English Terms starting with a. Close this query, and the label of this query is shown.

Fig. 4.2a2 Query Sample #1 (contd)

29

(B) Query Sample #2: List ALL records with the Chinese Term having (cotton) and have the Chinese Terms shown first in Ascending Order, followed by the English Terms, the Cantonese Index, the Putonghua Index, and the Type code. First, click on the Queries section. Choose the second option Create query by using wizard. Select the fields accordingly and click Next. Name the query as Query 2_List ALL Chinese Terms relating to . Check the Modify the query design option before clicking Finish. Type ** in the Criteria cell. All entries containing are listed.

Fig. 4.2b Query Sample #2: List ALL records that have (cotton). Show Chinese Terms in Ascending Order first, followed by English Terms, Cantonese Index, Putonghua Index, and Type.

30

(C) Query Sample #3: List ALL records with the Putonghua Index da. First using the Type as the primary key in ascending order, then the English Term as the secondary key in ascending order, followed by the Chinese Terms, the Cantonese Index, the Putonghua Index. Same way and it just need a little enhancement on the Query Design. Even a layman who has practiced this system for a sometime could handle it easily.

Fig. 4.2C Query Sample #3: List ALL records with the Putonghua Index da.

4.3 Adding New Records


New records could be added into the master file of the spreadsheet or to the database interactively. However, in reality, the building of an eLexicon on a trade needs group effort for efficiency. This is the task of the researcher in the next semester, beginning in September 2009, using batch processing as before, and not by interactive processing. For the time being, as the Apparel eLexicon Prototype was developed by the 60 graduates, each of them is eligible to obtain this primitive version via eMAIL upon request.

5.

Embedding the Apparel eLexicon Prototype from the DBMS into a Word Processing (WP) File

Though paperless society has been boosted for two decades, it is not too successful because of the human nature. Flipping traditional hardcopy books is always preferred to electronic pages39. This is why the publication business is still flourishing. The researcher, therefore, try to turn the Apparel eLexicon database entries into a WP file using the Microsoft Word 2003. Below is just an illustration using the Cantonese Index as the key.
39

The success of Kindle, a portable screen display reading device introduced by Amazon in 2007, has to be proven cross 2010. By the way, people are not getting used to the somewhat similar gadget, the tablet PC for these 10 years due to inconvenience and non-user-friendliness.

31

First, open the Apparel eLexicon database management system in Mircosoft Access 2003. Go to the Report Object and Create report by using wizard. Select the fields accordingly, with the Cantonese Index goes first, followed by the Chinese Term, English Term, Type, and the Putonghua Index; then click Next. Add Cantonese Index as a grouping level and click Next. Choose the Chinese Term in Ascending order and click Next. Select the Outline 1 layout and in Portrait Orientation, with the field width adjusted and click Next.

Fig. 5 To turn the Apparel eLexicon Database Entries into a WP file using the Microsoft Word 2003

32

Use the Bold Title Style and click Next. Type in Apparel eLexicon PrototypeCantonese Index as the Report Name, and choose Preview the report before clicking Finish. Click the inverted triangle at the OfficeLink, and choose Publish it with Microsoft Word.

Fig. 5 To turn the Apparel eLexicon Database Entries into a WP file using the Microsoft Word 2003 (contd)

33

As the researcher's workstation has installed both the Microsoft Office 2003 and 2007, the Microsoft Access Report system will generate the WP file in rich text format (rtf) to the higher version (Word 2007) automatically. It could be opened in Word 2003 too.

Fig. 5 To turn the Apparel eLexicon Database Entries into a WP file using the Microsoft Word 2003 (contd)

34

There are 68 pages in the Word file which correspond to that of the Access Report. Note that the appearance of the two files is not the same. In such a way, a hardcopy Apparel Lexicon could be prepared by merging. One interesting point has to be highlighted here, is that even most companies have installed the Microsoft Office; they do not fully utilize all its facilities. Actually, even the Word is a good tool for desktop publishing. In 1996 when the researcher was editing the HONG KONG COMPUTER DIRECTORY at the Hong Kong Productivity Council, similar method of transferring data was used. Only dBASE III+ was used to capture the data and turned into Word 3.1, it still worked.

Fig. 5 To turn the Apparel eLexicon Database Entries into a WP file using the Microsoft Word 2003 (contd)

35

6.

Summary and Discussion

This research paper is presenting on how to build an electronic bilingual lexicon in the textile and clothing industry (which is generally called apparel). The initiation of this project is due to many reasons. Firstly, the researcher is a lecturer teaching computing. He was assigned to teach the concepts of information systems in textile and clothing. So, using such a project to guide the graduating students to learn how to develop an information system closely relating to their industry with their limited computing concept is quite pragmatic and meaningful. Secondly, the researcher was a graduate from the master of arts in computer-aided translation (MACAT) programme at the Chinese University of Hong Kong, and he also earned a master degree in management science researching on Chinese computing, along with continual learning in Putonghua teaching; so it is quite natural to adopt all the knowledge in developing a Chinese/English crossreference lexicon with Putonghua and Cantonese Indexes. Thirdly, the researcher was a production supervisor in database systems since 1979. He knows it quite well that such a way of processing data is productive, efficient and effective to handle lexicography of different trades. And, it just happened that the Translation Department of the alma mata is holding an international conference on the teaching of computer-aided translation, therefore, the researcher handed in the paper proposal to his teacher, Professor Chan Sinwai for academic sharing. It is hoped that this paper and its related work (the Apparel eLexicon Prototype in spreadsheet, database, and word processing) are contributive to all learners and the practitioners, especially to the apparel business. Besides, such kind of work (computerized system development) could be applied to any business, and so a greater eLexicon on all trades could be developed as time passes. In fact, this project work is not highly technical, or has employed much academic theories for working. It is to be developed by the first year university-equivalent students, and is to be used by general business people. So, it just employed the handy Windows-based personal computer with an office that has a spreadsheet program. If more functions are to be used, a database management system should not be too difficult to add on. And, a word processor could help making a hard copy of the data entries in good order. In short, "simple is the best" is the researcher's motto. He believes that most computer hardware and software today are just created for the sake of profit; they may not be a necessity. One of the significances of this research paper, is the researcher uses the role as both a teacher and a production supervisor to share with the public on how to teach college student how to apply their information systems concept in developing a lexicon of their trade by simulating an IT production line. The whole procedure is described, with all precautions highlighted. It is hoped that this product, the Apparel eLexicon Prototype, would be helpful in the teaching of computer-aided translation. Lastly, a follow-up work would be done in the next semester in September 2009 to upgrade the prototype to a more pragmatic level.

36

Appendix I: The Cantonese Pronunciation Index of the Hong Kong Telephone Directory

37

Appendix II: Some Facts about the Cantonese and Putonghua Pronunciation Systems

1 2.1

There are seven dialects in Chinese. (Rao Bingcai 2000:718)

Chinese phonology has been established for over a thousand years (Wang Li 1998:1). 2.2 The 22 Putonghua initials could be placed in a poem (ibid, p.23):(THE PEACE SONG) z y j n m x b d f l c r sh g s t p zh k q h ch 2.3 A monk (sharman ) in Tang Dynasty (618-907 AD) created 36 Chinese alphabets.()()()() () (ibid, pp74-76). There was a set of 30 alphabets before the 36 alphabets in Tang Dynasty. Trace could be found from the remains dug at the Dunhuang Cave ( (Wang et Pan 1992:44). 3.2 Before the Putonghua pronunciation standard was established in the early 20th century, Chinese was using the Reverse-Sectional Method (), in which, it took the initial of a first sample character and the final of a second sample character combined to form the correct pronunciation of a character. e.g.: (dong) = (du) + (zong) (ibid, pp48-51) 3.3 Before the Putonghua pronunciation standard, the Cantonese pronunciation method () was widely used. It had four tones that were different from the Putonghua four tones. These four tones were (normal), (rising), (falling), and (rapid-subtle), which were still learnt in Hong Kong during early 1960s in the primary school. And, modern Putonghua retains only the first three tones, with the first tone elaborated into two, (high-level) and (middle-rising); while the Cantonese still has the last tone (ibid, pp71-74). 4.1 Tone symbols with sign are worked by Yale University (Deng Yinglie 2002:7)
1 () 55
high level

3.1

Putonghua

2 () 35
middle rising

3() 214
falling rising

4() 51
high falling

1
light

Cantonese

1
53(55) high level (falling)

2
35 middle rising

3
33 middle level

4
21 low falling

5
23 low rising

6
22 low level

7
55(5) high level

8
33(3) middle level

9
22(2) low level

38

The Transliteration of Putonghua or Mandarin is the standard Chinese that takes Beijing spoken language as criterion and northern dialects as basis. Pinyin or Hanyu Pinyin or Chinese Phonetic Alphabet is a reasonable spelling scheme for Putonghua which meets the common phonetic rules of Latin letters. Now it is accepted as the norm to spell Chinese proper names by international communities and ISO. It was adopted by Chinese State Council in Nov. 1, 1957, and was approved to popularize in China by the 5th session of the 1st CNPC on Feb. 11, 1958. (ibid, p9) 4.3 Yales Romanization System is the Chinese spelling scheme created by Yales University in 1940s (ibid, p10). 4.4 Wade Giles spelling system was worked out in 1876 by Thomas Francis Wade, the ambassador of England to China at that time, using English spelling rules. Before Pinyin was publicized in China, it had been used in Mainland for spelling Chinese proper names and is still used in Taiwan at present (ibid, p10). 4.5 Zhuyin Zimu or Chuyin Tzumu or National Phonetic Alphabet is formed by square letters which were worked out by Chinese, using Hanzi strokes to note the pronunciations of Hanzi. In the alphabet initials and finals are drawn up separately. The Zhuyin Zhimu was published in 1913 under the control of Spelling Unification Association and went to public under the order of Education Ministry of Northern Government of that time. Now in Mainland China the Pinyin has been used in alteration of Zhuyin Zimu in education, but it is still used in Taiwan to note the pronunciation of Chinese characters (ibid, p10). 4.6 Also named Baihua, the Cantonese is not only used widely in Guangdong and Guangxi, but also spread in Hong Kong, Macao and overseas Chinese. Therefore its importance is known to everyone (ibid, p21). 5 Professor Liu Kwok-fai of the City University of Hong Kong used the Cantonese pronunciation method of the past famous Chinese linguist of the early 20th century, Zhao Yuanren , to make the phonetic symbols for the famous International Phonetic Association (IPA) test sample, THE NORTH WIND AND THE SUN . However, most Cantonese speakers would not be able to say out the Chinese text by reading his phonetic symbols (Liu (2002:9-10). CIYUAN (transliterated as THE ORIGIN OF THE CHINESE CHARACTERS) is one of the authoritative Chinese dictionaries among the Chinese communities published since 1948 and updated periodically. One of the characteristics of it is having both the Putonghua and the Cantonese pronunciation. On the Putonghua side, it is using the traditional methods, Mandarin Pronouncing Symbols and their corresponding Romanized system . Its Cantonese counter part is an authoritative standard that could be adopted for Cantonese indexing.

4.2

7.1

Cantonesestarted using IPA since 1941 by Wong Shek-ling. There are around 10 Cantonese systems ever since due to the laisez-faire policy of the Hong Kong Government. (Ho Ying-hung 2003:1). 7.2 Cantonese has been established over 2,000 years. 39

References
AMERICAN HERITAGE ILLUSTRATED ENCYCLOPEDIC DICTIONARY. Boston: Houghton Miffin Company. 1987. Beijing Foreign Languages Teaching and Research Publisher (1997). THE WARMTH MODERN CHINESE-ENGLISH DICTIONARY Taipei: Warmth Publisher.Written in Chinese: 1997 Beijing Languages Institute. (1982) A CONCISE CHINESE-ENGLISH DICTIONARY Beijing: The Commercial Press.Written in Chinese: 1982 Beijing Languages Institute. (1987) CHINESE-ENGLISH-FRENCH: A HANDBOOK OF CHINESE JOURNAL TERMINOLOGY Hong Kong: Sanlian Book Store Hong Kong Branch. (Written in Chinese: (1987) ) Chan, Sin-wai (2004) A DICTIONARY OF TRANSLATION TECHNOLOGY. Hong Kong: The Chinese University of Hong Kong. Chan, Sin-wai (2004) A DICTIONARY OF TRANSLATION TECHNOLOGY. Hong Kong: The Chinese University of Hong Kong. Chan, Sin-wai (2002) TRANSLATION AND INFORMATION TECHNOLOGY. Hong Kong: The Chinese University of Hong Kong. Chan, Sin Wai et Pollard, David E. (2001) An Encyclopaedia of Translation (Chinese-English EnglishChinese). 2nd ed. Hong Kong: The Chinese University Press. Chau, Siu-cheung (ed.) (1996) TRANSLATION AND LIVING. Hong Kong: Commercial Press. (Written in Chinese: 1996 Chau, Siu-cheung (ed.) (1997) ADVANCED TRANSLATION. Hong Kong: Commercial Press. (Written in Chinese: 1997 Chau, Siu-cheung (ed.) (2003) ELEMENTARY TRANSLATION. Hong Kong: Commercial Press. (Written in Chinese: 2003 Chau, Siu-cheung et Chan Yuk-jim (ed.) (2004) THE THEORY AND PRACTICE OF INTERPRETING. Hong Kong: Commercial Press. (Written in Chinese: 2004 Chik, Hon-man et Ng, Lam Sim-yuk (1989). CHINESE-ENGLISH DICTIONARY CANTONESE IN YALE ROMANIZATION, MANDARIN IN PINYIN. Hong Kong: New Asia Yale-in-China Chinese Language Center, The Chinese University of Hong Kong.1989 Commercial Press (Hong Kong) Co. Ltd, The (1990) COMMERCIAL PRESS NEW PHRASES DICTIONARY. (()) Commercial Press (Hong Kong) Co. Ltd, The (1990) COMMERCIAL PRESS NEW WORDS DICTIONARY. (()) Commercial Press International Co. Ltd, The (2000) XINHUA DICTIONARY WITH ENGLISH TRANSLATION. Beijing. () Deng Yinglie (ed.) (2002). 20,000 HAN CHINESE DICTIONARY WITH EXPLANATIONS IN CHINESE, JAPANESE, KOREAN, VIETNAMESE, ENGLISH AND RUSSIAN PRONUNCIATION AND EXPLANATION. Shanghai: Shanghai Dictionary Publisher. ((2002) ) Downing, Douglas; Covington, Michael et al (2009) DICTIONARY OF COMPUTER AND INTERNET TERMS. (10th ed.) Hauppauge, NY: Barrons Educational Series, Inc. Gates, Bill. (1999) BUSINESS @ THE SPEED OF THOUGHT: USING A DIGITAL NERVOUS SYSTEM. NY: Warner Books Inc. Halliday, M. A. K.; Teubert, W.; Yallop, C. et ermkov, A. (2004) LEXICOLOGY AND CORPUS LINGUISTICS. London: Continuum. Ho Ying-hung (2003). THE NINE TONES OF THE CANTONESE DIALECT. Hong Kong: The Future Cultural Publisher. (Written in Chinese: (2003) )

40

Hong Kong Credit Bureau (1977). DATABASE CREDIT MONITOR. (A weekly publication) Hu, Run Sang (ed.) (1993) NEW ENGLISH PRACTICAL HANDBOOK (6th printing) Taipei: Hsiung Feng Publisher (Written in Chinese: 1993) Hua Tong Editorial Board (1995). THE CIYUAN DICTIONARY (Written in Chinese: (1995)) Hutchins, W. J. and Chichester: E. Horwood (1986) Machine translation: Past, Present, Future, New York: Halsted Press. (Free electronic version: http://ourworld.compuserve.com/homepages/WJHutchins/PPF-TOC.htm) INTERNATIONAL (ENGLISH-CHINESE, ENGLISH THROUGH ENGLISH) DICTIONARY Taipei: Kai Ming Book Store, 1974 Reprinted Liu, Godfrey K. (2002-3) STUDIES IN CANTONESE LINGUISTICS. 3 vols. Hong Kong: Lo Tat Cultural Publishing Company. (Bilingual publication: 2002-3 Liu, Jingzhi (ed.) (1991) TRANSLATORS HANDBOOK. Hong Kong: Commercial Press. (Written in Chinese: 1991 Lu, Jianming (2004) A RESEARCH ON THE CHINESE LANGUAGE USAGE IN THE 1980S. Beijing: Commercial Press. (Written in Chinese: 2004 Ma, Dong (ed.) (2006) THE CROSS-CULTURAL SHARING AND THE ANALYSIS OF LANGUAGE USAGE. Changsha, Hunan: Erlu Publisher. (Written in Chinese: 2006 Ma, Xianbin (2005) AN ANALYSIS ON THE WORD USAGE OF MODERN CHINESE. Beijing: Beijing University Press. (Written in Chinese: 2005 Man Kwong Publishing Co. (2007) Modern English Encyclopedia Handbook. 7th ed. Hong Kong. (Written in Chinese with English illustration: ) Manser, Martin H. et al (ed.) (2001). CONCISE ENGLISH-CHINESE/CHINESE-ENGLISH DICTIONARY. Hong Kong: Oxford University Press (China) Ltd () McArthur, Tom (1997) LONGMAN LEXICON OF CONTEMPORARY ENGLISH. 22nd impression. Edinburgh Gate, Harlow : Addison Wesley Longman Ltd. McArthur, Tom et McArthur, Feri (1992) THE OXFORD COMPANION TO THE ENGLISH LANGUAGE. Oxford : Oxford University Press. PCCW (2009) THE HONG KONG COMMERCIAL TELEPHONE DIRECTORY. Ping Jun (ed.) (2006) BASIC CHINESE KNOWLEDGE. Hong Kong: Commercial Press. (Written in Chinese: 2006 Putonghua Web (http://www.putonghuaweb.com/) Qia Yannong (ed.) (1972) THE CHINESE DICTIONARY WITH CANTONESE AND MANDARIN PRONUNCIATION. (19th ed.) Hong Kong: Hong Kong Overseas Chinese Language Publisher. (Written in Chinese: (1972) 19 ) Rao Bingcai (ed.) (2000). A HAKKA DIALECT PRONUNCING DICTIONARY. Guangzhou: Guangdong Peoples Publisher. (Written in Chinese: (2000))
Shelly, Gary B.; Cashman, Thomas J., et Vermaat, Misty E. (2002) MICROSOFT OFFICE: INTRODUCTORY CONCEPTS AND TECHNIQUES. Boston: Course Technology, Thomson Learning. SYSTRAN Language Translation Software (with online trial version: http://systransoft.com)

Tang, M. C. (ed. 1993-1997). HONG KONG COMPUTER DIRECTORY. Hong Kong: Hong Kong Productivity Council. (A yearly publication) Tang, M. C. (2003). BSC-SWOT/QFD-Sunzi strategic management model for a semi-governmental nonprofit educational organization in Hong Kong. (Doctoral dissertation, The International Management Centres (IMC). Tang, M. C. (2006). An attempt to compare the Chinese translations in Hong Kong, mainland China and Taiwan of the Microsoft Office Onlines article What is My Office Online?. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C. (2006). A comparison on the macro- and micro-structures of four bilingual paperback dictionaries. (Masters assignment, The Chinese University of Hong Kong).

41

Tang, M. C. (2006). Literature review on the traditional approach to translation before the modern era. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C. (2006) A research proposal to an attempt to compare the Chinese translations in Hong Kong, mainland China and Taiwan of the Microsoft Office Onlines article What is My Office Online?. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C. (2006). The trend of translation and translating. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C. (2007). A comparison among three different approaches in computer translation: Corpus-, Example- and Statistical-based approaches. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C. (2007). Tokenization, Segmentation, POS Tagging and Sentences Alignment on bilingual Texts (English Source / Chinese Target) in machine translation (MT). (Masters assignment, The Chinese University of Hong Kong). Tang, M. C., & Clubb, O. L. (1993). CHINESE COMPUTING: HISTORY AND CURRENT TRENDS. Hong Kong: Tamerind Publisher. Tang, M. C., Chong, K. W., Lo, O. K., & Luo, H. D. (2006). Bilingual Electronic and Online Dictionaries Comparison. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C., Deng, L. L., & Lee, A. I. (2007). An Evaluation on the Use of Translation Technology in Translation Practice. (Masters assignment, The Chinese University of Hong Kong). Tang, M. C., & Ye, Qingfang (2008). Building an English Translation Corpus for a Local Chinese Newspaper in Hong Kong Using SDL Trados Synergy 2007 (Domain Six: Martial Arts (Wushu)) Project Report. (Masters final year project assignment, The Chinese University of Hong Kong). Wang Li (1998) THE HAN CHINESE PHONETICS. Hong Kong: China Book Store (Hong Kong) Co. Ltd. (Written in Chinese: (1998)()) Wang, Shouming et Pan Wenguo (1992). THE HAN CHINESE PHONOLOGY. Shanghai: Huadong Normal University Publisher. (Written in Chinese: (1992) Wong Ling-shek (1941) CANTONESE PHONETICS COLLECTION. Hong Kong: China Book Store (Hong Kong) Co. Ltd. (Written in Chinese: ) Yu, Guangzhong (2006) AS THE LANGUAGE GURU SAID: CHINESE AND THE WESTERN. Hong Kong: Commercial Press. (Written in Chinese: 2006

42