Professional Documents
Culture Documents
Telephone Servey
Telephone Servey
0), optimum sample sizes with repeated surveys and reduction in cost with interpretation: 36First, we list the notations as below: M-=number of prefix areas in the population (in Waksberg's analysis, the prefix areas includes the first eight digits of the 10-digit telephone numbering system), m = number of sample prefix areas desired, k+l = cluster size in sample, n= total sample size of residential numbers = m(k+1), p= intraclass correlation within prefix areas, number of residential numbers in ith prefix area, x = Z)Pi/(KM), ie., the proportion of residential numbers in the population, kM = number of residential units in the population, proportion of prefix areas with no residential numbers, o*(X) = sampling variance of statistic being estimated, o* = population variance for statistics, s = number of surveys to be carried out with the sample design, Cu= cost of an unproductive call, ic., to a nonresidential number, Cp = cost of a productive call; includes both interviewing and processing costs. Cluster Sampling with PPS and with Some Prefix Areas Having No Residential ‘Telephone Numbers (that is, Some P, = 0 and t> 0) By assuming in the remaining (1-1)M areas, Pj > k#1, the expected total calls will be the sum of the expected number of calls to select a sample of m prefix areas with Pj > O and the expected number of calls within the selected areas, ie., E(total calls) = mn" + B;{100k mP\(100 Pj wM)"}, where i= 1 to (1-tM = (mm")[1+(1-k] (1) The associated expected cost is given by, 37CH m(k+1)Cp + {mm [L+(1-k]-m(k+1)}Cu = m(k+1)[Cp+(1-n-)Cun"] + mtCun —- (2) All variables in the expression are positive real numbers so that x < I-t. By introducing the sampling variance of statistic being estimated, 6°(X) = 0°(1+pk)[m(k+1)]", where p is the intraclass correlation within clusters (or the measure of homogeneity of the cluster using Hansen, Hurwitz and Madow's term), and expressing the expected cost as C= mC+mnC” where n=k +1 (3) Cl=tCwn ~ (4) and C"'= Cp + (1-t-t)Cu/ — (5), then the optimal values of m and k (and hence n) can be determined for a single survey. To determine the optimal values of m and k is to minimize the function F = 0%(X) + Mme! + ma C"-C) -~ (6) where 4 is the Lagrangian multiplier. Using partial differentiation to (6) with respect to m and nand setting the differentials to zero, we get dF/6m=-o'[l-ptpn|mtay! +AC+n C= 8F/5n =o%p(mny" - o[1-p+pnj(ma)? + AmC" Solving for 2 in (7) and substituting into (8), we can solve for CC" ~p)/p], that is, optimum n = optimum (k+1) x(Cp+ - oot 2) Pe ~ (wretca Pp | ® 38Comparing cluster sampling with simple random sampling (srs) with replacement or sts in an infinite population (so that the finite correction factor can be ignored), if the cluster sampling is more efficient, the sufficient condition is p <0 (but this is an unlikely outcome). If p= , cluster sampling and simple random sampling are equally good while cluster sampling is worse than simple random sampling when 0
= the number of PSU's in the population = the set of all PSU's in the population, where I = {1,2,3,...M} m= the number of PSU's in the sample which is randomly selected with replacement = the set of all sampled PSU's, i= (1,2,3,..m} N = the number of phone numbers contained in each PSU (usually taken as 100) P| =the proportion of auspicious phone numbers in the I* PSU PT the proportion of residential phone numbers in the I" PSU P =(2P))/Mand P* =(5P*; YM M'= the number of PSU's in the population that contain no auspicious phone numbers L = MYM = the proportion of PSU's with Pj = 0 Expected Numbers of Phone Numbers Dialed For the first stage, the total number of phone numbers dialed is fixed and is equal tome. Then the expected number of auspicious phone numbers dialed in the first stage is P me and the expected number of residential numbers is P * me. In the second stage, the expected numbers of phone numbers dialed is equal to (1-L)mke for total phone numbers, p~ mn for auspicious numbers and P* mke for residential numbers. On dividing these two results by (1-L)mke, P /(1-L) is the fraction of auspicious phone numbers and P * /(1-L) is the fraction for residential numbers. Note that these fractions are independent of c but are sensitive to the definition of an auspicious phone number. 2Distribution of the Proportion Auspicious If the distribution of the proportions of auspicious numbers is known, we can make a better choice for the value of c and the definition of an auspicious phone number. Let X denote the number of auspicious phone numbers found among the ¢ numbers that are dialed in a PSU and P be the proportion of phone numbers in a PSU that are auspicious and assume that the prior distribution of P, h(p), is a Beta distribution with o: and B be its parameters and the likelihood, h(x | p) is a Binomial distribution. That is, P ~ Beta(cr,8) for 0
4 and the results of the first stage of sampling are already available, a Pearson's X? of goodness of fit test with degrees of freedom (c-3) can be used to testing degree of fit for h(x). Potthoff used the data gathered by Valley Forge Information Service in March 1983, he obtained a Pearson X*-value of 0.04 with 1-df indicating the Beta distribution of h(p) cam be retained. On the other hand, the density of p for Type I PSU is given by the conditional density h(p|x = 1) which equals to (1- L)h(x = 1| p)h(p)/h(x = 1) and this is also a Beta distribution 43with parameters (a-+1) and (B+ -1), so that the conditional mean of p given x, o and B is E@ |x = 1,0,8) = (a-+1)/(a-+8+c) and the conditional variance is Var(P |x = 1,01,8) (o+1)(B-1+c) {(a+B+c}(a+P+1+c)}" Note that after a certain value of ¢ when ct and B is fixed, the conditional variance decreases as ¢ rises, as we see that there is a particular c* which gives the maximum value of the conditional variance and c* is given by the root of a quadratic equation and the only root is (-Ge+g)+ Ges)" Sef +2eg—Fg)}/4 where e=B-1, g=atB+l and £=a+f. It may be interpreted that a higher value of c associates with a lower value of the conditional variance, that is, a higher value of the precision, It is a favourable effect that in the first- stage dialing, a higher value of ¢ will lead both to improved chances of detecting errors and to less serious consequences of undetected errors (Potthoff, 1987a). Although the survey cost incurred is higher when ¢ rises, there is a trade-off, but this is often a favourable one. Apart from the value of e, the definition of an auspicious phone number also has effect on the survey cost. [Potthoff suggested that] the definition should be such that, if one were to process the same phone number twice for classification as auspicious or inauspicious, the classification would be the same on both occasions. Failure to observe this last criterion will cause the probability structure of the model to be violated (Potthoff, 19872). This criteria are in fact the criteria of "unambiguity' and ‘consistency! In sum, despite the high coverage of complete residential telephone numbers, random digit dialing methods appear to be more expensive to implement than directory-based samplings. Two-stage procedures suffer losses in precision due to clustered sample selection. Other altematives can be used instead of Class-I and Class-II Samplings, they are referred as synthetic approaches.2.3.5 Type-I1I Sampling - Synthetic Approach ‘Two examples of Synthetic Approach are list-assisted and dual frame samplings which are thought to have the complementary strengths of the directory-based and random digit dialing methods. Plus d-digit sampling is a list-assisted procedure in which a sample is selected from a directory and an integer is added to the last d digits of the selected numbers. 'd’ is usually selected as'I' or "2 for practical implementation. This sampling procedure includes both listed and unlisted telephone numbers and it therefore yields a higher proportion of productive calls than the random digit dialing design. However, the plus d-digit approach meets some theoretical problems such as in order to achieve nonzero chance of selection for unlisted numbers, it must be assumed that unlisted numbers are evenly mixed among listed numbers, but this essumption is hard to verify. Even granted this assumption, the probabilities of selection of telephone households are likely to be unequal because the samples are drawn from a directory which may have frame deficiencies. The remedy for this theoretical drawback can be by replacement of the last d digits by random selection. But as d increases, the proportion of productive calls also decreases since it is approximate to the naive random digit dialing method. On the other hand, we may stratify the prefix areas of the telephone numbers according to the auxiliary information provided by the commercial list, then random digit dialing design is applied by stratum to stratum. This approach may be referred as list-assisted sampling based on telephone number samples. We have mentioned that there are at least some commercial lists available in Hong Kong so that this approach may be applicable. However, the quality of this kind of lists may not be known since the providers usually do not release the information about the data quality, so the quality of the frame lists is out of the control of the users. Another 45alternative is the dual frame sampling which selects samples from a directory ot commercial list frame, and simultaneously an random digit dialing sample is selected from the frame of all possible telephone numbers. Numbers are called in both samples and interviews are attempted in all telephone households. It should be noted that a special estimator of the mean of a characteristic is constructed and this estimator is composed of two parts : total domain estimate and directory domain estimate. Compared with the random digit dialing approach, dual frame approach gains in higher precision (smaller variance of the poststratified estimator) and lower nonsampling error (advance letter can gain rapport). Further, survey cost is lower than the random digit dialing approach because of the higher proportion of productive numbers in the directory sample. The largest drawback to this approach is the need to use a special estimator. Further investigations both of alternative estimators and the value of this approach in telephone surveys are suggested. To conclude this section, considering only the theoretical aspects, since Hong Kong has a high telephone penetration, well- designed telephone numbering system and well-edited telephone directories, we recommend that systematic or clustered telephone directory-based sampling can be the first option of sampling strategy in conducting telephone surveys because itis feasible and simple to implement with considerable representativeness, Empirical confirmation of this assertion will be presented in Part II. On the other hand, itis suggested that more experimental studies should be done in Hong Kong to test Potthofi's generalized sampling scheme; otherwise, we do not know the applicability of this scheme in Hong Kong. In November 1996, the SSRC of the University of Hong Kong is doing a clustered RDD experiment to provide necessary data for testing this kind of sampling scheme. Having presented the available probabilistic telephone sampling strategies, we 46are going to discuss computer- assisted telephone interviewing (CATI) and the present situation of adopting CATI in Hong Kong. First-hand data are collected for this purpose since there is no available data about CATI in Hong Kong, Survey errors inherent in telephone surveys will be dealt in Chapter Three which is the core theoretical aspect of this thesis, with particular reference to Hong Kong. 2.4 CATI in Hong Kong Market Research Industry 2.4.1 What is CA’ A General Introduction With the rapid development of computer technology in recent years, computer- assisted telephone interviewing (CAT!) which is considered a cost-effective and efficient data collection means is a product of statistical and computer science. These changes have arisen not only because of technological developments but also because of the increasing cheapness of computer hardware and the availability of portable microcomputers and microcomputers for home use. (Stray, 1989) There are several related variants such as computer-assisted personal interviewing (CAPI), computer- assisted self-administered questionnaires (CASQ), computer-assisted panel research (CAPAR) and computer-assisted self interviewing (CASI) (see Nicholls, 1986 & 1988; Stray, 1989; Winter and Clayton, 1990). All of these make use of computer facilities when processing data and these computerized and non-traditional data collection methods comprise the family of computer- assisted data entry (CADE), On the other hand, a recent newly developed channel to dial IDD calls could be via the Intemet worldwide computer network, it combines the telephone and computer systems to dial international calls. In addition, the success of interactive videoconferencing (and even virtual conferencing) is a brilliant breakthrough of visual methods when communicating via telephone lines which is 4originally a voice-based system. All of these suggest that CATI should have a bright development in the future. This section is going to present some important features of CATI system and the experiences in the U.S., the U.K. and Hong Kong The assessment of data quality and operation costs incurred using CATI will be put under the Section of Telephone Survey Errors and Survey Costs in Chapter Three. A generic definition of CATI is provided by Nicholls (1988): Computer-assisted telephone interviewing (CAT) employs interactive computing system to assist interviewers and supervisors in performing the basic data-collection tasks of telephone interview surveys. This definition clearly excludes the cases that any surveys employ computers solely for data-entry and processing. This misconception of CATI is however found in the research industry in Hong Kong. This reflects the rather weak knowledge of CATI in Hong Kong. Detailed research findings are presented in Section 2.4.4. Stray (1989) suggests that a typical CATI system would be for a central computer system to down load the names and telephone numbers of the people to be interviewed to a micro as well as details of the questionnaire to be used. The conception and types of CATI, however, depends largely and jointly by the ways of uses of CATI and the fields of applications. The system of CATI is ranging from a standalone computer workstations to networked computer systems (minicomputers or mainframe servers). The capabilities of a typical! CATI system, according to Nicholls (1988), includes sample management, online call scheduling and case management, online interview, online monitoring, automatic record keeping and preparation of data sets. Nicholls (1986) considers that online call scheduling affects both interviewer productivity level and measures of data quality dependent on the frequency and efficiency of calling. Of course, the degree of capability is largely related to the computer system connected. Nicholls (1988) listed three complementary as well as competing functions with derived benefits about CATI. These functions are 481. CATT as a means to facilitate or expedite telephone surveys; 2. CATI as a means to enhance and control survey data quality; and 3. CATI as a means to permit new types of surveys not possible with pen-and- pencil methods, The benefits derived from function (1) are: (Ja) fast and simple methods for questionnaire setup, (1b) direct support of interviewers in selecting respondents, (1c) entry of responses in machine readable form to speed processing, and (1d) simple methods of generating output files and administrative reports. For function (2), the benefits derived are: (2a) systematic control over the scheduling of calls and callbacks, (2b) tailored wording of computer questions based on prior responses, (2c) computer-controlled branching between questionnaire items and sections, (24) automatic range and consistency edit checking during the interview, and (2e) careful monitoring of interviewer performance to ensure that intended procedures are followed. For function (3), the derived functions are (Ba) randomize question sequences or question wording in complex factorial design, (3b) incorporate arithmetic calculations or logical checks not readily performed with paper methods, (Bc) utilize table looking routines to match responses with hits of possible alternatives, (3d) use data from prior interviews (or records) in dependent interviewing without disclosing that data to the interviewer in advance. 49It should be noted that the third function may be treated as an extension of the second one; however, the first function is not necessarily consistent with the second two, at least the sophistication of setups are different. Concerning the questionnaire design, there are three schools of CAT! questionnaire designs which depend on the degree of sophistication of the CATI system. Item- based CATI systems is only a forward movement design and items therefore have to be answered in sequence. Screen-based CATI displays several items at a time and it uses soreen as the basic unit of questionnaire design and so it is especially suitable for sets of noncontingent questions on the same topic and for multiple answer questions. Although branching is computer controlled, itis still a forward movement design. Form-based CATI permits the interviewer to use cursor control keys to complete the table formatted questions in any order. It allows both forward and backward movement that toggles between the 'pages' of questionnaires. The last may be called the hybrid design that permits itemn-based, screen-based and form-based display in different parts of the same questionnaire. Multiple windows can accommodate both item and form based CATI concurrently on the same scteen, Different schools of questionnaire designs require the interviewers to have different knowledge and skill so that trainings are different. With the development of ‘hypertext! and ‘hypermedia’ in computer programming architecture, the forward and backward movement should become the dominant design. The trend of new design of CATI systems seems prompted more by the integration into broader systems than by enhancement of the CATI system per se. That is the CATI systems will be embedded as one of many functions performed by larger computer based survey systems, It can be sure that the future of CATI is bright since the continuing marked success of computer technology and the drastic declining price of computer systems, However, the popularity of CATI still depends mostly on the level of knowledge of the research practitioners, the social definition of telephone 30and the image of surveys among the general public. Further research on these aspects is important, 2.4.2 The Experiences in the United States Telephone interviewing started in the United States and is a relatively new form of data collection in the U.K., Western Europe and Asia, About the experiences of CAT in the U.S., Nicholls (1986, 1988) summarized as follows : The first CATI systems were developed by U.S, market research organizations in the early 1970s. Based on experiences in the first CATI survey, conducted by Chilton Research for AT&T in 1971, Nelson Peyton and Bortner (1972) described "three distinct advantages" for cathode ray tube interviewing (as it was then called) in comparison with conventional data collection methods. These were: "accuracy, speed, and reduced costs". (Nicholls, 1986). During this period, the market research firms using CATI are also found in other countries outside the U.S., such as Australia, Canada, Netherlands and West Germany etc. University survey research centers began independent development of CATI in the mid 1970s. The UCLA Center for Computer-Based Behavior Sciences led the way of implementing CATI. The Center's Director, Gerald Shure, coined the CATI name and acronym, (ef. Gerald H. Shure and Robert J. Meeker, "A Mini-Computer System for Multiperson Computer-Assisted Telephone Interviewing" in Behavior Research Methods and Instrumentation, Vol.1, No.2, 1978, Pp.196-202). Most importantly, the university research centers expanded the CATI capabilities ftom non- probabilistic sampling to probabilistic sampling and they also improved the range of call scheduling and callback routines in order to maintain high response rates and greater freedom of interviewer movement. The U.S. governmental agencies demonstrated an early interest in CATI but did not begin acquiring their own CATI capabilities until the early 1980s. The U.S. Department of Agriculture Statistical Reporting Service and the U.S. Census Bureau both established staffteams for implementation and testing of CATI surveys. It should be noted that a ‘common government application of CATI is for case-follow-up so that mixed mode of data collection is in use (eg. CAPAR, longitudinal demographic surveys and CAPL, when respondents not reachable by the phone) may also be used if necessary. The development of CATI in Hong Kong is similar with the case mentioned above in the United States. Although there has been no systematic study of the history of using CATI in Hong Kong, a survey on the current practice of CATI in Hong Kong research organizations has been conducted by myself and the results are then presented in Section 2.4.4. 2.4.3 The Experiences in the United Kingdom The development of CATI in the United Kingdom should be expected to be as advanced as the case in the United States. However, CATI in the United Kingdom has not been well-documented, although we have searched different sources, for example, the Journal of Market Research Society (JMRS) (but there are some advertisements about CATI in their Research Plus Magazine) and internet searching via some search engines over computers. Two pieces of related information are found from intemet search and JMRS, they are about using computers in market research and transport industry in the UK. Taylor Nelson AGB plc is a leading market research company and has offices in London, Paris, Madrid and Brussels. Tony Cowling (1996), the Group Chief Executive of Taylor Nelson AGB ple, in 1996 reported that the group uses a wide range of technology to collect, analyse and disseminate information including the use of barcode technology to collect data on consumer purchasing and by collecting «his by computer over the telephone so that it can provide rapid and accurate information for decision making. The group considered that the telephone is certainly an important tool for interviewing and when it is linked to a computer system it becomes even more powerful. The group has one of the largest Computer Aided Telephone Interview (CATI) systems in Europe. Some 250,000 interviews were 52completed in 1995 and many of them international. Computers are, of course, also used for analysis and manipulation of data and some of the information because the group provides data are in electronic format. Jones and Polak (1993) admitted that computer-assisted telephone interviewing is now well established and has quite quickly become the (market research) industry norm for telephone surveys, but the use of computers for face-to-face interviews is still relatively limited in social and market research in the UK. One exception is in the transport industry, where computer-based interviews are now the norm for mi mplex forms of conjoint analysis research and several consultants have developed proprietary software. Jones and Polak (1993) suggested two reasons why the take-up of computer- assisted/based interviewing in the UK has been relatively limited. ‘The reasons are (J) for the largest and most complex forms of survey, the main impediments to using computers are probably lack of familiarity (both on the part of the executives and their field force), and (2) the cost of the hardware and the software, and the problems of distributing and retrieving computers in an industry where the same interviewer may work for several companies. Companies working in the transport sector have got around these problems by writing their own software and offering clients 'value-added’ facilities using computer. For general household-based personal interviews, computers are still a less attractive proposition, except when using a dedicated field force using simpler machines. The authors also comments that there might be a role for The Market Research Society in encouraging an industry-wide transition to using computers for certain types of survey. 3Providing a historical account on CATI in the UK, the coming section draws heavily on the paper by Collins (1983). It should be noted that the data cited is about thirteen years old so that it may not be applicable in United Kingdom today To ‘explain’ the phenonmeon that although CATI is well- established in the U.S. as a data collection method in both market and social research, this is not the case in other English-speaking countries such as the United Kingdom, Collins (1983) stated that the U-K. environment is very different due to (1) telephone interviewing is not widely used for interviews with private households; (2) it tends to be regarded as an inferior and limited substitute for face-to-face interviewing. Further, collecting data by the aid of eiectronic machines is by no means equivalent to using CATI, other means such as using videotex and cable are more popular in some U.K. research agencies; and (3) telephone interviewing and CATI are after all two logically distinct methods so that they could grow separately. Collins (1983) pointed out two main negative pressures towards computer-assisted data entry (CADE) and hence applicable to CATI. First, it may only be a gimmick that computer assistance will be beneficial in survey data collection. A computer may seen as a barrier between the survey practitioners and respondents. The history of relative underachievement of computer technology in the U.K. contributes to the growth of suspicion regarding CADE. Second, the unfamiliarity with the telephone in the U.K. is widely believed to impose severe limitations on the kind of data that can be collected, and on the length of an interview. It should be noted that the second argument entangles with ingrained prejudices about the social image of telephone in the U.K... About the current status of CATI in the UX., Collins (1983) cited the survey results on CATI facilities in U.K. survey organizations. These results were published by the Market Research Development 54Fund (MRDF). The number of interviewing stations ranges from 30 down toa standalone computer. It shows that some agencies adopt the centralized (at least networked) CATI systems. But the popular systems are based on micro- computers, Table 3 : CATE Facilities in UK Survey Organizations (as at Mav, 1983 ‘Organon ——]-Nariberat | —__ ase - Stations ‘GRL-TATI Reseach 7 Var Indep. Res. Bureau Wake Res Enerpies 35 [Rew Beans DEG Prime ST "NOP Market Reseach 0 Prime [IMAS Survey Research 16 VAX; POTTS "MTC Research Group 16 Via Audis & Surveys, NY. FDS Marke Reseach 1 "BBC Mista ‘Burke RSL 6 VAX Marian -Datpian é Pixel ample Suveyt 3 Tors ‘Marketing Seences 3 Mi MIVA Conslaney 3 Prime aL z ier Transm z Vir Toren Carek ames 7 Sie Daa Cilection & Tp 7 Micro ‘Overseas maret Reseweh | Epion EXOT Plas Four Arayss T Prime (TS) [= "Prodee Stades t Prime (TS) Suny Fore i Mero Wire Marke Research Devlopran Find Peradeced Cola UVES, Conceming the fiture investment on CATI, Collins (1983) said, according to the survey figures, as many as one in five of the 180 suppliers of computer software expressing "certain" or "very likely" to invest (or to invest further) in CATI. ‘Nearly half of both suppliers and users saw it [CATI] as having only limited value for either large and complex surveys or small and very simple surveys. In addition, about 90% of both suppliers and users of market research expected the use of telephone interviewing to increase but about 25% were personally opposed to this development. Three broad categories of constrains and reservations on the growth of telephone interviewing (and hence of CATI) were listed by Collins (1983) : 1, Low Telephone Ownership: ‘The penetration rate of residential telephone ownership in the U.K. in 1983 was about 73% and that these households contained 77% of the adult domestic population. There is approximately 4% annual increase in telephone 35ownership but 85% is the expected ceiling penetration rate. The non- ownership is increasingly concentrated among the rather low socio-economic status groups who are often the focal respondents for academic research. 2. Limitations on the amount and depth of data that can be collected ina telephone interview: ‘The majority view is that questions in telephone interviews have to be simple closed questions about unambiguous, non-sensitive factual topics and non- visual-aided items. This is one of the social prejudices of using the telephone to collect data and it must be overcome for CATI to have a chance of wide acceptance, 3. Public opposition or resentment of the greater intrusiveness of the telephone: Many U.K. researchers anticipate not only a lower level of agreement to take part in telephone surveys but also a higher post-interview resentment. The confusion of telephone direct selling is also detrimental to the survey industry Nevertheless, Collins (1983) reported a comparison survey on telephone versus face-to-face interviewings conducted by the Social and Community Planning Research (SCPR) Survey Method Center. Collins stated that the general impressions is of encouraging similarity between the two sets of results. About the response rate, Collins (1983) reported that it was about 5% below that achieved in face- to-face interviewing (65% response rate), a pattern similar to that reported by ‘American researchers, However, the gain of "high" response rate was off-set by @ higher incidence of refusal to cooperate. This certainly supports the industry prejudice against telephone surveys. Collins’ paper describes the telephone interviewing and CATI practices in U-K. research industry, but it does not tell much about the development of using CATI both in the governmental agencies and the university academic institutes, so caution is necessary when making inference. 56244 tudy of CATI in Hong Kong Research Industry For the purpose of getting the profile of the current CATI practice in the market research industry in Hong Kong, we carried out a ‘census' by using telephone interviewing in April 1995 (the questionnaire and covering letter can be referred to Appendix X & XI), the frame was according to the sections of Marketing Research & Analysis and Data Processing Service of the Hong Kong Commercial/Industrial Guide Yellow Pages 94/95 (CIGYP). By a'pilot screening’ implemented in December, 1992, 123 valid entries were verified to see if they are actually doing market research or related business. The result was that 24.4% (30 cases) were confirmed as ‘eligible cases’. (see Table 4 and Table 5) Since the frame is copied from CIGYP, so the problem of coverage is exactly due to the completeness of the said directory. We believe that this should include ‘all’ working companies since no one wants to be ‘unlisted’ in the business directory if he or she does want to do business. Expectedly, there are only vo out of thirty cases (6.7%) claimed by themselves have been using CATI, so they are treated separately (that is, they are asked by ‘long-form’ questionnaire) and the results are presented in Section 2.4.5 (Cf. Table 4) In what follows, we first present the statistics of the telephone interview. ‘Table 4 : USECATI - AlreadvUseCATI ‘Response | Frequenay | Percent | Valid Percent Ne 2 238 353 Yes z 16 67, Oihers 3) 5.6 | MISSING Total 12371000 700.0 Noes 1 Wald Cass #30; 2 Oia = inaigie cats The majority of the entries listed in the directory are not doing market research nor related business, 51.7% (60 cases) are doing other business and 22.4% (26 cases) 7are not-yet-registered for doing business, there are only 25.9% (30 cases) verified as doing market research or related business. (see Table 5) Of these 30 ‘working! organisations, 93.3% (28 cases) say that they have never used CATI but 39.3% (1 cases) of these organisations claimed their computer systems are networked. It must be noted that the logical development of CATI is from a networked computer system, Table 5 : BUSINESS - MainBusiness Response | Frequency | Percent | Valid Percent iaiRes Related 30248 259 Others oa) NoiRegistered [211 224 NaResponse/NA, T[ 1000 | —~“io0.0 Total 123) 1000 100.0 Note: Vaid Cases = 16; Mising Ces 7 Of the 28 organisations which have not yet used CATI, 32.1% (9 cases) are not yet-decided to use CATI in the future, 35.7% (10 cases) are not-known if they may ‘use CATT in the future and 32.1% (9 cases) will definitely not use CATI. (see Table 6) ‘Table 6 : INTEND-IntendToUseCATI [Response [Frequency [Valid Percent ] NotVewecided 2 320 ‘NorKnown 0 357 ‘Never 9 321 Toul 5 00.0 Note: Vaid Case ‘Then a multi-response item to ask why they do not use CATT so far, excluding the non-applicable and non-respondent cases, there are only 26 cases answer this item, the ranked reasons (according to valid percent) are tabulated in Table 7: 38‘Table 7: WHYNOTI - WhyNotUseCATI ‘Response | FreaiCases | Valid Percent | Rank Tse0iherMethods T3A6 S771 WoNeedToUsecaTi | 1076 3es ‘NoldeaAbouiCATT “76 263; 3 Costly T 5726 is2[ 4 NeeiSofoware 326 ns[ 5 NeedTimeTrainStal? 76 Trp Ft ‘NesdHardware 7226 Taft NoitlerkerTrend 126 38[ 8] Itis important to note that CATI may not be currently suitable in Hong Kong market research, since other data collection methods are being more popularly adopted by the market research companies. If CATI becomes one of the ‘popular’ means of getting research data, the personnel in the market research industry will need to invest time and money to setup CATI, including software and hardware. In addition, the practitioners need to learn the necessary knowledge of CATI because there is nearly one-third of the responses showing that they do not have any idea about CATI. Last but not the least, the following tables show the result (Table 8), number of attempts (Table 9) and time spent (Table 10) when conducting the telephone interview, Table 8 : RESULT. -esultOfPhoneInterview Response | Frequency | Percent | Valid Percent Suscess 2) 28 GH Refusal 3p 2F 7.0 Chased a7 33 33, NaContact ops T£0] ChaigeBusiness T os 23 FaxLine T 08 23 Others B60] MISSING Total ws [1000 TOO oie 1 Val Cases =; 2 Others ineligible cass tines no longer subscribed, sulted ete ‘The success rate is 65.1% which is considered acceptable in most telephone surveys. 59Table 9 : ATTEMPT - NumberOfContact ‘Response | Frequency | Percent | Valid Percant One [isa 34 Two 373 231 Thee T oR Br Four T on 31 Six 21 T6 63 Others 3 [740 | MISSING (rear 123 [1009 100 Noes t mean=I.8, wedian= mode! 0 SETS 2 Valid Cases = 32 (2 of then fly veried ineligible) 5. Others= ineligible cases, Nearly 60% of the 'successfitl' cases only contact once, The low number of attempts actually benefits from the pilot screening conducted in December last year. Table 10 : TIME - NearestMinutesOflnterview Response | Frequensy [Percent | Valid Pereene Five! spat 172 | Six Tr 35) 379) Seven ze 35. Eight 3724 103 Ten tl 276 Thiers 3] 764 | MISSING Total Tas To00 TOO.0 ows T-meane7, median = mode=60, = 1, 2, Valid Cases = 29 5. Others = ineligible cases (93 cases) and one ase replied by fax As the questionnaire is quite short, the time for interview is around 6 to 7 minutes with standard deviation 1.9 minutes. 2.4.5 Case Study 2.4.5.1 Research Companies According to the telephone interview, only two research companies in Hong Kong claimed by themselves that they have been doing CATI so far. The following descriptions should reflect the current practice of CATI in Hong Kong. For confidentiality, we name the said companies by A and B. Both companies are each deemed as one of the few large market research companies in Hong Kong. 60The history of using CATI for both companies is very short, for company A is less than three years and since 1996 for company B. The spokesmen said that the CATI system they are implementing is just at the testing’ stage although some projects have been done by CATI. Relating to CATI, the computer configurations! systems that are being used are mostly personal computers (PCs), Networking (LAN) is used in company B so that the CATI system is ‘centralized’ but the spokesman for company A does not know whether the system is networked or not. The recent projects which are done by CATI are about ‘house-product’ for company A and ‘auditing survey’ for company B. About the projects, there are around 15 part-time interviewers engaged in the house-product survey but 10 to 20 full-time interviewers are involved in the auditing survey. The two companies are quite similar in some features when conducting CATI survey; namely, the format of the questionnaire design are the same as item-based design’, semi-automated’ CATI system to deal with unexpected situations (eg. sudden interruption by the respondents or inconsistent responses given by the respondents ete.) during the interview, ‘automatic scheduling of the call-backs' done by the system, ‘optimal timing’ can be determined by the system, ‘on-line monitoring’ via computer, telephone system as well as using audio monitoring by cassette tapes are used by both companies. Although both companies are only at the ‘testing’ stage of the CATI systems, according to the experience so far, they agreed that when comparing CATI with door-to-door interviewing, in general, CATI has "better data quality’, lower interviewing cost (after the setup cost)’ and 'more or less equal interviewer-effect!. In addition, the spokesman of company A said that CATT has ‘higher response rate, ‘shorter interviewing time' than door-to-door interviewing in general. These are opposite to the viewpoints expressed by company B. For both companies, the expectation of CATI in Hong Kong research industry are quite ‘optimistic’. They consider that CATI can be further developed in the industry since this is a worthwhile tool to collect data, but suitable software (CATI 6computer programming and knowledge workers etc.) and hardware (computer system etc.) are the indispensable necessary conditions for developing CATI. The large amount of cost (particularly the initial hardware and software setup) for running CATI is one of the hindrances of implementing CATI; however, the marginal fieldwork cost is low once the CATI system has been established. 2.4.5.2 The Census and Statistics Department ‘The Section of Labour Statistics Branch - Employment and Vacancy of Hong, Kong Government has been using CATI for surveys since 1990. The CATI system is managed by statisticians. There are about 60 microcomputers based on a VAX system and the CATT is thus centralized. They adopt the CATI to collect data on employment and vacancies on quarterly basis. The projects usually involve around 50 temporary but trained workers. The format of the questionnaire design is item- based. The CATI system can automatically deal with some specified unexpected situations during the interviewing process, but it cannot determine the ‘optimal’ call-back time by the system itself. The monitoring is through the computer system. Although the Section has not done thorough studies on assessing the CAT effects, the statistician who is charge of the system commented that CATI in gereral is not worse than personal interviewings (PI); the response rate for both modes of data collection is more or less equal, but CATI yields better data quality since cross-check of the data by computerized records can be done. The interviewer-effect is lower in CATI because of on-line monitoring. Except for the initial setup cost, the interviewing cost per respondent is lower for CATI than for PI. The interviewing time depends largely on the type of the questions asked so that they cannot judge if CATT is longer or shorter. However, the statistician emphasized that the efficiency of CATI is basically topic- dependent and hence CATI should not be applied blindly. o2.4.5.3 The Academic Institutes Among the seven governmental-funded universities in Hong Kong, there are three institutes; namely, the University of Hong Kong (HKU), the Hong Kong University of Science and Technology (HKST) and the Hong Kong Baptist University (HKBU), have been using CATI for survey data collection with the very recent addition of CUHK this year. The table below shows the use of CATI facilities: Table 11: CAT! facilities in HKU, HKST & HKBU as at June, 1996) Organisation | Year of inro'd Base SoESei Res Ce, 1930 | S86SKIPCS HKU SoeSe Dept, 1994 ar Yes Ye HKUST computers Sociology Dep, TaT | -BODXIPCS Yes Ye HKU ¥ Cenalized server but ndividal daa Te “+5 Data fles stored on Ble server of the netvork, They apply CATI for different survey topics ranging from attitudinal (privacy attitudes) to factual topics (determinants of marital timing of women). Two of the organizations employ part-time workers for CATI surveys, the other one replied “don't know’. The format of questionnaire design are entirely different including item, soreen and form-based. The CATI systems can automatically dea! with the unexpected situations during the interviewing process, at least semi-automatically. But all of the CATI systems cannot determine the ‘optimal’ call-back time. Two of the organizations monitor the interviewing process through the computer system while the other one is by the presence of supervisor in the interviewing room. About the comparison of CATI with PI, most of them do not know much about that since they have yet tested the CATI effects. However, some of them considered that in general CATI has higher response rate, shorter interviewing time, better data quality, lower interviewing cost and interviewer-effect, When asked about the future of CATI in Hong Kong, they optimistically expect that 683CATI dominates PI in the future, particularly when the CATI can be written in Chinese and the survey practitioners (especially among the market research companies) consent more on the usefulness of CATI. 2.5 A Callback model for Telephone Surveys 2: .1 Introduction Potthoff et al (1993) proposed a simple callback model which is suitable for some telephone surveys and requires estimates of two parameters of the beta distribution. It is well-known that estimates may be biased if the survey respondents who differ in their availability and also differ in their average characteristics, For example, employed and unemployed persons differ in their availability for interviewing and biased survey response may hence be resulted. Remedies of nonavailability for interviewing includes making enough callbacks (this is often costly or even infeasible if a survey period is tight and short) and assigning higher weight to sample persons who were less available, For example, the Politz- Simmons "night- at-home" weighting scheme in 1950 and Simmons weighting scheme in 1954 allowing for callbacks. Potthoff et al (1993) commented that (a) under these schemes, the question used to obtain the nights-at-home information may be perceived as intrusive and may not yield accurate answers and (b) these schemes do not fully eliminate bias, because of "Class 0” - the group not at home any night of the period. Potthoff et al (1993) thus proposed several callback models which can avert the above problems. We only focus on models dealing with telephone surveys of which a two-parameter model may suffice, as they speculated and confirmed. In the coming sections, we summarise the key elements of Potthoff's two-parameter callback model that is suitable for telephone surveys. 2.5.2 Assumptions and Basic Concepts of the model To simplify the exposition of the model, several key assumptions and basic concepts are listed below: 64(a) Refusals do not exist; that is, that anyone who is available to respond will freely do so (this assumption can be relaxed if the model becomes more sophisticated). (b) Let the population mean of a response variable Y that depend, in part, on p, the probability that a particular population member is at home and available to respond when an interviewer calls. Assume that p is beta distributed across, population members and write P Beta(o.,B) or f(p) = p*'(1-p)”/B(@,B) ~ (15) where 0< p< | and B(a,B) = T(@)T(BYT (a8) is the beta function and I(k)=(k-1)! where k is real. (©) The conditional expectation of Y given p is assumed to be linear (other functions of p could be used although this may need higher order of moments), i.e., E(Yip)=a+ bp —~ (16). At this point Y is treated as continuous and modifications for discrete Y's are possible. (d) Each sample member not interviewed earlier is to receive C callbacks (C > 0) before interview attempts are stopped. Callbacks exclude the first attempt, so for any sample member there are up to (C+1) attempts. Let X (x = 0,1,2,..C) denote the number of callbacks made and assume that the distribution of X conditional on P =p is geomettic, ice, (x{p) = p(1-p)%, x = 0,1,2,...00 (17) if there is no censoring or truncation for x > C. The marginal distribution of the number of callbacks, x is thus given by 209 = feotpyeemep oD (a +B)F(B+x) where x=0, 1, 2, ... -~- (18). T(at+P+x+DI(6) This is essentially a geometric distribution in connection with callbacks. 65The goal is to estimate the parameter E(Y), taking expectation on (16) and assuming (15); that is, E(Y) = B[E(Y |P)] = a+ bo(a+B)' ~~ (19) where P = a(a+B)". tis reasonable to assume that, condition! on p, the response variable Y is independent of the number of callbacks; that is, flysx{p) = Ryip)f(xlp) = flylpx)f(xlp) —- (20). ‘Thus, assuming the order f integration E{E(Y|P}x] may be changed, one obtains E[E(Y|P)|x] = E(Y|x) -- (21) and this is not true in general without (20). Further, itis easy to show that Pix _ Beta(oct1,B-+x); that is a beta-geometrie distribution with expectation E(P}x) = (a+1)(a+B+x+1)" — (22). Using results (16) and (22), we get (YI) = ELECY|P xt = Bla + bPix] =a+bELPR] = at blatl)(atB+x+1)! ~ (23). With o: and B given (they may be estimated using MLEs and the analysis can then proceed using estimates of a and f in lieu of the true values but this will be biased since it should use Ef(a+1)(oc+8-+x+1)"] for each value of x, atthe very least. Alternatively, a full Bayesian solution may be applied in this respect), one can apply the usual least squares formulas to a sample of (X,y) to estimate a and b in (23). Finally, using the estimates of a (a ) and b (b) resulting from (23), then my =a+ Pb--- (24) is the estimator of E(Y) of (19). In matrix notation, let E(y|x) = A’p where y and x are n by 1 vectors and n is the number of completed interviews; A is 2 by nmatrix in which the first column is a unit vector and the second column is the vector of the elements 66