Stat 322/332/362

Sampling and Experimental Design

Fall 2006 Lecture Notes

Authors: Changbao Wu, Jiahua Chen
Department of Statistics and Actuarial Science
University of Waterloo

Key Words: Analysis of variance; Blocking; Factorial designs; Observational
and experimental studies; Optimal allocation; Ratio estimation; Regression
estimation; Probability sampling designs; Randomization; Stratified sample
mean.

2

Contents

1 Basic Concepts and Notation 5
1.1 Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Parameters of interest . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Sample data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Survey design and experimental design . . . . . . . . . . . . . 8
1.5 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Simple Probability Samples 13
2.1 Probability sampling . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 SRSOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 SRSWR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Systematic sampling . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Cluster sampling . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Sample size determination . . . . . . . . . . . . . . . . . . . . 18

3 Stratified Sampling 21
3.1 Stratified random sampling . . . . . . . . . . . . . . . . . . . . 22
3.2 Sample size allocation . . . . . . . . . . . . . . . . . . . . . . 24
3.3 A comparison to SRS . . . . . . . . . . . . . . . . . . . . . . . 25

4 Ratio and Regression Estimation 27
4.1 Ratio estimator . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.1 Ratio estimator . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Ratio Estimator . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Regression estimator . . . . . . . . . . . . . . . . . . . . . . . 31

5 Survey Errors and Some Related Issues 33
5.1 Non-sampling errors . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Non-response . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3

. . .3 Questionnaire design . . . . . . . . . . . . . . .1 The 2 design . . . . . . . 70 . 36 6 Experimental Design 39 6. . . . . . . . . . . . 41 7 Completely Randomized Design 43 7. .4 One-Way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The 23 design . . . . . . .1 Comparing 2 treatments . 58 8. . . . . . . . . . . . . . . . . . . .2 Hypothesis Test . . . . . . 45 7. .4 CONTENTS 5. . . . . .4 Telephone sampling and web surveys . 35 5. . . . 67 9. . . . . . . . . .1 Categories . . . . . . . . . . . . . . . 41 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Paired comparison for two treatments . . . . . . . . . . . . . . 49 7. . . . . . . . . . .3 Randomization test . 40 6.2 Systematic Approach . 55 8. . . . . . . . . 43 7. . . . . . . . . . . . . . . . . . . . . .3 Two-way factorial design . . . . . . . . . . . . .2 Randomized blocks design . . . . . . . . . . . . . . . . . . . . . . . . 63 9 Two-Level Factorial Design 67 2 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Three fundamental principles . . . . . . . . . . . . 51 8 Block and Two-Way Factorial 55 8. .

More advanced topics will be covered in Stat-454: Sampling Theory and Practice and Stat-430: Experimental Design. Yet. and some of the prior information about the population. The general term “population” refers to a collection of “individuals”. 1. This 5 . Two distinct types of populations are studied in this course. and (2) design and analysis of experiments. Stat322/332/362 is another course in statistics to develop statistic tool in modeling. If we get the data on how many students completed Stat231 successfully in the past three terms. A random quantity can be conceptually regarded as a sample taken from some population through some indeterministic mechanism. there are some mathematical ways to quantify the randomness.Chapter 1 Basic Concepts and Notation This is an introductory course for two important areas in statistics: (1) survey sampling. we did not know exactly how many students will take this course before the course change deadline is passed. some binomial model can be very useful for the purpose of prediction. we hope to draw conclusions about the unknown population. generally points to the impossible task of accurately predicting the exact outcome of a quantity of interest in observational or experimental studies.1 Population Statisticians are preoccupied with tasks of modeling random phenomena in the real world. associated with each “individual” are certain characteristics of interests. The randomness as most of us understood. predicting random phenomena. For example. A survey or finite population is a finite set of labeled individuals. Through the observation of these random quantities (sample data).

In sample survey. Let us make it conceptually a bit harder. Does it imply that the corresponding population is finite? The answer is no. Population of university students in Ontario. where N is called the population size. we study an input-output process and are in- terested in learning how the output variable(s) is affected by the input vari- able(s). Is it a finite population? The answer is again negative. BASIC CONCEPTS AND NOTATION set can hence be denoted as U = {1. . 2. this population must contain infinite individuals.6 CHAPTER 1. 4. Some examples of survey population: 1. an agricultural engineer examines the effect of different types of fertilizers on the yield of tomatoes. Some large scale ongoing surveys must take this change into consideration. In the tomato example. Population of all farms in the United States. The difference between the finite/infinite population is not always easy to understand/explain. That is. 2. For instance. Hence. 3. The number of possible outcome of this experiment is two: {Head. No}. our main object is to learn about some characteristics of the finite population under investigation. Tail}. our random quan- tity is the yield. · · · . N } . The experiment is not about selecting one of this two individuals. It is obvious that Canada population is in constant change with time for reasons such as birth/death/immigration. Population of business enterprises in the Great Toronto area. In this course we treat the survey population as fixed. The survey population in applications may change over time and/or location. the population in experimental design is often regard as infinite. suppose we only record whether the yield per plant exceeds 10kg or not. In experimental design. The random quantity of interest takes only 2 possible values: Yes/No. Population of Canada. Assume an engineer wants to investigation whether the temperature of the coin can alter the probability of its landing on a head. We note the conceptual population is not as simple as consisting of two individuals with characteristics { Yes. In this case. When we regard the outcome of this random quantity as a sample from a population. 3.e. i. all individuals residing in Canada. but the complex outcome is mapped to one of these two values. we need to make believe that we only a snapshot of a finite population so that any changes in the period of our study is not a big concern.

For a survey population. then it is easy to see that N P = Y¯ . We must imagine a population with infinite number of heads and tails each representing an experimental config- uration under which the outcome will be observed. y. When this is the case.1. In summary. .2 Parameters of interest The interested characteristic(s) of a sample from a population is referred as study variable(s) or response variable(s). where M is the number of individ- uals in the population that possess certain attribute of interest. · · · . 2. 2. representing different groups or classes in the population. an “individual” in this case is understood as an “individual experiment configuration” which is practically infinite. − Y¯ )2 .2. the population under experimental design is an infinite set of all possible experiment configurations. i = 1. N −1 In other words. it is quite feasible for us to ignore the problem of estimating population proportions. PN 3. Tail}. 1. N . Let ( 1 if the ith individual possesses “A” yi = 0 otherwise where “A” represents the attribute of interest. we denote the value of the response variable as yi for the ith individual. Thus. Population variance: S 2 = (N − 1)−1 i=1 (yi 4. we may simply use the same techniques developed for population mean. the study variables are indicator variables or cate- gorical variables. PARAMETERS OF INTEREST 7 The experiment is not about how to select one of two individuals from a population consisting of {Head. The following population quantities are primary interest in sample survey applications: PN 1. S2 = P (1 − P ) . it is seen that the population proportion is a special case of population mean defined over an indicator variable. Population total: Y = i=1 yi . Population mean: Y¯ = N −1 N P i=1 yi . When the problem about proportions arises. In many applications. Population proportion: P = M/N .

µ3 and µ4 . or through carefully designed experiments. referred to as observational. Most sample data in survey sampling are observational while in experimental design they are experimental.4 Survey design and experimental design One of the objectives in survey sampling is to estimate the finite population quantities based on sample data. The most −1 P useful summary statistics from sample data are sample mean y ¯ = n i∈s yi 2 −1 P 2 and sample variance s = (n − 1) i∈s (yi − y ¯) . In the tomato-fertilizer example. µ2 . so results can be published in a timely fashion. µ1 . · · · . 1. i. the engineer wishes to examine if there are differences among the average yields of tomatoes. With a fixed budget. The µi ’s are the parameters of interest. we call any function of data not depending on unknown parameters as a statistic. Estimates based on sample surveys are often more accurate than the results based on a census. Knowing the exact unemployment rate for the year 2005 is not very helpful if it takes two years to complete the census. This is a little surprising. under four different types of fertilizers. Data can be collected more quickly. a census. counting or simple measurement. BASIC CONCEPTS AND NOTATION In experimental design. we are often interested in finding out the probability distributions of the study variable(s) and/or the related parameters. As a remark. 2. Why do we need sample survey? There are three main justifications for using sampling: 1. denoted by s: s = {1. 3. since the population is (at least hypothetically) infinite. performing a census is often impracticable. all population quantities such as mean or total can be determined exactly through a complete enumeration of the finite population. A census often . 2.3 Sample data A subset of the population with study variable(s) measured on each selected individuals is called a sample. 1. {yi . Data can be collected through direct reading. Sampling can provide reliable information at far less cost. In theory. i ∈ s} is also called sample or sample data. referred to as experimental. in statistics. n} and n is called the sample size. These parameters are in a rather abstract kingdom.e.8 CHAPTER 1.

called the stratum size. 4. 6. the pop- ulation is divided into a number of distinct strata or subpopulations Uj . H. Sampled population: The collection of all possible elements that might have been chosen in a sample. Sampling units can be the individual elements.1. Observation units are usually the individual elements. due to administrative or geographical restrictions. Stratification and clustering are the two most common situations. Here households are clusters and individual residents are the elements. We have N1 + N2 + · · · + NH = N . Target population: The complete collection of individuals or ele- ments we want to study. such that Uj ∩Uk = ∅ for j 6= k and U1 ∪U2 ∪· · ·∪UH = U . 2. wrong recording. or clusters. Sampling unit: The unit we actually sample. The number of elements in stratum Uj is often denoted as Nj . Some crucial steps involve careful definitions for the following items.4. 7. For example. Population structure: The survey population may show certain spe- cific structure. · · · . Clustering occurs when no reliable list of the elements or individuals in the population is available but groups. a list of all residents in a city may not exist but a list of all households will be easy to construct. of elements are easy to identify. the population from which the sample was taken. 5. There are two gen- eral types of sampling designs used in practice: probability sampling. . high quality data can be obtained through well trained personnel and following up studies on nonrespondents. Observation unit: The unit we take measurement from. Sampling design: Method of selecting a sample. SURVEY DESIGN AND EXPERIMENTAL DESIGN 9 requires a large administrative organization and involves many persons in the data collection. 2. and other types of errors can be easily injected into the census. Biased measurement. In a sample. Sampling frame: The list of sampling units. 1. called clusters. j = 1. Survey design is the planning for both data collection and statistical anal- ysis. Sometimes. 3.

If the difference is substantial. 2. and (e) a sample of volunteers. . 7. Despite of the best effort in applications. 3. Method of selecting the sample. The relevant data to be collected: define study variable(s) and popu- lation quantities. Under nonprobability sampling. Required precision of estimates. Standard errors and confidence intervals can also be re- ported. the conclusions based on the survey have to be interpreted carefully. 6. (b) a sample of convenience. The planning and execution of a survey may involve some or all of fol- lowing steps: 1. Summarizing and analyzing the data: estimation procedures and other statistical techniques to be employed. BASIC CONCEPTS AND NOTATION which will be discussed in more detail in subsequent chapters. Plans for handling non-response. 9. Yet we try hard to make sure the chance for any single unit to be selected is positive. The population frame: define sampling units and construct the list of the sampling units. If this is not the case. 4. (c) restrictive sampling. not all units in the population will be chosen. the sampled population is usu- ally not identical to the target population. In any single sampling survey.10 CHAPTER 1. The population to be sampled. Organization of the field work. 5. A clear statement of objectives. it results in the difference between the target population and the sampled population. and non- probability sampling. It is important to notice that conclusions from a sample survey can only be applied to the sampled popu- lation. A few additional remarks about the probability sampling plan. none of these are possible. In probability sampling. Nonprobability sampling includes (a) purposive or judgmental sampling. (d) quota sampling. 10. Writing reports. 8. unbiased estimates of population parameters can be constructed.

then the sampling plan is often referred as biased. If the sampling plan is biased. we wish that each sampling unit has equal probability to be included into the sample. then we can try to accommodate this information into our analysis. we will construct (unbi- ased) estimators. In each case. and build confidence intervals using a point estimate and its estimated standard error. If this is not the case. STATISTICAL ANALYSIS 11 Most often. 1.1. in most cases. However.5. The conclusion can still be unbiased in a loose sense. estimate the variance of the estimator. . If the resulting sampling data set is analyzed without detailed knowledge of selection bias. a biased plan might be helpful. and hard to accommodate in the analysis. and we know how it is biased. introducing biased sampling plan enables us to make more efficient inference. In some applications. the bias is hard to model. the final conclusion is biased. They are to be avoided.5 Statistical analysis We will focus on the estimation of population mean Y¯ or proportion P = M/N based on probability samples. The basic elements of experimental design will be discussed in Chapter 6. Thus.

12 CHAPTER 1. BASIC CONCEPTS AND NOTATION .

Chapter 2 Simple Probability Samples 2. The code in R is: > sample( 1:7. s3 = {3}. prob=c(1. non-zero probability of being included in the sample. 3}. denoted by πi = P (i ∈ s). s5 = {1. i = 1. we will treat them as the same unless otherwise specified. 2. say. 3}. each element (sampling unit) in the (study) pop- ulation has a known. Since the sampling unit and the element are often the same. · · · . The probability that element i is selected in the sample is called inclusion probability. If πi = 1. 2. 13 . s7 = {1. 2. A probability measure P (·) is given by s s1 s2 s3 s4 s5 s6 s7 P (s) 1/9 1/9 1/9 2/9 2/9 2/9 0 Selection of a sample based on above probability measure can be done using a random number generator in Splus or R. 1. s2 = {2}. 0)/9) The output will be a number between 1 and 6 with the corresponding probability. 3}. It implies that the element 2 is virtually not in the population because it will never be selected.1 Let N = 3 and U = {1. s4 = {1. Remark: Suppose πj = 0 when j = 2. Example 2. the element will be included in the sample for certainty. N . 2. 1.1 Probability sampling In probability sampling. s6 = {2. All possible candidate samples are s1 = {1}. 2}. 3}. 1. Such a sampling can be specified through a probability measure defined over the set of all possible samples. 2. 2. It is required that all πi > 0.

SIMPLE PROBABILITY SAMPLES Let ν(s) = the number of elements in s. Under probability sampling. i ∈ s} be the sample data. Remark: Do not get confused between elements and samples. 2. E(¯ y ) = Y¯ . P (s) = Nn if ν(s) = n. select the 2nd element from the remaining N − 1 elements with probability 1/(N − 1). One way to select such a sample is use Simple Random Sampling Without Replacement (SRSWOR): select the 1st element from U = {1. Let {yi . and continue this until n elements are selected. s7 be defined as in Example 2. The following sampling design has fixed sample size of n = 2.e. There were examples when the outcomes of “random number” generated by computer were predicted. In a more scientific respect. N } with probability 1/N .e.2 Let U = {1. P (s) = 0 otherwise. Standard errors and confidence inter- vals should also be reported. i. P (s) = Nn if ν(s) = n.n) in Splus or R). Example 2. unbiased estimates of commonly used popu- lation parameters can be constructed. 2. We say a sampling design has fixed sample size n if ν(s) 6= n implies P (s) = 0. 2.2 Simple random sampling without replace- ment One of the simplest probability sampling designs (plans) to select a sample  −1 of fixed size n with equal probability.1. generating pseudo random numbers is most practical as well as effective. In practice.14 CHAPTER 2. the sample mean y¯ is an unbiased estimator of Y¯ . s s1 s2 s3 s4 s5 s6 s7 P (s) 0 0 0 1/3 1/3 1/3 0 Remark: Try to write a R code for this sampling plan. ♦ . · · ·. · · · . either of the above methods truly provides a random sample. 3} and s1 . For the purpose of sam- pling survey. i.1 Under SRSWOR. the scheme can be carried out using a table of random numbers or computer generated random numbers (such as sample(N.  −1 It can be shown that under SRSWOR. Result 2. P (s) = 0 otherwise.

The practical implications are: the precision of the statistical inference improves when we collect more information. the population variance S 2 is defined slightly differently. The formula can hence differ a little. Y¯ is a population parameter. (2) v(¯ y ) = (1 − f )s2 /n is an unbiased estimator of V (¯ y ). y¯ + zα/2 SE(¯ y )]. and it has been verified in applications again and again. When n is small. Confidence intervals: an approximately 1 − α CI for Y¯ is given by y − zα/2 SE(¯ [¯ y ). It is seen that when the sample size increases. S is the population vari- ance. The results on the estimation of Y¯ apply to two other parameters: the population total Y and the population proportion P = M/N . 6. In addition. the variance of the sample means from these two populations are approximately equal as long as n1 ≈ n2 . ♦ Some remarks: 1. where SE is the estimated standard error of y¯. In this case. but the exact coverage probability of this CI is unknown for either choices. SRSOR 15 Result 2. ♦ The 1 − f is called the finite population correction factor. say N1 >> N2 . 4. the variance of y¯ is given by V (¯ y ) = (1 − 2 2 f )S /n. V (¯ 5.2. both factors (1 − f ) and S 2 /n decrease. Yet this is a well established result. a constant but unknown.3 Under SRSWOR. this outcome is quite counter-intuitive. y ) = (1 − f )S 2 /n is a constant but unknown (since S 2 is unknown!). 2. but one has much larger population size than the other one. Result 2.2. V (¯ y ) can be estimated by replacing S 2 by s2 . y¯ is a statistic (should be viewed as a random variable before the sample is taken). In some books. To many. and is computable once the sample is taken. suppose we have two finite populations with about the same population variances S12 ≈ S22 . 3. zα/2 might be replaced by tα/2 (n − 1). You need not be alarmed. . (1) the sample variance s2 is an unbiased estimator of S 2 . where f = n/N is the sampling fraction.2 Under SRSWOR.

This sampling scheme is called simple random sampling with replace- ment (SRSWR). (2) systematic sampling is more efficient than SRS when there is a linear trend in the ordered population. s2 . When N is very large and n is small. V (¯ y ) = k −1 kr=1 [¯ P Result 2. 2. yn be the values for the n selected elements and y¯ = n−1 ni=1 yi . E(¯ ¯ 2 Y ) /N . Let y¯(sr ) = n−1 i∈sr yi . select the 2nd element also from {1.16 CHAPTER 2. y12 } = {2. y2 . 2. r + (n − 1)k will form the sample. Under SRSWR. choose a random number r between 1 and k. r is called random starting point.4 Systematic sampling Suppose we want to take a sample of size n from the population U of size N . some elements in the population may be selected more than once. P y ) = Y¯ . · · · . 16. the elements numbered r. Under systematic sampling where N = n × k. ♦ Example 2.3 Simple random sampling with replacement Select the 1st element from {1. Yet each element in the population has the same probability of being sampled. sk . Here Y¯ = 13 and S 2 = 52. Under SRSWOR. r + k. 4. · · · . · · · . The population elements are ordered in a sequence. N } with equal probability.67.4 Under SRSWR. 20}. . (ii) Under systematic sam- pling. s1 : {2. SRSWOR and SRSWR will be very close to each other. particular so if a complete list of sampling units is not available. 14. E(¯ y (sr )− Y¯ ]2 . 6. · · · . SIMPLE PROBABILITY SAMPLES 2. 2. 24}. In this respect. there are only k candidate samples s1 .5 Under systematic sampling. · · · . N } with equal probability. For a sample of size n = 4: (i) . Let y1 . Assume N = n × k. To take a systematic sample. V (¯ y ) = σ 2 /n. 22}. ♦ SRSWOR is more efficient than SRSWR. Systematic sampling is also approximately the same as SRSWOR when the population is roughly in a random order. there are three candidate samples. P y ) = Y¯ . r + 2k. it has some similarities with SRSWOR. where σ 2 = N i=1 (yi − P Result 2.3 Suppose the population size N = 12. 10. 8. y2 . s2 : {4. V (¯ y ) = (1 − 1/3)S 2 /4 = 8. repeat this n times. · · · . Systematic sampling is often used in practice due to two reasons: (1) it is sometimes easier to do a systematic sampling than SRS. and {y1 . · · · .

easier and quicker to collect data from adjoining elements than elements chosen at random. due to similarities of elements within the same cluster. The ith cluster consists of Mi elements. Borrowing the variance formula from SRSWOR results in conservative statistical analysis. Thus. y¯(s3 ) = 15. If the population can be viewed as in a random order. cluster sampling is less informative and less efficient per elements in the sample. the cluster sampling plan is efficient. 18. On the other hand.2.e. y¯(s2 ) = 13 and . A list of clusters can be constructed as the sampling frame but a complete list of elements is often unavailable. i.67. In another vein. in terms of unit cost. can often be compensated by increasing the overall sample size. The loss of efficiency. Let yij be the y value for the jth element in the ith cluster. N M i=1 j=1 the population variance (per element) is N X M 1 S2 = (yij − Y¯ )2 . X N M − 1 i=1 j=1 . results from a systematic sample can be very unreliable. variance formula for SRSWOR can be borrowed. the population mean (per element) is N X M 1 X Y¯ = yij . Mi ≡ M . In this case it is necessary to use cluster sampling where a random sample of clusters is taken and some or all elements in the selected clusters are observed. CLUSTER SAMPLING 17 s3 : {6.5 Cluster sampling In many practical situations the population elements are grouped into a number of clusters. 12. The first is variance estimation. the systematic sampling plan can be more efficient when there is a linear trend in the ordered population. There are two major problems associated with systematic sampling. however. because it is much cheaper. We consider a simple situation where the cluster sizes Mi are all the same. 24}. Cluster sampling is also preferable in terms of cost. Suppose the population consists of N clusters. The other problem is that if the population is in a periodic or cyclical order. 2. Unbiased variance estimator is not available.5. The three sample means are y¯(s1 ) = 11. or too expensive to construct. The population size (total number of elements) is N M . V (¯y ) = [(11 − 13)2 + (13 − 13)2 + (15 − 13)2 ]/3 = 2.

SIMPLE PROBABILITY SAMPLES The mean for the ith cluster is Y¯i = M −1 M P j=1 yij . 1. otherwise a ratio type estimator will have to be used.18 CHAPTER 2. (i) E(y¯) = Y¯ . simple solutions exist. We assume the sampling scheme is SRSWOR. such that y − Y¯ | > e) ≤ α P (|¯ for a chosen value of α. The sample mean (per element) is given by M 1 XX 1X¯ y¯ = yij = Yi . 2 (ii) V (y¯) = (1 − n SM ) . and the variance for the 2 ith cluster is Si = (M − 1)−1 PM ¯ 2 j=1 (yij − Yi ) . Precision specified by absolute tolerable error The surveyor can specify the margin of error. i∈s (Yi N n n−1 ♦ When cluster sizes are not all equal.6 Under one-stage cluster sampling with clusters sampled using SRSWOR. nM i∈s j=1 n i∈s Result 2. One-stage cluster sampling: Take n clusters (denoted by s) using sim- ple random sampling without replacement. 2. complications will arise. and all elements in the selected clusters are observed. N n . It is also interesting to note that systematic sampling is a special case of one-stage cluster sampling. When Mi ’s are all known. The answer to this question depends on how accurate he wants the estimate to be. Approximately we have n S r e = zα/2 1 − √ . 2 where SM = 1 PN ¯ − Y¯ )2 . N n N −1 i=1 (Yi (iii) v(y¯) = (1 − n 1 1 ) P ¯ − y¯)2 is an unbiased estimator for V (y¯). e. one needs to know how big a sample he should draw.6 Sample size determination In planning a survey.05. usually taken as 0.

Sample size determination requires the knowledge of S 2 or CV . There are two ways to obtain information on these.6. 0 ≤ P ≤ 1 implies S 2 ≤ 1/4. and CV = S/Y¯ is the coefficient of variation. Also note that S 2 = P (1 − P ). .2. That is. and information from these studies can be used to get approx- imate values for S 2 or CV . we have 2 zα/2 S2 n0 n= 2 2 2 = e + zα/2 S /N 1 + n0 /N 2 where n0 = zα/2 S 2 /e2 . 3. common choice of e and α are 3% and 0. This example is applicable when the age is the study variable.05. e. Use a small portion of the available resource to conduct a small scale pilot survey before the formal one to obtain information about S 2 or CV . The largest value of required sample size n occurs at P = 1/2. (a) Historical data. Precision specified by relative tolerable error The precision is often specified by a relative tolerable error. (b) A pilot survey. For example. the largest value minus the smallest value is no more than 100. SAMPLE SIZE DETERMINATION 19 Solving for n . if a population has a range of 100. and the . P (|p − P | > e) ≤ α. Then a conventional estimate of S is 100/4. y − Y¯ | ! |¯ P >e ≤α |Y¯ | The required n is given by 2 zα/2 S2 n∗0 n= = . e2 Y¯ 2 + zα/2 2 S 2 /N 1 + n∗0 /N Where n∗0 = zα/2 2 (CV )2 /e2 . 2. Sample size for estimating proportions The absolute tolerable error is often used. Other methods are often ad hoc. Quite often there were similar studies conducted pre- viously.

20 CHAPTER 2. SIMPLE PROBABILITY SAMPLES .

PNh 3. − Y¯h )2 . PH PNh 4. j = 1. The population is said to have a stratified structure.Chapter 3 Stratified Sampling We mentioned in Section 1. Some related population quantities are: 1. The hth stratum variance Sh2 = (Nh − 1)−1 j=1 (yhj − Y¯ )2 . Let yhj be the y value of the jth element in stratum h. X X h=1 h=1 where Wh = Nh /N is called the stratum weight. 2. · · · . 2. The population mean Y¯ = N −1 H P PNh h=1 j=1 yhj . · · · . H. Nh . 2. h = 1. The hth stratum mean Y¯h = Nh−1 N P h j=1 yhj . 21 . · · · . The second equality can be alternatively re-stated as that Total variation = Within strata variation+Between strata variation. The population variance S 2 = (N − 1)−1 h=1 j=1 (yhj It can be shown that H Y¯ = Wh Y¯h . We must have N1 + N2 + · · · + NH = N . such that Uh ∩Uh0 = ∅ for h 6= h0 and U1 ∪U2 ∪· · ·∪UH = U . Stratification may also be imposed by the surveyor for the purpose of better estimation.4 that sometimes the population is naturally di- vided into a number of distinct non-overlapping subpopulations called strata Uh . 2. H. Let Nh be the hth stratum size. h = 1. X h=1 H H (N − 1)S 2 = (Nh − 1)Sh2 + Nh (Y¯h − Y¯ )2 .

22 CHAPTER 3.1 Under stratified random sampling. the sampling scheme is termed Stratified Random Sampling. h = 1. a decision will have to be made first on how many elements are to be selected from each stratum. Let yhj . 2. We have Var(Y |X) = Var{E(Y |X)} + E{Var(Y |X)}. The overall sample is therefore given by s = s1 ∪ s2 ∪ · · · ∪ sH .1 Stratified random sampling To take a sample s with fixed sample size n from a stratified population. It turns out. Wh y¯h is an unbiased estimator of Y¯ . Let X and Y be two random variables. The sample mean and sample variance for stratum h are given by 1 X 1 yhj and s2h = (yhj − y¯h )2 . · · · . PH (iii) v(¯ yst ) = h=1 Wh2 (1 − fh )s2h /nh is an unbiased estimator of V (¯ yst ). . Let nh > 0 be the number of elements drawn from stratum h. that the estimation based on stratified simple random sampling is more efficient for majority of populations in applications. j ∈ sh be the observed values for the y variable. PH (i) y¯st = h=1 yst ) = H 2 2 h=1 Wh (1 − fh )Sh /nh . though. Suppose a sample sh of size nh is taken from stratum h. and samples from different strata are independent of each other. For students who are still fresh with some facts in probability theory. 2. It follows that n = n1 + n2 + · · · + nH . · · · . The main motivation of applying stratified simple random sampling is the administrative convenience. Result 3. you may relate the above decomposition with a formula as follows. STRATIFIED SAMPLING This relationship is needed when we make comparisons between SRS and stratified sampling. where fh = nh /Nh is the sampling P (ii) V (¯ fraction in the hth stratum. h = 1. X y¯h = nh j∈sh nh − 1 j∈sh If sh is taken from the hth stratum using simple random sampling without replacement. H. 3. H.

3. (4) Increased accuracy of estimate. This also relates to the other ques- tions: how to stratify? and how to allocate the sample sizes? We will return to these questions in next sections. Stratified sampling can often provide more accurate estimates than SRS. In theory there shouldn’t be any concern about this unusual case. There are four main reasons to justify the use of stratified sampling: (1) Administrative convenience. estimates for cer- tain sub-population are also required. Here provinces are the natural choice of strata.3. one might require the estimates of unemployment rate for not only at the national level but for each province as well. but the results from the survey will be more acceptable to the public if. A survey at national level can be greatly facilitated if officials associated with each province survey a portion of the sample from their province.1. (2) In addition to the estimates for the entire population. STRATIFIED RANDOM SAMPLING 23 ♦ The proof follows directly from results of SRSWOR and the fact that s1 . a random sample of 100 students from University of Waterloo may contain only few female students. sH are independent of each other.2 and 3. s2 . For example. only a subset of the elements in a stratum are observed. while all elements in a sampled cluster are observed. . In both cases the population is divided into subgroups: strata in the former and clusters in the latter. The results can also be eas- ily modified to handle the estimation of population total Y and population proportion P . For instance. Questions associated with stratified sampling include (i) Why use strat- ified sampling? (ii) How to stratify? and (iii) How to allocate sample sizes to each stratum? We will address questions (ii) and (iii) in Sections 3. Usually. In cluster sampling only a portion of clusters are sampled while in stratified sampling every stratum will be sampled. the sample consists of 50 male students and 50 female students. (3) Protect from possible disproportional samples under probability sam- pling. say. · · · . Stratified sampling is different from cluster sampling.

Result 3. h = 1. N n h=1 ♦ 2. 2. 1.e. yst ) subject to constraint n1 + n2 + · · · + nH ) can be found by minimizing V (¯ nH = n. H. and optimal allocation for a given n. · · · . an optimal allocation (n1 . one requires knowledge of Sh . the resulting allocation is given by Nh nh = n = nWh . H n 1X yst ) = (1 − Vprop (¯ ) Wh Sh2 . one can gather this information from historical data.2 Sample size allocation We consider two commonly used schemes in allocating the sample sizes into each of the strata: proportional allocation. Optimal allocation (Neyman allocation) When the total sample size n is fixed. we should allocate the stratum sample size proportional to the stratum size. Under the restriction that n1 + n2 + · · · + nH = n. . H . h=1 Wh Sh h=1 Nh Sh and the minimum variance is given by H H 1 X 2 1 X yst ) = ( Wh Sh ) − Vmin (¯ Wh Sh2 . n2 . the total sample size. n h=1 N h=1 ♦ To carry out an optimal allocation. 2. nh ∝ Nh . Proportional allocation With no extra information except the stratum size. Since rough estimates of the Sh ’s will be good enough to do a sample size allocation.3 In stratified random sampling V (¯ yst ) is minimized for a fixed total sample size n if W h Sh Nh S h nh = n PH = n PH . Nh . h = 1. · · · . · · · . STRATIFIED SAMPLING 3. i. or through a small scale pilot survey.24 CHAPTER 3. N Result 3.2 Under stratified random sampling with proportional allocation.

the optimal allocation discussed is also called Neyman allocation. n 1X y ) − Vprop (¯ V (¯ yst ) = (1 − ) Wh (Y¯h − Y¯ )2 ≥ 0 . we have H . The optimal allocation under this situation can be similarly derived but is not be discussed in this course.3 A comparison to SRS It will be of interest to make a comparison between stratified random sam- pling and SRSWOR. people with same sex. In practice. To differentiate these two optimal schemes. stratified random sampling is more efficient than simple random sampling. For example. This also provides guidance on how to stratify: the optimal stratification under pro- portional allocation is the one which produces the largest possible differences between the stratum means.3. A COMPARISON TO SRS 25 3. Then. Another factor that may affect our decision of sample allocation is the unit cost per sampling unit.3.4 that when proportional allocation is used. certain prior information or common knowledge can be used for stratification. stratified random sampling is (almost) always more efficient than SRSWOR. age and/or sex-age group combinations will be a reasonable choice. The gain of efficiency depends on the between-strata variation. N n h=1 ♦ It is now clear from Result 3. In general. . Such a stratification also maximizes the homo- geneity of the y-values within each stratum. Result 3. age and income level are more likely similar to each other. (Nh − 1)/(N − 1) = Nh /N . in surveying human populations. both with a total sample size of n. Strati- fication by sex. treating . The cost of taking a sample from some strata can be higher than other strata.4 Let y¯st be the stratified sample mean and y¯ be the sample mean from SRSWOR.

26 CHAPTER 3. STRATIFIED SAMPLING

Chapter 4

Ratio and Regression
Estimation

Often in survey sampling, information on one (or more) covariate x (called
auxiliary variable) is available prior to sampling. Sometimes this auxiliary
information is complete, i.e. the value xi is known for every element i in the
population; sometimes only the population mean X ¯ = N −1 PN xi or total
i=1
X= N
P
i=1 x i is known. When the auxiliary variable x is correlated with the
study variable y, this known auxiliary information can be useful for the new
survey study.
Example 4.1 In family expenditure surveys, the values on x(1) : the number
of people in the family and/or x(2) : the family income of previous year are
known for every element in the population. The study variable(s) is on cur-
rent year family expenditures such as expenses on clothing, food, furniture,
etc.
Example 4.2 In agriculture surveys, a complete list of farms with the area
(acreage) of each farm is available.
Example 4.3 Data from earlier census provides various population totals that
can be used as auxiliary information for the planned surveys.
Auxiliary information can be used at the design stage. For instance,
a stratified sampling scheme might be chosen where stratification is done
by values of certain covariates such as sex, age and income levels. The
pps sampling design (inclusion probability proportional to a size measure)
is another sophisticated example.
In this chapter, we use auxiliary information explicitly at the estimation

27

28 CHAPTER 4. RATIO AND REGRESSION ESTIMATION

stage by incorporating the known X ¯ or X into the estimators through ratio
and regression estimation. The resulting estimators will be more efficient
than those discussed in previous chapters.

4.1 Ratio estimator
4.1.1 Ratio estimator under SRSWOR
.
Suppose yi is approximately proportional to xi , i.e. yi = βxi for i = 1, 2, . . . , N .
.
It follows that Y¯ = β X.
¯ Let R = Y¯ /X ¯ = Y /X be the ratio of two population
means or totals. Let y¯ and x¯ be the sample means under SRSWOR. It is
natural to use R ˆ = y¯/¯
x to estimate R. The ratio estimator for Y¯ is defined
as
Yˆ¯ R = R
ˆX¯ = y¯ X
¯.

One can expect that X/¯ ¯ x will be close to 1, so Yˆ¯ R will be close to y¯. Why
is ratio estimator often used? The following results will provide an answer.
Note that R = Y¯ /X ¯ is the (unknown) population ratio, and R ˆ = y¯/¯x is a
sample-based estimate for R.
Result 4.1 Under simple random sampling without replacement,

(i) Yˆ¯ R is approximately unbiased estimator for Y¯ .

(ii) The variance of Yˆ¯ R can be approximated by
N
. n 1 1 X
V (Yˆ¯ R ) = (1 − ) (yi − Rxi )2 .
N n N − 1 i=1

(iii) An approximately unbiased variance estimator is given by
n 1 1 X
v(Yˆ¯ R ) = (1 − ) ˆ i )2 .
(yi − Rx
N n n − 1 i∈s


To see when the ratio estimator is better than the simple sample mean
y¯, lets make a comparison between the two variances. Note that
n 1 2
y ) = (1 −
V (¯ ) S ,
N n Y

in many practical . and SXY = (N − 1) −1 PN ¯ ¯ i=1 (yi − Y )(xi − X). There are other situa- tions where we have to use a ratio type estimator. 2 CV (Y ) where ρ = SXY /[SX SY ]. the ratio estimator will perform better than the simple sample mean. If a straight line going through the origin is appropriate. RATIO ESTIMATOR 29 N . This is usually the case.4. n 1 1 X V (Yˆ¯ R ) = (1 − ) [(yi − Y¯ ) − R(xi − X)] ¯ 2 N n N − 1 i=1 n 1 = (1 − ) [SY2 + R2 SX2 − 2RSXY ] . P P 4. ratio estimator can be used in two different ways: (a) estimate R = Y¯ /X ¯ by R ˆ = y¯st /¯ xst . CV (X) = SX /X ¯ and CV (Y ) = SY /Y¯ . ratio estimator may be efficient.1.5. This condition can also be re-expressed as 1 CV (X) ρ> .2 Ratio estimator under stratified random sam- pling When the population has been stratified. Indeed. we only require ρ > 0. and Y¯ = RX ¯ by R ˆ X. Ratio estimator can provide improved estimate. N n where SY2 and SX2 are the population variances for the y and x variables. ¯ . A scatter plot of the data can visualize the relationship between y and x. Under one-stage cluster sampling with clusters of unequal sizes and Mi are not known unless the ith cluster is selected in the sample. the population mean (per element) is indeed a ratio: N XMi N N N 1 X 1 X Y¯ = X X yij / Mi = [ Yi ]/[ Mi ] . i=1 j=1 i=1 N i=1 N i=1 ˆ A natural estimate for Y¯ would be Y¯ = [n−1 i∈s Yi ]/[n−1 i∈s Mi ]. situations CV (X) = CV (Y ). The conclu- sion is: if there is a strong correlation between y and x.1. The ratio estimator will have a smaller variance if and only if R 2 SX 2 − 2RSXY < 0 .

In (a).3 Under stratified random sampling. X 2 nh 1 1 [(yhj − Y¯h ) − R(xhj − X ¯ h )]2 . where the X Result 4. X V (Y Rc ) = Wh (1 − ) h=1 N h n h N h − 1 j=1 which can be estimated by H ˆ ¯ . X Wh (1 − ) (yhj − R h=1 N h n n h h − 1 j∈sh where Rh = Y¯h /X ¯ h and R ˆ h = y¯h /¯ xh . x¯st = H P P h=1 Wh y h=1 Wh x ¯h . and its variance is given by H Nh . by a ratio estimator [¯ yh /¯ ¯ ¯ xh ]Xh .30 CHAPTER 4. X 2 nh 1 1 X ˆ hj − x¯h )]2 . the strata mean. the combined ratio estimator Yˆ¯ Rc is approximately unbiased for Y¯ . the separate ratio estimator Yˆ¯ Rs is approximately unbiased for Y¯ . only X needs be known. the stratum means X ¯ h are required. X 2 nh 1 1 V (Yˆ¯ Rs ) = (yhj − Rh xhj )2 . and its variance is given by H Nh ˆ ¯ . The separate ratio estimator of ¯ Y is defined as H y¯h ¯ Yˆ¯ Rs = X Wh X h. The combined ratio estimator of Y¯ is defined as y¯st ¯ Yˆ¯ Rc = X. under (b). x¯st where y¯st = H ¯h .2 Under stratified random sampling. ♦ Result 4. X Wh (1 − ) h=1 N h n h N h − 1 j=1 which can be estimated by H . v(Y Rc ) = Wh (1 − ) [(yhj − y¯h ) − R(x h=1 Nh nh nh − 1 j∈sh where R = Y¯ /X ¯ and R ˆ = y¯st /¯ xst . h=1 x¯h ¯ h ’s are the known strata means. RATIO AND REGRESSION ESTIMATION or (b) write Y¯ as Y¯ = H ¯ ¯ P h=1 Wh Yh and estimate Yh . ♦ . X 2 nh 1 1 v(Yˆ¯ Rs ) = ˆ h xhj )2 .

REGRESSION ESTIMATOR 31 One of the questions that needs to be addressed is how to make a choice between Yˆ¯ Rc and Yˆ¯ Rs . 4. Its approximate variance is given by N . Third. yi = β0 + β1 xi . Second.4.2 Regression estimator The study variable y is often linearly related to the auxiliary variable x. where B Result 4. So in many situations Yˆ¯ Rs will perform better than Yˆ¯ Rc . i. the regression estimator Yˆ¯ REG is approxi- mately unbiased for Y¯ .4 Under SRSWOR. The bias from using Yˆ¯ Rc . N . are too small. 2. ¯ This variance can be estimated by n 1 1 X 2 v(Yˆ¯ REG ) = (1 − ) eˆ . The separate ratio estimator requires the strata means X ¯ h being known. First. the regression estimator of Y¯ is defined as Yˆ¯ REG = y¯ + B( ˆ X¯ − x¯) . .2. and i=1 (yi − Y )(xi − X)/ P i=1 B0 = Y¯ − B X. . i = 1. however. while for the separate ratio estimator this slope can be different for different strata. So roughly we have Y¯ = β0 + β1 X ¯ and y¯ = β0 +β1 x¯. ˆ = Pi∈s (yi − y¯)(xi − x¯)/ Pi∈s (xi − x¯)2 . the variance of Yˆ¯ Rc depends on the “residuals” ehj = (yhj − Y¯h ) − R(xhj − X ¯ h ). N n N − 1 i=1 i where ei = yi − B0 − Bxi . under SRSWOR.e. will be smaller since the approximation is made to y¯st /¯ xst . More formally. If the sample sizes within each stratum. it depends on what kind of auxiliary information is available. nh . B = N ¯ ¯ PN (xi − X) ¯ 2 . the variance formula for the separate ratio estimator depends on the approximation to y¯h /¯ xh . in terms of efficiency. N n n − 1 i∈s i . This leads to the regression type estimator of Y¯ : Yˆ¯ = y¯+β1 (X ¯ − x¯). the bias from using Yˆ¯ Rs can be large. The β1 is usually unknown and is estimated by the least square estimator βˆ1 from the sample data. the combined ratio estimator will have to be used. · · · . If only X ¯ is known. n 1 1 X V (Yˆ¯ REG ) = (1 − ) e2 . . which is equivalent to fit a single straight line across all the strata with a common slope. and the pooled sample size n will usually be large.

When n is large. Under stratified sampling. So regression estimator is preferred in most situations. n 1 V (Yˆ¯ REG ) = (1 − ) SY2 (1 − ρ2 ) . Ratio estimators are still being used by many survey practitioners due to its simplicity. then the regression estimator and the ratio estimator will perform similarly. B ˆ = Pi∈s (yi − y¯)(xi − x¯)/ Pi∈s (xi − x¯)2 . N n where ρ = SXY /[SX SY ] is the population correlation coefficient between y and x. If a scatter plot of the data shows that a straight line going through the origin fits the data well. Both requires only X¯ be known to compute the estimates under SRSWOR. and ˆ0 = y¯ − B B ˆ x¯. ♦ It can be shown that . we have V (Yˆ¯ REG ) ≤ V (¯ y ) under SRSWOR. the regression estimator is always more efficient than the simple sample mean y¯. . It can also be shown that V (Yˆ¯ REG ) ≤ V (Yˆ¯ R ). RATIO AND REGRESSION ESTIMATION ˆ0 − Bx where eˆi = yi − B ˆ i. Since |ρ| ≤ 1. a combined regression estimator and a separate regression estimator can be developed similarly.32 CHAPTER 4.

In this chapter we briefly overview the possible sources of non-sampling errors. this error can be reduced and controlled through a carefully chosen design and through a rea- sonably large sample size. with some discussions on how to identify and reduce this type of errors in questionnaire design. consists of a number of stages. Survey errors can be broadly classified into sampling error and non- sampling error. especially a large scale survey. telephone surveys and web surveys. 5.Chapter 5 Survey Errors and Some Related Issues A survey. with different sources of errors that affect the final reported estimates. 33 .1 Non-sampling errors Major sources of non-sampling errors may include some or all of following: 1. Non-response error: The amount by which the quantity for sampled population differs from the quantity for the frame population. Under probability sampling. Coverage error: The amount by which the quantity for the frame population differs from the quantity for the target population. Each stage. All other errors are called non-sampling errors. 2. may require considerable time and effort. The sampling error is the amount by which the estimate computed from the data would differ from the true value of the quantity for the sampled population. from the initial planning to the ultimate publication of the results.

the bias would be y1 ) − Y¯ = Y¯1 − Y¯ = W2 (Y¯1 − Y¯2 ) . 5. Suppose we have data from the respondent group obtained by SRSWOR and y¯1 is the sample mean. with stratum weights W1 and W2 . occur when the sampled element cannot be reached or refuse to respond. Effect of non-response Consider a single study variable y. Well-trained staff members can re- duce the error from data management. and item non-response where information on certain variables are not available. If the ith element is selected in the sam- ple.34 CHAPTER 5. If we use y¯1 to estimate Y¯ . Measurement error: In theory. the observed value of y is denoted by yi∗ . but we have no data from the non-respondent group. we assume there is a true value yi attached to the ith element. carefully designed questionnaire well- worded questions in mail surveys or telephone surveys can reduce measure- ment errors and non-response rate in these cases. The measurement error is the amount by which the estimate computed from yi∗ differs from the amount com- puted from yi . SURVEY ERRORS AND SOME RELATED ISSUES 3. Let Y¯1 and Y¯2 be the means for the two groups. Errors incurred from data management: Steps such as data pro- cessing. Since the equipment for the measurement may not accurate.2 Non-response In large scale surveys. 1. yi∗ may differ from yi . Non-response. 4. The finite population can be concep- tually divided into two strata: respondent group and non-respondent group. it is often the case that for each sampled element several or even many attributes are measured. coding. or the questionnaire are not well designed. It follows that Y¯ = W1 Y¯1 + W2 Y¯2 . There are two types of non-response: unit non-response where no information is available for the whole unit. which is our original target parameter. sometimes called missing data. E(¯ . data entry and editing can all bring errors in. Non-sampling errors are hard to control and are often left un-estimated or unacknowledged in reports of surveys. or the selected individuals intentionally provide incorrect in- formation.

Personal interviews. follow-ups in personal interviews or mail inquiries can reduce the non-response dramatically. The questions should be precise and they should elicit accurate answers. Always test your questions before taking the survey. 2. well-tested questionnaire can reduce both the measurement errors and the non-response rate. some special efforts. mail inquiry. This is the most important step in writing a questionnaire. Other techniques include subsampling of non- respondents and randomized response for sensitive questions. If Y¯1 and Y¯2 are very close. Decide what you want to find out. 5. Think about different wording. telephone interview. which is often the case in practical situations. the bias can be non-ignorable. training and supervision of interviewers. etc). the choice of data collection method (personal interview. Some general guide- lines (Lohr.3. such as call-backs in telephone in- terview. In the process of data collection.3 Questionnaire design Measurements of study variables on each selected element (individual) are often obtained by asking the respondents a number of pre-designed questions. the bias will be negligible. and/or W2 is very small. Dealing with non-response Non-response rates can be reduced through careful planning of the sur- vey and extra effort in the process of data collection. . mail surveys. telephone surveys. the selection. and web surveys all use a questionnaire. Try the questions on a very small sample of individuals from the target popula- tion and make sure that there are no misinterpretations of the questions to be asked. On the other hand. if W2 is not small. In the planning stage. and Y¯1 and Y¯2 differ substantially. 2000) should be observed when one is writing a questionnaire: 1. QUESTIONNAIRE DESIGN 35 The bias depends on the proportion of non-respondents and the difference between the two means. Questions that seem clear to you may not be clear to someone else. A carefully designed. 2. 3.5. the attitude of management toward non-response. Keep it simple and clear. think about the diversified background of the individuals selected. the design of questionnaire are all important toward the reduction of the non-response.

For household surveys. the order of these answers should also be considered: some respondents will simply choose the first one or the third one! 5.4 Telephone sampling and web surveys The use of telephone in survey data collection is both cost-effective and time- efficient. These leading (or loaded) type questions can result in serious measurement error problems and bias. . SURVEY ERRORS AND SOME RELATED ISSUES 4. Adjustment at the estimation stage is neces- sary to take these into account. if possible. Sample selection: with difficulties arisen from the chosen sampling frame. Pay attention to question-order effects. The way the numbers are selected. Avoid questions that prompt or motivate the respondent to say what you would like to hear.36 CHAPTER 5. 6. 5. while for the closed questions the re- spondents are forced to choose answer(s) from a pre-specified set of possible answers. Ask only one concept in each question. Typically. the selection of a probability sample requires special techniques. not all the numbers in the list be- long to the target population and some members from the target population are not on the list. Use specific questions instead of general ones. However. Decide whether to use open or closed questions. the person who answers the phone. 7. This will promote clear and accurate answers to the questions being asked. there are several other unique features related to telephone surveys. in addition to the issue of how to design the questions. etc. all business numbers should be excluded and those without a phone will not be reached. Answers to open questions are of free form. If you ask more than one question. the way of handling not-reached number. all have impact on the selected sample. 8. It ensures that accurate answers will most likely be obtained. the order of these questions will play a role. the time of making a call. Sometimes a phone can be shared by a group of people and sometimes a single person may have more than one number. This situation differs from country to country. If you ask closed questions with more than two possible answers. place to place. The choice of a sampling frame: there are usually more than one list of telephone numbers available.

Sim- ilar to telephone surveys. It is very difficult to control and/or distinguish between the target population and the sampled population. However. . TELEPHONE SAMPLING AND WEB SURVEYS 37 There is an increased tendency of doing surveys through the web. Results from web surveys should always be interpreted with care.5.4. there are serious problems with this kind of surveys. this is a cheap and quick way of gathering data. The future of web surveys is still uncertain. The sample data are obtained essentially from a group of volunteers who are interested in providing information through the web.

38 CHAPTER 5. SURVEY ERRORS AND SOME RELATED ISSUES .

He had 11 plants set out in a single row. the values of input variables are carefully chosen and controlled. and from the depth of the resulting depression. there is a fixed (non-random) value yi of the output to be obtained. and the remaining 6 were fed a supposedly improved mixture B. 5 were given the standard fertilizer mixture A. In designed experiment. In survey sampling. surveyor passively investigates the characteristics of an output variable y. once a unit i is selected. and conceptually. Example 6.1 (Tomato Planting) A gardener conducted an experiment to find whether a change in the fertilizer mixture applied to his tomato plants would result in an improved yield. in a way to be discovered. The population under study is the collection of all possible quantitative settings behind each setting of experimental factors and is (at least conceptually) infinite. and the output variables are regarded as random in that the val- ues of the output variables will change over repeated experiments under the same setting of the input variables. We also assume that the setting of the input variables determines the distribution of the output variables. Example 6. The machine operates by pressing the tip into a metal test coupon. The yields of tomatoes from each plant were measured upon maturity.Chapter 6 Basic Concepts of Experimental Design Experimentation allows an investigator to find out what happens to the output variables when the settings of the input variables in a system are purposely changed. the hardness of the 39 .2 (Hardness Testing) An experimenter wishes to determine whether or not four different tips produce different readings on a hardness testing machine.

Four observations for each tip are required. Is it possible that only some of them are crucial and some of them can be dropped from con- sideration? 3. person. and obtaining the response is called an experimental run. operator. 70◦ F . The process of choosing a treatment.3 (Battery Manufacturing) An engineering wish to find out the effects of plate material type and temperature on the life of a battery and to see if there is a choice of material that would give uniformly long life regardless of temperature. 2. applying it to an experiment unit. Response surface exploration.1 Five broad categories of experimental prob- lems 1. EXPERIMENTAL DESIGN coupon can be determined. animal. Normally. or location. Battery life are observed at various material-temperature combinations. An experimental unit is a generic term that refers to a basic unit such as ma- terial. chemical reaction is controlled by temperature. . its levels are the treatments. He has three possible choices of plate mate- rials. and so on. The purpose of most (statistical) experiments is to find the best possible setting of the input variables. 4. plant. The output of an experiment can be analyzed to help us to achieve this goal. to which a treatment is applied. with different levels that can be controlled or set by the experimenter. The output variable in an experiment is also called the response. and 125◦ F – are used in the lab since these temperature levels are consistent with the product end- use environment. and three temperature levels – 15◦ F. time period. Treatment comparisons. concentration. The output is likely being influenced by a num- ber of factors. and we want to know which one is the best by some standard. The input variables are referred to as factors. Variable screening. For instance. pressure. it implies that a product can be obtained by a number of different ways. Suppose a few factors have been deter- mined to have crucial influences on the output. The main purpose is to compare several treat- ments and select the best ones. When there is only one factor. duration.40 CHAPTER 6. 6. A treatment is a combination of factor levels. We may then search for a simple mathematical relationship between the values of these factors and the output. System optimization. Example 6.

What do you want to achieve? (This is usually from your future boss. randomization. A poor design may capture little information which no analysis can rescue. 6. It can be costly to redo the experiment).3 Three fundamental principles There are three fundamental principles in experimental design. What kind of fertilizer you use is a qualitative factor. What do you plan to observe? This is similar to the variable of interest in survey sampling. Choose factors and levels. How much fertilizer you use is a quantitative factor. In general. This variation reflects the magnitude of experimental error.6. Determine the response. The system deteriorates when the values of the input variables deviate from these settings. Work out an experimental plan. namely. it could be costly to control the input variables precisely.2. We define the treatment effect as the expected value (mathematical expectation in the word of probability theory) of the response variable (measured against . It could be something you want to demonstrate. 5. Make sure you will carry out the experiment as planned. we call it replication. Suppose the system is approximately optimized at two (or more) possible settings of the input variables. and blocking. 3. it is very important to plan ahead. A setting is most robust if the system deteriorates least. two or more levels of each factor are needed. and hope that the outcome will impress your future boss). repli- cation. (Come to our statistical consulting centre before doing your experiment. However. System robustness. 6. Replication When a treatment is applied to several experiment units. The basic principle is to obtain the information you need efficiently. If practical situations arise such that you have to alter the plan. Perform the experiment. 4. Factors may be quantitative and qualitative. SYSTEMATIC APPROACH 41 5. 1. in mass production. the outcomes of the response variable will differ. The following five-step procedure is directly from Wu and Hamada (2000).2 A systematic approach to the planning and implementation of experiments Just like in survey sampling. be sure to record it in detail. 2. State objectives. To study the effect of factors.

Units within a group are more homogeneous but may differ a lot from group to group. or replicates. if you apply a treatment to one experimental unit. These groups are referred to as blocks. An effective blocking scheme removes the block to block variation. Remember. It is desirable to compare treatments within the same block. It helps to reduce the measurement error. It is therefore important to increase the number of replicates. The responses should also be measured in random order (if possible). Ran- domization can then be applied to the assignments of treatments to units within the blocks to further reduce (balance out) the influence of unknown variables. you probability need only recruit 1. so that the block effects are eliminated in the comparison of the treatment effects. you do not have 5 replicates. and several groups of units will have to be used. Blocking Some experiment units are known to be more similar each other than others. In following chapters some commonly used experimental designs are pre- sented and those basic principles are applied.000 women. It provides protection against variables (factors) that are unknown to the experimenter but may impact the response. but mea- sure the response variable 5 times. if you want to determine if a drug can reduce the breast cancer prevalence by 50%.000 women. increases. Applying the principle of blocking makes the experiment more efficient. Sometimes we may not have a single large group of alike units for the entire experiment. the experiment units are not identical despite our effort to make them alike. . EXPERIMENTAL DESIGN some standard). and the variance of the estimate reduces when the number of replications. To prevent unwanted influence of subjective judgment. the units should be allocated to treatment in random order. not experimental error. Randomization In applications. if we intend to detect small treatment effects.42 CHAPTER 6. You have 5 repeated measurements. For example. while to detect a reduction of 5% may need to recruit 10. Here is the famous doctrine in experimental design: block what you can and randomize what you cannot. The treatment effect will be estimated based on the outcome of the experiment.

The tomato plant example is typical. The goal is to compare the treatments. 7. y2. and treatment 2 to n2 units. we have sample data as follows y11 . ni (7. When the n experiment units are homogeneous. . . . . . the allocation should be completely randomized to avoid possible influences of unknown factors. with n = n1 +n2 .1) with µi being the expectation of yij . y22 . Once the observations are obtained. . A commonly used statistical model for a single factor experiment is that yij = µi + eij .e. 43 . . .1 Comparing 2 treatments Suppose we want to compare the effects of two different treatments. y1.Chapter 7 Completely Randomized Design We consider experiments with a single factor. . We also assume the response variable y is quantitative. and there are n experiment units available. and eij being the error terms resulted from repeated experiments and being independent and identically distributed as N (0. . where we wish to compare the effect of two fertilizer mixtures on the yield of tomatoes. i = 1. . We may allocate treatment 1 to n1 units. y12 .n2 . 2. 2.n1 and y21 . E(yij ) = µi . j = 1. . σ 2 ). i.

if the answer is yes. Note that the variance σ 2 is assumed to be the same for both treatments. as it uses the y values from both treatments. V ar(ˆ i = 1. It could certainly also be µ1 > µ2 versus µ1 ≤ µ2 but this problem is symmetric to the case of (2). To have a good estimate of µ1 . the smaller the variance of the point estimator µˆi . and H1 as alternative hypothesis or simply the alternative. A key step in con- structing the test is to first estimate the unknown means µ1 and µ2 . The statistical analysis of the experiment focuses on answering the ques- tion “Is there a significant difference between the two treatments?” and. however. something we assume. then the conclusion of the analysis can be used to decide which treatment to use in future applications. 2. COMPLETELY RANDOMIZED DESIGN The above model is. it is easy to verify that E(ˆ µi ) = µi for i = 1. So they are both unbiased estimators. It is now clear that the larger the sample size ni .e. It can be estimated by   n1 n2 1 s2p (y1j − y¯1· ) + (y2j − y¯2· )2  . For most experiments with quantitative response variable. Further. i. the above model works well. It may be a good approximation to the real world problem under study. This is equivalent to testing one of the two types of statistical hypothesis: (1) H0 : µ1 = µ2 versus H1 : µ1 6= µ2 . s2p is unbiased estimator for σ 2 . . Replications reduce the experimental error and ensure better point estimates for the unknown parameters and consequently more reliable test for the hypothesis. we should make n1 large. The H0 is referred to as Null hypothesis. we have µi ) = σ 2 /ni . we should make n2 large. The test procedures are presented in next section. n1 j=1 n2 j=1 Under the assumed model. however. If larger value of µi means a better treatment. It can also be irrel- evant to a particular experiment. to have a good estimate of µ2 . we estimate them by n1 n2 1 X 1 X µ ˆ1 = y¯1· = y1j and µ ˆ2 = y¯2· = y2j . Usually. 2.44 CHAPTER 7. and (2) H0 : µ1 ≤ µ2 versus H1 : µ1 > µ2 . It can be shown that E(s2p ) = σ 2 . 2 X X =  n1 + n2 − 2 j=1 j=1 This is also called the pooled variance estimator. trying to identify which treatment is preferable.

(ii) the sampling distribution of T is known if H0 is true. then µ1 − µ2 = 0. · · · . 3. Reach to a final conclusion: for the given sample data.01. · · · . With assumed independence. 2. and P (reject H0 |H0 is true) is called the type I error probability. Xn ). 4. reject H0 .10). In practice. 1 1   2 µ1 − µ V ar(ˆ ˆ2 ) = σ + . Xn ) ∈ C} such that P (T ∈ C|H0 ) ≤ α for a prespecified α (usually α = 0.7. 0. This usually involves the following steps: 1. The test statistic T needs to satisfy two crucial criteria: (i) the value of T is computable solely from the sample data. HYPOTHESIS TEST 45 Finally. 7.05 or 0. Such a test is called an α level significant test. This is the so- called two sided test problem since the alternative includes both possibilities. If H0 is true. we may estimate µ1 − µ2 by µ ˆ1 − µ ˆ2 = y¯1· − y¯2· . This is often related to the point estimators for the parameters of interest. An intuitive argument for the test would be as follows: µ1 − µ2 can be estimated by y¯1· − y¯2· . 1. Start by assuming H0 is true. and . we often have a limited resource such that n = n1 + n2 will have to be fixed. In this case we should make n1 = n2 (or as close as possible) to minimize the variance of µˆ1 − µ ˆ2 . Otherwise we fail to reject H0 . We now elaborate the above general procedures through following commonly used two-sample tests. µ1 > µ2 or µ1 < µ2 . n1 n2 This variance becomes small if both n1 and n2 are large. Xn ) : T (X1 . and then try to see if information from sample data supports this claim or not.2 Hypothesis test under normal models A statistical hypothesis test is a decision-making process: you have to make a decision on whether to reject the null hypothesis H0 based on the information from the sample data. Find a test statistic T = T (X1 . · · · . Determine a critical (rejection) region {(X1 . Two sided test Suppose we wish to test H0 : µ1 = µ2 versus H1 : µ1 6= µ2 .2. if T ∈ C.

The c is determined y1· − y¯2· | > c|µ1 = µ2 ) = α for the given α (usually a small positive by P (|¯ constant). 1). The error rate. So what is wrong? The model could be wrong. |T0 | > Zα/2 . i. 1) random variable to take values as extreme as 2.1).96 is only 1 out of 20. That is. the data does not support the null hypothesis H0 : we therefore reject this hypothesis. the experiment could be poorly conducted. COMPLETELY RANDOMIZED DESIGN consequently we would expect y¯1· − y¯2· is also close or at least not far away from 0. is controlled by α. we may start thinking: something must be wrong because it is very unusual for a N (0. however.e.4. The H0 is indeed true and T0 is distributed as N (0.05. Note that we could make a wrong decision in the process. then y¯1· − y¯2· T0 = q σ n−1 1 + n2 −1 is also distributed as N (0. The test statistic is given by y¯1· − y¯2· T0 = q sp n−1 1 + n2 −1 . σ 2 ) and that all yij ’s are independent of each other. or too small.5. the chance to observe a T0 such that |T0 | > 1. However. It just happened that we observed an extreme value of T0 . we may then come to the conclusion that maybe the hypothesis H0 : µ1 = µ2 is wrong! The data are not consistent with H0 . where Z1−α/2 is 1 − α/2 quantile of the N (0. say −3. if |¯ y1· − y¯2· | > c for certain constant c.5. say it equals 2. yij ∼ N (µi . In other words.96) = 0. since the value of T0 cannot be computed from the sample data. If H0 : µ1 = µ2 is true. if these possibilities can be ruled out. 1) random variable.46 CHAPTER 7. or −3. The resulting test is the well-known two-sample t-test. then P (|T0 | > 1. the computation could be wrong. we have to follow the rule to reject H0 . The underlying logic for the above decision rule is as follows: if H0 is true. we reject H0 if |T0 | > Z1−α/2 . for example. 1) random variable. If T0 computed from the data is too large. y1· − y¯2· ) − (µ1 − µ2 ) (¯ T = q σ n−1 1 + n2 −1 is distributed as N (0.4. Since P (|T0 | > Z1−α/2 |H0 ) = α. In this case the σ 2 will have to be estimated by the polled variance estimator s2p . Under the assumed normal model (7. we have evidence against H0 and therefore should reject H0 . 1) random variable. The test cannot be used if the population variance σ 2 is un- known.

One sided test It is often the case that the experiment is designed to dispute the claim H0 : µ1 = µ2 . so the type I error probability is still controlled by α. The test statistic T0 can also be used in this case. This claim can also be tested before we examine the means. but it does not support the alternative H1 : µ1 > µ2 . The two sided test can be modified to handle this case. we reject H0 only if T0 > tα (n1 + n2 − 2). It should be noted that. We reject H0 if T0 > t1−α (n1 + n2 − 2). one may wish to claim that certain new treatment is better than the old one. A test for equal variances Our previous test assumes σ12 = σ22 . T0 is NOT distributed as t(n1 + n2 − 2). under H0 : µ1 ≤ µ2 . have P (T0 > t1−α (n1 + n2 − 2)|H0 ) ≤ P (T > t1−α (n1 + n2 − 2)|µ1 = µ2 ) = α. where σi2 = V ar(yij ) for i = 1. in favor of the one sided alternative H1 : µ1 > µ2 . q A large negative value of T0 = (¯ y1· − y¯2· )/(sp n−1 −1 1 + n2 ) provides ev- idence against H0 . Note that σ12 and σ22 can be estimated by the two sample variances. For instance. to test H0 : µ1 = µ2 versus H1 : µ1 < µ2 . . Sometimes a test for H0 : µ1 ≤ µ2 versus H1 : µ1 > µ2 may be of interest. n1 − 1 j=1 n2 − 1 j=1 To test H0 : σ12 = σ22 versus H1 : σ12 6= σ22 . We reject H0 if |T0 | > tα/2 (n1 + n2 − 2). the ratio of s21 and s22 is used as the test statistic. Similarly.2. The term µ1 − µ2 does not vanish from T under H0 which only states µ1 − µ2 ≤ 0. however. F0 = s21 /s22 .7. which is a symmetric situation to the foregoing one. We do. we reject H0 if T0 < −t1−α (n1 + n2 − 2). 2. n1 n2 1 X 1 X s21 = (y1j − y¯1· )2 and s22 = (y2j − y¯2· )2 . HYPOTHESIS TEST 47 which has a t-distribution with n1 + n2 − 2 degrees of freedom if H0 is true. The general decision rule should follow that evidence which is against H0 should be in favor of H1 . Hence. 2. 3.

4. The concept of “more extreme” is case dependent.999. n2 − 1).999 would provide stronger evidence against H0 than that of T0 = 2. The experimenter has collected 10 observations of strength under each of the two formulations.400. COMPLETELY RANDOMIZED DESIGN Under the normal model (7. Let Tobs be the observed value of the test statistic T0 computed from the sample data and T be a random variable following the same distribution to which T0 is compared.025 (8) = 2. The H0 will have to be rejected whenever the p-value is smaller or equal to α. The p-value We reject H0 if the test statistic has an extremely large or small observed value when compared to the known distribution of T0 under H0 . The p-value is defined as p = P (T is more extreme than Tobs ) .76 0.400 or T0 = 5. If we observed T0 = 2.100 Modified 10 17. the case of T0 = 5. F0 is distributed as an F (n1 − 1.05.92 0.48 CHAPTER 7. Note: Most F distribution tables contain only values for high percentiles. Formulation ni y¯i· s2i Standard 10 16. p = P (|T | > |Tobs |) . where T ∼ t(n1 + n2 − 2). for the one sided t-test for H0 : µ1 ≤ µ2 versus H1 : µ1 > µ2 . For the two sided t-test. n2 − 1). if n1 = n2 = 5 and α = 0. However. we would reject H0 at both cases. Example 7. n2 ) = 1/Fα (n2 . The smaller the p-value. we reject H0 : µ1 = µ2 whenever |T0 | > t0.1 An engineer is interested in comparing the tension bond strength of portland cement mortar of a modified formulation to the standard one. the p-value is computed as p = P (T > Tobs ) . For instance. Values for low percentiles can be obtained using F1−α (n1 .1) and if H0 is true. n2 − 1) or F0 > Fα/2 (n1 − 1. the stronger the evidence against H0 . We reject H0 if F0 < F1−α/2 (n1 − 1.306. where T ∼ t(n1 + n2 − 2).061 . n1 ). The data is summarized in the following table.

Since F0 < 4.e. When the number of experimental runs n (sample size) is not large. RANDOMIZATION TEST 49 A completely randomized design would choose 10 runs out of the sequence of a total number of 20 runs at random and assign the modified formulation to these runs while the standard formulation is assigned to the remaining 10 runs. We assume yij ∼ N (µi . . The observed F statistics is F0 = s21 /s22 = 1.05 (18) = −1. One strategy of analyzing the data without the normality assumption is to take advantage of the randomization in our design.03. the modified formulation does improve the strength. The primary concern of the experimenter is to see if the modified formulation produces improved strength.0001.975 (9. i. If there is no difference between two treatments (as claimed in the null hypoth- esis). we have to compare F0 to F0. Let T =µ ˆ1 − µˆ2 .92)/[ 0. Since T0 < −t0.0805 1/10 + 1/10] = −9. What we really need to examine is the statistical decision procedure we used with the type I error not larger than α. 9)!). In this case.76 − 17.7.734.6393. treatment one could have been applied to any 5 of 10 experiment units. Let yij be the observed strength for the jth runs under formulation i (= 1 or 2).14 . how do we know that our analysis is still valid? There is certainly no definite answer to this. The observed value of the T statistic is given by √ q T0 = (16.3 Randomization test The test in the previous section is based on the normal model (7. we reject H0 in favor of H1 . there is little opportunity for us to verify its validity.025 (9. We therefore need to test H0 : µ1 = µ2 against H1 : µ1 < µ2 . Suppose n = 10 runs are performed in an experiment and n1 = n2 = 5. The p-value of this test is given by P [t(18) < −9. σ12 = σ22 is a reasonable assumption (Note: if F0 < 1. 9) = 4. Due to randomization.0805 .1).14] < 0. The pooled variance estimate is computed as s2p = [(n1 − 1)s21 + (n2 − 1)s22 ]/(n1 + n2 − 2) = 0.03.3. σi2 ) and we would like first to test H0 : σ12 = σ22 . then it really does not matter which 5 y-values are told to be outcomes of the treatment 1. This is compared to F0. 7. we don’t have enough evidence against H0 .

−6}. −2. we have not only reduced or eliminated the influence of possible unknown factors. the proportion (p-value) is 3/6 = 0. when |T | equals |Tobs |. t252 }. . the definition of “more extreme than Tobs ” depends on the null hypothesis you want to test. the more extreme means |T | ≥ |Tobs |. Along this line of thinking.5/6 = 0. Once more. For the purpose of computing the proportion.5 = 3 T values are more extreme than Tobs in the above definition. . Under the null hypothesis. Therefore. i = 1. i. Hence.50. More interestingly. If. however. the p-value of the randomization test is computed as 1. . The one you have. . as discussed in the last section. 2. and the rest as y21 . . It should not be outstanding. Suppose we observe Tobs = 3. . when randomization strategy is used in the design. . . 6. · · · . suppose n1 = n2 = 2 and T takes (42 ) = 6 possible values as {2. is just an ordinary one. the 252 possible T values are equally likely to occur. Once again. COMPLETELY RANDOMIZED DESIGN This statistic can be computed whenever we pick 5 y-values as y11 . the randomization adapted in the design of the experiment not only protect us from unwanted influence of unknown factors. y15 . · · · .50 CHAPTER 7. it turns out that Tobs is one of the largest possible values of T (out of 252 possibilities). 252. y25 .e. it may shed a lot of doubt on the validity of the null hypothesis.25. Tobs . it also enable us to analyze the data without strong model assumptions. The current Tobs is just one of the (10 5 ) = 252 pos- sible outcomes {t1 . P (T = ti ) = 1/252. If you want to reject H0 : µ1 = µ2 and would simply take the alternative as H1 : µ1 6= µ2 . 3. −3. we find that there are 2 + 2 × 0. we count that only as a half. For example. If we wish to test H0 : µ1 = µ2 versus H1 : µ1 > µ2 . . we define the p-value to be proportion of the T values which are more extreme than Tobs . t2 . the outcome of randomization test is often very close to the outcome of the t-test discussed in the last section. but also justified the use of t-test even if the normality assumption is not entirely appropriate.

2. k runs to treatment 2. . to test H0 : µ1 = µ2 = · · · = µk versus H1 : µi 6= µj for some (i. For each treatment i there are ni independent experiment runs. ONE-WAY ANOVA 51 7. k . . Decomposition of the total sum of squares: In cluster sampling we have an equality saying that the total variation is the sum of within cluster variation and between cluster variation. The appropriate procedure for testing H0 is the analysis of variance. X X X (¯ i=1 j=1 i=1 i=1 j=1 Pk P where y¯·· = i=1 j=1 yij /(nk) is the overall average.7. . 2. eij are the random error component and are assumed iid N (0. σ 2 ). n .2) where yij is the jth observation under treatment i. . . j) .e. A normal model for single factor experiment: yij = µi + eij . For a balanced single factor design the total number of runs is N = nk. . . = nk = n. . n j=1 Our primary interest is to test if the treatment means are all the same. µi = E(yij ) are the fixed but unknown treatment means. . . i = 1.4 Comparing k (> 2) treatments: one-way ANOVA Many single-factor experiments involve more than 2 treatments. etc. Suppose there are k (> 2) treatments. A similar decomposition holds here: k X n k k X n (yij − y¯·· )2 = n yi· − y¯·· )2 + (yij − y¯i· )2 . . . 2. It is natural to estimate µi by n 1X µ ˆi = y¯i· = yij . This equality is usually restated as SST ot = SST rt + SSErr . (7.4. A design is called balanced if n1 = n2 = . k. . i = 1. j = 1. i. A completely randomized design would randomly assign k runs to treatment 1. .

Under model (7.2 Analysis of Variance for the F Test Source of Sum of Degree of Mean Variation Squares Freedom Squares F0 Treatment SST rt k−1 M ST rt M ST rt /M SErr Error SSErr N −k M SErr Total SST ot N −1 . F0 = M ST rt /M SErr = [SST rt /(k − 1)]/[SSErr /(N − k)] . y¯k· are iid random variates with mean µ and variance σ 2 /n. i=1 j=1 i=1 If µ1 = µ2 = · · · = µk = µ. The computational procedures can be summarized using an ANOVA table: Table 7. X n (¯ i=1 These two estimators are also called the Mean Squares. · · ·. i. i.52 CHAPTER 7. The p-value is computed as p = P [F (k − 1. y¯k· will tend to differ from each other. the estimated treatment means y¯1· . A combined estimator for the variance σ 2 is given by k X n k 2 X X (yij − y¯i· ) / (n − 1) = SSErr /(N − k) . The F test: The test statistic we use is the ratio of the two estimators for σ 2 .e. so we reject H0 if F0 is too large. The two numbers on the denominators.e. if F0 > Fα (k − 1. denoted by M SErr = SSErr /(N − k) and M ST rt = SST rt /(k − 1) . · · ·. N − k and k − 1. y¯2· . are the degrees of freedom for the two MSs. Another estimator of σ 2 can be computed based on these means. N − k).2) and if H0 is true. F0 is distributed as F (k−1. k yi· − y¯·· )2 /(k − 1) = SST rt /(k − 1) . the estimated treatment means y¯1· . the SST rt will be large compared to SSErr . the µi ’s are not all equal. N −k). COMPLETELY RANDOMIZED DESIGN using three terms of Sum of Squares: Total (Tot). Treatment (Trt) and Error (Err). When H0 is false. N − k) > F0 ] .

The total number of observations is 25.2 The cotton percentage in the synthetic fiber is the key factor that affects the tensile strength. 30.96.76 Error 161.20 20 8.96 24 Note that F0.4. y¯3· = 17. . the p-value is less than 0. y¯2· = 15.7.76 4 118. y¯4· = 21. Source of Sum of Degree of Mean Variation Squares Freedom Squares F0 Treatment 475. The estimated mean tensile strength are y¯1· = 9.6.8. and the overall mean is y¯·· = 15. i) Describe a possible scenario that the design is completely randomized.04. y¯5· = 10. 20. 20) = 4. There is a clear difference among the mean tensile strengths. The total sum of squares is SST ot = 636.43. 25. 35) and obtained five observations of the tensile strength for each level. ii) Complete an ANOVA table and test if there is a difference among the five mean tensile strengths.8. ONE-WAY ANOVA 53 Example 7.94 F0 = 14. An engineer uses five different levels of cotton percentage (15.06 Total 636.6.01.01 (4.4.

COMPLETELY RANDOMIZED DESIGN .54 CHAPTER 7.

Although randomization tends to balace their influence out. In some applications.Chapter 8 Randomized Blocks and Two-way Factorial Design We have seen the important role of randomization in the designed experi- ment. Whether the left or the right sole was made with A or B was determined by flipping a coin. 8. it is more appropriate if arrangement can be made to eliminate their influence all together. used for boy’s shoes are compared. the sole of one shoe was made with A and the sole of the other with B. but we are not interested at the moment to investigate their effects. It also validates the statistical analysis under the normality assumptions. A and B. We would like to know which material is more durable. For instance. experimental units often differ dramatically from one to another. there often exist some factors which obviously have significant influence on the outcome. however. The experimenter recruited 10 boys for the experiment. randomization reduces or eliminates the influence of the factors not considered in the experimenst. The durability data were obtained as follows: 55 . The treatment effects measured from the response variable are often overshadowed by the unit variations. Each boy wore a special pair of shoes. In general. Randomized blocks design is a powerful tool that can achieve this goal.1 Paired comparison for two treatments Consider an example where two kinds of materials.

2 8.17. A randomization test can be used here to test the difference between the two materials.5 10.09.8 8.4 9.9 14. it is statistically significant that the two materials have different durability.3 13. Consequently. not in boys. eight have higher measurement from B than from A. Let y1j and y2j be the two . the average difference from the current data. there are 1024 possible average of differences. the observed difference of the response variable for each boy should reflect the difference in materials.6 If we blindly apply the analysis techniques that are suitable for the com- pletely randomized designs.41. As materials A and B were both wore by the same boy for the same period of time.2 14. We find that three of them are larger than 0.5%. The observed value of the T -statistic is Tobs = 0.8 13. and four give the same value as 0.7 6.0 8.3 10.72. The T test for paired experiment: For paired experiments.3 B 14. If two materials are equal durable. and therefore. There is no significant evidence based on this test. In addition. the two cases when material A lasted longer have smaller differences. This “significant difference” was not detected from the usual T test due to the fact that the difference between boys are so large that the difference between two materials is not large enough to show up.63.8 11.01. we find that (i) the durability measurements differ greatly from boy to boy. If we examine the data more closely.8 6.8 11. we have y¯A = 10.56 CHAPTER 8. y¯B = 11.369 and the p-value is 0.3 9. observations obtained from the different experi- mental units tend to have different mean values.2 10. according to the binomial distribution. s2A = 6. An important feature of this experiment has been omitted: the data are obtained in pairs. an outcome as or more extreme like this has probability of only 5. If there were no difference between the two materials. we obtain a significance level of p = 5/1024 = 0.41. but (ii) if comparing A and B for each of the ten boys. Thus. Tossing 10 coins could produce 210 = 1024 possible outcomes. If we split the counts of equal ones. s2 = 6. 1024 possible signed differences.04. random assignment of A and B to left or right shoes should only have effects on the sign associated with the differences.6 9.2 11. BLOCK AND TWO-WAY FACTORIAL Boy 1 2 3 4 5 6 7 8 9 10 A 13. s2B = 6.5%.

we compute the p-value P (|T | > |Tobs |). στ2 ). j = 1. X X (dj − d) j=1 j=1 The statistical hypothesis is now formulated as H0 : τ = 0 and the alternative is H1 : τ 6= 0 or H1 : τ > 0. PAIRED COMPARISON FOR TWO TREATMENTS 57 observed values of y from the jth unit. . The two model parameters τ and στ2 can be estimated by n n τˆ = d¯ = n−1 dj and σˆτ2 = s2d = (n − 1)−1 ¯2. i = 1.8. sd = 0. the one-side test gives us the p-value as P (t(9) > 3. The usual two sample T test which assumes yij = µi + eij is no longer valid under current situation. the observed value of T is computed as τˆ Tobs = √ . we calculate the p-value by P (T > Tobs ).1) where τ = µ2 − µ1 is the mean difference between the two treatments. j = 1.41. There is significant evidence that the two materials are different.4 . · · · . . Under the null hypothesis. where T ∼ t(n − 1).348877) = 0.41 Tobs = √ = 3. 2. for two sided test against the alternative τ 6= 0.386/ 10 Hence. .386. 2. 0. the ej ’s are iid N (0. n . and 0.1.1). the two side test has p-value 0. . n . yij = µi + βj + eij . Let us re-analyze the data set from the boys shoes experiment.0042. A suitable model is as follows. It can be shown that under model (8.0084. sd / n For one-sided test against the alternative τ > 0. The problem can be solved by working on the difference of the response variables dj = y2j − y1j which satisfies dj = τ + ej . . τˆ − τ T = √ sd / n has a t-distribution with n−1 degrees of freedom. where the βj represent the effect due to the experimental units (boys in the previous example) and they are not the same. (8. It is easy to find out that d¯ = 0.

four different types of fertilizers were examined. then the confidence interval would be √ d¯ ± 2. blocks are caused by the heterogeneity of the experimental units. Broadly speaking.262. In general. C. and 3. Confidence interval for τ = µ2 − µ1 : Since τˆ − τ T = √ sd / n has a t-distribution. When this heterogeneity is con- sidered in the design. The factor of fertilizers is of primary interest and has four levels denoted by A.262sd / 10 . a good fertilizer should work well over a variety of seeds. Note that the quantile is t0. denoted by 1. An example of randomized blocks design: Suppose in the tomato plant example.2 Randomized blocks design The paired comparison of previous section is a special case of blocking that has important applications in many designed experiments. The effects of individual boys are obviously large and cannot be ignored. a confidence interval for τ can be easily constructed. BLOCK AND TWO-WAY FACTORIAL Remark: The p-values obtained using randomization or using t-test are again very close to each other. experimental units are homogeneous. and those (blocks) whose effects are desired to be eliminated. In the boys shoes example. Suppose we want a confidence interval with confidence 95% and there are 10 pairs of observations. 8. it becomes a blocking factor. The between block variability is eliminated by treating blocks as an explicit factor. The corresponding effect is called block effect. This factor of boys has to be considered and is called blocking factor. The seed types are obviously . B. but they are not of any interest to the experimenter. The reason for this is that.975 (9) = 2. and three types of seeds. and all treatments are compared within blocks. were used for the experimentation. and D. 2. Within the same block. factors can be categorized into two types: whose with effects of primary interest to the experimenter.58 CHAPTER 8. our primary interest is to see whether the two types of materials have significant difference in durability.

2 3 34.2 24. Let yij be the observed response for fertilizer i and seed j. .7 29. There are a = 4 levels and b = 3 blocks in this example. The other situation will be considered later. j). . For the model to be considered here.7 25. (8. Since the comparisons are relative. and the βj ’s are called the block effects. it implies that µij = µ + τi + βj .5 32.8 18. .7 30. βj is the effect in the jth block (seed). .7 33. .2) where µ is an overall mean. . RANDOMIZED BLOCKS DESIGN 59 important for the plant yield and are treated as blocks. these 12 plants should be randomly positioned. a and j = 1.8.4 29.9 23. The statistical model for this design is yij = µ + τi + βj + eij . These can also be alternatively expressed as H0 : τ1 = · · · = τa = 0 versus H1 : τi 6= 0 for at least one i.2. . b . τi is the effect of the ith treatment (fertilizer). 2. The outcomes. plant yields. P P The τi ’s are therefore termed the treatment effects. we will assume that there is only one experimental run for each combination. The hypotheses of interest are H0 : µ1· = · · · = µa· versus H1 : µi· 6= µj· for at least one pair (i.9 To limit the effect of earth conditions. . For each fertilizer-seed combination. i = 1. we can assume a X b X τi = 0 and βj = 0 . A B C D 1 23. σ 2 ). i=1 j=1 If we let µij = E(yij ). The experimenter adopted a randomized blocks design by applying all four types of fertilizers to each seed. several replicates could be conducted. and eij is the usual random error term and assumed as iid N (0. are obtained as follows.4 2 30. 2. and the planting order for each seed is also randomized. The treatment means are µi· = bj=1 µij /b = µ + τi . . We are interested in testing the equality of the treatment means. the block means are µ·j = ai=1 µij /a = µ + βj .

b a i=1 and a X b 1 X y¯·· = yij . The sum of squares for the residuals represents the remaining sources of variations not due to the treatment effect or the block effect. Note that the experiment was designed in such a way that every block meets every treatment level exactly once. · · · . 2. X SSErr = i=1 j=1 . It is easy to see that ai=1 y¯i· /a = y¯·· P and bj=1 y¯·j /b = y¯·· . ab i=1 j=1 The above decomposition implies that we can estimate µ by y¯·· . i = 1. b j=1 a 1X y¯·j = yij .2). · · · . and is defined as a X b (yij − y¯i· − y¯·j + y¯·· )2 . 2. We could similarly define the block sum of squares b y·j − y¯·· )2 . a. BLOCK AND TWO-WAY FACTORIAL Associated with model (8. j = 1. The size of SST rt forms the base for rejecting the hypothesis of no treatment effects. X SST rt = b (¯ i=1 represents the variations caused by the treatment. We in general are not concerned about testing the block effect. The goal of randomized blocks design is to remove this effect away and to identify the source of variation due to the treatment effect. we may write yi· − y¯·· ) + (¯ yij = y¯·· + (¯ y·j − y¯·· ) + (yij − y¯i· − y¯·j + y¯·· ) where b 1X y¯i· = yij . P a yi· − y¯·· )2 . The sum of squares for the treatment. X SSBlk = a (¯ j=1 The size of SSBlk represents the variability due to the block effect. The quantity eˆij = yij − y¯i· − y¯·j + y¯·· is truly the residual that cannot be explained by various effects. τi by y¯i· − y¯·· and βj by y¯·j − y¯·· .60 CHAPTER 8.

(a − 1)(b − 1)] . Further. It is important to see a similar decomposition for the degrees of freedom: N − 1 = (a − 1) + (b − 1) + (a − 1)(b − 1) . it can also be shown that if there is no treatment effect. where M ST rt = SST rt /(a − 1) and M SErr = SSErr /[(a − 1)(b − 1)] are the mean squares. . it is worthwhile to point out that this perfect decomposition is possible fully due to the deliberate arrangement of the design that every level of the blocking factor and every level of treatment factor meets equal number of times in experimental units. the value of SST rt will be large compared to SSErr . Mathemati- cally one can test the block effect using a similar approach. (a − 1)(b − 1) > F0 ]. SSBlk and SSErr are independent of each other. where F 0 is the actual value of Fobs . where N = ab is the total number of observations. We reject H0 if F0 > Fα [a − 1. a-1. One simply types 1. but this is usually not of interest.8. When treatment effect does exist. can be obtained using Splus or R program. (a − 1)(b − 1)] . The exact p-value. if H0 is true. it could be shown that SST rt .2). RANDOMIZED BLOCKS DESIGN 61 Finally.2. Computations are summarized in the following analysis of variance table: Source of Sum of Degrees of Mean variation Squares Freedom Squares F0 Treatment SST rt a−1 M ST rt = SS T rt a−1 F0 = M ST rt M SErr Block SSBlk b−1 M SBlk = SS Blk b−1 SSErr Error SSErr (a − 1)(b − 1) M SErr = (a−1)(b−1) Total SST ot N −1 This is the so-called two-way ANOVA table. P [F (a − 1.e. Under model (8. Note that the F distribution has only been tabulated for selected values of α.pf(F0. the total sum of squares SST ot = ai=1 bj=1 (yij − y¯·· )2 can be de- P P composed as SST ot = SST rt + SSBlk + SSErr . Again. F0 = M ST rt /M SErr ∼ F [a − 1. (a-1)*(b-1)) to get the actual p-value. i.

42 F0 = 1. First.50 .95 . one may wish to estimate the treatment effects τi by τˆi = y¯i· − y¯·· . BLOCK AND TWO-WAY FACTORIAL Let us complete the analysis of variance table and test whether the fertil- izer effect exists for the data described at the beginning of the section. i.30 .12 6 M SErr = 13. To construct a 95% confidence interval for τi . y¯3· = 26. The exact p-value can be found using Splus as 1-pf(1. y¯·3 = 31.09.3.757. Then compute y¯·· = (24.27 3 M ST rt = 22.43 . we need to find the variance of τˆi .722 Block SSBlk = 103. X SSBlk = 4[ j=1 and finally. y¯·2 = 27.65 Error SSErr = 78.69 .6)=0.17 and y¯·1 = 24. Confidence intervals for individual effects: When H0 is rejected.38 . There are no significant difference among the four types of fertilizers.722. compute y¯1· = 29.02 Total SST ot = 248.e. and 4 X 3 yij2 − 12¯ y··2 = 248.95 + 27. we don’t have enough evidence to reject H0 .95)/3 = 28. The analysis of variance table can now be constructed as follows: Source of Sum of Degree of Mean Variation Squares Freedom Squares F0 Treatment SST rt = 67.69 11 Since F0 < F0. SSErr = SST ot − SST rt − SSBlk = 78.38 + 31.12 . y¯4· = 31.62 CHAPTER 8. 6) = 4.30 2 M SBlk = 51.2613. X SST rt = 3[ i=1 3 y¯·j2 − 3¯ y··2 ] = 103.05 (3.27 .95 .27 . . y¯2· = 25. the treatment effects do exist. The following model assumptions are crucial for the validity of this method. X SST ot = i=1 j=1 4 y¯i·2 − 4¯ y··2 ] = 67.

(iii) All observations are independent and normally distributed.8. A t confidence interval can then be constructed.3 Two-way factorial design The experiments we have discussed so far mainly investigate the effect of a single factor to a response. Suppose in an experiment we are interested in the effects of two factors. as can be seen in the next section. A (balanced) two- way factorial design proposes to conduct the experiment at each treatment (combination of levels of A and B) with same number of replicates. The outcomes are summarized as follows: . A and B. This assumption can also invalid in some applications. etc. in the boys shoes example. denoted by A. This is not always realistic either. and D. The response variable is the survival time. The tomato plant example investigated the factor of fertilizer. TWO-WAY FACTORIAL DESIGN 63 (i) The effects of the block and of the treatment are additive.e. C. II and III. Both factors are equally important.3. (II. We assume factor A has a levels and B has b levels. A). (ii) The variance σ 2 is common for all error terms. we are interested in the factor of different materials. B. denoted by I. For each treatment such as (I. (III. i. A toxic agents example of two-way factorial design: In an experiment we consider two factors: poison with 3 levels. 8. In randomized blocks design. Also note that σ 2 can be estimated by M SErr . B). four replicated experimental runs were conducted. µij = µ + τi + βj . Under above assumptions it can be shown that (ˆ τi − τi )/SE(ˆ τi ) is distributed as t((a − 1)(b − 1)). C). the blocking factor comes into the picture but our analysis still concentrated on a single factor. and treatment with 4 levels.

23 0. .40 0. This is reflected by the interaction terms γij . and n = 4. n.25 0. . The key difference between model (8. .88 0.21 0.02 0.38 III 0. The µ can be viewed as the overall mean.45 1.40 0. b and j=1 γij = 0 for i = 1.31 0.45 0.66 0. j = 1. σ 2 ). 2.71 0.3) is not the number of replicates.64 CHAPTER 8. In order to have the capacity of estimating γij .3) where i = 1. . P P Pa Pb i=1 γij = 0 for j = 1.23 0.76 0.43 0. . 2. The total number of observations is abn.49 0.82 0.36 0.37 0. The τi ’s are the effect for factor A.30 0. 2.44 0. (8. it is necessary to have several replicates at each treatment combination. .24 0. The following statistical model is appropriate for this problem: yijk = µ + +τi + βj + γij + eijk . j. . . In addition. . The change of treatment means from µ1· to µ2· depends not only on the difference between τ1 and τ2 . a. .63 0. .30 0. . .2) and model (8. The additive model (8. . The eijk are the error terms and are assumed as iid N (0. but also the level of another factor. b = 4.43 0.38 0.72 0.36 0. the experimenter wishes to see if there is an interaction between the two factors.61 0. the γij are the interactions.29 0. Similar to the randomized blocks design.31 0.71 0. .29 0. It is the interaction terms γij . bj=1 βj = 0.10 0.18 0. . .33 Both factors are of interest. . a. In the example a = 3.92 0. To have equal number of replicates for all treatment combinations will result in a simple statistical analysis and good efficiency in estimation and testing.45 0.23 1. 2. .22 0.46 0. BLOCK AND TWO-WAY FACTORIAL Treatment Poison A B C D I 0.56 0. the βj ’s are the effect for factor B.31 0. and k = 1. 2. .35 1.2) used for randomized blocks design is no longer suitable for this case. . we can define these parameters such that ai=1 τi = 0. n.24 0.62 II 0. b.22 0.

3. The number of degrees of freedom associated with each sum of squares is . where a X n b X (yijk − y¯··· )2 . X SSE = i=1 j=1 k=1 One can also compute SSE from subtraction of other sum of squares from the total sum of squares. y¯·j· = yijk . X SSAB = n (¯ i=1 j=1 and a X b X n (yijk − y¯ij· )2 . Due to the perfect balance in the number of replicates for each treatment combinations. Further. and y¯··· = yijk . X SST = i=1 j=1 k=1 a yi·· − y¯··· )2 . X SSA = bn (¯ i=1 b y·j· − y¯··· )2 . The mean squares are defined as the SS divided by the corresponding degrees of freedom. n k=1 Then y¯ij· is a natural estimator of µij . we again have a perfect decomposition of the sum of squares: SST = SSA + SSB + SSAB + SSE . bn j=1 k=1 an i=1 k=1 abn i=1 j=1 k=1 We have a similar but more sophisticated decomposition: yijk − y¯··· = (¯ yi·· − y¯··· ) + (¯ y·j· − y¯··· ) + (¯ yij· − y¯i·· − y¯·j· + y¯··· ) + (yijk − y¯ij· ) .8. let b X n a X n a X b X n 1 X 1 X 1 X y¯i·· = yijk . X SSB = an (¯ j=1 a X b yij· − y¯i·· − y¯·j· + y¯··· )2 . TWO-WAY FACTORIAL DESIGN 65 Analysis of variance for two-way factorial design: Let µij = E(yijk ) = µ + τi + βj + γij and n 1X y¯ij· = yijk .

2 Total 3006.11.7 36 22.7 F0 = 1. The mean squares for each effect are compared to the mean squares of error. 36) > 23. and similarly for the B effect and AB interactions.1 6 41.2 B (Treatment) 922. The analysis of variance table is as follows: Source of Sum of Degrees of Mean variation Squares Freedom Square F0 M SA A SSA a−1 M SA F0 = M SE B SSB b−1 M SB F0 = M SB M SE AB SSAB (a − 1)(b − 1) M SAB F0 = MMSSAB E Error SSE ab(n − 1) M SE Total SST abn − 1 Numerical results for the toxic agents example: For the data presented earlier.2] < 0.9 Error 800.8] < 0.66 CHAPTER 8. We have very strong evidence that both effects present. The F statistic for testing the A effect is F0 = M SA /M SE . 36) > 13. The p-value for testing the poi- son effect is P [F (2.2 47 The p-value for testing the interactions is P [F (6.9] = 0. .4 3 307.0 2 516. 36) > 1. one can complete the ANOVA table for this example as follows (values for the SS and MS are multiplied by 1000): Source of Sum of Degrees of Mean variation Squares Freedom Square F0 A (Poison) 1033. There is no strong evidence that interactions exist.5 F0 = 13.001.6 F0 = 23.001. BLOCK AND TWO-WAY FACTORIAL Effect A B AB Error Total Degree of Freedom a−1 b−1 (a − 1)(b − 1) ab(n − 1) abn − 1 The decomposition of degrees of freedom is as follows: abn − 1 = (a − 1) + (b − 1) + (a − 1)(b − 1) + ab(n − 1) .8 AB Interaction 250. the p-value for testing the treatment effect is P [F (3.

and will shed light on complicated situations. First. A design with three factors at two levels may have as few as 23 = 8 runs. they require relatively few runs.Chapter 9 Two-Level Factorial Design A general factorial design requires independent experimental runs for all possible treatment combinations. it is often the case at the early stage of the design that many potential factors are of interest. the treatment effects estimated from the two level design provide directions and guidance to search for the best treatment settings. each has two levels called “low” and “high”. third. easy to analyze. designs at two levels are relatively simple. Second.1 The 22 design Suppose there are two factors. One may also conclude that such designs are most suitable for exploratory investigation. When four factors are under investigation and each factor has three levels. A and B. There are four treatment combinations that can be represented using one of the following three systems of notation: 67 . A complete replicate of a design with k factors all at two levels requires at least 2 × 2 × · · · × 2 = 2k observations and is called a 2k factorial design. a single replicate of all treatments would involve 3 × 3 × 3 × 3 = 81 runs. and lastly. Choose only two levels for each of these factors and run a relatively small experiment will help to identify the influential factors for fur- ther thorough studies with few important factors only. Factorial designs with all factors at two levels are popular in practice for a number of reasons. 9.

a. 2. TWO-LEVEL FACTORIAL DESIGN Descriptive (A. Also.–) 28 25 27 (1)=80 (+. b and ab will be conveniently used in estimating the effects of factors and in the construction of an ANOVA table. . B high (+. . j = 1. –) a A low. b and ab to represent the total of all n replicates taken at the corresponding treatment combinations. Example 9. B low (+. the total number of experimental runs is 4n. B) Symbolic A low. n. Replicate Treatment I II III Total (–. –) (1) A high. The average effect of factor A is defined as A = y¯2·· − y¯1·· a + ab (1) + b = + 2n 2n 1 = [a + ab − (1) − b]. . . and the experiment is replicated three times for each treatment combinations. j = 1 represent the “low” level and 2 means the “high” level. and k = 1. i = 1. The data are shown as follows.–) 36 32 32 a=100 (–. she chooses two levels for both factors.1 A chemical engineer is investigating the effect of the concen- tration of the reactant (factor A) and the amount of the catalyst (factor B) on the conversion (yield) in a chemical process. 2.+) 18 19 23 b=60 (+. +) b A high.68 CHAPTER 9. we use (1). a.+) 31 30 29 ab=90 The totals (1). B high (–. Let yijk be the observed values for the response variable. 2. Here i.+) ab If there are n replicates for each of the four treatments. 2n . B low (–. 2n The average effect of factor B is defined as B = y¯·2· − y¯·1· b + ab (1) + a = + 2n 2n 1 = [b + ab − (1) − a].

e. For the data presented in example 9. these contrasts can also be used to compute the sum of squares for the analysis of variance: SSA = [a + ab − (1) − b]2 /(4n). THE 22 DESIGN 69 The interaction effect AB is defined as the average difference between the effect of A at the high level of B and the effect of A at the low level of B.33. SSAB = [(1) + ab − a − b]2 /(4n). B = [90 + 60 − 100 − 80]/(2 × 3) = −5.1. i. These contrasts can be identified easily using an algebraic signs matrix as follows: Factorial Effect Treatment I A B AB (1) + – – + a + + – – b + – + – ab + + + + The column I represents the total of the entire experiment. X SST = yijk − 4n(¯ i=1 j=1 k=1 The error sum of squares is obtained by subtraction as SSE = SST − SSA − SSB − SSAB . Further.00. 2n These effects are computed using the so-called contrasts for each of the terms. Contrast(B) = b + ab − (1) − a. namely Contrast(A) = a + ab − (1) − b. AB = [(¯ y22· − y¯12· ) − (¯ y21· − y¯11· )]/2 1 = [(1) + ab − a − b]. SSB = [b + ab − (1) − a]2 /(4n). .67. The contrast for each effect is a linear combination of the treatment totals using plus or minus signs from the corresponding column. the estimated average effects are A = [90 + 100 − 60 − 80]/(2 × 3) = 8. AB = [90 + 80 − 100 − 60]/(2 × 3) = 1. The total sum of squares is computed in the usual way 2 X n 2 X 2 y··· )2 . and Contrast(AB) = (1) + ab − a − b. the column AB is obtained by multiplying columns A and B.1.9.

B and C.92 Total 323. 4n . each at two levels. 9. and l = 1.34 8 3. SSB = nB 2 . The notation (1). .13 AB 8.33 F0 = 53. . 2 represent the “low” and “high” levels of the three factors. We also need a quadruple index to represent the response: yijkl .33 1 8. there are 23 = 8 treatment combinations.70 CHAPTER 9. .15 B 75.00 11 Both main effects are statistically significant (p-value < 1%). The interaction between A and B is not significant (p-value = 0. k = 1. are considered. The total number of experimental runs is 8n.33 1 208.33 F0 = 2. and SSAB = n(AB)2 . b. 2. TWO-LEVEL FACTORIAL DESIGN The sum of squares can be computed using SSA = nA2 . and C are defined as A = y¯2··· − y¯1··· 1 = [a + ab + ac + abc − (1) − b − c − bc] . a. n represent the n replicates for each of the treatment combinations. j.00 1 75. is extended here to represent the treatment combination as well as the totals for the corresponding treatment.00 F0 = 19.13 Error 31. where i.2 The 23 design When three factors A.183). . B. etc. ab. as in the 22 design: A B C Total – – – (1) + – – a – + – b + + – ab – – + c + – + ac – + + bc + + + abc The three main effects for A. The complete ANOVA table is as follows: Source of Sum of Degrees of Mean Variation Squares Freedom Square F0 A 208.

we use half of this difference). there will be a three-way interaction ABC which is defined as the average difference between the AB interaction for the two different levels of factor C.e. 4n C = y¯··2· − y¯··1· 1 = [c + ac + bc + abc − (1) − a − b − ab] . 4n 1 BC = [(1) + a + bc + abc − b − c − ab − ac] .9. 1 AC = [(1) + b + ac + abc − a − c − ab − bc] . y22·· − y¯12·· ) − (¯ AB = [(¯ y21·· − y¯11·· )]/2 1 = [(1) + c + ab + abc − a − b − bc − ac] . 4n The AB interaction effect is defined as the half difference between the average A effects at the two levels of B (since both levels of C in “B high” and “B low”. THE 23 DESIGN 71 B = y¯·2·· − y¯·1·· 1 = [b + ab + bc + abc − (1) − a − c − ac] . 4n When three factors are under consideration. and is computed as 1 ABC = [a + b + c + abc − (1) − ab − ac − bc]. 4n and similarly.2. 4n The corresponding contrasts for each of these effects can be computed easily using the following algebraic signs for the 23 design: Factorial Effect Treatment I A B AB C AC BC ABC (1) + – – + – + + – a + + – – – – + + b + – + – – + – + ab + + + + – – – – c + – – + + – – + ac + + – – + + – – bc + – + – + – + – abc + + + + + + + + . i.

AB = A × B. Factorial Effect Replicate Treatment I A B AB C AC BC ABC I II Total (1) + – – + – + + – –3 –1 (1)=–4 a + + – – – – + + 0 1 a=1 b + – + – – + – + –1 0 b=–1 ab + + + + – – – – 2 3 ab=5 c + – – + + – – + –1 0 c=–1 ac + + – – + + – – 2 1 ac=3 bc + – + – + – + – 1 1 bc=2 abc + + + + + + + + 6 5 abc=11 . It can also be shown that the sum of squares for the main effects and interactions can be computed as (Contrast)2 SS = .72 CHAPTER 9. are presented in the following table. ABC = A × B × C. The contrast for each effect is a linear combination of the totals through the sign columns. the operating pressure in the filler (B). XXXX SST = yijkl − 8n(¯ and the error sum of squares is obtained by subtraction: SSE = SST − SSA − SSB − SSC − SSAB − SSAC − SSBC − SSABC . deviation from the target fill height. For instance. 8n For example. TWO-LEVEL FACTORIAL DESIGN The columns for the interactions are obtained by multiplying the correspond- ing columns for the involved factors. The process engineer chooses two levels for each factor. with sign columns for interactions. and conducts two replicates (n = 2) for each of the 8 treatment combinations. 1 SSA = [a + ab + ac + abc − (1) − b − c − bc]2 . and the bottles produced per minute or the line speed (C). etc. The data. 8n The total sum of squares is computed as 2 y···· )2 .2 A soft drink bottler is interested in obtaining more uniform fill heights in the bottles produced by his manufacturing process. Three control variables are considered for the filling process: the percent carbonation (A). Example 9.

60 Error 5.60 B 20.25 1 20.9.00 1 1.00 1 36. THE 23 DESIGN 73 The main effects and interactions can be computed using Effect = (Contrast)/(4n) .25 1 2.60 ABC 1. .50. For instance.50.2. 1 A = [−(1) + a − b + ab − c + ac − bc + abc] 4n 1 = [−(−4) + 1 − (−1) + 5 − (−1) + 3 − 2 + 11] 8 = 3.25 F0 = 3.25 F0 = 19.00 F0 = 57.25 F0 = 0.60 AC 0. 1 ABC = [−(1) + a + b − ab + c − ac − bc + abc] 4n 1 = [−(−4) + 1 − 1 − 5 − 1 − 3 − 2 + 11] 4n = 0.00 F0 = 1.00 F0 = 1. all the main effects are significant at the level of 1%.00.00 1 1.00 15 None of the two-factor interactions or the three-factor interaction is signifi- cant at 5% level. 1 BC = [(1) + a − b − ab − c − ac + bc + abc] 4n 1 = [−4 + 1 − (−1) − 5 − (−1) − 3 + 2 + 11] 8 = 0. The sum of squares and analysis of variance are summarized in the following ANOVA table.40 C 12.25 1 12.60 AB 2.25 1 0.00 8 0.40 BC 1. Source of Sum of Degrees of Mean variation Squares Freedom Square F0 A 36.25 F0 = 32.625 Total 78.