Sampling Methods II
(Errors in Sample Survey)
Institute of Statistical Research and Training (ISRT)
University of Dhaka
Introduction
The tradition sampling theory related probability sampling as-
sumes that every unit in the population is accessible and mea-
surable. As a result, it is assumed that value of the ith unit yi is
the correct value for that unit.
The error in estimate arises solely from the random sampling vari-
ation that is present when n of the units are measured instead of
the complete population of N units.
These assumptions hold reasonably well in the simpler types of sur-
veys in which the measuring devices are accurate and the quality
of work is high.
In complex surveys, particularly when difficult problems of mea-
surement are involved, the assumptions may be far from true.
Sources of Errors
Failure to measure some of the units in the chosen sample. This
may occur by oversight or, with human populations, because of
failure to locate some individuals or their refusal to answer the
questions when located.
Errors of measurement on a unit. The measuring device may be
biased or imprecise. With human populations the respondents may
not possess accurate information or the may
Errors introduced in editing, coding and tabulating the results.
These sources of error necessitate a modification of the standard
theory of sampling.
The principal aims of such a modification are to provide guidance
about the allocation of resources between the reduction of random
sampling errors and the reduction of the other errors.
Moreover, to develop methods for computing standard errors and
confidence limits that remain valid when the other errors are present.
Effects of Nonresponse
In the study of nonresponse its convenient to think of the popula-
tion as divided in two strata: the first consisting of units for which
measurements would be obtained if the units happened to fall in
the sample, the second of the units for which no measurements
would be obtained.
Let N1 , N2 be the number of units in the two strata such that
N = N1 + N2 , the total number of units in the population.
Therefore, W1 = N1 /N , is the proportion of response in the pop-
ulation, and W2 = N2 /N , is the proportion of nonresponse in the
population.
Assume that sample has been selected using simple random sam-
pling (SRS).
At the end of data collection, we would have simple random sample
from stratum 1, no data from stratum 2.
Cont...
Hence the amount of bias in the sample mean is,
E(ȳ) − Ȳ = Ȳ1 − Ȳ
= Ȳ1 − (W1 Ȳ1 + W2 Ȳ2 )
= (1 − W1 )Ȳ1 − W2 Ȳ2
= W2 Ȳ1 − W2 Ȳ2
= W2 (Ȳ1 − Ȳ2 ),
where, Ȳ1 , Ȳ2 and Ȳ is the true mean for the stratum 1, stratum 2
and population respectively.
That is the amount of bias is the product of the proportion of
nonresponse and the difference in stratum means.
Since, the sample provides no information about Ȳ2 , the size of
the bias is unknown unless bounds can be placed on Ȳ2 from some
source other than the sample data.
Cont...
With a continuous variate, the only bounds that can be assigned
with certainty are often so wide as to be useless.
Consequently, with continuous data, any sizable proportion of non-
response usually makes it impossible to assign useful confidence
limits to Ȳ from the sample results.
We are left in the position of relying on some guess about the size
of the bias, without data to substantiate the guess.
However, in sampling for proportions the situation is a little easier,
since the unknown proportion P2 in stratum 2 must lie between 0
and 1.
If W2 is known, these bounds for P2 enable us to construct confi-
dence limits for the population proportion P .
Cont...
Suppose that a simple random sample of n units drawn and that
measurements are obtained for n1 of the units in the sample.
Assuming n1 large enough, 95% confidence limits for P1 are given
by,
p
p1 ± 2 p1 q1 /n1 ,
where p1 is the sample proportion and the f pc is ignored.
When we try derive a confidence statement about P , we are on
safe ground if we assume P2 = 0 when finding P̂L and P2 = 1
when finding P̂U .
Thus we might take, for 95% limits,
p
P̂L = W1 (p1 − 2 p1 q1 /n1 ) + W2 (0)
p
P̂U = W1 (p1 + 2 p1 q1 /n1 ) + W2 (1)
Cont...
It is easy to verify that these limits are conservative, that is, that,
P r(P̂L ≤ P ≤ P̂U ) > 0.95
The limits can be narrowed a little by a a more careful argument,
since P2 cannot be 0 and 1 simultaneously, as assumed in case
above confidence band.
The limits are distressed wide unless W2 is very small. The rapid
increase in the width of the confidence interval with increasing W2
is evident in the following Table (source: Table 13.2 of Cochran,
2006)
Cont...
It is of interest to examine what values of n would be needed to
give the same widths of confidence interval if W2 were zero.
This is easy when p1 is 0.5 because,
p
P̂L = W1 (p1 − 2 p1 q1 /n1 ) + W2 (0)
p
= (W1 + W2 )p1 − (2 p1 q1 /n1 + W2 p1 )
p
= p1 − (2 p1 q1 /n1 + W2 p1 )
p
P̂U = W1 (p1 + 2 p1 q1 /n1 ) + W2 (1)
p
= W1 (p1 + 2 p1 q1 /n1 ) + W2 (2p1 )
p
= (W1 + W2 )p1 + (2 p1 q1 /n1 + W2 p1 )
p
= p1 + (2 p1 q1 /n1 + W2 p1 )
Cont...
For W2 = 5%, we observed in the previous Table that the half-
width of the confidence interval is 5.6
Therefore, the equivalent sample size ne , assuming no nonresponse,
is found from the equation
p
5.6 = 2 50 × 50/ne
ne = 320.
For W2 = 10, 15 and 20%, the values of ne are 155, 90 and 60,
respectively.
It is evidently worthwhile to devote a substantial proportion of the
resources to the reduction of nonresponse.
Cont...
If the population nonresponse rate W2 is not known, as will usually
be the case, conservative confidence limits can be calculated from
the sample data by a method suggested by a student.
In calculating the lower limit, assume that all sample nonrespon-
dents would have given a negative response.
In calculating upper limit, assume that the sample nonrespondents
would have given a positive response.
For example, suppose n = 1000, n1 = 800, and p1 = 10%, so
that 80 sample members give a positive response and the sample
nonresponse rate is 20%.
Then in percents,
p
P̂L = 100(80/1000) − 2 (8)(92)/1000 = 6.3%
p
P̂L = 100(280/1000) + 2 (28)(72)/1000 = 30.8%
Cont...
If W2 is known from previous experience in the particular type of
survey, Birnbaum and Sirken (1950) give a method of finding the
sample size n that guarantees with risk α an absolute error in the
sample proportion less than a specified amount d that does not
require any advance knowledge of P1 , P2 or P .
By taking the least favorable combination of the bias W2 (P1 − P2 )
and the value P1 , Birnbaum and Sirken show that a value of n that
still guarantees en error less than d, with risk α, is approximately,
t2α
n= − 1,
4d(d − W2 )W1
where tα is the normal deviate corresponding to the risk that the
error exceeds d. Note that no value of n suffices if W2 > d. More-
over, if W2 = 0, then the formula reduces to the usual one.
Cont..
Types of Nonresponse
Noncoverage: This is failure to locate or to visit some units in
the sample.
This is a problem with areal sampling units, in which the inter-
viewer must find and list all dwellings (according to some defini-
tion) in a city block.
It arises also from the use of incomplete lists.
Sometimes weather or poor transportation facilities make it im-
possible to reach certain units during the period of the survey.
%enditemize
Not-at-homes: This group contains persons who reside at home
but are temporarily away from the house.
Families in which both parents work and families without children
are harder to reach than families with very young children or with
old people confined to the house.
Cont...
Unable to answer: The respondent may not have the informa-
tion wanted in certain questions or may be unwilling to give it.
Skillful wording and pretesting of the questionnaire are a safe
guard.
The “hard core”: Persons who adamantly refuse to be inter-
viewed, who are incapacitated, or who are far from home during
the whole time available for field work constitute this sector.
It represents a source of bias that persists no matter how much
effort is put into completeness of returns.
More on these Errors
The detection and measurement of noncoverage are difficult. In
areal sampling, one method is to revisit the primary units, making
a careful listings that serves as a check.
In other case, comparisons of counts of numbers of people or dwellings
with those in another survey sometimes give a warning that some
have been missed.
In regard to the not-at-homes, the problem is easier in surveys in
which any adult in the home is capable of answering the questions
than in those in which a single adult, chosen at random, is to be
interviewed.
Call-Backs
Call-backs is a standard technique used to specify the number of
call-backs, or a minimum number, that must be made on any unit
before abandoning it as “unable to connect”.
A Mathematical Model of the Effects of Call-backs
Deming (1953) developed a useful and flexible mathematical model
for examining in more detail the consequences of different call-back
policies.
The population is divided into r classes, according to the proba-
bility that the respondent will be found at home.
Let,
wij = probability that a respondent in the j th class
will be reached on or before the ith call
pj = proportion of the population falling in the j th class
µj = item mean for the j th class
σj2 = item variance for the j th class
Cont...
For simplicity it is assumed that wij > 0 for all classes, although
the method is easily adapted to include persons impossible to
reach.
If ȳij is the mean for those in class j who were reached on or before
the ith call, it is also assumed that E (ȳij ) = µj .
The true population mean for the item is,
X
µ̄ = pj µj .
j
However, after i calls the persons in the sample can be classified
into (r + 1) classes as follows: in the first class and interviewed, in
the second class and interviewed; and so on.
The (r + 1)th class consists of all those not yet interviewed after i
calls.
Cont...
If the finite population correction (fpc) is ignored, the numbers
falling in these (r + 1) classes are distributed according to the
multinomial,
n0
r
X
wi1 p1 + wi2 p2 + . . . + wir pr + (1 − wij pj ) ,
j=1
where n0 is the initial size of the sample.
It follows that the number ni who have been interviewed in the
course of i calls is binomially distributed
Pr with number of trials
= n0 and probability of success j=1 wij pj .
Hence,
r
X
E(ni ) = expected number of interviews in i calls = n0 wij pj .
j=1
Cont...
For fixed ni , the number of interviews nij obtained
Pr (j = 1, 2, . . . , r)
follow a multinomial with probabilities wij pj / j=1 wij pj .
It follows that,
ni wij pj
E(nij |ni ) = Pr
j=1 wij pj
Hence, if ȳi is the sample mean obtained after i calls,
P P
nij ȳij nij ȳij
E(ȳi |ni ) = E =E E nij
ni ni
Pr Pr
j=1 ni wij pj µj j=1 ni wij pj µj
= Pr = Pr = µ̄i .
ni j=1 wij pj ni j=1 wij pj
Since this result does not depend on ni , the unconditional mean
of ȳi is also µi . Therefore, the bias is (µ̄i − µ̄).
Cont...
The conditional variance of ȳi for given ni is found similarly to be,
Pr 2 2
j=1 wij pj σj + (µj − µ̄i )
V (yi |ni ) = Pr
ni j=1 wij pj
The unconditional variance, ignoring terms of order 1/n2i , is given
approximately by replacing ni in the above equation by its ex-
pected value as,
Pr 2 2
j=1 wij pj σj + (µj − µ̄i )
V (yi |ni ) = P 2
r
n0 j=1 w ij p j
Finally, the mean square error of the estimate obtained after i calls
is,
MSE(ȳi |i) = V (ȳi |i) + (µ̄i − µ̄)2 .
Cont...
However, in the estimation, cost of making i calls must also be
considered.
The expected number
P of new interviews obtained in the k th call is
wkj − w(k−1)j pj .
Hence, if ck is the cost per completed interview at the k th call, the
expected total cost of making i call is n0 C(i), where,
X X
C(i) =c1 w1j pj + c2 (w2j − w1j )pj + . . .
X
+ ci (wij − w(i−1)j )pj .
Hansen and Hurwitz Approach for Nonresponse
Hansen and Hurwitz (1946) suggest to take a random subsample
of the persons who have not been reached after the first attempt
to reach the persons in the sample has been made.
They recommended to make a major effort to interview everyone
in the subsample.
This technique was first developed for surveys in which the initial
attempt was made by mail, a subsample of persons who did not
return the completed questionnaire being approached by the more
expensive method of personal interview.
This method can be regarded as an application of the technique of
double sampling for stratification.
Cont...
0
Suppose that a sample of n units has been selected, among which
0 0 0 0
n1 units responded, as a result, there are n2 = n − n1 units those
did not respond.
Assume that, by intensive efforts, the data are later obtained from
0 0
a random subsample n2 = ν2 n2 out of the n2 . If ν2 = 1/k, then
0
n2 = n2 /k.
In the framework of double sampling in case of two strata, here,
Stratum 1 consists of those who would respond to a first attempt,
0
with a measured sample of size n1 = n1 , so that ν1 = 1 and Stra-
tum 2 consists of those who would respond to the second attempt,
0
with n2 = n2 /k.
Therefore, the cost of taking the sample can be expressed as,
0
0 0 c2 n2
c0 n + c1 n1 + ,
k
where the c’s are the costs per unit: c0 is the cost of making the
first attempt, c1 is the cost of processing the results from the first
attempt, and c2 is the cost of getting and processing the data in
the second stratum.
Cont...
If W1 , W2 are the population proportions in the two strata, the
expected cost is,
0
0 0 c2 W2 n2
C = c0 n + c1 W 1 n 1 + , (1)
k
An estimator of Ȳ is given as,
0 0
0 n1 ȳ1 + n2 ȳ2
ȳ = w1 ȳ1 + w2 ȳ2 = ,
n0
0
where ȳ1 and ȳ2 are the means of the samples of sizes n1 = n1 and
0
n2 = n2 /k.
Cont...
In the framework of double sampling for stratified sampling, the
0
estimator ȳ is unbiased when responses are obtained from all the
0
selected random subsample of size n2 = n2 /k.
0
E(ȳ ) = E(w1 ȳ1 + w2 ȳ2 )
= EW E(w1 ȳ1 + w2 ȳ2 ) Wh = wh
= EW E1 E2 (w1 ȳ1 + w2 ȳ2 ) Wh = wh
= EW (w1 Ȳ1 + w2 Ȳ2 ) Wh = wh
= W1 Ȳ1 + W2 Ȳ2 = Ȳ
Again, using double sampling theory, we have,
(k − 1)W2 S22
0 1 1
V (ȳ ) = 0 − S2 +
n N n0
0
Variance of ȳ
We can write,
0
h 0 0
i
V (ȳ ) = EW E1 V2 (ȳ ) + V1 E2 (ȳ )
Now,
0 0
E2 (ȳ ) = E2 (w1 ȳ1 + w2 ȳ2 ) = w1 ȳ1 + w2 ȳ2
Therefore,
0 0 0 0
V1 (w1 ȳ1 + w2 ȳ2 ) = V1 (ȳn0 ) [ȳn0 is the mean from n units]
0
!
n S2
= 1−
N n0
1 1
= − S2.
n0 N
Cont...
And,
0
V2 (ȳ ) = V2 (w1 ȳ1 + w2 ȳ2 )
= w22 V2 (ȳ2 )
0
!2
n2 S22
n2
= 1− 0
n0 n2 n2
0 0
!
n2 n2 − n2
= 02 S22
n n2
w2 kn2 − n2
= 0 S22
n n2
(k − 1)w2 S22
=
n0
Finally, taking expectation over W we have,
(k − 1)W2 S22
0 1 1
V (ȳ ) = 0 − S2 +
n N n0
Cont...
0
The quantities n and k are then chosen to minimize the product
C(V + S 2 /N ), where,
(S 2 − W2 S22 ) kW2 S22
V + S 2 /N = + . (2)
n0 n0
By Cauchy’s inequality, the optimum k is,
s
c2 (S 2 − W2 S22 )
kopt = (3)
S22 (c0 + c1 W1 )
0
The initial sample size n may be chosen either to minimize C for
0
specified V or V for specified by solving for n from (2) or 1.
If V is specified,
0 N [S 2 + (k − 1)W2 S22 ]
nopt = (4)
N V + S2
Cont...
The solutions require a knowledge of W2 that often be estimated
for previous experience.
In addition to S 2 , whose value must be estimated in advance as in
any sample size determination problem.
The solutions also involve S22 , the variance in the nonresponse stra-
tum which may be harder to predict because it will not probably
be the same as S 2 .
If W2 is not well known, a satisfactory approximation is to work
0
out the value of nopt from a provisional (3) and (4) for a range of
assumed values of W2 between 0 and a safe upper limit.
Adjustments for Bias Without Call-backs (Politz-Simmons Adjustment)
An ingenious method of diminishing the biases present in the re-
sults of the first call was suggested by Hartley (1946) and developed
by Politz and Simmons (1949, 1950) and Simmons (1954).
Suppose that all calls are made during the evening on the six week-
nights.
The respondent is asked whether he was at home, at the time of
the interview, on each of the five preceding weeknights.
If the respondent states that he was at home t nights out of five,
the ratio (t + 1)/6 is taken as an estimate of the frequency π with
which he is at home during interview hours.
Cont...
The results from the first call are sorted into six groups according
to the value of t, (0, 1, 2, 3, 4, 5).
In the tth group let nt be the number of interviews obtained and
ȳt , the item mean.
The Politz-Simmons estimate of the population mean µ is,
P5
6
t=0 n t ȳ t t+1
ȳP S = P (5)
5 6
t=0 n t t+1
This approach recognizes that the first call results are unduly
weighted with persons who are at home most of the time.
Since a person who is at home, on the average, a proportion π of
the time has a relative chance π of appearing in the sample, his
response should receive a weight 1/π.
Cont...
The quantity 6/(t + 1) is used as an estimate of 1/π.
Thus ȳP S is less biased than the sample mean ȳ from the first call,
but its variance is greater, because an unweighted mean is replaced
by a weighted mean with estimated weights.
Suppose the population is divided into classes, people in the j th
class being at home a fraction πj of the time.
Note that the tth group (i.e. persons at home t nights out of the
preceding five) will contain persons from various classes.
Let njt , ȳjt be the number and the item mean for those in class j
and group t.
Then ȳP S may be written as,
6
PP
njt ȳjt t+1
ȳP S = = N/D (say) (6)
6
PP
njt t+1
Cont...
This is a ratio type of estimate. In large samples its mean is
approximately E(N )/E(D).
If n0 is the initial size of sample (responses plus not-at-homes)
and nj is the number from class j who are interviewed and pj
proportion of j th class, the following assumptions are made,
(i) nj /n0 is a binomial estimate of pj πj
5!
(ii) E(njt |nj ) = nj t!(5−t)! πjt (1 − πj )5−t
(iii) E(ȳjt ) = µj , for any j and t.
For given j, using assumption (ii)
5
! 5
X 6 X 6 5!
E njt = nj πjt (1 − πj )5−t
t=0
t + 1 t=0
t + 1 t!(5 − t)!
5
X t!
= nj π t (1 − πj )5−t
t=0
(t + 1)!(5 − t)! j
5
nj X t!
= π t+1 (1 − πj )5−t
πj t=0 (t + 1)!(5 − t)! j
Cont...
nj
6πj (1 − πj )5 + . . . + πj6
=
πj
nj
(1 − πj )6 + 6πj (1 − πj )5 + . . . + πj6 − (1 − πj )6
=
πj
nj
(πj + 1 − πj )6 − (1 − πj )6
=
πj
nj
1 − (1 − πj )6
=
πj
Hence
r r
X E(nj ) X
1 − (1 − πj )6 = n0 pj 1 − (1 − πj )6 ,
E(D) =
j=1
πj j=1
using assumption (i).
Cont...
Furthermore, since E(ȳjt ) = µj for any j and t, this yields,
Pr 6
j=1 µj pj 1 − (1 − πj )
E(ȳP S ) = µ̄P S = Pr 6
.
j=1 pj [1 − (1 − πj ) ]
Pr
Since the true mean µ̄ = j=1 pj µj , some bias remains in ȳP S .
In a certain sense, this estimate has the same bias as ȳ6 , the sample
mean given by the call-back method with a requirement that as
many as six calls be made if necessary.
Recall the call-backs method, with a total of i calls, gives an un-
biased estimate of,
Pr
j=1 wij pj µj
µ̄ji = Pr ,
j=1 wij pj
where wij is the probability that a person in class j who falls in
the sample will be interviewed.
Therefore, in this case, we have πj = w1j .
Cont...
If at subsequent calls the probability of finding at home a person
not previously reached remains at πj , then
wij = 1 − (1 − πj )i ,
so that µ̄P S = µ̄6 .
However, with the call-back method the probability of an interview
at a later call may be greater than πj as a result of information
obtained by the interviewer at the first or earlier calls.
Though, the variance of ȳP S is complicated, with the usual ap-
proximation for a ratio estimate, it may be expressed, following
Deming (1953), as,
1 nX
V (ȳP S ) = πj pj Bj [σj2 + (µj − µ̄P S )2 ]
n0 U
X o
+(n0 − 1) (πj pj )2 (Bj − Aj )2 (µj − µ̄P S )2 ,
Cont...
where
X
U =1− pj (1 − πj )6
1
Ai = [1 − (1 − πj )6 ]
πj
5 2
X 6 5!
Bj = πjt (1 − πj )5−t
t=0
(t + 1) t!(5 − t)!
Mechanism of Nonresponse
Most surveys have some residual nonresponse even after careful
design and follow-up of respondent.
Let us define, the random variable,
1 if unit i responds
Ri =
0 if unit i does not respond.
After sampling, values of the response indicator variable are known
for the units selected in the sample.
A value for yi is recorded if ri , the ith realization of R is 1,
The probability that au unit selected in the sample will respond,
φi = P (Ri = 1),
is of course unknown but assumed positive.
Rosenbum and Rubin (1983) named φi as the propensity score for
the ith unit.
Cont...
Suppose that yi is a response of interest and that xi is a vector of
information known about unit i in the sample.
Also assume that φi is the response propensity associated with
unit i in the sample.
According to Little and Rubin (2002) missing data can be of three
types:
Missing Completely at Random (MCAR): If the response
propensity φi does not depend on xi , yi or the survey design, the
missing data are missing completely at random. In this case, the
sample mean of the respondents ȳR is approximately unbiased for
the population mean. However, the MCAR mechanism is implic-
itly adopted when nonresponse is ignored.
Cont...
Missing at Random (MAR): IF φi depends on xi but not
yi then missing data are called missing at random. In this case,
nonresponse depends only on the observed variables. This is some-
times termed ignorable nonresponse. Here, ignorable means that
a model can be explained the nonresponse mechanism and that
the nonresponse can be ignored after the model accounts for it,
not that the nonresponse can be completely ignored and complete
data methods used.
Not Missing at Random (NMAR): IF the propensity φi de-
pends on the value of a missing response variable and cannot be
completely explained by the values in t he observed data, then the
nonresponse is known as not missing at random.
Imputation
Imputation defined as the process of estimating individual missing
value in a data set.
A replacement value, often from another person in the survey who
is similar to the item non-respondent on other variables, is imputed
for the missing value.
When imputation is used and additional variable should be created
for the data set that indicates whether the response was measured
or imputed.
A wide variety of imputation methods have been developed for
assigning values for missing item responses.
Imputation techniques may be quite useful when imputation for
any missing value is done based on homogeneous imputation class.
Imputation Techniques
Deductive Imputation: This technique imputes values using
the logical relation among the variables.
For example, suppose that binary information on the variables
victim in any crime and violent crime victim are collected along
with other variables. If the value of the later variable is missing for
an individual who reported himself negative as victim in any crime.
Then, logically, it can easily be deduced that the respondent never
experienced a violent victim.
Again, suppose that a survey is conducted taking several informa-
tion from a sample of mothers. Mothers were asked about their
number of child after 2, 4 and 6 years after their marriage. If infor-
mation is missing for year 4 but available for years 2 and 6, then
sometimes it is possible to deduce value for year 4 logically.
Cont...
Mean Imputation: Missing values can be replaced by the mean
of all the responding values for the variable. This can be done
based on the whole data set or separately for different categories of
respondents defined by combination of selected classification vari-
ables.
Hot-Deck Imputation: In this approach, the sample units are
divided into classes so that within class units can be assumed ho-
mogeneous. The value of one of the responding units in the class
is substituted for each missing response. Often, the values for a
set of related missing items are taken from the same donor, to pre-
serve some of the multivariate relationships. There alternatives in
the choice of donor such as sequential hot-deck imputation, ran-
dom hot-deck imputation and nearest-neighbor hot-deck imputa-
tion (see Lohr, 2010).
Cont...
Sequential hot-deck imputation procedures impute the value in the
same subgroup that was last read by the computer. One problem
with using the value on the previous “card” is that often the non-
respondents also tend to occur in clusters, so one person may be a
donor multiple times, in a way that the sampler cannot control.
Random hot-deck imputation chooses the donor randomly from the
persons in the cell with information on all the missing items. To
preserve multivariate relationships, usually values from the same
donor are used for all missing items of a person.
Nearest-neighbor hot-deck imputation chooses the donor based on
some distance measure. The donor is assumed as the most closest
to the person with the missing item where closeness is measured
based on the distance function.
Cont...
Regression Imputation: This imputation technique predicts the
missing value using a regression of the item of interest on variables
observed for all cases. A variation is stochastic regression imputa-
tion, in which the missing value is replaced by the predicted value
from the regression model plus a randomly generated error term.
Cold-Deck Imputation: In this technique, the imputed values
are chosen from a previous survey or other information, such as
from historical data. This approach is not guaranteed to eliminate
selection bias.
Multiple Imputation: In this approach, each missing value is
imputed m(≥ 2) different times. Typically, the same stochastic
model is used for each imputation. These create m different “data”
sets with no missing values. Each of the m data is analyzed as if
no imputation had been done; the different results give the analyst
a measure of the additional variance due to imputation.