In most surveys, post-analysis results are invalid due to missing data. Missing data can either be thrown away, ignored or substituted through some procedure. When data are thrown away or ignored in generating estimates, nonresponse bias becomes a problem. This section examines the nonresponse bias in processing the data excluding the nonresponse observations. (Kalton, 1983)
For simplicity, a simple random sample (SRS) in the variable y, where y contains missing data, from a population of size N is drawn. The need to define the types and patterns of nonresponse, which will be discussed later, are unimportant for this section. The data will further be assumed to be divided into two groups, the set of nonrespondents and respondents. In reality, the division of the data into two simple groups is an oversimplification for some units at least, chance plays whether they respond or not. The simplified model is appealing, however, its tractability leads to some informative results. (Kalton, 1983)
Let R be the number of respondents and M be the number of nonrespondents (M for missing) in the population, with N = R + N; the corresponding sample quantities are r and m, with r + m = n. Let\ue000R = R/N and
In reducing nonresponse bias caused by missing data, there are many procedures that can be applied and one of this is imputation. In this study, imputation procedures are applied to eliminate nonresponse and reduce bias to the estimates. Imputation is briefly defined as the substitution of values for the nonresponse observations. The discussion of imputation procedures will be provided later.
In surveys, nonresponse observations follow a definite pattern. For this study, missing data and nonresponse can be used interchangeably. There are three patterns a nonresponse data can have. It can be that the missing data for a variable Y are \u201cMissing Completely at Random\u201d (MCAR) if the probability of having a missing value for Y is unrelated to the value of Y itself or to any other variable in the data set. Data that are MCAR reflect the highest degree of randomness and show no underlying reasons for missing observations that can potentially lead to bias research findings. With MCAR, the occurrence of missing data is unrelated to the other variables in the data set or other systematic factors; missing data are randomly distributed across all cases.
Another pattern of a nonresponse data is the Missing At Random (MAR). The missing data for a variable Y is considered MAR if the probability of missing data on Y is unrelated to the value of Y after controlling for other variables in the analysis. MAR data show some randomness to the pattern of data omission. The likelihood of a case having incomplete information on a variable can be explained by other variables in the data set, although presence or absence of missing values on a variable is not related to the participants\u2019 true status on the missing variable.
The difference of MCAR and MAR is the relationship of the variable Y to the other variables in the data set. Nonresponse in MCAR is completely independent to the other variables. There is no relationship of the missing values in Y variable to the responding values and the other variables in the data set. In MAR, there is a relationship between the missing observations in Y and with the other variables. The variables could explain the incomplete information from the Y variable. [parang umuulit ang part na to!]
The last pattern of nonresponse and considerably the worst of the three is the probability of missing data on Y is related to the value of Y even if other variables are controlled in the analysis. Such case is termed as NonIgnorable Nonresponse (NIN). NIN missing data have systematic, nonrandom factors underlying the occurrence of the missing values that are not apparent or otherwise measured. NIN missing data are the most problematic because of the effect in the
generalizability of research findings and may potentially create bias parameter estimates, such as the means, standard deviations, correlation coefficients or regression coefficients.
These patterns are considered as an important assumption in imputation. For an imputation procedure to work and achieve statistically acceptable estimates, the pattern of nonresponses must either satisfy the MCAR or MAR assumption. For this study, the researchers\u2019 created nonresponse that follows the MCAR assumption.
Another important assumption in imputation is the types of nonresponse. While the patterns of nonresponse focus on the relationships of the nonresponse variable to other variables, the types of nonresponse focus on the method in which the observations are nonresponse values. Kalton (1983) stressed the importance to differentiate the types of nonresponse: noncoverage, total (unit) nonresponse, item nonresponse, partial nonresponse.
Noncoverage (NC) denotes the failure to include some units of the survey population in the sampling frame. As a consequence, units that are excluded in the frame have no chance of appearing in the sample. NC is not usually a type of nonresponse; however, Kalton (1983) loosely classifies this for convenience purposes. NC can be seen in surveys where units are failed to cover in the sampling frame or the listing of units are incomplete.
Unit (or total) nonresponse (UN) takes place wherein no information collected from a sampling unit. There are many causes of this nonresponse, namely, the failure to contact the respondent (not at home, moved or unit not being found), refusal to collect information, inability of the unit to cooperate (might be due to an illness or a language barrier) or questionnaires that are lost.
Item nonresponse (IN) emerges when the information collected from a unit is incomplete due to the refusal of answering some of the questions. There many causes of this nonresponse, namely, refusal to answer the question due to the lack of information necessarily needed by the informant, failure to make the effort required to establish the information by retrieving it from his memory or by consulting his records, refuses to give answers because the questions might be sensitive, embarrassing or considers to his perception of the survey\u2019s objectives, the interviewer fails to record an answer (might skipped questions), or because the response is subsequently rejected at an edit check on the grounds that it is inconsistent with other responses (may include an inconsistency arising from a coding or punching error occurring in the transfer of the response of the computer data file).
Now bringing you back...
Does that email address look wrong? Try again with a different email.