You are on page 1of 10

S-CHAPTER14:

DATA PREPARATION

ARM12 1
THE DATA‑PREPARATION PROCESS (p.418)

The data preparation process is guided by the preliminary plan


of data analysis, formulated in the research design phase.
- It involves 7 steps (Fig 14.1, p.419):
(1) Qnre Checking
(2) Editing
(3) Coding
(4) Transcribing
(5) Data Cleaning
(6) Adjusting the Data Statistically
(7) Selecting a Data Analysis Strategy.

ARM12 2
1. QUESTIONNAIRE CHECKING

A qnre may be unacceptable for several reasons:


(a) Part(s) of the qnre may be incomplete.
(b) The respondent didn't understand or follow the instructions.
(c) The responses may show little variance.
Ex: A resp has checked only 4s on a series of 7‑pt LS scale.
(d) The returned qnre is incomplete.
Ex: Some pages are missing.
(e) The qnre is received after the stipulated date.
(f) Un-qualified persons answered the qnre .
  - If quota requirements are not met, conduct
additional interviews to meet the quota.
2. EDITING

To review of the qnres and identify answers that are


illegible, incomplete, inconsistent, or ambiguous.
(a) Responses are illegible; they have been recorded incorrectly.

(b) Qnres may be incomplete; some qns may remain unanswered.


(c) The data may be inconsistent.
Ex: A respondent reports low income but indicates
shopping in very expensive stores .
(d) Responses may be ambiguous
Ex: A respondent circles 2 & 3 on a 5‑pt scale.
Treatment of Unsatisfactory Responses

Unsatisfactory responses are handled by:

(a) returning qnres to the field for collecting data again.


(b) assigning missing values, or
(c) discarding unsatisfactory responses.

3. CODING

Coding assigns a code (number) to each alternative answer of each q n.


Ex: Sex may be coded as 1 for females & 2 for males.
4. TRANSCRIBING
Transferring the coded data from the qnres or coding sheets on to the:
(i) disks (ii) magnetic tapes, or (iii) directly into computer memory .
5. DATA CLEANING (p.415)

Data cleaning involves consistency check & treatment of missing responses.


1. Consistency Check identifies data that are
(a) Out of range; On a 5-pt scale, values of 0, 6 are out of range.
(b) Logically inconsistent: A respondent indicates that she pays bills
with credit card although s/he doesn’t have one.
(c) Have extreme values: A respondent circles only 5s on a 7-pt scale
on all attributes of a brand.
2. Treatment of Missing Responses
The options for treatment of missing responses:
(a) Substitute a neutral value; use mean value
(b) Substitute an imputed response: product usage may be related to the family
size for resps, who provided data on both,
then, if family size is given then the product
usage may be calculated, if it is missing.
(c) Casewise deletion; here cases (resps) with missing responses are discarded.
→ smaller sample size & bias, if resps & non-resps differ ito the missing variable.
(d) Pairwise deletion; Instead of discarding all cases with missing value(s),
the cases with complete responses are used.
6
- There will be different sample sizes for different variables.
6. STATISTICALLY ADJUSTING THE DATA (p.428)

This enhances the quality of data analysis: There are 3 procedures:


1. Weighting
Each case (respondent) is given a weight to reflect its importance with respect to
other cases to make the sample more representative of a population.
Ex: To determine how an existing product can be modified.
- You put greater wt to the opinions of Heavy users of the product.
- So, assign: 4 to Heavy, 3 to Medium, 2 to Light and 1 to Non users.
2. Variable Re-specification
May involve transformation of data to
(a) modify existing variables.
Ex: Original variable was product usage with 10 response categories.
- These may be categorized into 3 categories: Heavy, Medium, Light.
(b) create new variables that are composites of several other variables.
Ex: Standard of living, corporate image, etc.
3. Scale Transformation
Manipulation of scale values to ensure comparability with those of other scales.
Ex: Different scales may be employed for measuring different variables.
Image variables on a 7-pt Semantic dfferential scale,
L-Style variables on a 5‑pt Likert scale, and
Attitude variables on a 100-pt Continuous rating scale.
7. SELECTING A DATA ANALYSIS STRATEGY (p.415)

Consists of several aspects:


1. Consideration of the earlier steps of the MR process.
- Changes may be necessary due to additional information
generated in subsequent stages of the research process.
2. Known characteristics of the data.
- The research design may favor certain techniques.
Ex: ANOVA is suitable for analyzing experimental data.
3. Properties of the statistical techniques.
- Some techniques are appropriate for examining differences
in variables.
- Others for assessing the relationships between variables.
- Still others for making predictions.
4. Researcher's background and philosophy .
- Qualitative, quantitative or both.
8
ARM12 9

You might also like