You are on page 1of 46

UNIT 7

Data Processing

1
Scales of Measurement
•Measurement involves the systematic application of
rules for assigning numbers to objects to represent the
quantities of a person’s attributes or traits (Urbina, 2004)
•A scale of measurement is simply a means by which
individuals can be distinguished from one another on a
variable of interest, whether that variable is a predictor
or a criterion.
•Four types of scales or levels of measurement exist:
•Nominal scale
•Ordinal scale
•Interval scale, and
•Ratio scale.

2
Scales of Measurement

Ratio
Interval Highe
r
Ordinal
Nominal

Higher levels of measurement have all the properties of


lower levels of measurement. More powerful statistical
analysis can be performed with such data.
3
Nominal scale
• A nominal scale is one composed of two or
more mutually exclusive categories.
• Examples
a) Applicant sex: male or female
b) Applicant race: black, white, or other
c) Job title: sales manager, sales clerk, sales
representative, sales person, or other
d) Classification of trainee success: successful or
unsuccessful.

4
Nominal Scale
Basic Measures of
Comparisons Examples Average

Identity male-female mode


user-nonuser
occupations
uniform numbers
Which of the following soft drinks do you like? Check all that apply.

 Coca-Cola  Mountain Dew  Seven Up


 Dr. Pepper  Pepsi  Sprite

5
Ordinal scale
• An ordinal scale is one that ranks objects, such as
individuals, from “high” to “low” on some variable of
interest.
• Ordinal scales are sometimes used in selection
research.
• In developing criteria measurements, for instance,
supervisors may be asked to rank their subordinates
with respect to some characteristic, like performance.
– See example next.

6
Ordinal Scale
Basic Measures of
Comparisons Examples Average

Order brand preference median


social class mode
hardness of minerals
quality of lumber
Rank the following soft drinks from 1 (least liked) to 6 (most liked):

___Coca-Cola ___Mountain Dew ___Seven Up


___Dr. Pepper ___Pepsi ___Sprite
7
Example of an Ordinal scale : Ranking
of employees
• Below are listed the names of your ten subordinates. Read
over the list and then rank the individuals on their quality of
work completed in their jobs.

• By “quality of work completed,” we mean the minimum


amount of rework necessary to correct employee mistakes.

• You should give the subordinate you believe is


– the highest in quality of work performed a rank of “1,”
– the employee next highest in quality of work a “2,”
– the next a “3,” and so on until you give a “10” to the
employee who is the lowest in quality of work completed.
8
Employee rankings
Employee Rank on quality of
work completed
•Abay Belete 4 Note:
•Abebe kebede 2
1= Highest
•Belew Bisrat 1 quality
•Chala Emana 6
10= Lowest
•Chaltu Biru 7
quality
•Gidey G/egzer 3
•Nardos Abebe 9
•Nebiyu Abegaz 5
•Tesfaye Bitew 8
•Zebenay Eskinder 10
9
Ordinal scale---
• The ordinal scale provides us with more information than does a
nominal scale.
• Individuals are not only assigned a number representative of a
category, as on a nominal scale, but differences b/n the numbers
assigned yield additional information.
• Numerical differences indicate the relative position of individuals
for the variable on which they are ranked.
– For example, Belew produces better quality of work than Abebe.
• However, an ordinal scale does not provide information on the
magnitude of differences among the ranks.
• Thus, we don’t know if a one-point difference b/n two people is
of equal interval to a one-point difference b/n two other people.

• We can draw only “greater than” or “less than” conclusions with


ordinal data; we do not know the amount of difference that
separates individuals or objects being ranked.
10
Interval Scale
• With interval scale, differences between numbers take on
meaning.
• In addition to rank-order information, the scale uses constant
units of measurement, affording meaningful expression of
differences with respect to characteristic.
• An interval scale has an arbitrary but not an absolute zero point.
• Although an object being measured may be given a score of
zero, the score of zero is set by convention.
– E.g. IQ score, room temperature.
• Rating scales are frequently used as criterion measures
in selection studies.
– For example, many of job performance measures
consist of performance appraisal ratings.
11
Interval Scale
Basic Measures of
Comparisons Examples Average

Comparison temperature mean


of intervals grade point avg. median
brand attitude mode
What is your overall opinion about each of these brands?
unfavorable favorable
Coca-Cola 1 2 3 4 5 6 7
Dr. Pepper 1 2 3 4 5 6 7
Pepsi 1 2 3 4 5 6 7
Sprite 1 2 3 4 5 6 7

12
Example, interval scale used in
rating employee performance
1. Accuracy of work: the extent to which the employee
correctly completes job assignments
1. Almost always makes errors, have very low
accuracy
2. Quite often makes errors
3. Makes errors but equals job standards
4. Makes few errors, has high accuracy
5. Almost never makes errors, has very high accuracy
Comments: ________________________________

13
Example, -----
2. Quality of work: the extent to which the employee
produces a volume of work consistent with
established standards for the job
1. Almost never meets standards
2. Quite often does not meet standards
3. Volume of work is satisfactory, equals job
standards
4. Quite often produces more than required
5. Almost always exceeds standards, exceptionally
productive.
Comments: ________________________________
14
Example, interval scale
• Employee rating of job satisfaction:
1. Very dissatisfied
2. Dissatisfied
3. Satisfied
4. Very satisfied

15
Interval scale, ---
• When ratings shown above are treated as an interval
scale, the magnitude of the difference between rating
points is assumed to be the same.

• Thus, raters are expected to view the difference b/n


points 1 & 2 on the scale in the same way as they view
the difference b/n points 4 & 5.

• Most of the predictors (and criteria) we use in


selection can be measured with an interval scale, since
the underlying psychological construct (e.g. mental
ability) typically is normally distributed.
16
Ratio Scale
• As on the interval scale, differences b/n numerical values
on a ratio scale have meaning.
• In contrast, though, a ratio scale has an absolute zero
point.
• The presence of an absolute zero point permits us to
make statements about the ratio of one individual’s
scores to another based on the amount of the
characteristic being measured.
– Thus if one worker produces 100 wire baskets in an
hour while another produces 50, we can then state the
second worker produces only half as much as the first.
• In general, numerical operations, such as addition,
subtraction, multiplication and division are applied on
ratio scaled data.
17
Ratio Scale
Basic Measures of
Comparisons Examples Average

Comparison units sold mean*


of absolute # of purchases median
magnitudes age mode
income
Divide 100 points among these soft drinks according to your
likelihood of purchasing each within the next week:

___Coca-Cola ___Mountain Dew ___Seven Up


___Dr. Pepper ___Pepsi ___Sprite
18
Data processing

• Data processing (DP) is any process that


converts data into information or knowledge

19
Data Processing
In the context of data processing,
• data are defined as numbers or characters that represent
measurements from observable phenomena.
• Measured information is then logically deduced and/or
statistically calculated from multiple data.
• Information is defined as either a meaningful answer to a
query

20
Elements of Data Processing

• Data Entry • Data Validation


• Data Cleaning • Data Tabulation
• Data Coding • Statistical Analysis
• Data Translation • Data Warehousing
• Data Summarization • Data Mining
• Data Aggregation

21
Editing of data

• Process of examining the raw data to detect


errors and omissions and to correct them
• It ensures
– Completeness -Consistency
– Accuracy -Homogeneity

Editing

Field Editing Central Editing


22
Field Editing

• Review of the reporting forms by the


investigator for completing/ translating
individual responses
• This makes things easier for tabulator
• Investigator should not correct the errors of
omission by guessing

23
Central Editing

• Carried out when all the forms of schedules


have been completed
– Entry at a wrong place
– Entries with wrong units
– Inappropriate & missing replies.

24
Coding

• Coding is a process of assigning symbols (alphabetical or


numerals) to the answers.
• This helps in recording responses in a limited classes or
categories
• Classes should be appropriate to the research problem
• Classes are exhaustive & mutually exclusive so the answer
can be placed in one & only one cell in a given category.
• Every class must be defined in terms of only one concept.
• Good coding ensures efficient analysis

25
Classification
• Reduction in homogeneous groups on the
basis of some characteristics
• It helps in making comparison and drawing
meaningful conclusions
• It can be done on the basis of attributes or on
the basis of numerical characteristics

26
Classification

Sex, Caste, Education, Land holding

Height, weight, marks, income, etc. 27


Classification
according to
Dichotomy

Each class is divided into two subclasses


and only one attribute is studies

Resident Employed
Non-resident Unemployed

Married
Unmarried

28
Each class is divided into a
Manifold
number of subclasses and
Classification more than one attribute is
studies

Industries

Private Public

Large Small Large Small

Profit Loss Profit Loss Profit Loss Profit Loss


Making making Making making Making making Making making

29
Classification
• When individual observations possess numerical
characteristics, such as height, weight, salary, marks, etc. they
are classified on the basis of intervals.
• The number of items in each class is called the frequency of
the class.
• Every class has two limits: an upper limit & a lower limit.
• The difference b/n these two limits is called the magnitude of
the class or class interval.

30
Example
• Following data refer to monthly salary of 40 employees of an
organization. Tabulate the data using the exclusive method:

1060, 1310, 1255, 750, 1690, 945, 1200, 1060,


2125, 2120, 1190, 1120, 2130, 2240, 2190, 1370, 1440, 2560,
870, 2000, 1870, 1700, 1800, 2375, 1940, 2250, 1460, 1750,
1875, 1165, 2255, 1470, 2060, 2135, 2125, 1760, 1650, 1945,
2000, 2250

31
Statistical Series

• A series is defined as a logical or systematic


arrangement of observations or items.
• When the things and attributes are counted,
measured or weighed and arranged in an
orderly manner, they form a series.

32
Statistical Series

Space Spatial Series

Time Time Series

Geography Geographical Series

Physical Conditions Conditions Series


(height, weights, etc)
33
Discrete & Continuous series

34
Tables
• Tabulation is used to summarize and
condense the data
• It aids in analysis of relationships, trends and
other summarization
• One-way tables and two-way tables, three-
way tables and cross tabulations are some of
the forms

35
Important Characteristics
• Clear & concise title
• Should be distinctly numbered to facilitate
easy reference.
• Should have Captions(Column headings) and
Stubs (Row headings)—clear & brief.
• Unit of measurements should be indicated
• Source should be mentioned.

36
Important Characteristics
• Explanatory footnotes if required
• Columns of the tables should be numbered
• Abbreviations should be used to the minimum possible extent
• Should be logical, clear, accurate & simple
• Data categories should be arranged based on chronological,
geographic, alphabetical or magnitude
• Must suit the needs & requirements of the research study.

37
Graphical Presentation
Two Dimensioanl Diagram

50

40

30
% Share
20

10

0
Raw Material Labour Overheads Profits

Firm A 50 20 20 10
Firm B 32 36 20 12

Heads

Firm A Firm B 38
Stacked Bars

100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Firm A Firm B

Raw Material Labour Overheads Profits 39


Pictogram

Firm B
Y-Axis

FIRM A

X-Axis
40
Pie Chart

Firm A

Profits
Overheads 10%
20% Raw Material
50%
Labour
20%

41
Line Chart

60

40

20

0
Raw Material Labour Overheads Profits

Firm A Firm B
42
DATA ANALYSIS TECHNIQUES
• This phase calls for extracting findings from the
collected data.

• The researcher tabulates the data & develops


frequency distributions.

• He/she should use different statistical tools to analyze


his/her data.
• Statistics: numeric expressions describing the
characteristics of a given phenomenon.

• There are two broad categories of statistics:


descriptive & inferential.
43
• Descriptive statistics: a set of methods involving
the collection, presentation, characterization, and
summarization of a set of data by means of
numerical descriptors.

• Such numerical descriptors (statistics) include


– the mean, median and mode as measures of central
tendency (e.g., the typical output of a firm), and
– the variance, standard deviation and range as measures
of variability, or dispersion (e.g., the difference between
the largest and smallest output of a firm).

44
• Inferential statistics: the set of methods
that allow estimation or testing of a
characteristic or attribute of a population
(i.e., the entire set of values under
consideration) or
– the making of a judgment or decision
concerning a population based only upon
sample results.

45
• It is mainly on the basis of inferential
analysis that the tasks of interpretation
and generalization are performed.

• Advanced statistical techniques related to


inference of the degree of association and
differences are very important
instruments of social research.

46

You might also like