Professional Documents
Culture Documents
ANALYSIS
by:
Novianto Budi
Kurniawan
WHAT IS
CATEGORICAL DATA?
Categorical data is a collection of
information that is divided into groups
2
3
WHAT IS CATEGORICAL DATA
Categorical data is a collection of
information that is divided into groups
Types of Categorical Data:
1. Nominal
This is a type of data used to name variables without providing any numerical value
(“labelled” or “named” data)
2. Ordinal
This is a data type with a set order or scale to it
Categorical data can take on numerical values (such as “1” indicating Yes and
“2” indicating No), but those numbers don’t have mathematical meaning. One
can neither add them together nor subtract them from each other.
4
WHAT IS CATEGORICAL DATA
The measurement scale
for the response
consists of a number of categories
5
CATEGORICAL DATA
ANALYSIS
• Independent (Explanatory) Variable is Categorical (Nominal or Ordinal)
• Dependent (Response) Variable is Categorical (Nominal or Ordinal)
• Source: Data collection (Meta-analysis, Census, Survey, Observations,
etc)
• Special Cases: 2x2 (Each variable has 2 levels)
–Nominal/Nominal
–Nominal/Ordinal
–Ordinal/Ordinal
• Contingency Tables
6
CONTINGENCY TABLES
• A table showing the distribution of one variable in rows and another in
columns, used to study the association between the two variables.
• Tables representing all combinations of levels of explanatory and
response variables
• Numbers in table represent Counts of the number of cases in each cell
7
ANALYSIS OF CONTINGENCY
TABLES
Tables as technique of data description
What can a contingency table tell us?
• Comparison between groups
• Mutual relationship between 2 (or more) variables
• Explanatory Variable – Groups (Typically based on
demographics, exposure, or Trt)
• Response Variable – Outcome (Typically presence or absence
of a characteristic)
8
DISPLAYING CONTINGENCY
TABLES
9
ANALYSIS OF CONTINGENCY
TABLES
Bivariate analysis of categorical variables
Cross-tabulation
10
TWO-WAY CONTINGENCY
2×2 table TABLE ANALYSIS
11
TWO-WAY CONTINGENCY
Example:
TABLE
A sample 124 mice was divided into two groups, 84
receiving a standard dose of pathogenic bacteria
followed by an antiserum and a control group of 40
not receiving the antiserum. After 3 weeks the
numbers dead and alive in each group were
counted.
antiserum control Total antiserum control
Dead 19 18 37 % Dead 23 45
Alive 65 22 87
Association between
Total 84 40 124
mortality and treatment?
12
CONSTRUCTION OF
CONTINGENCY TABLES
Step 1: which variable is independent and which is depen
Determine
Step 2:Calculate percentages within the categories of the
independent variable.
13
CONSTRUCTION OF
CONTINGENCY TABLES
INDEPENDENT – explanatory variable
Gender
Gender
Gender
17
CONSTRUCTION OF
CONTINGENCY TABLE
Table 1. Relationship between Educational
Level and Performance on Civil Service
Examination
Education
Performance
on Civil High More than Total
Service School High
Examination or Less School
18
CONSTRUCTION OF
CONTINGENCY
Percentage TABLE
Distribution for Data of Table 1
Education
Performance
on Civil
Service High School More than
or Less High School
Examination
• This distinction results in how we interpret the results (%) and which coefficient
of association/correlation we can use.
21
ANALYSIS OF CONTINGENCY
A manager would expect income to lead to job satisfaction: The
TABLES
higher the income, the higher would be the expected job
satisfaction. Using the following data, do you agree?
Table 2. Cross-Tabulation of Income and Job
Satisfaction
Income
Job
Total
Satisfaction Low Medium High
22
ANALYSIS OF CONTINGENCY
TABLES - ORDINAL
Percentage Cross-Tabulation of Income-Job
Satisfaction Relationship
Income
Job
Satisfaction Low Medium High
Low 50% 20% 13%
Medium 30% 53% 20%
High 20% 27% 67%
Total 100% 100% 100%
(n = 200) (n = 150) (n = 75)
23
ANALYSIS OF CONTINGENCY
Income-Job Satisfaction
TABLES Relationship
• Avoid intermediate categories of the independent (and dependent) variable for
this purpose - will result in clearer understanding and interpretation of the table
• Compare the percentage of those with low income who have high job
satisfaction (20%) with the percentage of those with high income who have high
job satisfaction (67%).
• Alternatively, compare the percentage of those with low income who express low
job satisfaction (50%) with the percentage of those with high income who
express low job satisfaction (13%).
• They show that those with high income indicated high job satisfaction more often
than did those with low income (by 47%) and, conversely, that those with low
income indicated low job satisfaction more often than did their counterparts with
high income (by 37%).
24
ANALYSIS OF CONTINGENCY
A disgruntled official working in the personnel department is
TABLES - ORDINAL
disturbed by the level of incompetence she perceives in the
leadership of the organization. She is convinced that
incompetence rises to the top, and she shares this belief with
you as her coworker over lunch. She asked you to help her to
Table 3.her
substantiate Cross-Tabulation of Competence
claim. Using the following data, do youand
agree
with her judgement? Hierarchy
Competence
Hierarchy Total
Low Medium High
Low 113 60 27 200
Medium 31 91 38 160
High 8 8 24 40
Total 152 159 89 400
25
ANALYSIS OF CONTINGENCY
TABLES - ORDINAL
Percentage Cross-Tabulation of Competence-
Hierarchy Relationship
Competence
Hierarchy
Low Medium High
Low 74% 38% 30%
Medium 21% 57% 43%
High 5% 5% 27%
Total 100% 100% 100%
(n = 152) (n = 159) (n = 89)
26
TWO-WAY CONTINGENCY
TABLE ANALYSIS
CASE STUDY
27
STATISTICAL CONTROL TABLE
ANALYSIS
When a pair of variables were found to
be associated statistically, we inevitably
assumed that they were, in fact, related,
in the sense that changes in one could
be expected to lead to changes in the
other. Conversely, when the variables
were not associated statistically, we
28
STATISTICAL CONTROL TABLE
ANALYSIS
Control table analysis is a technique
for the analysis of multivariate (three or
more variable) analysis of nominal and
ordinal variables. Control table analysis
is used to determine how a third,
“control” variable may affect the
association between an independent
29
STATISTICAL CONTROL TABLE
ANALYSIS
The procedure by which the researcher
31
STATISTICAL CONTROL TABLE
Example:
ANALYSIS
The Daily News, one of the leading newspapers in Central City,
has recently published a series of troubling articles accusing the
Central city government of favouritism in testing and hiring job
applicants. The articles charge Central city officials with giving
hiring preference to those whom they know, rather than to the
most qualified applicants. One article quotes an unsuccessful job
candidate: “Unless you know someone in Central city hall,
you’re not going to get a pass on the civil service
examination. And without the pass, you don’t make it
onto the hire list. Check the list—most of the people on it
have friends in city government. The key is to know
32
STATISTICAL CONTROL TABLE
Example (cont … ):
To do so, he draws aANALYSIS
random sample of 335 job applicants for
analysis from the Central city central personnel department. He
begins by cross-tabulating whether the applicant knew someone
in Central city government (previous contact) with the
information on whether she or he passed the civil service exam
(test
Tableperformance). Table 3 displays
3. Cross-Tabulation the cross-tabulation.
of Test Performance and
Prior
PriorContact
Contact
Test
Total
Performance No Yes
Fail 70 70 140
Pass 60 135 195
Total 130 205 335
33
STATISTICAL CONTROL TABLE
Example (cont … ):
ANALYSIS
Percentage Cross-Tabulation of Test Performance
and PriorPrior
Contact
Contact
Test
Performance No Yes
36
STATISTICAL CONTROL TABLE
Table 3.A.
ANALYSIS
37
STATISTICAL CONTROL TABLE
Table 3.B.
ANALYSIS
38
STATISTICAL CONTROL TABLE
Example (cont … ):
ANALYSIS
Although prior contact with a Central city official
appeared to affect test performance in the
original bivariate cross-tabulation, the introduction of
the control variable (education) made the
relationship disappear. Thus, in this example,
contact is not a cause of test
performance. Instead, previous contact is a
spurious variable—one that initially appears to be
related to the dependent variable but whose effect
39
STATISTICAL CONTROL TABLE
Example (cont … ):
ANALYSIS
Table 3.B. also demonstrates that, regardless of prior
contact with a Central city official, the percentage of
college graduates failing the examination (25%) is much
smaller than the percentage of nongraduates who fail
(67%). This finding indicates that it is education that
leads to test performance. For both those who have and
those who have not had prior contact, the higher the
education, the better is the performance
40 on
STATISTICAL CONTROL TABLE
ANALYSIS
EXERCICES
41
MEASURES OF ASSOCIATION
Measures of association are statistics whose
magnitude and sign (positive or negative)
provide an indication of the extent and direction of
relationship between two variables in a cross-
tabulation. In contrast to the percentage
difference, measures of association are calculated
on the basis of—and take into account—all data in the
contingency table. These statistics are designed to
indicate where an actual relationship falls
42 on the
MEASURES OF ASSOCIATION
Four conventions:
1.If the relationship between the two variables is
perfect, the measure equals + 1.0 (positive
relationship) or - 1.0 (negative relationship).
48
MEASURES OF ASSOCIATION
Example (cont…):
Percentaged Cross-Tabulation of Education and
Seniority
Education
Seniority
Low High
Low 80% 40%
High 20% 60%
Total 100% 100%
(n = 25) (n = 25)
As education increased, employees were more likely to
have high seniority by a (percentage) difference of 60% –
20% = 40%.
49
MEASURES OF ASSOCIATION
Example :
Table 5. Cross-Tabulation of Gender and
Contribution
Gender
Contributi
Total
on Male Female
1. Calculate Gamma!
2. What conclusion you can take from this data?
51
THANK YOU
AND
04:10
9/28/20