You are on page 1of 67

Workshop Objectives:

• To develop Data Analysis Skills

• Use of appropriate Statistical Techniques

• Use of various Statistical Packages to perform


Statistical Analysis

• Understanding and Interpretation of Results


Types of Analysis:
First we will learn to perform
analysis using SPSS
then
We replicate all these analysis
using STATA
and
Finally we will Use AMOS for
SEM
Before starting analyzing data let me introduce
SPSS and its basic structure
Descriptive Analysis:

• Descriptive Analysis for Qualitative Variables

• Descriptive Analysis for Quantitative Variables


Descriptive Analysis of Qualitative Data

Qualitative Data
(Categorical Data)

Tables Graphs Numbers

One Way Table


Two Way Table Bar Chart
. Pie Chart
Percentages
. Clustered Bar
. Chart
N-Way Table
Descriptive Analysis for Quantitative Data
Quantitative Data
(Numerical Data)

Tables Graphs Numbers


Frequency Distribution Histogram
Stem and Leaf Box-Plot

Important
Center Variation Distribution
Points
Mean
Median Range
Median
Mode Inter Quartile Range Skewness
Quartiles
Geometric Mean Variance Kurtosis
Percentiles
Harmonic Mean Standard Deviation
Trimmed Mean
Tabular Methods
Graphical Methods
Histogram
250

200

150
Frequency

100

50

0
15750 35750 55750 75750 95750 115750
Numerical Methods
Practice Session for Descriptive Analysis
• Import Customer’s Databas.xls into SPSS
• Label data properly
• Make One-way tables for variables (Age, gender, OwnHome
and Married). Also make pie chart and bar chart for these
variables
• Make Two-way tables (gender by OwnHome and Married by
OwnHome). Also make clustered bar chart for each variable
• Produce Detailed Numerical descriptive statistics for
variable “Purchases” (Mean, Median, ………..). Also make
histogram and stem & leaf and box-plot for variable
“Purchases”
Inferential Analysis
Parametric & Non-Parametric Inference

Normality Normality
Normality Normality
+ +
+ +
Equal Un-Equal
Equal Un-Equal
Variances Variances
Variances Variances
Comparing One Group
• Kinds of Research Questions
For the one-sample situation, the prime concern in research is
examining a measure of central tendency (location) for the
population of interest. The best-known measures of location
are the mean and median. For a one-sample situation, we
might want to know if the average waiting time in a doctor's
office is greater than one hour, or if the average growth of
roses is 4 inches or more with a certain fertilizer, or is annual
return is 10.2% for the banks that exercised comprehensive
planning.
Comparing Two Groups
 Kinds of Research Questions

One of the most common tasks in research is to compare two populations


(groups). We might want to compare the income level of two regions, the
nitrogen content of two lakes, or the effectiveness of two drugs.

The first question that arises is what aspects (parameters) of the


populations shall we compare. We might consider comparing the averages,
the medians, the standard deviations, the distributional shapes
(histogram), or maximum values. We base the comparison parameter on
our particular problem.

Perhaps the simplest comparison that we can make is between the means
of the two populations.
Comparing more than two Groups
 Kinds of Research Questions

One of the most common tasks in research is to compare several populations


(groups). We might want to compare the income level of three regions, the
nitrogen content of four lakes, or the effectiveness of four drugs.

The first question that arises concerns which aspects (parameters) of the
populations we should compare. We might consider comparing the means,
medians, standard deviations, distributional shapes (histograms), or
maximum values. We base the comparison of parameter on our particular
problem.

Perhaps the simplest comparison that we can make is to compare means of


several populations.
One Sample t-test

• One Sample t-test is used to compare one group to a


given standard on the basis of Arithmetic Average
(Mean).
Assumptions of the One-sample t-test

• The data are continuous.

• The data follow the Normal distribution.

• The sample is a simple random sample from the


population.
Hypotheses and Formulas

H 0 : m = m0 , H A : m �m0

X -m
t=
s2
n

With
df = n - 1
Case Study
A manufacturer of high-performance automobiles
produces disc brakes that must measure 322 millimeters
in diameter. Quality control manager randomly selects
128 discs and measures their diameters.

We can use One Sample T Test to determine whether or


not the mean diameters of the brakes in sample
significantly differ from 322 millimeters.
SPSS Analytic Procedure
The Sign Test
The sign test is perhaps the oldest of all the nonparametric
procedures. This nonparametric test is based on the binomial
distribution. It assumes two mutually exclusive outcomes,
constant or stable probability of success or failure, and n
independent trials

The terminology, sign test, reinforces the point that the data are
converted to a series of plus and minus signs. The test is based
on the number of plus signs that occur. Zero differences are
thrown out, and the sample size is reduced accordingly.
Assumptions of the Sign Test

• The data are continuous

• The distribution of these data is symmetric.

• The measurement scale is at least interval.


Hypotheses and Formulas

H 0 : m%= m%
0

H A : m%�m% 0 , m > m0 , m < m0


% % % %
w - mw
Z=
sw
n ( n + 1)
mw =
4
n ( n + 1) ( 2n + 1)
sw =
24
w = �R+
Case Study
A Researcher believes that median salary of HR
Manager is 50 thousands. To confirm this
hypothesis he selects a random sample of 1207 HR
Managers from different companies.

We can use Sign Test to determine whether or not


the median salary is significantly different from 50
thousands.
SPSS Analytic Procedure
Paired Samples t-test
 Kinds of Research Questions

In the paired case, we take two measurements on same individual at


different times, or we have one measurement on each individual of a
pair.
Examples of the first case are two insurance-claim adjusters assessing
the damage for the same 15 cases. Evaluation of the improvement in
aerobic fitness for 15 subjects where measurements are made at the
beginning of the fitness program and at the end of it.
An example of the second paired situation is the testing of the
effectiveness of two drugs, A and B, on 20 pairs of patients who have
been matched on physiological and psychological variables. One
patient in the pair receives drug A, and the other patient gets drug B.
Assumptions of the paired-sample t-test

 The data are continuous.

 The data, i.e., the differences for the matched-pairs,


follow a Normal distribution.

 The sample of pairs is a simple random sample from


its population.
Hypotheses and Formulas

H 0 : md = 0, H A : md �0

X d - md
t=
2
s d
n
With
df = n - 1
Case Study
A researcher in behavioral medicine believes that stress often makes
asthma symptoms worse for people who suffer from this respiratory
disorder. Therefore, the researcher decides to study the effect of
relaxation training on the severity of their symptoms.

A sample of 5 patients is selected. During the week before


treatment, the investigator records the severity of their symptoms
by measuring how many doses of medication are needed for asthma
attacks. Then the patients receive relaxation training. For the week
following the training the researcher once again records the number
of doses used by each patient.

Data from Gravetter and Wallnau (4th Ed.) p. 319.


SPSS Analytic Procedure
Wilcoxon Signed Rank test

• Wilcoxon Signed Rank test is used to test the


median difference of zero in case of non –
normal populations.
Assumptions of the two-sample t-test

• The differences are continuous.

• The distribution of these differences is symmetric.

• The differences are mutually independent.

• The measurement scale is at least interval.


Hypotheses and Formulas

H 0 : m%
1 = m2
%
H1 : m%1 �m 2
%
w - mw
Z=
sw
n ( n + 1)
mw =
4
n ( n + 1) ( 2n + 1)
sw =
24
w = �R+
Case Study
An educationist wants to see the effectiveness of
new teaching method. For this She selected 600
students and record their scores in a test of 150
marks. The scores are recorded before and after the
new teaching method.

The Wilcoxon Signed Rank test can be used to test


the effectiveness of new teaching method.
SPSS Analytic Procedure
Independent Samples t-test
Equal Variances

Independent sample t – test is used to compare two


groups on the basis of their averages.
Assumptions of the two-sample t-test

 The data are continuous


 The data follow the Normal distribution.
 The variances of the two populations are equal
 The two samples are independent
 Both samples are simple random samples from their
respective populations.
Hypotheses and Formulas

H 0 : m1 = m2 , H A : m1 �m2

X 1 - X 2 - ( m1 - m2 )
t=
( 1 ) 1 ( 2 ) 2 �1 + 1 �
n - 1 s 2
+ n - 1 s 2

� �
n1 + n2 - 2 �n1 n2 �
With
df = n1 + n2 - 2
Case Study
• An analyst at a department store wants to evaluate a
recent credit card promotion. To this end, 500
cardholders were randomly selected. Half received
an ad promoting a reduced interest rate on
purchases made over the next three months, and
half received a standard seasonal ad.
• We can use Independent-Samples T Test to compare
the spending of the two groups.
SPSS Analytic Procedure
Independent Samples t-test
Unequal Variances

• Independent Samples t-test is use to compare two


independent groups on the basis of average. This test
does not require homogeneity of the variances.
Hypotheses and Formulas

H 0 : m1 = m2 , H A : m1 �m2

2
�2 2

X 1 - X 2 - ( m1 - m2 ) s
� + �
1 s 2
�n1 n2 �
t= With df = � 2 �
� 2
� 2 2 2

s s � � � �
2

�s 1 � �s 2 �
� + 2�
1
�n1 � �n2 �
�n1 n2 � � �+ � �
� � n1 - 1 n2 - 1
Case Study
• A researcher wishes to compare the
expenditure behavior of the students, one of
the research question is to see the difference
in expenditures by gender.
SPSS Analytic Procedure
Mann-Whitney Test
• Mann-Whitney Test is used to compare the
two independent groups on the basis of
medians. This test does not require the
assumption of normality.
Mann-Whitney U Test Assumptions
 The variable of interest is continuous. The measurement scale
is at least ordinal.

 The probability distributions of the two populations are


identical, except for location.

 The two samples are independent.

 Both samples are simple random samples from their


respective populations.
Hypotheses and Formulas
H 0 : m%
1 = m2
%
H A : m%
1 �m 2
%

u - mu
z=
su mu =
n1n2
2
n1n2 ( n1 + n2 + 1)
su =
12
n1 ( n1 + 1)
u = w-
2

W is the sum of ranks of the smaller


sample
Case Study
Data on birth weight of infants born to mothers with
different levels of prenatal care. Two independent
samples data for univariate analysis. Test data for
Mann-Whitney U-Test, obtained from Howell, David
D. Fundamental Statistics for the Behavioral Sciences
3rd Edition, p385.
SPSS Analytic Procedure
One-Way Analysis of Variance
Equal Variances

• One Way Analysis of Variance is used to


compare more than two groups on the basis
of their averages.
One-Way Analysis of Variance Assumptions

 The data are continuous.

 The data follow the Normal distribution, each group is


normally distributed.

 The variances of the populations are equal.

 The groups are independent.

 Each group is a simple random sample from its


population.
Hypotheses and Formulas
H 0 : m1 = m2 = m3 = ....... = m k
H A : Atleast one pair is significantly diffrent

MSG
F=
MSE
MSG is the Mean Square of Group and MSE is the Mean Square
Error
Example
• This is a hypothetical data file that concerns the
popularity of a TV channel. Using a prototype, the
marketing team has collected focus group data. One
of the question of interest is to see the difference in
popularity of the TV channel in different age groups.
• This hypothesis can be tested using One Way ANOVA.
SPSS Analytic Procedure
One-Way Analysis of Variance
Unequal Variances

• Welch ANOVA is used to compare more than two


groups on the basis of averages. This test does not
require the homogeneity of variances.
Welch Analysis of Variance Assumptions
 The data are continuous

 The data follow the Normal distribution, each group is


normally distributed.

 The groups are independent.

 Each group is a simple random sample from its population.


Hypotheses and Formulas
H 0 : m1 = m2 = m3 = ....... = m k
H A : Atleast one pair is significantly diffrent

1 k ni �
( X i. - X .. ) �
2

k - 1 i =1 s 2 � �
F= i

� 2 �
2 ( k - 2 ) k � ni / s i �
1+ 2 � 1- k
� �/ ( ni - 1)
k - 1 i =1 � 2

� �n i / s i �
� i =1 �
-1
� �
2
� �
With � k �
2
� �

df = 2
3
� �
1- n n i Si
/
�/ ( ni - 1) �
�k - 1 i =1 � 2� �
� � �n i / S i � �
� � i =1 � �
Case Study
• A sales manager evaluates two new training courses.
• Sixty employees, divided into three groups, all
receive standard training. In addition, group 2
receives technical training, and group 3 receives a
hands-on tutorial. Each employee was tested at the
end of the training course and their score recorded.
SPSS Analytic Procedure
Kruskal-Wallis Test

• Kruskal-Wallis H-test is used to compare more


than two groups on the basis of their medians.
Kruskal-Wallis Test
Assumptions
• The variable of interest is continuous, the measurement
scale is at least ordinal.

• The probability distributions of the populations are


identical, except for location.

• The groups are independent.

• All groups are simple random samples from their


respective populations.
Hypotheses and Formulas

H 0 : m%
1 = m 2 = ...... = m k
% %
H A : At least one pair of median is significantly diffrent

k
12 Ri
H= � - 3 ( N + 1)
N ( N + 1) i =1 ni
Case Study
A health scientist wishes to compare the
survival experiences after breast cancer with
different Pathological Tumor Size (Categories).

We can use Kruskal-Wallis H-Test to determine


whether or not the median survival time of
the patients is significantly differ in different
pathological tumor size.
SPSS Analytic Procedure
Model Building Techniques

You might also like