You are on page 1of 10

31-Jan-20

Two-tail Study

27

27

Hypothesis Testing: Introduction


• Hypothesis Testing is a method for testing a claim or
hypothesis about a parameter in a population, using data
measured in a sample.
– In other words, it is a systematic way to test claims or ideas about a
group or population.

• Level of significance refers to a criterion of judgment


upon which a decision is made regarding the value stated in
a null hypothesis.
– The criterion is based on the probability of obtaining a statistic
measured in a sample if the value stated in the null hypothesis were
true.

28

28

14
31-Jan-20

Hypothesis Testing: Calculation


• Test statistic is a mathematical formula that allows researchers
to determine the likelihood of obtaining sample outcomes if the
null hypothesis were true. The value of the test statistic is used to
make a decision regarding the null hypothesis.

• A p value is the probability of obtaining a sample outcome,


given that the value stated in the null hypothesis is true.
– p value for obtaining a sample outcome is compared to the level of
significance.

29

29

Decision about Hypothesis


• Reject the null hypothesis
– The sample mean is associated with a low probability of
occurrence when the null hypothesis is true.

• Do not reject (Retain) the null hypothesis


– The sample mean is associated with a high probability of
occurrence when the null hypothesis is true.

• Note:
– A null hypothesis may be rejected, but it can never be accepted
based on a single test.
– In classical hypothesis testing, there is no way to determine whether
the null hypothesis is true.
30

30

15
31-Jan-20

Choose a Level of Significance


• Type I Error: false positives
– Type I error occurs when the sample results lead to the rejection of
null hypothesis when it is in fact true.
– Probability of type I error (α ) is also called level of significance.

• Type II Error: false negatives


– Type II error occurs when, based on the sample results, null
hypothesis is not rejected when it is in fact false.
– Probability of type II error is denoted by β .

31

31

A Broad Classification of Hypothesis Tests

Hypothesis Tests

Tests of Tests of
Association Differences

Median/
Distributions Means Proportions Rankings

32

32

16
31-Jan-20

Frequency Distribution

33

Frequency Distribution
• In a frequency distribution, one variable is considered at a
time.
– A frequency distribution for a variable produces a table of
frequency counts, percentages, & cumulative percentages for all
values associated with that variable.

• Statistics Associated with Frequency Distribution


– Measures of Location
– Measures of Variability
– Measures of Shape

34

34

17
31-Jan-20

Measures of Location
• Mean
– Most commonly used measure of central tendency.
– Used when data is in interval or ratio scale.
• Median
– Middle value when data are arranged in ascending or descending
order. It is the 50th percentile.
– When data is in Ordinal Scale & also interval or ratio scale
• Mode
– The value that occurs most frequently & represents the highest
peak of the distribution.
– Mode is a good measure of location when the variable is inherently
categorical or has otherwise been grouped into categories.

35

35

Measures of Variablity
• Variability is a measure of the dispersion or spread of
scores in a distribution.
– Variability ranges from 0 to ∝.
• Range
• Interquartile Range
• Variance
– Mean squared deviation from the mean. The variance can never be
negative.
• Standard Deviation
– Square root of the variance.
• Coefficient of variation
– Ratio of SD to the mean expressed as a percentage & is a unitless
measure of relative variability.
– Can be used with ratio scale only.
36

36

18
31-Jan-20

Measures of Shape: Skweness


• Skewness: A skewed distribution is a distribution of scores
that includes outliers or scores that fall substantially above
or below most other scores in a data set.
– Tendency of deviations from mean to be larger in one direction than
in the other. It can be thought of as tendency for one tail of the
distribution to be heavier than other.

Symmetric Distribution
Skewed Distribution

Mean Mean Median


Median Mode
Mode 37

37

Measures of Shape: Skewness of distribution

• A positively skewed
distribution is a
distribution of scores
where a few outliers are
substantially larger (toward
the right tail in a graph)
than most other scores.

• A negatively skewed
distribution is a
distribution of scores
where a few outliers are
substantially smaller
(toward the left tail in a
graph) than most other
scores. 38

38

19
31-Jan-20

Measures of Shape: Kurtosis


• Kurtosis
– Measure of the relative peakedness or flatness of the
curve defined by frequency distribution.

• Kurtosis of a normal distribution is zero.


• If kurtosis is positive, distribution is more peaked than a
normal distribution.
• A negative value means that distribution is flatter than a
normal distribution.

39

39

40

20
2/8/2020

Business Research Method

Prof. Ravi Shekhar Kumar

XLRI- Xavier School of Management, Jamshedpur


ravishekhar@xlri.ac.in

Session-7

Cross-Tabulation

1
2/8/2020

Cross-Tabulation
• While a frequency distribution describes one variable at a time, a
cross-tabulation describes two or more variables simultaneously.

General rule is to
compute % in the
direction of the
independent variable,
across the dependent
variable.

First table is more


acceptable than
second
3

Statistics Associated with Cross-Tabulation


• Chi-Square Test for independence: …is a statistical
procedure to determine whether frequencies observed at
the combination of levels of two categorical variables are
similar to frequencies expected
– To determine whether a systematic association exists, probability of
obtaining a value of chi-square as large or larger than one
calculated from cross-tabulation is estimated.
– Null hypothesis (H0) of NO association between two variables will
be rejected only when calculated value of test statistic is greater
than critical value of chi-square distribution with appropriate
degrees of freedom.
– An important characteristic of chi-square statistic is df associated
with it. df = (r - 1) x (c -1).

2
2/8/2020

Strength of Association in Cross-Tabulation


• phi coefficient is used as a measure of strength of
association in special case of a table with two rows & two
columns (a 2 x 2 table).

χ2
φ=
n

Strength of Association in Cross-Tabulation


• While phi coefficient is specific to a 2 x 2 table,
contingency coefficient (C) can be used to assess strength
of association in a table of any size. Can be applicable to
square table.
χ2
C=
χ2 + n

• Contingency coefficient varies between 0 & 1.


• Maximum value of contingency coefficient depends on size
of table (number of rows & number of columns). For this
reason, it should be used only to compare tables of same
size.
6

You might also like