You are on page 1of 26

QUANTITATIVE

TECHNIQUES - II
Dr. Pritha Guha
Problem Number of Frequency (Oi )
observed defects
• X: the no. of defects in pages of a
(i)
book.
• A random sample of size n = 60 pages 0 32
is taken.
• The number of defects are
𝑋1 , 𝑋2 , ⋯ , 𝑋60 , classified as follows:
1 15
• Can this data be modelled by a
Poisson distribution?
2 9
3 4
Solution:
መ 𝜆መ = 1
• Need to estimate 𝜆, 0 × 32 + 1 × 15 + 2 × 9 + 3 × 4 = 0.75
60

• 𝐻0 :Data is following Poisson(0.75); 𝐻1 : Not following this distribution

Number of observed Frequency (𝑂𝑖 ) Ei


defects (i)

0 32 𝑒 0.75 × 0.75 0
60 × = 28.34199
0!
1 15 𝑒 0.75 × 0.75 1
60 × = 21.25649
1!
2 9 𝑒 0.75 × 0.75 2
60 × = 7.971186
2!
≥3 4 60 − 28.34199 − 21.25649 − 7.971186
= 2.430334

Total 60 60
Solution:
• Expected frequency cannot be less than 5, combine the last two classes, then all Ei > 5.
• d=k-m-1=3-1-1=1.
• Suppose α=0.05

2 32−28.34199 2 15−21.25649 2 13−10.40152 2


• 𝜒𝑜𝑏𝑠 = + + = 2.962789
28.34199 21.25649 10.40152

Number of Frequency (𝑂𝑖 ) Ei


2 2
• 𝜒𝑑;𝛼 = 𝜒1;0.05 = 3.841459 observed
defects (i)
• Conclusion:

0 32 28.34199

1 15 21.25649

≥2 13 10.40152
Total 60 60
Test for Goodness of Fit for a Normal Distribution
• Example from Text (Pg. 509)
• Data: X: Mileage of 50 cars (see Text, Table 1.7, Pg. 15 and data set GasMiles.csv)
• Does this data set come from a normal distribution?
Class 𝑶𝒊 𝑬𝒊
2 12−10 2 6−10 2 12−10 2 9−10 2 11−10 2
𝜒𝑜𝑏𝑠 = + + + + = 2.6
(29.09492- 30.88864] 12 10 10 10 10 10 10
d = k-m-1 = 5-2-1=2
(30.88864-31.35790] 6 10
(31.35790-31.76210] 12 10
(31.76210-32.23136] 9 10
(32.23136-34.02508] 11 10
qqnorm(Mileage, ylab = "Mileage", col = "dark green", pch = 19)
qqline(Mileage, col = "red")
• shapiro.test(Mileage)

In R
TEST OF INDEPENDENCE
Problem
• A recent study of gender preferences among car shoppers found that men and women equally
favour automatic transmission cars.
• A marketing analyst doubts these results. He believes that a person's gender influences
whether he or she purchases an automatic transmission car.
• He collects data on 400 recent car purchases cross-classified by gender and type of
transmission type of the car (automatic transmission versus manual transmission).
• The results are as follows:

Automatic Cars Manual Cars


Female 50 60
Male 120 170
• Does the sample data support the marketing analyst’s claim?
Contingency Table/ Cross Tabs

Automatic Cars Manual Cars Row Total


Female 50 60 110
Male 120 170 290
Column Total 170 230 400
Chi-square Test for Independence: Set up
Null and alternate hypothesis:

• 𝐻0 : The two classifications are statistically independent


• 𝐻1 : The two classifications are statistically dependent

Computation:

• r = total number of rows, c= total number of columns


• 𝑂𝑖𝑗 = cell frequency corresponding to row i and column j
• 𝑟𝑖 = row total for row i
• 𝑐𝑗 = column total for column j
𝑟𝑖 ×𝑐𝑗

• 𝐸𝑖𝑗 =
𝑛
Chi-square Test for Independence: Test Statistic

• Test Statistic:
2
𝑂𝑖𝑗 − 𝐸෠𝑖𝑗
𝜒𝑜2 = ෍
𝐸෠𝑖𝑗
𝑎𝑙𝑙 𝑐𝑒𝑙𝑙𝑠
2
• 𝜒𝑜2 ~𝜒𝑑 , where 𝑑 = 𝑟−1 𝑐−1
• Reject 𝐻0 at level of significance α if
2
• 𝜒𝑜2> 𝜒𝑑;𝛼
• p-value < α
Problem Continued: Null and Alternative
• Does the sample data support the marketing analyst's claim? Consider level of significance
as 10%.

Automatic Cars Manual Cars Row Total


Female 50 60 110
Male 120 170 290
Column Total 170 230 400
𝐻0 :
𝐻1 :
Problem Continued: Computation of 𝐸෠𝑖𝑗

Automatic Cars Manual cars


Female 𝑟1 𝑐1 𝑟1 𝑐2
𝐸෠11 = = 𝐸෠11 = =
𝑛 𝑛

Male 𝑟2 𝑐1 𝑟2 𝑐2
𝐸෠11 = = 𝐸෠11 = =
𝑛 𝑛
Problem Continued: Test Statistic
Automatic Cars Manual cars
Female 46.75 63.25

Male 123.25 166.75

• 𝜒𝑜2 =
Problem Continued: Conclusion
• d=
• α=
• Con1 = table(Gender_Car$Gender, Gender_Car$CarTransmission)
• chisq.test(Con1, correct = FALSE)

In R
NON-PARAMETRIC TESTS
Why Non-parametric tests are required?

• Comparing the means of two populations is often required.


• We can do the comparison easily if we assume that the samples are normally distributed.
• If the sample size is large, we can claim that the sampling distribution of the sample means
are approximately normal, by CLT.
• In some cases, it may happen that the data are not normal and sample size is also too small
to use CLT.
• Thus, we require an alternative approach which is independent of any assumption about
the distribution of the data.
One-Sample Rank-Based Test: Wilcoxon Signed-Rank Test
(Small Sample)
• Null hypothesis: 𝐻0 : 𝜇෤ = 𝜇෤0 ,
• Alternative hypothesis: 𝐻1 : 𝜇෤ < 𝜇෤0 , 𝐻1 : 𝜇෤ > 𝜇෤0 , 𝐻1 : 𝜇෤ ≠ 𝜇෤0
• Define: 𝑑𝑖 = 𝑋𝑖 − 𝜇෤0
• Test statistic: 𝑇 + = sum the ranks corresponding to positive values of 𝑑𝑖
Problem
• A producer of breakfast cereals wants to verify that a filler machine is operating
correctly. The machine is supposed to fill one-pound boxes with 460 g, on average.
This is a little above the 453.6 g needed for one pound. When the contents are
weighed, it is found that 15 boxes yield the following measurements:
454.4, 470.8, 447.5, 453.2, 462.6, 445.0, 455.9, 458.2, 461.6, 457.3, 452.0, 464.3,
459.2, 453.5, 465.8
Does the data provide convincing statistical evidence that the true mean weight
differs from 460 g?
Weight of 𝒅𝒊 |𝒅𝒊 | Rank
Problem Continued: boxes (|𝒅𝒊 |)
454.4 -5.6 5.6 8
𝐻0 : 470.8 10.8 10.8 13
𝐻1 : 447.5 -12.5 12.5 14
453.2 -6.8 6.8 11
462.6 2.6 2.6 4
Test statistic: 𝑇 + = 445.0 -15.0 15.0 15
445.9 -4.1 4.1 6
458.2 -1.8 1.8 3
461.6 1.6 1.6 2
457.3 -2.7 2.7 5
452.0 -8.0 8.0 12
464.3 4.3 4.3 7
459.2 -0.8 0.8 1
453.5 -6.5 6.5 10
465.8 5.8 5.8 9
• wilcox.test(a, mu=460, alternative = "two.sided“)

In R

You might also like