You are on page 1of 72

HARAMAYA UNIVERSITY

AFRICA CENTER OF EXCELLENCE FOR CLIMATE-SMART


AGRICULTURE AND BIODIVERSITY CONSERVATION

MASTER OF CLIMATE-SMART AGRICULTURE

COURSE NAME: STATISTICAL METHODS FOR CLIMATE RESEARCH


(Csag-5312)

GROUP ASSIGNMENT
BY
Name ID No
Samuel Lemma PGP/1062/14
Selam Kassahun PGP/1060/14
DEMISSE AMBAW PGP/1063/14
Siraj Shekmohammed PGP/1066/14

Submitted To: Kefelegn kebede (PhD)

September 2022
Haramaya University, Haramaya
PART ONE
1) Data: Twins.JMP (Cochran, pp. 198) In 193 pairs of Swedish twins (2), 56 were of type MM
(both male), 72 of type MF (one male, one female), and 65 of type FF. On the hypothesis that a
twin is equally likely to be a boy or a girl and that the sexes of the two members of a twin pair
are determined independently, the probabilities of MM, MF, & FF pairs are %, %, & %,
respectively. Compute the values of the chi-square the significance probability.

Solution 1
Distributions
Twins

ff mf mm

Frequencies

Level Count Prob


ff 65 0.33679
mf 72 0.37306
mm 56 0.29016
Total 193 1.00000

N Missing 0
3 Levels Test Probabilities

Level Estim Prob Hypoth Prob


ff 0.33679 0.333333333
mf 0.37306 0.333333333
mm 0.29016 0.333333333
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 2.0156 2 0.3650
Pearson 2.0000 2 0.3679
Hypothesis are:
 H0: µ1=0.333, µ2=0.333, µ3=0.333
 HA: µ1≠ µ2≠ µ3 ≠ 0.333

Conclusion
 Since our Prob>Chisq value (i.e., 0.3679) is greater than = 0.05, we fail to reject H0 &
conclude that the data came from equal probability.

Solution 2

Distributions

Gender

MM FM FF

Frequencies

Level Count Prob

MM 56 0.29016

FM 72 0.37306
FF 65 0.33679

Total 193 1.00000

N Missing0
3 Levels
Test Probabilities
Level Estim Prob Hypoth Prob
MM 0.29016 0.25000
FM 0.37306 0.50000
FF 0.33679 0.25000

Test ChiSquare DF Prob>Chisq


Likelihood Ratio 13.2477 2 0.0013*
Pearson 13.2798 2 0.0013*

Hypothesis:
 H0: µ1=0.25, µ2=0.5, µ3=0.25 (the data came from a population having a 1/4:1/2:1/4
ratio of 56 MM, 72 of type MF, and 65 of type FF;
 HA: the data came from a population not having a 1/4:1/2:1/4 ratio of 56 MM, 72 of type
MF, and 65 of type FF.

Conclusion
Since our Prob>Chisq value (i.e., 0.0013*) is less than  = 0.05, we reject H0 & conclude that
the data came from a population not having a 1:1:1 ratio of 56 MM, 72 of type MF, and 65 of
type FF.

2) Flowers. jmp
In an experiment of yellow vs. green flowers, we expect a 3:1 ratio. We can test if what we
observe is statistically different from 3:1. Let's say we raise 100 flowers & observe the following
colors:
Hypothesis
H0: yellow = 0.75; green = 0.25 (the data came from a population having a 3:1 ratio of yellow
to green flowers);
HA: the data came from a population not having a 3:1 ratio of yellow to green flowers

Solution
Distributions
Seed color

Green Yellow

Frequencies

Level Count Prob

Green 16 0.16000

Yellow 84 0.84000
Total 100 1.00000

N Missing0
2 Levels
Test Probabilities

Level Estim Prob Hypoth Prob


Green 0.16000 0.25000
Yellow 0.84000 0.75000

Test ChiSquare DF Prob>Chisq


Likelihood Ratio 4.7580 1 0.0292*
Pearson 4.3200 1 0.0377*

Decision
Since our Prob>Chisq value (i.e., 0.0377) is less than  = 0.05, we reject H0 & conclude that the
data came from a population not having a 3:1 ratio of yellow to green flowers.

3) Data: Seed-Color1.MP It is hypothesized that the ratio of yellow-smooth to yellow-wrinkled


to green-smooth to green-wrinkled seeds to be 9:3:3:1. Using the data below do the four seed
phenotypes appear to follow the hypothesized 9:3:3:1 ratio?

Ho: yellow-smooth = 9/16 (=0.5625), yellow-wrinkled = 3/16 (=0.1875), green-smooth =


3/16 ((=0.1875), green-wrinkled = 1/16 =0.0625); in other words, Ho: = the data came from a
population having a 9:3:3:1 ratio;
HA: the data came from a population not having a 9:3:3:1 ratio (i.e., at least one of the
proportions above is not correct).
Yellow- Yellow- Green Green- Total
Smooth Wrinkled Smooth Wrinkled
152 39 53 6 250

Solution
Distributions 4 Levels
seed color Test Probabilities

Level Estim Prob Hypoth Prob


green smooth 0.21200 0.5625
green wrinkled 0.02400 0.1875
yellow smooth 0.60800 0.1875
yellow wrinkled 0.15600 0.0625

green smoothgreen wrinkledyellow smoothyellow wrinkled

Frequencies

Level Count Prob


green smooth 53 0.21200
green wrinkled 6 0.02400
yellow smooth 152 0.60800
yellow wrinkled 39 0.15600
Total 250 1.00000

N Missing 0
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 300.8662 3 <.0001*
Pearson 360.9724 3 <.0001*

Hypothesis
 HO: µ1= 9/16, µ2=3/16, µ3=3/16, µ4 = 1/16
 HA: the data came from a population not having a 9:3:3:1 ratio (i.e. at least
one of the proportions above is not correct).
Conclusion
 Since our Prob>Chisq value (i.e., <.0001*) is less than = 0.05, we reject H0 & conclude
that the data came from a population not having a 9:3:3:1 ratio of yellow-smooth to
yellow-wrinkled to green-smooth to green-wrinkled seeds.

4) Data: Seed-Color3.JMP (Cochran, pp196)


In crosses b/n 2 types of maize, a researcher found 4 distinct types of plants in the 2 nd generation.
In a sample of 1301 plants, there were,
According to a simple type of Mendelian inheritance, the probabilities of obtaining these 4 types
of plants are 9/16, 3/16, 3/16, & 1/16, respectively. We select this as the null hypothesis.

F1 F2 F3 F4 Total
773 231 238 59 1301
Solution
Distributions Level Count Prob
F4 59 0.04535
maize type
Total 1301 1.00000

N Missing0
4 LevelsTest Probabilities

Level Estim Prob Hypoth Prob


F1 0.59416 0.5625
F2 0.17756 0.1875
F3 0.18294 0.1875
F1 F2 F3 F4 F4 0.04535 0.0625

Frequencies

Level Count Prob


F1 773 0.59416
F2 231 0.17756
F3 238 0.18294
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 9.8952 3 0.0195*
Pearson 9.2714 3 0.0259*

Hypothesis
 HO: µ1= 9/16, µ2=3/16, µ3=3/16, µ4 = 1/16 (the data came from a population having a
9:3:3:1 ratio)
 HA: the data came from a population not having a 9:3:3:1 ratio (i.e., at least
one of the proportions above is not correct).

Conclusion

 Since our Prob>Chisq value (i.e., 0.0259*) is less than = 0.05, we reject H0 & conclude
that the data came from a population not having a 9:3:3:1 ratio.

5) Data: Answers2.JMP People sometimes say that “b” and “c” answers occur most frequently
on multiple choice tests. To see if there is any evidence of this, use the answers below from the
verbal section of a real SAT. (This SAT exam was selected randomly from The College Board,
10 SATs, New York: College Entrance Examination Board, 1988).

Solution Frequencies
Distributions
Level Count Prob
A 12 0.14118
B 22 0.25882
C 19 0.22353
D 17 0.20000
E 15 0.17647
Total 85 1.00000

N Missing0
A B C D E 5 Levels
Test Probabilities
Level Estim Prob Hypoth Prob
A 0.14118 0.2
B 0.25882 0.2
C 0.22353 0.2
D 0.20000 0.2
E 0.17647 0.2
Test ChiSqua DF Prob>Chi
re sq
Likelihood 3.4568 4 0.4845
Ratio
Pearson 3.4118 4 0.4914

Hypothesis
 H0: 1 = 2 = 3 = 4 = 5 = 0.2 (i.e., the populations have categories of equal
proportion);
 HA: at least one i differs from that given under H0.

Conclusion
 Since our Prob>Chisq value (i.e., 0.4914) is greater than = 0.05, we fail to reject H0 &
conclude that there is no evidence to say the data differ from that given by Ho.

6) Data: R-Number.JMP One crude check of a random number generator is to generate, say 100
random numbers, and check that 10% are b/n 0 and 1, 10% are b/n 1 and 2, 10% are b/n 2 and 3,
etc. People are very bad at picking random numbers themselves. To show that let’s try this
experiment. Everyone in class should pick a number, randomly, b/n 1 and 10 (inclusive). We’ll
then survey the class and record the distribution of chosen numbers in the table below. Test that
these numbers are randomly uniformly distributed across the 10 cells (e.g., 10% in each cell).
Random nr. 1 2 3 4 5 6 7 8 9 10 Total
Obs. 2 4 6 4 1 7 24 10 10 4 72
Frequency

Solution Frequencies
Distributions
random numbers Level Count Prob
1 2 0.02778
2 4 0.05556
3 6 0.08333
4 4 0.05556
5 1 0.01389
6 7 0.09722
7 24 0.33333
8 10 0.13889
9 10 0.13889
10 4 0.05556
1 2 3 4 5 6 7 8 9 10 Total 72 1.00000

N Missing0 10 Levels
Test Probabilities Level Estim Prob Hypoth Prob
5 0.01389 0.1
Level Estim Prob Hypoth Prob 6 0.09722 0.1
1 0.02778 0.1 7 0.33333 0.1
2 0.05556 0.1 8 0.13889 0.1
3 0.08333 0.1 9 0.13889 0.1
4 0.05556 0.1 10 0.05556 0.1

Test ChiSqua DF Prob>Chi


re sq
Likelihood 45.1698 9 <.0001*
Ratio
Pearson 54.9444 9 <.0001*

Hypothesis
 Ho: 1 = 2 = 3 = 4 = 5 = 6 = 7 = 8 = 9 = 10 = 0.1(i.e., the populations have
categories of equal proportion);
 HA: at least one i differs from that given under H0.
Conclusion
 Since Prob>Chisq value (i.e., <.0001*) is less than = 0.05, we reject H0 & conclude that
the data came from a population not having a µ1 =10%, µ2 =10%, µ3 =10%, µ4 =10%,
µ5 =10%, µ6 = 10%, µ7 =10%, µ8 =10%, µ9 =10%, µ10 =10%, ratio.

7) Data: Seed-Color2.JMP Mendel performed experiments with peas to test his genetic theory. He
predicted that 9/16 of his peas would be Round-Yellow peas, 3/16 would be Round-Green peas,
3/16 would be Wrinkled-Yellow peas, & 1/16 would be Wrinkled- Green peas. His data is shown
Round yellow peas 315
Ho:
to the1right.
= 9/16,
Do 2
the=data
3/16, 3 = 3/16,
support 4 = 1/16;
the theory? HA: at least one EL differs from the others
Solution: Round green peas 108
Wrinkled yellow peas 101
Wrinkled green peas 32
Solution
Distributions
Seed color

Frequencies

Level Count Prob


round green peas 108 0.19424
round yellow peas 315 0.56655
wrinkled green peas 32 0.05755
wrinkled yellow peas 101 0.18165
Total 556 1.00000

N Missing0 4 Levels
Test Probabilities

Level Estim Prob Hypoth Prob


round green peas 0.19424 0.1875
round yellow peas 0.56655 0.5625
wrinkled green peas 0.05755 0.1875
wrinkled yellow peas 0.18165 0.0625
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 152.0839 3 <.0001*
Pearson 176.5276 3 <.0001*

Hypothesis
 Ho: 1 = 9/16, 2 = 3/16, 3 = 3/16, 4 = 1/16 (the data came from a population having
a 9:3:3:1 ratio of Round-Yellow peas, Round-Green peas, Wrinkled-Yellow peas, &
wrinkled green peas);
 HA: the data came from a population not having a 9:3:3:1 ratio of Round-Yellow peas,
Round-Green peas, Wrinkled-Yellow peas, & wrinkled-green peas

CONCLUSION

 Since our Prob>Chisq value (i.e., <.0001*) is less than = 0.05, we reject H0 & conclude
that the data came from a population not having a 9:3:3:1 ratio of Round-Yellow peas,
Round-Green peas, Wrinkled-Yellow peas, & wrinkled green peas

8) Data: M&M.JMP According to the M&M website, the color distribution in milk chocolate is
30% brown, 20% yellow, 20% red, 10% green, 10% blue, & 10% orange. Get together with a
partnerColor
and combine your packages
Brownof M&Ms
Yellow to Red
test theGreen
hypothesis
Blue thatOrange
this is Total
the color
Obs. Frequency 18 18 19 14 7 16 92

Solution
Distributions Level Count Prob
green 14 0.15217
color
orange 16 0.17391
red 19 0.20652
yellow 18 0.19565
Total 92 1.00000

N Missing0 6 Levels
blue brown green orange red yellow
Freque
ncies
Level Count Prob
blue 7 0.07609
brown 18 0.19565
Test Probabilities
Level Estim Prob Hypoth Prob
blue 0.07609 0.1
brown 0.19565 0.3
green 0.15217 0.1
orange 0.17391 0.1
red 0.20652 0.2
yellow 0.19565 0.2
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 10.6783 5 0.0581
Pearson 11.4239 5 0.0436*

Hypothesis
 Ho: 1 = 30% brown, 2 = 20% yellow, 3 = 20% red, 4=10% green, 5=10% blue,
6=10% orange (the data came from a population having a 3:2:2:1:1:1 ratio of brown,
yellow, red, green, blue and orange);
 HA: the data came from a population not having a 3:2:2: 1:1:1 ratio of brown, yellow,
red, green, blue, and orange.

Conclusion
 Since our Prob>Chisq value (i.e., 0.0436*) is less than = 0.05, we reject H0 & conclude
that the M&M color distribution in milk chocolate does not come from a 3:2:2: 1:1:1
ratio of brown, yellow, red, green, blue and orange.

9) In rice, the green leafhopper is suspected to differ in feeding preference b/n an already
diseased plant and a healthy plant. The researcher, therefore, encloses a prescribed number of
green leafhoppers in a cage that holds an equal number of healthy and diseased rice plants. After
2 hours of caging, he then counts the number of insects found on diseased and on healthy plants.
Of 239 insects confined, 67 were found on the healthy plants and 172 on the diseased plants.
Does the observed ratio of 67:172 deviate significantly from the hypothesized no-preference
ratio 0f 1:1?
Solution
Distributions
plant status

diseasd health

Frequencies

Level Count Prob


diseasd 172 0.71967
health 67 0.28033
Total 239 1.00000

N Missing0
2 LevelsTest Probabilities

Level Estim Prob Hypoth Prob


diseasd 0.71967 0.5
health 0.28033 0.5
Test ChiSquare DF Prob>Chisq
Likelihood Ratio 47.7417 1 <.0001*
Pearson 46.1297 1 <.0001*

Hypothesis
 Ho: µ1 = µ2 = 1:1 (i.e., the populations have categories of equal proportion);
 HA: at least one i differs from that given under H0

Conclusion
 Since our Prob>Chisq Pearson value (i.e., <.0001*) is less than = 0.05, we reject H0 &
conclude that the observed ratio of 67:172 deviate significantly from the hypothesized
no-preference ratio 0f 1:1 (the populations have no categories of equal proportion)

PART TWO

1) Descriptive statistics

Enter the above data in MS Excel and calculate the following:


A) minimum, maximum, mean, variance, coefficient of variation
B) Test whether the above data is normally distributed or not.
Solution
Distributions
yield

40 50 60 70 80 90 100 110

Compare Distributions
Show Distribution AICc BIC -2*LogLikelihood

[x] Normal 652.70797 657.31618 648.55212

Quantiles

100.0% maximum 108


99.5% 108
97.5% 103.85
90.0% 92.9
75.0% quartile 84
50.0% median 72.5
25.0% quartile 64
10.0% 56
2.5% 44.125
0.5% 42
0.0% minimum 42

Summary Statistics
Mean 73.9625
Std Dev 14.023889
Std Err Mean 1.5679185
Upper 95% Mean 77.083364
Lower 95% Mean 70.841636
N 80
Variance 196.66946
Skewness 0.069106
Kurtosis -0.418091
CV 18.96081
Minimum 42
Maximum 108

Fitted Normal Distribution

Parameter Estimate Std Error Lower 95% Upper 95%

Location μ 73.9625 1.5679185 70.889436 77.035564

Dispersion σ 14.023889 0.2988714 13.450177 14.364665

Measures
-2*LogLikelihood 648.55212
AICc 652.70797
BIC 657.31618

Goodness-of-Fit Test
W Prob<W
Shapiro-Wilk 0.9880738 0.6695

A2 Simulated p-
Value
Anderson-Darling 0.4274881 0.3024

A) Minimum, maximum, mean, variance, coefficient of variation

 minimum…………………….............................................................42
 maximum…………………………………………………………...108
 mean………………………………………………………………….73.9625
 variance……………………………………………………………...196.66946
 coefficient of variation………………………………………………...18.96081
 N……………………………………………………………………....80
B) Test whether the above data is normally distributed or not.
Hypothesis
Ho: the data is from the normal distribution
HA: the data is not from the normal distribution

Conclusion
 Since our Shapiro-Wilk is (i’e. 0.6695) is greater than >0.05. so, we fail to reject the null
hypothesis and there is no evidence to say that the data is from a normal distribution.

2) One sample t-test


Solution

Distributions
amt

1.8 2 2.2 2.4 2.6

Quantiles
100.0% maximum 2.6
99.5% 2.6
97.5% 2.6
90.0% 2.6
75.0% quartile 2.525
50.0% median 2.35
25.0% quartile 2.05
10.0% 1.9
2.5% 1.9
0.5% 1.9
0.0% minimum 1.9
Summary Statistics
Mean 2.3
Std Dev 0.2607681
Std Err Mean 0.1064581
Upper 95% Mean 2.5736593
Lower 95% Mean 2.0263407
N 6
Test Mean
Hypothesized Value 2
Actual Estimate 2.3
DF 5
Std Dev 0.26077

t Test
Test Statistic 2.8180
Prob > |t| 0.0372*
Prob > t 0.0186*
Prob < t 0.9814

1.6 1.8 2.0 2.2 2.4

Hypothesis
Ho µ=2.0

Ha µ≠2.0

Conclusion

P-value =0.0372 leads to the decision of rejecting Ho & we conclude that the mean of the new
variety is not equal to 2.0.

The mean yield (2.3) is significantly higher than the hypothesized mean (2.0).

One sample t test


Distributions

yield

8 8.5 9 9.5

Quantiles

100.0% maximum 9.4


99.5% 9.4
97.5% 9.4
90.0% 9.4
75.0% quartile 9.25
50.0% median 8.55
25.0% quartile 8.025
10.0% 7.8
2.5% 7.8
0.5% 7.8
0.0% minimum 7.8

Summary Statistics

Mean 8.6

Std Dev 0.6228965

Std Err Mean 0.2542964

Upper 95% Mean 9.2536897

Lower 95% Mean 7.9463103

N 6

Test Mean

Hypothesized Value 8

Actual Estimate 8.6

DF 5

Std Dev 0.6229

t Test

Test Statistic 2.3595

Prob > |t| 0.0648

Prob > t 0.0324*

Prob < t 0.9676

7.0 7.5 8.0 8.5 9.0


Hypothesis
Ho µ=8.0

Ha µ≠8.0

Conclusion

P-value =0.0648 leads to the decision of rejecting Ho & we conclude that the mean of this year
yield is not equal to 8.0.

The mean yield (8.6) is significantly higher than the hypothesized mean (8.0).

3) Paired sample t-test


Solution
Distributions

diff

-1 0 1 2 3

Quantiles

100.0% maximum 3.3


99.5% 3.3
97.5% 3.3
90.0% 3.3
75.0% quartile 2.975
50.0% median 1.75
25.0% quartile -0.05
10.0% -0.7
2.5% -0.7
0.5% -0.7
0.0% minimum -0.7
Summary Statistics

Mean 1.525
Std Dev 1.5163396
Std Err Mean 0.536107
Upper 95% Mean 2.7926916
Lower 95% Mean 0.2573084
N 8

Test Mean

Hypothesized Value 0
Actual Estimate 1.525
DF 7
Std Dev 1.51634

t Test

Test Statistic 2.8446

Prob > |t| 0.0249*

Prob > t 0.0124*

Prob < t 0.9876

Hypothesis

H o: 1 = 0
H a: 1  0

Conclusion

P-value=0.0249 suggests that the population means are not equal and we reject Ho.
Thus, we would conclude that the two varieties are not equal. (Mean of the difference (1.5)
shows a significant increase)

4) Independent sample t test


Solution
Oneway Analysis of Yield By oilseed
3.4

3.2

2.8

2.6

2.4

2.2
cont +grth
oilseed

Oneway Anova
Summary of Fit

Rsquare 0.058055
Adj Rsquare 0.005725

Root Mean Square Error 0.360902

Mean of Response 2.84

Observations (or Sum Wgts) 20

Pooled t Test
+grth-cont
Assuming equal variances
Difference 0.17000 t Ratio 1.053283
Std Err Dif 0.16140 DF 18
Upper CL Dif 0.50909 Prob > |t| 0.3061
Lower CL Dif -0.16909 Prob > t 0.1531
Confidence 0.95 Prob < t 0.8469
Analysis of Variance

Source DF Sum of Mean F Ratio Prob >


Squares Square F
oilseed 1 0.1445000 0.144500 1.1094 0.3061
Error 18 2.3445000 0.130250
C. 19 2.4890000
Total

Means for Oneway Anova


Level Number Mean Std Error Lower Upper 95%
95%
cont 10 2.75500 0.11413 2.5152 2.9948

+grth 10 2.92500 0.11413 2.6852 3.1648

Our hypothesis

H o: 1 = 2
H a: 1  2

Conclusion

P-value=0.3061 suggests that the population means are not equal.


Thus, we would conclude that treated and control are not equal. (Mean of treated (2.9) is
significantly higher than that of controlled (2.7))
5) Independent sample t-test
Solution
One-way Anova
Summary of Fit

Rsquare 0.343353

Adj Rsquare 0.296449

Root Mean Square Error 0.23619

Mean of Response 2.09375

Observations (or Sum Wgts) 16

Pooled t-Test
standard variety-new variety
Assuming equal variances

Difference -0.33000 t Ratio -2.70563

Std Err Dif 0.12197 DF 14

Upper CL Dif -0.06840 Prob > |t| 0.0171*

Lower CL Dif -0.59160 Prob > t 0.9915

Confidence 0.95 Prob < t 0.0085*

-0.4 -0.2 0 0.2 0.4


Analysis of Variance

Source DF Sum of Mean Square F Ratio Prob > F


Squares
Variety 1 0.4083750 0.408375 7.3204 0.0171*

Error 14 0.7810000 0.055786

C. Total 15 1.1893750

Means for Oneway Anova

Level Number Mean Std Error Lower 95% Upper 95%


new variety 6 2.30000 0.09642 2.0932 2.5068
standard variety 10 1.97000 0.07469 1.8098 2.1302

Hypothesis
 Ho: µ1 = µ2
 Ha: µ1 ≠ µ2(µ1 = new variety, µ2 = standard variety).

Conclusion
 Since (Prob > ) p-value is 0.0171*. Suggests that the varieties means are not equal.
Thus, we would conclude that the mean of new variety and standard variety are not equal.
(Mean of new variety is significantly higher than that of standard variety)

6) Independent sample t-test


Solution
Oneway Analysis of Yield By varity
5.5

4.5

3.5

A B
varity

Oneway Anova
Summary of Fit

Rsquare 0.276017

Adj Rsquare 0.23343

Root Mean Square Error 0.530039

Mean of Response 4.373684

Observations (or Sum Wgts) 19

Pooled t Test
B-A
Assuming equal variances

Difference 0.62000 t Ratio 2.545824


Std Err Dif 0.24354 DF 17

Upper CL Dif 1.13382 Prob > |t| 0.0209*

Lower CL Dif 0.10618 Prob > t 0.0104*

Confidence 0.95 Prob < t 0.9896

Analysis of Variance

Source DF Sum of Mean F Ratio Prob >


Squares Square F
varity 1 1.8208421 1.82084 6.4812 0.0209*

Error 17 4.7760000 0.28094

C. 18 6.5968421
Total

Means for One-way Anova

Level Number Mean Std Error Lower Upper


95% 95%

A 10 4.08000 0.16761 3.7264 4.4336

B 9 4.70000 0.17668 4.3272 5.0728

Hypothesis

Ho:1=2
H a: 1  2
Conclusion
P-value=0.0209 suggests that the population means are not equal.
Thus, we would conclude that variety A and variety B are not equal. (Mean of B (4.7) is
significantly higher than that of A (4.0))

7. Chi-square test
In an experiment of purple vs. green seeds, we might expect a 3:1 ratio. We can test if what we
observe is statistically different from 3:1. Let's say we raise 100 flowers and observe the
following colors:
Seed color Yellow Green Total
Observed 112 46 158
frequency
 Write H0 & HA.
 Do the analysis & make your decision?

Solution
Distributions

seed colar

yellow green

Frequencies

Level Count Prob

yellow 112 0.70886

green 46 0.29114

Total 158 1.00000

N Missing0
2 Levels
Test Probabilities
Level Estim Prob Hypoth Prob

yellow 0.70886 0.75000

green 0.29114 0.25000

Test ChiSquare DF Prob>Chisq


Likelihood Ratio 1.3786 1 0.2403

Pearson 1.4262 1 0.2324

Hypothesis
 H0: yellow = 0.75; green = 0.25, in other words H0: = the data came from a population
having a 3:1 ratio of yellow to green flowers;
 HA: the data came from a population not having a 3:1 ratio of yellow to green flowers
Conclusion and interpretation
 Since our Prob>Chisq value (i.e., 0.2324) is greater than = 0.05, we fail to reject H0 &
conclude that we don’t have evidence data came from a population having a 3:1 ratio of
yellow to green flowers.

8) Chi-square test
A parental cross is expected to produce progeny in categories Brown, Yellow, Red & Green
with a ratio of 9:3:3:1. The observed frequencies of the progeny were as given below:

Colour Brow Yello Red Gree


n w n
Obs. 150 50 42 8
Frequenc
y
 Write H0 & HA.
 Do the analysis & make your decision?
Solution
Distributions

sed colour

brown green red yellow

Frequencies

Level Count Prob

brown 150 0.60000

green 8 0.03200

red 50 0.20000

yellow 42 0.16800

Total 250 1.00000

N Missing0
4 Levels
Test Probabilities

Level Estim Prob Hypoth Prob

brown 0.60000 0.5625

green 0.03200 0.0625

red 0.20000 0.1875

yellow 0.16800 0.1875

Test ChiSquare DF Prob>Chisq


Likelihood Ratio 5.8801 3 0.1176
Pearson 5.0613 3 0.1674
Hypothesis
H0: 1 = 9/16; 2 = 3/16; 3 = 3/16; 3 = 1/16 (i.e., the populations frequencies are given by the
ratio 9:3:3:1);

HA: at least one i differs significantly from that given under H0

Conclusion
 Since our Prob>Chisq Pearson p-value (i.e., 0.1674) is greater than = 0.05, we fail to
reject H0 & conclude that we don’t have evidence that the data differ from a population
having a ratio of 9:3:3:1.

9) Chi-square test
Seed colour
Petal Dark- Brown Light- White
colour brown brown
Purple 21 32 13 4
Mauve 55 102 70 8
White 14 59 112 10
 Write H0 & HA.
 Do the analysis & make your decision?
 Do correspondence analysis if necessary!

Solution
Hypothesis
H0: there is no relationship between Seed color & Petal color (independent);

HA: there is a relationship between Seed color & Petal color (Dependent).
Contingency Analysis of seed color by petal color
Mosaic Plot
1.00 W

0.75 LB

0.50

0.25

DB

0.00
purple mauve white

petal color
Freq: freq

Contingency Table
petal color by seed color

Count DB B LB W Total
Total %
Col %
Row %
purple 21 32 13 4 70
4.20 6.40 2.60 0.80 14.00
23.33 16.58 6.67 18.18
30.00 45.71 18.57 5.71
mauve 55 102 70 8 235
11.00 20.40 14.00 1.60 47.00
61.11 52.85 35.90 36.36
23.40 43.40 29.79 3.40
white 14 59 112 10 195
2.80 11.80 22.40 2.00 39.00
15.56 30.57 57.44 45.45
7.18 30.26 57.44 5.13
Total 90 193 195 22 500
18.00 38.60 39.00 4.40

Tests
N DF -LogLike RSquare (U)
500 6 30.679975 0.0520
Test ChiSquare Prob>ChiSq
Likelihood Ratio 61.360 <.0001*
Pearson 58.575 <.0001*

Correspondence Analysis
0.6
DB
purple
0.4

mauve
0.2 B

c1 0.0
W
-0.2

LB
white
-0.4

-0.6
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
c2

petal color seed color


Decision
Since the Pearson Prob>ChiSq value = <.0001* is less than α= 0.05, we reject Ho & conclude
that there is a relationship between Seed color & Petal color.

Based on the result from correspondence analysis, Dark-brown and Purple have a high
relationship followed by Brown and Mauve. On the other hand, Light-brown and White have a
low relationship with others.

10). Chi-square test


Contingency Analysis of type By storage type
Mosaic Plot
1.00
no of not germinate

0.75

0.50
no of germinate

0.25

0.00
A B C D E

storage type

Freq: count
Contingency Table
storage type By type

Count no of no of not Total


Total % germinate germinate
Col %
Row %
A 112 12 124
23.14 2.48 25.62
27.25 16.44
90.32 9.68
B 76 14 90
15.70 2.89 18.60
18.49 19.18
84.44 15.56
C 88 32 120
18.18 6.61 24.79
21.41 43.84
73.33 26.67
D 43 7 50
8.88 1.45 10.33
10.46 9.59
86.00 14.00
E 92 8 100
19.01 1.65 20.66
22.38 10.96
92.00 8.00
Total 411 73 484
84.92 15.08
Tests

N DF -LogLike RSquare (U)


484 4 9.2443179 0.0450

Test ChiSquare Prob>ChiSq


Likelihood Ratio 18.489 0.0010*
Pearson 19.379 0.0007*

Correspondence Analysis
0.6

no of not germinate
0.4
C

0.2

B
c1 0.0 D
no of germinate
A
-0.2 E

-0.4

-0.6
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
c2

storage type Type

Hypothesis

H0: there is no relationship between germination status & storage method (independent);

HA: there is a relationship between germination status & storage method (Dependent).
 Decision
 Since the Pearson Prob>Chisq value (i.e., 0.0007*) is less than = 0.05, we reject H0 &
conclude that germination & storage type are dependent.
 From the correspondence analysis we get the following facts of relationships. It looks like
number of germinated seed prefer the storage type A, D, E, B. far more than any other
storage type C. Also, the storage type C & the number of not germinated seed more
related than the other four storage type.

11). Chi-square test

 Write H0 & HA.


 Do the analysis & make your decision?
 Do correspondence analysis if necessary!

Contingency Analysis of typ By hormon


Mosaic Plot
1.00

0.75

rooting

typ 0.50

0.25

not rooting

0
A B
hormon

Freq: frequency
Contingency Table
hormon By typ

Count not rooting rooting Total


Total %
Col %
Row %
A 15 25 40
16.67 27.78 44.44
65.22 37.31
37.50 62.50
B 8 42 50
8.89 46.67 55.56
34.78 62.69
16.00 84.00
Total 23 67 90
25.56 74.44

Tests

N DF -LogLike RSquare (U)


90 1 2.7060744 0.0529

Test ChiSquare Prob>ChiSq


Likelihood Ratio 5.412 0.0200*
Pearson 5.399 0.0201*

Fisher's Prob Alternative Hypothesis


Exact Test
Left 0.9949 Prob(typ=rooting) is greater for hormon=A than B
Right 0.0188* Prob(typ=rooting) is greater for hormon=B than A
2-Tail 0.0284* Prob(typ=rooting) is different across hormon
Correspondence Analysis

not rooting
0.4
A
0.2

c1 0
rooting
-0.2 B

-0.4

-0.4 -0.2 0 0.2 0.4


c2

Hormone typ

Singular Inertia Portion Cumulative


Value
0.24494 0.05999 1.0000 1.0000

typ c1

not rooting 0.4180

rooting -0.1435

hormon c1

A 0.2738

B -0.2191
Hypothesis

H0: there is no relationship between number of rooting & hormone (independent);

HA: there is a relationship between number of rooting & hormone (Dependent).

 Decision
 Since the Pearson Prob>Chisq value (i.e. 0.0201*) is less than = 0.05, we reject H0 &
conclude that hormone and rooting are dependent.
 From the correspondence analysis we get the following facts of relationships. It looks like
rooting prefer the hormone type B (B hormone) far more than any other hormone A (A
hormone).

12) Chi-square test

 Write H0 & HA.


 Do the analysis & make your decision?
Solution
H0: 1 = 0.36; 2 = 0.48; 3 = 0.16 (i.e., the populations frequencies are given by the ratio
9:12:4);

HA: at least one i differs significantly from that given under H0


Distributions
Color

Brown Spotted White

Frequencies
Level Count Prob

Brown 240 0.60000

Spotted 20 0.05000

White 140 0.35000

Total 400 1.00000

N Missing 0 3
Levels
Test Probabilities
Level Estim Prob Hypoth Prob
Brown 0.60000 0.48000
Spotted 0.05000 0.16000
White 0.35000 0.36000

Test ChiSquare DF Prob>Chisq


Likelihood Ratio 52.6950 2 <.0001*
Pearson 42.3611 2 <.0001*
Decision

Since our Prob>Chisq value (i.e., <.0001*) is less than  = 0.05, we reject are & conclude that
the proportions in color is not that of the sample expected (not given by the 9:12:4 ratio of
brown, spotted, and white.

13). Chi-square test

 Write H0 & HA.


 Do the analysis & make your decision?
 Do correspondence analysis if necessary
Contingency Analysis of group By mastities
Mosaic Plot
1.00

0.75

B
0.50

0.25
A

0.00
mastiti no mastiti

mastities

Freq: count
Contingency Table
mastities By group

Count A B C Total
Total %
Col %
Row %
mastiti 36 29 10 75
12.00 9.67 3.33 25.00
37.50 21.97 13.89
48.00 38.67 13.33
no mastiti 60 103 62 225
20.00 34.33 20.67 75.00
62.50 78.03 86.11
26.67 45.78 27.56
Total 96 132 72 300
32.00 44.00 24.00

Tests

N DF -LogLike RSquare (U)


300 2 6.6775058 0.0208

Test ChiSquare Prob>ChiSq


Likelihood Ratio 13.355 0.0013*
Pearson 13.387 0.0012*
Correspondence Analysis

0.4 mastiti

0.2

c1 0.0
B
no mastiti

-0.2
C

-0.4

-0.4 -0.2 0.0 0.2 0.4


c2

mastitis group

Hypothesis

 H0: there is no relationship between number of cow & treatment groups (independent);

 HA: there is a relationship between number of cow & treatment groups (Dependent).

Decision
 Since the Pearson Prob>ChiSq value = 0.0012* is less than α= 0.05, we reject Ho &
conclude that there is a relationship between number of cow & treatment groups
(dependent).
 Based on the result of Contingency Table from correspondence analysis, number of cows
that have mastitis is highly significant in treatment group A.
14) One sample t test

Solution
Ho: μ = 4000kg
Ha: μ ≠ 4000kg

Distributions
milk yield

3000 3500 4000 4500


Quantiles

100.0% maximum 4700


99.5% 4700
97.5% 4700
90.0% 4660
75.0% quartile 4142.5
50.0% median 3790
25.0% quartile 3367.5
10.0% 3027
2.5% 3000
0.5% 3000
0.0% minimum 3000

Summary Statistics

Mean 3800
Std Dev 500.15553

Std Err Mean 158.16307

Upper 95% Mean 4157.7897

Lower 95% Mean 3442.2103

N 10
Test Mean

Hypothesized Value 4000

Actual Estimate 3800


DF 9
Std Dev 500.156
Sigma given 500

z Test
Test Statistic -1.2649

Prob > |z| 0.2059


Prob > z 0.8970
Prob < z 0.1030
Decision
Since the value Prob > |z| of 0.2059 is greater than  = .05, we fail to reject H 0 & conclude that
we don’t have evidence that the mean is significantly different from 4000kg.

15) Paired t test

Distributions

-2 0 2 4 6 8 10 12

Quantiles

100.0% maximum 9
99.5% 9
97.5% 9
90.0% 9
75.0% quartile 6.5
50.0% median 2
25.0% quartile 0
10.0% -1
2.5% -1
0.5% -1
0.0% minimum -1
Summary Statistics

Mean 3.1111111
Std Dev 3.6552854
Std Err Mean 1.2184285
Upper 95% Mean 5.9208122
Lower 95% Mean 0.3014101
N 9
Test Mean

Hypothesized Value 0
Actual Estimate 3.11111
DF 8
Std Dev 3.65529

t Test
Test Statistic 2.5534
Prob > |t| 0.0340*
Prob > t 0.0170*
Prob < t 0.9830

Hypothesis

Ho:1=2 2
H a: 1  2

Conclusion

P-value=0.0340 suggests that the population means are not equal.


Thus, we would conclude that milk production before treatment and after treatment are not equal.
(Mean of the difference (3.1) shows a significant increase ie milk production)
16) Independent sample t test

Solution
Oneway Analysis of FREQUENCY By GROUP

140

130

120

110
A B
GROUP

Oneway Anova
Summary of Fit

Rsquare 0.203236

Adj Rsquare 0.153438

Root Mean Square Error 7.350359

Mean of Response 127.0556

Observations (or Sum Wgts) 18

Pooled t Test
B-A
Assuming equal variances

Difference 7.000 t Ratio 2.020206


Std Err Dif 3.465 DF 16
Upper CL Dif 14.345 Prob > |t| 0.0604
Lower CL Dif -0.345 Prob > t 0.0302*
Confidence 0.95 Prob < t 0.9698

-10 -5 0 5 10

Analysis of Variance

Source DF Sum of Mean Square F Ratio Prob > F


Squares
GROUP 1 220.5000 220.500 4.0812 0.0604
Error 16 864.4444 54.028
C. Total 17 1084.9444

Means for Oneway Anova

Level Number Mean Std Error Lower 95% Upper 95%


A 9 123.556 2.4501 118.36 128.75
B 9 130.556 2.4501 125.36 135.75

Hypothesis
• Ho: µ1 = µ2
• Ha: µ1 ≠ µ2(µ1 = group A measurement, µ2 = Group B measurement).

Conclusion
 Since (Prob > F ) p-value is 0.0604, so we fail to reject the null hypothesis.
We don’t have evidence that the group measurement means are equal.
PART THREE

Exercises for One-Way ANOVA: CRD


For the data sets given below answer the following questions:
 What is (are) the factor(s) in this experiment?
 What are the levels of the factor(s) in this experiment?
 Is there block effect(s), if yes give the block(s)?
 State the null & alternative hypotheses!
 Undertake ANOVA using JMP & based on the output you get interpret your results and give
your decision & recommendation?
 Which group do you recommend as best? Why?
 Give the compare and contrast of this model using the relative efficiency value in
comparison to RCBD/LSD depending on the presence of block(s)?

Data Set 1 Data Set 2


Treatments Treatments
1 2 3 4 5 1 2 3 4 5
4.12 6.42 5.64 5.80 6.93 1.06 2.21 1.82 3.40 2.47
12.31 9.98 10.68 12.14 17.35 5.15 3.99 4.34 6.57 7.68
9.36 13.87 9.82 12.22 10.18 3.68 5.94 3.91 6.61 4.09
11.32 9.31 10.66 12.52 9.65 4.66 3.66 4.33 6.76 3.83
12.02 16.81 8.84 9.88 11.67 5.01 7.40 3.42 5.44 4.84
13.48 9.99 10.26 11.27 13.37 5.74 4.00 4.13 6.14 5.68
11.23 12.94 11.66 11.40 12.15 4.61 5.47 4.83 6.20 5.07
10.19 12.75 12.36 13.15 15.63 4.10 5.37 5.18 7.07 6.82

Data Set 3 Data Set 4


Treatments Treatments
1 2 3 4 5 1 2 3 4 5
16.18 7.64 13.95 19.20 9.40 41.06 53.59 56.99 49.33 45.62
28.46 12.96 21.52 28.71 25.03 49.94 66.19 72.86 75.38 55.03
24.04 18.81 20.23 28.83 14.26 59.69 64.05 73.05 57.44 59.69
26.99 11.97 21.50 29.28 13.48 48.29 66.16 73.81 56.13 57.55
28.03 23.21 18.75 25.32 16.51 67.02 61.59 67.19 61.18 59.05
30.22 12.99 20.89 27.41 19.05 49.98 65.15 70.68 65.41 66.27
26.84 17.40 22.99 27.60 17.22 57.34 68.66 71.01 62.37 57.74
25.29 17.12 24.04 30.22 22.45 56.86 70.40 75.37 71.08 50.69

Solution
DAT set 1
Response yield
Whole Model
Actual by Predicted Plot
18

16

14

12

10

4
4 6 8 10 12 14 16 18
yield Predicted RMSE=2.811 RSq=0.07
PValue=0.5968

Summary of Fit

RSquare 0.074132

RSquare Adj -0.03168

Root Mean Square Error 2.811003

Mean of Response 11.03325


Observations (or Sum Wgts) 40

Analysis of Variance

Source DF Sum of Mean Square F Ratio


Squares
Model 4 22.14346 5.53587 0.7006
Error 35 276.56081 7.90174 Prob > F
C. Total 39 298.70428 0.5968

Effect Tests
Source Nparm DF Sum of Mean Square F Ratio Prob > F
Squares

TRT 4 4 22.143465 5.535866 0.7006 0.5968

TRT
Leverage Plot

15

10

10.0 10.5 11.0 11.5 12.0


TRT Leverage, P=0.5968

Least Squares Means Table

Level Least Sq Mean Std Error Mean

1 10.503750 0.99383962 10.5038

2 11.508750 0.99383962 11.5088

3 9.990000 0.99383962 9.9900

4 11.047500 0.99383962 11.0475

5 12.116250 0.99383962 12.1163


Data set 2

Response yield
Whole Model
Actual by Predicted Plot
8

1
1 2 3 4 5 6 7 8
yield Predicted RMSE=1.4048 RSq=0.23
PValue=0.0564

Effect Summary

Source LogWorth PValue


TRT 1.248 0.05644
Residual by Predicted Plot
3
2
1
0
-1
-2
-3
-4
1 2 3 4 5 6 7 8
yield Predicted

Summary of Fit

RSquare 0.225568

RSquare Adj 0.137061

Root Mean Square Error 1.404831

Mean of Response 4.817

Observations (or Sum Wgts) 40

Analysis of Variance

Source DF Sum of Mean Square F Ratio


Squares
Model 4 20.119165 5.02979 2.5486
Error 35 69.074275 1.97355 Prob > F
C. Total 39 89.193440 0.0564

Effect Tests

Source Nparm DF Sum of Mean Square F Ratio Prob > F


Squares
TRT 4 4 20.119165 5.029791 2.5486 0.0564
TRT
Leverage Plot
8

0
4.0 4.5 5.0 5.5 6.0
TRT Leverage, P=0.0564

Least Squares Means Table

Level Least Sq Mean Std Error Mean

1 4.2512500 0.49668284 4.25125

2 4.7550000 0.49668284 4.75500

3 3.9950000 0.49668284 3.99500

4 6.0237500 0.49668284 6.02375

5 5.0600000 0.49668284 5.06000

Data set 3
Response yield
Whole Model
Actual by Predicted Plot

30

25

20

15

10

10 15 20 25 30
yield Predicted RMSE=4.2162 RSq=0.58
PValue=<.0001

Residual by Predicted Plot


10

-5

-10
10 15 20 25 30
yield Predicted

Summary of Fit

RSquare 0.579513
RSquare Adj 0.531457
Root Mean Square Error 4.216234
Mean of Response 21.14975
Observations (or Sum Wgts) 40

Analysis of Variance

Source DF Sum of Mean Square F Ratio


Squares
Model 4 857.4873 214.372 12.0592
Error 35 622.1820 17.777 Prob > F
C. Total 39 1479.6693 <.0001*
Effect Tests

Source Nparm DF Sum of Mean Square F Ratio Prob > F


Squares
TRT 4 4 857.48729 214.3718 12.0592 <.0001*

TRT
Leverage Plot

30

25

20

15

10

5
14 16 18 20 22 24 26 28
TRT Leverage, P<.0001

Least Squares Means Table

Level Least Sq Mean Std Error Mean

1 25.756250 1.4906638 25.7563

2 15.262500 1.4906638 15.2625

3 20.483750 1.4906638 20.4838

4 27.071250 1.4906638 27.0713

5 17.175000 1.4906638 17.1750

LSMeans Differences Tukey HSD


α=0.050 Q=2.87506
LSMean[i] By LSMean[j]

Mean[i]-Mean[j] 1 2 3 4 5
Std Err Dif
Lower CL Dif
Upper CL Dif
1 0 10.4938 5.2725 -1.315 8.58125
0 2.10812 2.10812 2.10812 2.10812
0 4.43278 -0.7885 -7.376 2.52028
0 16.5547 11.3335 4.74597 14.6422
2 -10.494 0 -5.2213 -11.809 -1.9125
2.10812 0 2.10812 2.10812 2.10812
-16.555 0 -11.282 -17.87 -7.9735
-4.4328 0 0.83972 -5.7478 4.14847
3 -5.2725 5.22125 0 -6.5875 3.30875
2.10812 2.10812 0 2.10812 2.10812
-11.333 -0.8397 0 -12.648 -2.7522
0.78847 11.2822 0 -0.5265 9.36972
4 1.315 11.8088 6.5875 0 9.89625
2.10812 2.10812 2.10812 0 2.10812
-4.746 5.74778 0.52653 0 3.83528
7.37597 17.8697 12.6485 0 15.9572
5 -8.5813 1.9125 -3.3088 -9.8963 0
2.10812 2.10812 2.10812 2.10812 0
-14.642 -4.1485 -9.3697 -15.957 0
-2.5203 7.97347 2.75222 -3.8353 0

Level Least Sq
Mean
4 A 27.071250
1 A B 25.756250
3 B C 20.483750
5 C 17.175000
2 C 15.262500

Levels not connected by the same letter are significantly different.

Data set 4
Response yield
Whole Model
Actual by Predicted Plot
80

70

60

50

40
40 50 60 70 80
yield Predicted RMSE=6.8385 RSq=0.45
PValue=0.0002

Residual by Predicted Plot


15
10
5
0
-5
-10
-15
40 50 60 70 80
yield Predicted

Summary of Fit

RSquare 0.452224
RSquare Adj 0.389621
Root Mean Square Error 6.838489
Mean of Response 61.42225
Observations (or Sum Wgts) 40

Analysis of Variance

Source DF Sum of Mean Square F Ratio


Squares
Model 4 1351.2620 337.815 7.2237
Error 35 1636.7725 46.765 Prob > F
C. Total 39 2988.0345 0.0002*
Effect Tests
Source Nparm DF Sum of Mean Square F Ratio Prob > F
Squares
TRT 4 4 1351.2620 337.8155 7.2237 0.0002*

TRT
Leverage Plot
80

70

60

50

40
55 60 65 70
TRT Leverage, P=0.0002

Least Squares Means Table

Level Least Sq Std Error Mean


Mean
1 53.772500 2.4177709 53.7725

2 64.473750 2.4177709 64.4738

3 70.120000 2.4177709 70.1200

4 62.290000 2.4177709 62.2900

5 56.455000 2.4177709 56.4550

LSMeans Differences Tukey HSD


α=0.050 Q=2.87506
LSMean[i] By LSMean[j]
Mean[i]-Mean[j] 1 2 3 4 5
Std Err Dif
Lower CL Dif
Upper CL Dif
1 0 -10.701 -16.348 -8.5175 -2.6825
0 3.41924 3.41924 3.41924 3.41924
0 -20.532 -26.178 -18.348 -12.513
0 -0.8707 -6.517 1.31304 7.14804
2 10.7013 0 -5.6463 2.18375 8.01875
3.41924 0 3.41924 3.41924 3.41924
0.87071 0 -15.477 -7.6468 -1.8118
20.5318 0 4.18429 12.0143 17.8493
3 16.3475 5.64625 0 7.83 13.665
3.41924 3.41924 0 3.41924 3.41924
6.51696 -4.1843 0 -2.0005 3.83446
26.178 15.4768 0 17.6605 23.4955
4 8.5175 -2.1837 -7.83 0 5.835
3.41924 3.41924 3.41924 0 3.41924
-1.313 -12.014 -17.661 0 -3.9955
18.348 7.64679 2.00054 0 15.6655
5 2.6825 -8.0187 -13.665 -5.835 0
3.41924 3.41924 3.41924 3.41924 0
-7.148 -17.849 -23.496 -15.666 0
12.513 1.81179 -3.8345 3.99554 0

Level Least Sq
Mean
3 A 70.120000
2 A B 64.473750
4 A B C 62.290000
5 B C 56.455000
1 C 53.772500

Levels not connected by the same letter are significantly different.

For the data sets given above answer the following questions:
1. What is (are) the factor(s) in this experiment?
 We have one (1) treatment factor for all data sets given above.
2. What are the levels of the factor(s) in this experiment?
 We have five (5) level for the above four data set given.
3. Is there block effect(s), if yes give the block(s)?
 There is no block effect in the CRD design f experiment, this means no block
effect for all four data sets. Because the design is CRD.
4. State the null & alternative hypotheses!
 H0: µ1 = µ2 = µ3 = µ4 = µ5 (or all means are equal)
 HA: not all the means are equal = at least one mean is different from the rest

5. Undertake ANOVA using JMP & based on the output you get interpret
your results and give your decision & recommendation?
Result and interpretation
In Data set 1;
Since Prob > F value (i.e., 0.5968) p values greater than α = 0.05, we fail to reject the null
hypothesis and conclude that we don’t have evidence that the mean between the treatment
1, 2, 3, 4, and 5 is not equal.
In Data set 2;
Since Prob > F value (i.e., 0.0564) p values greater than α = 0.05, we fail to reject the null
hypothesis and conclude that we don’t have evidence that the mean between the treatment
1, 2, 3, 4, and 5 are not equal.
In Data set 3;
Since Prob > F value (i.e., <.0001*) p values less than α = 0.05, we reject null hypothesis
and conclude that the mean between the treatment 1, 2, 3, 4, 5 are not equal.
Based on the result from the multiple comparison procedure (MCP), in this group, treatment
four (4) is high significance followed by treatment one (1) while treatments two (2) and five
(5) are low significance.

In Data set 4;
Since Prob > F value (i.e., 0.0002*) p values less than α = 0.05, we reject null hypothesis
and conclude that the mean between the treatment 1, 2, 3, 4, 5 are not equal.
Based on the result from the multiple comparison procedure (MCP), in this group, treatment
three (3) is high significance followed by treatments two (2) and four (4) while treatments
one (1) and five (5) are low significance.
6. Which group do you recommend as best? Why?
Option 1:
Based on the significant difference, we recommend Data Set 3 and Data Set 4 because they
have a significant difference (p-value. <.0001* and 0.0002*) Data Set 3 and Data Set 4
respectively, while Data Set 1 and Data Set 2 have no significant difference.
Option 2:
Based on the significance level, we recommend Data Set 3 because it has a highly
significant difference (p-value <.0001*) than Data Set 4 which has (p-value of 0.0002*).

7. Give the compare and contrast of this model using the relative efficiency value
in comparison to RCBD/LSD depending on the presence of block(s)?
Relative efficiency to compare RCBD with LSD.

Solution
Data set 1
Given:
MScolumns = 5.535866
Treatment = 5
MSerror = 7.90174
RE = 5.535866 + (5 – 1)* 7.90174/ 5 (7.90174) = 0.94069
Data set 2
Given:
MScolumns = 5.029791
Treatment = 5
MSerror = 1.97355
RE = 5.029791 + (5 – 1)* 1.97355/ 5 (1.97355) = 1.30972
Data set 3
Given:
MScolumns = 214.3718
Treatment = 5
MSerror = 17.777
RE = 214.3718+ (5 – 1)* 17.777/ 5 (17.777) = 3.2117

Data set 4
Given:
MScolumns = 337.8155
Treatment = 5
MSerror = 46.765
RE = 337.8155+ (5 – 1)* 46.765/ 5 (46.765) = 2.2447

The interpretation is that: Based on the following rule decision can be made.
o If RE >1, then RCBD is more efficient
than CRD
o If RE =1, then both RCBD & CRD are
equally efficient
o If RE <1, then CRD is better
So in data set 2, 3, and 4 has the higher efficiency percent, means that the result we
get for those data set ( 2, 3 and 4) is greater than one ( RE = >1). This leads as to
the decision of RCBD is efficient than that of LSD.

You might also like