Attribution Non-Commercial (BY-NC)

307 views

Attribution Non-Commercial (BY-NC)

- Mat 2377 Practice Exam Sol 2011
- Tute 4
- NCES Report
- Minitab Modul
- Top 10 Concepts Statistics
- One Sample t Test
- Week 5 - Chapter 8 Individual
- Six Sigma Manual
- Sampling
- FR 1_Chem 26
- Topic 4
- Applied Linear Statistical Models Mid Term 3
- Methodology 1
- Effect Sizes
- t statistic
- Tutorial Chap 3
- 2012 Math Studies Exam Paper
- Calibration
- Results ULAM
- QUINN, FONSECA FERREIRA, D’AYALA.pdf

You are on page 1of 7

6, page 1

The overall F-test is only one step in the comparison of several groups. In Chapter 5, we saw

how the pooled standard deviation could be used in confidence intervals and hypothesis tests for

the difference of any pair of group means. We can generalize this to confidence intervals and

hypothesis tests for any linear combination of group means.

γ = C1 µ1 + C 2 µ 2 + … + C I µ I

Mice example:

Suppose I wanted to examine the difference in the mean lifetimes of the two control diets:

µ1 − µ 2 . Then C1 = 1 , C 2 = −1 , C3 = 0 , C 4 = 0 , C5 = 0 , C 6 = 0 .

More complicated combinations are sometimes of interest: suppose I wanted to compare the

average of the two control diets to the average of the four reduced calorie diets: what are the

Ci ’s for it?

In the above two examples, the sum of the Ci ’s is 0 in each case. A linear combination of the

means in which the coefficients sum to 0 is called a contrast because it compares or contrasts

some means with others. Specific contrasts are often of interest in ANOVA, but we can create a

confidence interval for any linear combination of means; it does not need to be a contrast.

Which of the following linear combinations of means are contrasts?

a) ( µ1 − µ 3 ) + ( µ 4 − µ 5 )

µ1 + µ 2 + µ 3 + µ 4 + µ 5

b)

5

c) 2 µ1 − µ 2 − µ 3

(µ 2 + µ 3 + µ 4 + µ 5 )

d) − µ1

4

e) µ1 + .µ 2 + µ 5 − µ 3 − µ 4

(µ 3 − µ 2 ) (µ 6 − µ 3 )

f) −

35 10

Chap. 6, page 2

(this is from the mice example on p. 157 and compares the increase in mean life

expectancy per calorie of going from N/R85 to N/R50 to the increase per calorie of

going from N/R50 to N/R40)

Estimate γ = C1 µ1 + … + C I µ I by g = C1Y1 + … + C I YI .

SD(g) =

We estimate σ by the pooled standard deviation s p . Plugging this estimate into SD(g) gives the

estimated standard deviation of g which is the standard error of g:

SE(g) =

g ± t df (1-α/2) SE(g)

where df is the degrees of freedom for s p (that is, n-I). A test of the hypothesis

g

H 0 : γ = 0 is carried out using the test statistic t = .

SE ( g )

Examples:

1. Compute a confidence interval for the difference between mean lifetimes for the laboratory

control (N/N85) and the unrestricted controls (NP):

SE(g) =

Chap. 6, page 3

2. Compute an estimate of the contrast which is the average of the two control diets minus the

average of the four reduced calorie diets along with a confidence interval.

g= − = -12.45

2 4

SE(g) = 6.678 + + + + + = 0.7800

49 57 71 56 56 60

We are 95% confident that the mean life expectancy of the two controls diets is from 10.9 to 14.0

months less than the mean lifetime of the four restricted diets.

Simultaneous Inferences

Fishing Expeditions and Data Snooping: tests based on how the data turned out.

Say you would like to study a particular stream in Chile that is being used to dispose of waste

from pulp manufacture.

i) Measure the concentrations of contaminants in the water

ii) Measure different stream characteristics such as species diversity (plants and

animals), species richness (plants and animals), brood size for several species of fish,

asymmetry in fish, prevalence of bacterial infection (plants and animals) etc., in the

stream and in several other uncontaminated streams in the region.

iii) Keep measuring stream characteristics until you find a significant difference between

the pulp waste stream and the others at the 95% level.

This example, and the one found in your text, which you should study until you understand it,

illustrates the need for family-wise control of the alpha value, or confidence level.

Chap. 6, page 4

If we form individual 95% confidence intervals for a set of linear combinations of means, then

we cannot be 95% confident that they all include the true parameters they’re estimating. The

actual confidence that a family of confidence intervals are simultaneously correct is called the

familywise confidence level.

Example: Say we conduct an experiment where we make 10 pairwise comparisons and control

each of them at the 95% confidence level. What is the probability that of at least Type I error

occurring in the experiment?

Let p = Pr( success ) = 0.95 , q = Pr( failure) = 0.05 , where success means that we do not make a

type I error, and let x be a binomial random variable. Also, assume that each comparison is

independent of every other. Then, the probability of at least type I error is given by

⎛10 ⎞

Pr( x ≥ 1) = 1 − Pr( x = 0) = 1 − ⎜ ⎟ p10 q 0 = 1 − 0.5987 = 0.4013

⎝0⎠

Hence, if we would like to control our probability of type I error for our entire experiment, we

must make some adjustments. Unfortunately, the above calculation depends on the trials being

independent, which is probably not the case for most experiments. So we can not calculate

family-wise probabilities exactly. For this reason, there are several different versions of alpha

correction techniques.

The most common form of family-wise correction is the Bonferroni inequality to create

simultaneous confidence intervals with any desired familywise confidence level. To create

100(1-α)% simultaneous confidence intervals for k parameters, we make each confidence

interval an individual 100(1-α/k)% confidence interval. .

Bonferroni guarantees that the familywise confidence level is at least 1-α, but it can be overkill,

especially when k is large. There are several ways that have been developed for creating

simultaneous confidence intervals among means that can be less drastic.

• Planned comparisons: contrasts which the researcher decides are of interest before the

data are collected. We can control the familywise confidence level using the Bonferroni

inequality or one of the other methods listed below.

• Unplanned comparisons: contrasts which the researcher decides are of interest after

examining the data. These may be chosen from a larger set of contrasts which have been

examined or may be chosen after looking at the data to suggest contrasts of interest. The

confidence intervals must take into account that you actually (in the first case) or

essentially (in the second case) examined a large number of contrasts and picked out the

most “significant” one or ones.

Chap. 6, page 5

In the specific case of all pairwise comparisons of group means, a number of procedures have

been developed to control the familywise error rate. The primary one is

• Tukey-Kramer procedure (for all planned or unplanned pairwise comparisons)

In the general case of contrasts (or any linear combinations of the means) which are not

necessarily pairwise comparisons, there are two main choices. These methods can also be used

for pairwise comparisons.

• Planned comparisons: Bonferroni

• Unplanned comparisons: Scheffe (can also be used for planned comparisons)

In all these cases, the confidence interval for a contrast γ always has the form:

The specific method used determines only the multiplier. If you have a legitimate choice

between two or more procedures, you can choose the one with the smaller multiplier.

In SPSS, the standard errors of one or more contrasts can be calculated by selecting the Contrasts

button on the One-way ANOVA window. You will have to find the value of the appropriate

multiplier to create a confidence interval for the contrast.

Pairwise comparisons between all pairs of means can be obtained by clicking the Post Hoc

button in the One-Way ANOVA window. It will automatically give you confidence intervals for

the difference between each pair of means. There are a multitude of options there; the ones

corresponding to the ones mentioned here are:

⎛I⎞

Bonferroni Bonferroni t n − I (1 − α / 2k ) where k = ⎜⎜ ⎟⎟ is number of

⎝ 2⎠

pairwise comparisons of means

q I ,n −I (1 − α )

Tukey-Kramer Tukey (not Tukey’s-b) (q is from Table A.5)

2

Chap. 6, page 6

⎛ 6⎞ 6!

There are I = 6 groups and n-I = 343 d.f. within groups. There are ⎜⎜ ⎟⎟ = = 15 pairwise

⎝ 2 ⎠ 2! 4!

comparisons. The coefficients or multipliers for 95% confidence intervals for the difference

between each pair of means are:

1. LSD: t 343 (.975) = 1.967 (approx. 1.984 using Table A.2 with 100 d.)

(approx. 5(2.26) = 3.36 using Table A.4 with df2 = 200)

4. Tukey-Kramer: = = 2.866

2 2

4.10

(approx. = 2.90 using Table A.5 with 120 df)

2

If I had just been interested in all pairwise comparisons a priori then I would use Tukey-Kramer.

If there were other pre-planned contrasts I were interested in in addition to all pairwise

comparisons, then I would either use Bonferroni (but I would have to increase k to reflect the

additional contrasts) or Scheffe, whichever were smaller. If there were additional unplanned

comparisons, then I would use Scheffe for all comparisons.

Chap. 6, page 7

Post Hoc Tests

Multiple Comparisons

Tukey HSD

Mean

Difference 95% Confidence Interval

(I) Diet (J) Diet (I-J) Std. Error Sig. Lower Bound Upper Bound

NP N/N85 -5.289* 1.301 .001 -9.018 -1.561

N/R50 -14.895* 1.240 .000 -18.450 -11.341

R/R50 -15.484* 1.306 .000 -19.228 -11.740

N/R lopro -12.284* 1.306 .000 -16.028 -8.540

N/R40 -17.715* 1.286 .000 -21.400 -14.029

N/N85 NP 5.289* 1.301 .001 1.561 9.018

N/R50 -9.606* 1.188 .000 -13.010 -6.202

R/R50 -10.194* 1.257 .000 -13.796 -6.593

N/R lopro -6.994* 1.257 .000 -10.596 -3.393

N/R40 -12.425* 1.235 .000 -15.965 -8.885

N/R50 NP 14.895* 1.240 .000 11.341 18.450

N/N85 9.606* 1.188 .000 6.202 13.010

R/R50 -.589 1.194 .996 -4.009 2.832

N/R lopro 2.611 1.194 .246 -.809 6.032

N/R40 -2.819 1.171 .156 -6.176 .537

R/R50 NP 15.484* 1.306 .000 11.740 19.228

N/N85 10.194* 1.257 .000 6.593 13.796

N/R50 .589 1.194 .996 -2.832 4.009

N/R lopro 3.200 1.262 .117 -.417 6.817

N/R40 -2.231 1.241 .468 -5.787 1.325

N/R lopro NP 12.284* 1.306 .000 8.540 16.028

N/N85 6.994* 1.257 .000 3.393 10.596

N/R50 -2.611 1.194 .246 -6.032 .809

R/R50 -3.200 1.262 .117 -6.817 .417

N/R40 -5.431* 1.241 .000 -8.987 -1.875

N/R40 NP 17.715* 1.286 .000 14.029 21.400

N/N85 12.425* 1.235 .000 8.885 15.965

N/R50 2.819 1.171 .156 -.537 6.176

R/R50 -5.289 1.301 .468 -1.325 5.787

N/R lopro -14.895* 1.240 .000 1.875 8.987

*. The mean difference is significant at the .05 level.

Homogeneous Subsets

Months survived

a,b

Tukey HSD

Subset for alpha = .05

Diet N 1 2 3 4

NP 49 27.402

N/N85 57 32.691

N/R lopro 56 39.686

N/R50 71 42.297 42.297

R/R50 56 42.886 42.886

N/R40 60 45.117

Sig. 1.000 1.000 .108 .212

Means for groups in homogeneous subsets are displayed.

a. Uses Harmonic Mean Sample Size = 57.462.

b. The group sizes are unequal. The harmonic mean of

the group sizes is used. Type I error levels are not

guaranteed.

- Mat 2377 Practice Exam Sol 2011Uploaded byDavid Lin
- Tute 4Uploaded byabhishekbaid
- NCES ReportUploaded byudayagb
- Minitab ModulUploaded byDede Apandi
- Top 10 Concepts StatisticsUploaded byneeebbbsy89
- One Sample t TestUploaded byarwiyanto
- Week 5 - Chapter 8 IndividualUploaded bysuzettebrown
- Six Sigma ManualUploaded byNelson Alejandro Fierro
- SamplingUploaded byGaurav Khandelwal
- FR 1_Chem 26Uploaded byHowo4Die
- Topic 4Uploaded byross_tec
- Applied Linear Statistical Models Mid Term 3Uploaded byrodalivre
- Methodology 1Uploaded byMario Bagesbes
- Effect SizesUploaded byShirley Yelrihs
- t statisticUploaded byuser31415
- Tutorial Chap 3Uploaded byYaacub Azhari Safari
- 2012 Math Studies Exam PaperUploaded byTifeny Seng
- CalibrationUploaded bySabiran Gibran
- Results ULAMUploaded byabob619
- QUINN, FONSECA FERREIRA, D’AYALA.pdfUploaded byshiva
- 1010_ Analytical Data - Interpretation and TreatmentUploaded byAnonymous ulcdZTo
- lec3_4upUploaded byLibyaFlower
- 601 Class 05Uploaded byAsim Abdelwahab Abdorabo
- 3a. 2017 Lecture 3 Univariate Part 2Uploaded bymdnghtgold
- Kruskal-Wallis Tests (Simulation)Uploaded byscjofyWFawlroa2r06YFVabfbaj
- The Association of Trends in Charcoal-burning Suicide With Google Search and Newspaper Reporting in Taiwan_ a Time Series AnalysisUploaded bynovireandysasmita
- TwoUploaded byBeti Perez
- Effect Size EstimationUploaded bysubcribed
- Prems mannUploaded byThad
- Baysain method quiz 2Uploaded byAbhinav Bhargav

- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- Chapter 12Uploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- An Ova PowerUploaded byFanny Sylvia C.
- Intro BootstrapUploaded byMichalaki Xrisoula
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- R Matrix TutorUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Close Out NettingUploaded byFanny Sylvia C.

- StatisticsUploaded byRahul Ghosale
- Chapter 16Uploaded byeuhsaha
- The Effect of the Country of Origin Image Product KnowledgeUploaded bylyj1017
- To Conduct Shewart's Experiments on Known Population by ABHISHEK SHARMAUploaded byabhi
- Some of the Individual ValuesUploaded bysurojiddin
- Practice-Problems-and-Exams-20131.pdfUploaded byMehathiSubash
- Patient Satisfaction in Dental Healthcare CentersUploaded bysania
- BBA 122 Notes on Estimation and Confidence IntervalsUploaded byMoses Chol
- 1-s2.0-S0022391308601547-mainUploaded bybkprostho
- Bartov 2018 Can Twitter Help Predict Firm-Level Earnings.pdfUploaded byeagle fly10
- 2040 W16 T2 W SolutionsUploaded byMO
- Determinants of Regional Minimum Wages in the Philippines.pdfUploaded byavisalari
- Part 1 the Ruggedness Test in HPLCUploaded byNestor Armando Marin Solano
- msyl_lfsl_llkhtbrt_bdwn_ljbt_-chapter08_2Uploaded bykhushboo chandani
- Sl RegressionUploaded bySoniyaKanwalG
- Final Year ProjectUploaded byAbdul Manan
- Two PopulationsUploaded bySergioDragomiroff
- Teaching event study.pdfUploaded bySarfraz Khalil
- Managerial StatisticsUploaded bybenthompson72
- A Simple Guide to StatiscticsUploaded bydrquan
- BootstrapUploaded byŁukasz Chocholak
- Regresión y CalibraciónUploaded byAlbertoMartinez
- Costinot Donaldson AERPP2012Uploaded byRicardo Caffe
- Assessing the Economic Justification for Government Involvement in SportsUploaded bydarrenvandruten
- Population EcologyUploaded bygenetik52
- Presentation of Regression ResultsUploaded byElizabeth Collins
- Quarterly E-Commerce Retail Sales Q3 2010Uploaded byCam Wood
- Sampling DistributionsUploaded byharis
- Section 3Uploaded byCesily
- SmartAlAnswers ALLUploaded byEyosyas Woldekidan

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.