© All Rights Reserved

30 views

© All Rights Reserved

- Tutorial 1-14 Student s Copy 201605
- New Doc 2017-11-30
- 470 Final Report
- An Ova
- Sample APA Lab Write Up
- Influence of Leadership and Organizational Culture on Performance through Motivation in PT. Bank Rakyat Indonesia Tbk Branch of Bekasi
- Randomized Block Design
- Phys Ther 1987 Belitsky 1080 4
- Use the Analysis ToolPak to Perform Complex Data Analysis - Excel - Office
- Sta 200 b Article
- 9781441999337-t1
- S11-SP
- IMPACT OF FII FLOW ON THE BSE SENSEX AND NIFTY
- T10_MixModels
- Taguchi’s Design of Experiments and Selection of Orthogonal Array
- Cadima-2003-Stock_assessment_manual_II.pdf
- SPSS
- ADX Report for Untitled Experiment
- anova
- Dev

You are on page 1of 9

two-sample t-test to more than two samples. (Further methods in Chapter 8 of Business

Statistics)

As an example: Using the 2-sample t-test we have tested to see whether there was any

difference between the size of invoices in a company's Leeds and Bradford stores. Using

Analysis of variance we can, simultaneously, investigate invoices from as many towns as

we wish, assuming that sufficient data is available.

Problem: Why can't we just carry out repeated t-tests on pairs of the variables?

If many independent tests are carried out pairwise then the probability of being correct

for the combined results is greatly reduced. For example, if we compare the average

marks of two students at the end of a semester to see if their mean scores are significantly

different we would have, at a 5% level, 0.95 probability of being correct. Comparing more

students:

Students

Pairwise tests

P( all correct)

0.95

0.05

0.953 = 0.875

0.125

{n(n-1)}/2

0.95n

1 - 0.95n

10

45

0.9545 = 0.10

0.90

Solution: We need therefore to use methods of analysis which will allow the variation

between all n means to be tested simultaneously giving an overall probability of 0.95 of

being correct at the 5% level. This type of analysis is referred to as Analysis of Variance

or 'ANOVA' in short. In general, it:

In general, Analysis of Variance, ANOVA, compares the variation between groups and

the variation within samples by analysing their variances.

One-way ANOVA: Is there any difference between the average sales at various

departmental stores within a company?

Two-way ANOVA: Is there any difference between the average sales at various stores

within a company and/or the types of department? The overall variation is split 'two ways'.

One-way ANOVA

Total variation

(SST)

due to difference between the

group means.

(SSE)

between the groups, i.e.

between the group means.

(SSG)

Two-way ANOVA

Total variation

(SST)

between the groups, i.e.

between the group means

(SSG)

due to difference between the

main group means.

(SSE1)

between the block means,

i.e. second group means

(SSBl)

variation not due to

difference between either

set of group means (SSE)

where SST = Total Sum of Squares; SSG = Treatment Sum of Squares between the

groups; SSBl = Blocks Sum of Squares; SSE = Sum of Squares of Errors.

(At this stage just think of 'sums of squares' as being a measure of variation.)

The method of measuring this variation is variance, which is standard deviation squared.

Total variance = between groups variance + variance due to the errors

It follows that: Total sum of =

squares (SST)

the groups (SSG)

to the errors (SSE)

If we find any two of the three sums of squares then the other can be found by difference.

In practice we calculate SST and SSG and then find SSE by difference.

Since the method is much easier to understand with a numerical example, it will be

explained in stages using theory and a numerical example simultaneously.

Example 1

One important factor in selecting software for word processing and database management

systems is the time required to learn how to use a particular system. In order to evaluate

three database management systems, a firm devised a test to see how many training hours

were needed for five of its word processing operators to become proficient in each of three

systems.

System A

16

19

14

13

18 hours

System B

16

17

13

12

17

hours

System C

24

22

19

18

22

hours

Using a 5% significance level, is there any difference between the training time needed for

the three systems?

In this case the 'groups' are the three database management systems. These account for

some, but not all, of the total variance. Some, however, is not explained by the difference

between them. The residual variance is referred to as that due to the errors.

Total variance = between systems variance + variance due to the errors.

It follows that:

Total sum

of squares

(SST)

Sum of squares +

between systems

(SSSys)

Sum of squares

of errors

(SSE)

The 'square' for each case is (x - x )2 where x is the value for that case and x is the mean.

The 'total sum of squares' is therefore x x 2 . The classical method for calculating this

sum is to tabulate the values; subtract the mean from each value; square the results; and

finally sum the squares. The use of a statistical calculator is preferable!

In the lecture on summary statistics we saw that the standard deviation is calculated by:

sn

or s n 1

x x

n

x x

n 1

2

]

Both methods estimate exactly the same value for the total sum of squares.

1

TotalSS Input all the data individually and output the values for n , x and n from

the calculator in SD mode. Use these values to calculate 2n and n2n .

n

n

n2

n n2 .

x

15

SSSys Calculate

17.33

3.419

11.69

175.3 = SS Total

System A

System B

System C

5

5

5

16

15

21

SS for Systems

15

17.33

2.625

n2

n n2 .

6.889

103.3 = SSSys

Below is the general format of an analysis of variance table. If you find it helpful then

make use of it, otherwise just work with the numbers, as on the next page.

General ANOVA Table (for k groups, total sample size N)

Source

S.S

d.f.

M.S.S.

Between groups

SSG

k-1

SSG

MSG

k 1

MSG

F

MSE

Errors

SSE

(N-1) - (k-1)

SSE

MSE

Nk

Total

SST

N-1

Method

Fill in the total sum of squares, SST, and the between groups sum of squares, SSG,

after calculation; find the sum of squares due to the errors, SSE, by difference;

the degrees of freedom, d.f., for the total and the groups are one less than the total

number of values and the number of groups respectively; find the error degree of

freedom by difference;

the mean sums of squares, M.S.S., is found in each case by dividing the sum of

squares, SS, by the corresponding degrees of freedom.

The test statistic, F, is the ratio of the mean sum of squares due to the differences

between the group means and that due to the errors.

Source

S.S.

d.f.

M.S.S.

103.3

3-1=2

103.3/2 = 51.65

51.65/6.00 = 8.61

Errors

72.0

14 - 2 = 12

72.0/12 = 6.00

Total

175.3

15 - 1 = 14

Between

systems

The methodology for this hypothesis test is similar to that described last week.

The null hypothesis, H0, is that all the group means are equal. H0: 1 = 2 = 3 = 4 etc.

The alternative hypothesis, H1, is that at least two of the group means are different.

The significance level is as stated or 5% by default.

The critical value is from the F-tables, F 1 , 2 , with the two degrees of freedom from

the groups, 1, and the errors, 2.

The test statistic is the F-value calculated from the sample in the ANOVA table.

The conclusion is reached by comparing the test statistic with the critical value and

rejecting the null hypothesis if the test statistic is the larger of the two.

Example 1 (cont.)

H0: A = B = C

Critical value: F0.05 (2,12) = 3.89 (Deg. of free. from 'between systems' and 'errors'.)

Test statistic: 8.61

Conclusion: T.S. > C.V. so reject H0. There is a difference between the mean learning

times for at least two of the three database management systems.

We can calculate a critical difference, CD, which depends on the MSE, the sample sizes

and the significance level, such that any difference between means which exceeds the CD

is significant and any less than it is not.

The critical difference formula is:

1

1

CD t MSE

n1 n 2

t has the error degrees of freedom and one tail. MSE from the ANOVA table.

5

Example1 (cont.)

1

1

CD t MSE

n1 n 2

1 1

1.78 6.00

5 5

2.76

System C takes significantly longer to learn than Systems A and B which are similar.

Two-way ANOVA

In the above example it might have been reasonable to suggest that the five Operators

might have different learning speeds and were therefore responsible for some of the

variation in the time needed to master the three Systems. By extending the analysis from

one-way ANOVA to two-way ANOVA we can find our whether Operator variability is a

significant factor or whether the differences found previously were just due to the Systems.

Example 2

System A

System B

System C

Operators

3

16

16

24

19

17

22

14

13

19

13

12

18

18

17

22

Again we ask the same question: using a 5% level, is there any difference between the

training time for the three systems? We can use the Operator variation just to explain some

of the unexplained error thereby reducing it, 'blocked' design, or we can consider it in a

similar manner to the System variation in the last example in order to see if there is a

difference between the Operators.

In the first case the 'groups' are the three database management systems and the 'blocks'

being used to reduce the error are the different operators who themselves may differ in

speed of learning. In the second we have a second set of groups - the Operators.

Total variance = between systems variance + between operators variance

+ variance of errors.

So

of squares

between systems

between operators

(SST)

(SSSys)

(SSOps)

Sum of squares

of errors

(SSE)

In 2-way ANOVA we find SST, SSSys, SSOps and then find SSE by difference.

SSE = SST - SSSys - SSOps

We already have SST and SSSys from 1-way ANOVA but still need to find SSOps.

Operators

6

1

16

16

24

18.67

System A

System B

System C

Means

2

19

17

22

19.33

3

14

13

19

15.33

4

13

12

18

14.33

5

18

17

22

19.00

Means

16.00

15.00

21.00

17.33

SSOps Inputting the Operator means as frequency data (n = 3) gives:

n = 15,

n = 2.078

x = 17.33,

and

n n2 = 64.7 = SSOps

Source

Between Systems

Between Operators

S.S.

d.f.

M.S.S.

103.3

3-1=2

103.3/2 = 51.65

51.65/0.91 = 56.76

64.7

5-1=4

64.7/4 = 16.18

16.18/0.91 = 17.78

7.3

14 - 6 = 8

7.3/8 = 0.91

Errors

Total

175.3

15 - 1 = 14

H0: A

Test Statistic: 56.76 (Notice how the test statistic has increased with the use of the more

powerful two-way ANOVA)

Conclusion: T.S. > C.V. so reject H0. There is a difference between at least two of the

mean times needed for training on the different systems.

Using CD of 1.12 (see overhead): C takes significantly longer to learn than A and/or B.

H0: 1

Test Statistic: 17.78

Conclusion: T.S. > C.V. so reject H0. There is a difference between at least two of the

Operators in the mean time needed for learning the systems.

Using CD of 1.45 calculated as previously (see overhead): Operators 3 and 4 are

significantly quicker learners than Operators 1, 2 and 5.

Table 4

= 5%

1

1

2

3

4

5

161.40

18.51

10.13

7.71

6.61

199.50

19.00

9.55

6.94

5.79

215.70

19.16

9.28

6.56

5.41

224.60

19.25

9.12

6.39

5.19

230.20

19.30

9.01

6.26

5.05

234.00

19.33

8.94

6.16

4.95

236.80

19.35

8.89

6.09

4.88

238.90

19.37

8.85

6.04

4.82

240.50

19.38

8081

6.00

4.77

6

7

8

9

10

5.99

5.59

5.32

5.12

4.96

5.14

4.74

4.46

4.26

4.10

4.76

4.35

4.07

3.86

3.71

4.53

4.12

3.84

3.63

3.48

4.39

3.97

3.69

3.48

3.33

4.28

3.87

3.58

3.37

3.22

4.21

3.79

3.50

3.29

3.14

4.15

3.73

3.44

3.23

3.07

4.10

3.68

3.39

3.18

3.02

11

12

13

14

15

4.84

4.75

4.67

4.60

4.54

3.98

3.89

3.81

3.74

3.68

3.59

3.49

3.41

3.34

3.29

3.36

3.26

3.18

3.11

3.06

3.20

3.11

3.03

2.96

2.90

3.09

3.00

2.92

2.85

2.79

3.01

2.91

2.83

2.76

2.71

2.95

2.85

2.77

2.70

2.64

2.90

2.80

2.71

2.65

2.59

16

17

18

19

20

4.49

4.45

4.41

4.38

4.35

3.63

3.59

3.55

3.52

3.49

3.24

3.20

3.16

3.13

3.10

3.01

2.96

2.93

2.90

2.87

2.85

2.81

2.77

2.74

2.71

2.74

2.70

2.66

2.63

2.60

2.66

2.61

2.58

2.54

2.51

2.59

2.55

2.51

2.48

2.45

2.54

2.49

2.46

2.42

2.39

21

22

23

24

25

4.32

4.30

4.28

4.26

4.24

3.47

3.44

3.42

3.40

3.39

3.07

3.05

3.03

3.01

2.99

2.84

2.82

2.80

2.78

2.76

2.68

2.66

2.64

2.62

2.60

2.57

2.55

2.53

2.51

2.49

2.49

2.46

2.44

2.42

2.40

2.42

2.40

2.37

2.36

2.34

2.37

2.34

2.32

2.30

2.28

26

27

28

29

30

4.23

4.21

4.20

4.18

4.17

3.37

3.35

3.34

3.33

3.32

2.98

2.96

2.95

2.93

2.92

2.74

2.73

2.71

2.70

2.69

2.59

2.57

2.56

2.55

2.53

2.47

2.46

2.45

2.43

2.42

2.39

2.37

2.36

2.35

2.33

2.32

2.31

2.29

2.28

2.27

2.27

2.25

2.24

2.22

2.21

40

60

120

4.08

4.00

3.92

3.84

3.23

3.15

3.07

3.00

2.84

2.76

2.68

2.60

2.61

2.53

2.45

2.37

2.45

2.37

2.29

2.21

2.34

2.25

2.17

2.10

2.25

2.17

2.09

2.01

2.18

2.10

2.02

1.94

2.12

2.04

1.96

1.88

- Tutorial 1-14 Student s Copy 201605Uploaded by文祥
- New Doc 2017-11-30Uploaded byAshutosh Sinha
- 470 Final ReportUploaded byKhan Iftezar Sayeed
- An OvaUploaded bySmita Agrawal
- Sample APA Lab Write UpUploaded byldlewis
- Influence of Leadership and Organizational Culture on Performance through Motivation in PT. Bank Rakyat Indonesia Tbk Branch of BekasiUploaded byCenter for Promoting Education and Research(CPER), USA
- Randomized Block DesignUploaded byDhona أزلف Aquilani
- Phys Ther 1987 Belitsky 1080 4Uploaded byNatacha Haydeé
- Use the Analysis ToolPak to Perform Complex Data Analysis - Excel - OfficeUploaded bydaking
- Sta 200 b ArticleUploaded byNeo Mervyn Monaheng
- 9781441999337-t1Uploaded byfabian_espitia
- S11-SPUploaded bySaagar Karande
- IMPACT OF FII FLOW ON THE BSE SENSEX AND NIFTYUploaded byShuja Shabbir
- T10_MixModelsUploaded bymaleticj
- Taguchi’s Design of Experiments and Selection of Orthogonal ArrayUploaded byBhavin Desai
- Cadima-2003-Stock_assessment_manual_II.pdfUploaded byMaria Ester Chavez Gutierrez
- SPSSUploaded byLakish Maddoo
- ADX Report for Untitled ExperimentUploaded byRaysa Nicole Herbas Campos
- anovaUploaded byHamid Hamid
- DevUploaded byDevita Ardiani
- HW2Uploaded byITzFredWaZy
- A Questionnaire Based Assessment of Organizational Communication in Indian RailwaysUploaded byarcherselevators
- Chapter 04Uploaded bySenyum Sehat
- AllUploaded byLensa Rosdiana Safitri
- PDFUploaded bymaria margaretha wattimury
- 52 perusahaan ipoUploaded bySapiPendek Gaming
- Hypothesis Testing.docxUploaded byDillibabu Thiagarajan
- Ambudheesh Assignment QTMDUploaded byRahul Kumar
- Lab Syllabus P248 01 L 4 Su18Uploaded byPohuyist
- KUISUploaded byRachmad Fadly

- Sequence of Service for Restaurant PracticalUploaded byDevendren Sathasivam
- Mid Term Assessment HOS60303 082014Uploaded byDevendren Sathasivam
- A aUploaded byDevendren Sathasivam
- Borang Pertukaran Program -A142Uploaded byDevendren Sathasivam
- LessonUploaded byDevendren Sathasivam
- The Call of the Wild Chp2Uploaded byDevendren Sathasivam
- Equipment List and Bill of QuantitiesUploaded byDevendren Sathasivam
- Partnership AgreementUploaded byDevendren Sathasivam
- Production ChartUploaded byDevendren Sathasivam
- Customer Rewards GuidelineUploaded byDevendren Sathasivam
- Habanero Chile SalsaUploaded byDevendren Sathasivam
- Jerk MarinadeUploaded byDevendren Sathasivam
- Barbeque MenuUploaded byDevendren Sathasivam
- Hotel Plant & Premises_questionsUploaded byDevendren Sathasivam

- 4.TABLE OF CONTENTS.pdfUploaded byDhaPutra
- formulacard.pdfUploaded bynoramirah
- Stata OutputUploaded by9883786897
- StatisticsUploaded byDelphina Gomes
- Cash Flows and Accrual Accounting in Predicting Future Cash FlowsUploaded byNicolai Aquino
- corr_regUploaded byMuhammad Abduh
- 8Uploaded bygeodennys
- IBM.infoSphere.information.analyzer.v8.7.User.guideUploaded bycanthe
- A Hybrid Statistical-Analytical Method for AssessingUploaded bykeunteus
- 6assoUploaded byVincent Clement
- Nastar Network Optimization User Guide (GSM)-(V600R008_02).pdfUploaded byHashaam14
- Statistics for Business and Economics.docxUploaded byMichaelister Ordoñez Monteron
- Using_CBPR_to_address_health_disparities.pdfUploaded byprocrast3333
- Case 7-11Uploaded bybelovedman
- Syllabi BI 2012Uploaded byRahmat Pramulya
- 08.Corredor.dissertationUploaded byNinoska Marquez Romero
- MulticollinearityUploaded bymissinu
- EViews 7 Users Guide IUploaded byLuciana Lima
- Data Analysis using WEKAUploaded byprabhjotsbhatia9675
- Mcq Measures of Dispersion With Correct AnswersUploaded byJyoti Prasad Sahu
- Random Forest Với WekaUploaded byNguyen Ba Quan
- Passenger’s Satisfaction towards Shatabdi Express Train in Punjab, India by vivek mahalaUploaded byVivek Mahala
- Medical Statistic CalculationUploaded bysdmitar
- 5-TS SAS LectureUploaded byrphmi
- 10.1007-BF02820690Uploaded byKumar Gavali Suryanarayana
- Using R for Linear RegressionUploaded byMohar Sen
- Writing and Interpreting Research ReportUploaded byMakanjuola Osuolale John
- Smart Al AnswersUploaded byArima Prabu
- chapter07Uploaded byapi-172580262
- SHS Correlation and Regression Final (2)Uploaded byKarissa

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.