29 views

Uploaded by jermac17

Powerpoint on anova statistics.

save

You are on page 1of 42

**Engi neer i ng Ex per i ment at i on I I
**

Lec t ur e 7

Basi c St at i st i c s and ANOVA Basi c St at i st i c s and ANOVA

ME 311, Mechanical Engineering

University of Kentucky

Summar y of Lec t ur e 6 y

Regression Model

Linear model coefficients

Model evaluation

Exploit contour and surface plots

Error bars for 2

2

example p

Single factor multiple level

Make sense of your data

Direct and indirect data analysis Direct and indirect data analysis

Model Linearization

Curve fitting

Goodness of the fit Goodness of the fit

R

2

definition

Single factor example

ME 311, Mechanical Engineering

University of Kentucky

Basic Statistical Concepts Basic Statistical Concepts

Simple comparative experiments Simple comparative experiments

The hypothesis testing framework

The two-sample t-test

Ch ki ti lidit Checking assumptions, validity

Comparing more than two factor’s levels…the analysis of

variance

ANOVA decomposition of total variability

Statistical testing & analysis

Checking assumptions, model validity

Post-ANOVA testing of means

Sample size determination

ME 311, Mechanical Engineering

University of Kentucky

Portland Cement Formulation (page 23)

ME 311, Mechanical Engineering

University of Kentucky

Graphical View of the Data Graphical View of the Data

Dot Diagram, Fig. 2-1, pp. 24

ME 311, Mechanical Engineering

University of Kentucky

Box Plots, Fig. 2-3, pp. 26

ME 311, Mechanical Engineering

University of Kentucky

The Hypothesis Testing Framework

Statistical hypothesis testing is a useful framework for

many experimental situations

O i i f th th d l d t f th l 1900 Origins of the methodology date from the early 1900s

We will use a procedure known as the two-sample t-

test test

ME 311, Mechanical Engineering

University of Kentucky

The Hypothesis Testing Framework y g

Sampling from a normal distribution

Statistical hypotheses:

0 1 2

: H μ μ =

Statistical hypotheses:

1 1 2

: H μ μ ≠

ME 311, Mechanical Engineering

University of Kentucky

Estimation of Parameters

1

estimatesthepopulationmean

n

i

y y μ =

∑

1

estimates the population mean

1

i

i

n

y y

n

μ

=

∑

2 2 2

1

1

( ) estimates the variance

1

i

i

S y y

n

σ

=

= −

−

∑

ME 311, Mechanical Engineering

University of Kentucky

Summary Statistics (pg. 36) y (pg )

Modified Mortar

“New recipe”

Unmodified Mortar

“Original recipe”

1

2

16.76

0100

y

S

=

New recipe

g p

1

2

17.04

0061

y

S

=

2

1

1

0.100

0.316

S

S

=

=

1

1

0.061

0.248

S

S

=

=

1

10 n =

1

10 n =

ME 311, Mechanical Engineering

University of Kentucky

How the Two-Sample t-Test Works:

Usethesamplemeanstodrawinferencesabout thepopulationmeans

1 2

Use the sample means to draw inferences about the population means

16.76 17.04 0.28

Differenceinsamplemeans

y y − = − = −

2

Difference in sample means

Standard deviation of the difference in sample means

σ

2

y

This suggests a statistic:

n

σ

σ =

gg

1 2

0

2 2

1 2

Z

y y

σ σ

−

=

1 2

1 2

n n

σ σ

+

ME 311, Mechanical Engineering

University of Kentucky

How the Two-Sample t-Test Works:

2 2 2 2

U d t ti t d S S

2 2 2 2

1 2 1 2

1 2

Use and to estimate and

Thepreviousratiobecomes

S S

y y

σ σ

−

1 2

2 2

1 2

The previous ratio becomes

y y

S S

+

1 2

2 2 2

1 2

However, we have the case where

n n

σ σ σ = =

1 2

2 2

,

Pool the individual sample variances:

2 2

2

1 1 2 2

1 2

( 1) ( 1)

2

p

n S n S

S

n n

− + −

=

+ −

ME 311, Mechanical Engineering

University of Kentucky

1 2

How the Two-Sample t-Test Works:

The test statistic is

1 2

0

1 1

y y

t

−

=

1 2

1 1

p

S

n n

+

Values of t

0

that are near zero are consistent with the null hypothesis

Values of t

0

that are very different from zero are consistent with the

alternative hypothesis alternative hypothesis

t

0

is a “distance” measure-how far apart the averages are expressed in

standard deviation units

Notice the interpretation of t

0

as a signal-to-noise ratio Notice the interpretation of t

0

as a signal to noise ratio

ME 311, Mechanical Engineering

University of Kentucky

The Two-Sample (Pooled) t-Test

2 2

2

1 1 2 2

1 2

( 1) ( 1) 9(0.100) 9(0.061)

0.081

2 10 10 2

p

n S n S

S

n n

− + − +

= = =

+ − + −

0.284

p

S =

1 2

0

16.76 17.04

2.20

1 1 1 1

0284

y y

t

S

− −

= = = −

+ +

1 2

0.284

10 10

p

S

n n

+ +

The two sample means are a little over two standard deviations apart

Is this a "large" difference?

ME 311, Mechanical Engineering

University of Kentucky

The Two-Sample (Pooled) t-Test

So far, we haven’t really done , y

any “statistics”

We need an objective basis

for deciding how large the test

t ti ti t ll i

t

0

=-2.20

statistic t

0

really is

In 1908, W. S. Gosset derived

the reference distribution

for t called the t for t

0

… called the t

distribution

Tables of the t distribution -

text page 606 text, page 606

ME 311, Mechanical Engineering

University of Kentucky

The Two-Sample (Pooled) t-Test

A value of t

0

between –2.101 and 2.101 is consistent with equality of means

t

0

is exceeding the range of 2.101 or –2.101, leads to significant means difference

Could also use the P-value approach

t

0

=-220 t

0

2.20

ME 311, Mechanical Engineering

University of Kentucky

The Two-Sample (Pooled) t-Test ( )

t

0

=-2.20

The P-value is the risk of wrongly rejecting the null hypothesis of equal g y j g yp q

means (it measures rareness of the event)

The P-value in our problem is P = 0.042

ME 311, Mechanical Engineering

University of Kentucky

The Normal Probability Plot y

ME 311, Mechanical Engineering

University of Kentucky

Importance of the t-Test p

Provides an objective framework for simple comparative

experiments

C ld b d t t t ll l t h th i t Could be used to test all relevant hypotheses in a two-

level factorial design, because all of these hypotheses

involve the mean response at one “side”of the cube involve the mean response at one side of the cube

versus the mean response at the opposite “side” of the

cube

ME 311, Mechanical Engineering

University of Kentucky

What If There Are More Than Two Factor Levels?

The t-test does not directly apply

There are lots of practical situations where there are either more

than two levels of interest, or there are several factors of

simultaneous interest

The analysis of variance (ANOVA) is the appropriate analysis

“engine” for these types of experiments – Chapter 3, textbook

Th ANOVA d l d b Fi h i th l 1920 d The ANOVA was developed by Fisher in the early 1920s, and

initially applied to agricultural experiments

Used extensively today for industrial experiments y y p

ME 311, Mechanical Engineering

University of Kentucky

An Example (See pg. 60) p ( pg )

An engineer is interested in investigating the relationship

between the RF power setting and the etch rate for this tool The between the RF power setting and the etch rate for this tool. The

objective of an experiment like this is to model the relationship

between etch rate and RF power, and to specify the power

setting that will give a desired target etch rate setting that will give a desired target etch rate.

The response variable is etch rate.

She is interested in a particular gas (C2F6) and gap (0.80 cm),

and wants to test four levels of RF power: 160W, 180W, 200W,

and 220W. She decided to test five wafers at each level of RF

power.

The experimenter chooses 4 levels of RF power 160W, 180W,

200W, and 220W

The experiment is replicated 5 times – runs made in random The experiment is replicated 5 times – runs made in random

order

ME 311, Mechanical Engineering

University of Kentucky

An Example (See pg. 62)

Does changing the power

change the mean etch

rate?

Is there an optimum level

for power? for power?

ME 311, Mechanical Engineering

University of Kentucky

The Analysis of Variance (Sec. 3-2, pg. 63)

In general, there will be a levels of the factor, or a treatments,

and n replicates of the experiment, run in random order…a p p ,

completely randomized design (CRD)

N = an total runs

We consider the fixed effects case the random effects case We consider the fixed effects case…the random effects case

will be discussed later

Objective is to test hypotheses about the equality of the a

treatment means

ME 311, Mechanical Engineering

University of Kentucky

treatment means

The Analysis of Variance

The name “analysis of variance” stems from a partitioning of

the total variability in the response variable into components that the total variability in the response variable into components that

are consistent with a model for the experiment

The basic single-factor ANOVA model is

1,2,...,

,

ij i ij

i a

y μ τ ε

=

⎧

= + +

⎨

,

1,2,...,

ij i ij

y

j n

μ τ ε + +

⎨

=

⎩

2

an overall mean, treatment effect,

i t l (0 )

i

ith

NID

μ τ = =

2

experimental error, (0, )

ij

NID ε σ =

ME 311, Mechanical Engineering

University of Kentucky

Models for the Data

There are several ways to write a model for the data:

is called the effects model

Let then

ij i ij

y μ τ ε

μ μ τ

= + +

+ Let , then

is called the means model

i i

ij i ij

y

μ μ τ

μ ε

= +

= +

Regression models can also be employed

ij i ij

y μ

ME 311, Mechanical Engineering

University of Kentucky

The Analysis of Variance

Total variability is measured by the total sum of squares:

The basic ANOVA partitioning is:

2

..

( )

a n

T ij

SS y y = −

∑∑

1 1 i j = =

2 2

.. . .. .

( ) [( ) ( )]

a n a n

ij i ij i

y y y y y y − = − + −

∑∑ ∑∑

1 1 1 1

2 2

( ) ( )

j j

i j i j

a a n

i ij i

n y y y y

= = = =

= − + −

∑∑ ∑∑

∑ ∑∑ . .. .

1 1 1

( ) ( )

i ij i

i i j

T Treatments E

y y y y

SS SS SS

= = =

= +

∑ ∑∑

ME 311, Mechanical Engineering

University of Kentucky

The Analysis of Variance

T Treatments E

SS SS SS = +

A large value of SS

Treatments

reflects large differences in treatment

means

A small value of SS

Treatments

likely indicates no differences in

treatment means

Formal statistical hypotheses are:

0 1 2

: H μ μ μ = = = L

0 1 2

1

:

: At least one mean is different

a

H

H

μ μ μ

ME 311, Mechanical Engineering

University of Kentucky

The Analysis of Variance

While sums of squares cannot be directly compared to test the

hypothesis of equal means, mean squares can be compared.

A mean square is a sum of squares divided by its degrees of freedom:

T l T E

df df df = +

1 1 ( 1)

Total Treatments Error

df df df

an a a n

SS SS

+

− = − + −

,

1 ( 1)

Treatments E

Treatments E

SS SS

MS MS

a a n

= =

− −

If the treatment means are equal, the treatment and error mean

squares will be (theoretically) equal.

If t t t diff th t t t ill b l th If treatment means differ, the treatment mean square will be larger than

the error mean square.

ME 311, Mechanical Engineering

University of Kentucky

Analysis of Variance: Summarized

Computing…see text, pp 66-70

The reference distribution for F

0

is the F

a-1, a(n-1)

distribution

R j t th ll h th i ( l t t t ) if Reject the null hypothesis (equal treatment means) if

0 , 1, ( 1) a a n

F F

α − −

>

ME 311, Mechanical Engineering

University of Kentucky

0 , 1, ( 1) a a n α

ANOVA Table: Example 3-1

ME 311, Mechanical Engineering

University of Kentucky

The Reference Distribution:

ME 311, Mechanical Engineering

University of Kentucky

ANOVA calculations are usually done via

t computer

Calculations can be done on Minitab, NCSS, Excel,

Matlab, Scilab, …etc , ,

ME 311, Mechanical Engineering

University of Kentucky

Model Adequacy Checking in the ANOVA

f S Text reference, Section 3-4, pg. 75

Checking assumptions is important

Normalityy

Constant variance

Independence p

Have we fit the right model?

Later we will talk about what to do if some of these Later we will talk about what to do if some of these

assumptions are violated

ME 311, Mechanical Engineering

University of Kentucky

Model Adequacy Checking in the ANOVA

Examination of Examination of

residuals (see text, Sec.

3-4, pg. 75)

ˆ

ij ij ij

e y y

y y

= −

= −

NCSS generates the

. ij i

y y = −

g

residuals

Residual plots are very

useful useful

Normal probability plot

of residuals

ME 311, Mechanical Engineering

University of Kentucky

Other Important Residual Plots

ME 311, Mechanical Engineering

University of Kentucky

Post-ANOVA Comparison of Means

The analysis of variance tests the hypothesis of equal treatment The analysis of variance tests the hypothesis of equal treatment

means

Assume that residual analysis is satisfactory

If th t h th i i j t d d ’t k hi h ifi If that hypothesis is rejected, we don’t know which specific means

are different

Determining which specific means differ following an ANOVA is

called the multiple comparisons problem

There are lots of ways to do this…see text, Section 3-5, pg. 87

We will use pairwise t-tests on means…sometimes called Fisher’s e use pa se t tests o ea s so et es ca ed s e s

Least Significant Difference (or Fisher’s LSD) Method

ME 311, Mechanical Engineering

University of Kentucky

Two-Factor, Multiple levels Experiment , p p

alevels of factor A; blevels of factor B; nreplicates

ME 311, Mechanical Engineering

University of Kentucky

Extension of the ANOVA to Factorials

2 2 2

a b n a b

∑∑∑ ∑ ∑

2 2 2

... .. ... . . ...

1 1 1 1 1

( ) ( ) ( )

ijk i j

i j k i j

a b a b n

y y bn y y an y y

= = = = =

− = − + −

∑∑∑ ∑ ∑

2 2

. .. . . ... .

1 1 1 1 1

( ) ( )

ij i j ijk ij

i j i j k

n y y y y y y

= = = = =

+ − − + + −

∑∑ ∑∑∑

T A B AB E

SS SS SS SS SS = + + +

breakdown:

1 1 1 ( 1)( 1) ( 1)

df

abn a b a b ab n − = − + − + − − + − 1 1 1 ( 1)( 1) ( 1) abn a b a b ab n + + +

ME 311, Mechanical Engineering

University of Kentucky

ANOVA Table – Fixed Effects Case ANOVA Table Fixed Effects Case

NCSS and Minitab will perform the computations

Text gives details of manual computing – see pp.

169 & 170

ME 311, Mechanical Engineering

University of Kentucky

Analysis of Variance Table

Source Sum of Mean Prob Power

Term DF Squares Square F-Ratio Level (Alpha=0.05)

A: C2 2 900801.2 450400.6 2563.41 0.000000* 1.000000

B: C3 2 420599.2 210299.6 1196.90 0.000000* 1.000000

AB 4 809992.1 202498 1152.50 0.000000* 1.000000

S 18 3162.667 175.7037

Total (Adjusted) 26 2134555 ( j )

Total 27

* Term significant at alpha =0.05

Means and Effects Section

Standard

Term Count Mean Error Effect Term Count Mean Error Effect

All 27 478.2592 478.2592

A: C2

1 9 468.7778 4.418442 -9.481482

2 9 706.5555 4.418442 228.2963

3 9 259 4445 4 418442 218 8148 3 9 259.4445 4.418442 -218.8148

B: C3

1 9 305.4445 4.418442 -172.8148

2 9 595.7778 4.418442 117.5185

3 9 533.5555 4.418442 55.2963

AB: C2,C3

1,1 3 16.33333 7.652967 -279.6296

1,2 3 796.6667 7.652967 210.3704

1,3 3 593.3333 7.652967 69.25926

2 1 3 538 6667 7 652967 4 925926 2,1 3 538.6667 7.652967 4.925926

2,2 3 708 7.652967 -116.0741

2,3 3 873 7.652967 111.1481

3,1 3 361.3333 7.652967 274.7037

3,2 3 282.6667 7.652967 -94.2963

3 3 3 134 3333 7 652967 180 4074

ME 311, Mechanical Engineering

University of Kentucky

3,3 3 134.3333 7.652967 -180.4074

Factorials with More Than Two Factors

Basic procedure is similar to the two-factor case; all abc kn Basic procedure is similar to the two factor case; all abc…kn

treatment combinations are run in random order

ANOVA identity is also similar:

T A B AB AC

ABC AB K E

SS SS SS SS SS

SS SS SS

= + + + + +

+ + + +

L L

L

Complete three-factor example in text, Example 5-5

ABC AB K E

SS SS SS + + + +

L

ME 311, Mechanical Engineering

University of Kentucky

Readi ngs g

Readings:

Ch t 3 d 5 Chapters 3 and 5

ME 311, Mechanical Engineering

University of Kentucky

- AnovaUploaded bysaritha339
- UOP 888Uploaded byJulio Rivero
- ANOVA PresentationUploaded byPratik Kulkarni
- An OvaUploaded byGargi Rajvanshi
- Atomic Learning1Uploaded bySuraj Sriram
- Tw Anova AranjatUploaded byCodruta Dura
- Anova e PrintUploaded byEmine Alaaddinoglu
- The Relationship Between Cave Biodiversity and HabitatUploaded byDana Joanne Von Trono
- 25 Factors Influencing the Students ChoiUploaded byLeslie Ann Daganio
- Data Science 101 Statistics OverviewUploaded byErnesto Pizarro
- AnovaUploaded bySophia Reyes
- Optimization of WEDM Process Parameters on SS 317 using Grey Relational AnalysisUploaded byIRJET Journal
- One-Way Analysis of Variance in SPSSUploaded byMichail
- Chapter 9 SlidesUploaded bykyntha
- MATH2931 Lecture 6Uploaded byBob
- 2015_Hw3.pdfUploaded byPETER
- crd fasfUploaded byYogyaning Kartiko
- key note for mid.docxUploaded byThùyy Vy
- Bum2413-Applied Statistics 11213Uploaded bySyada Hageda
- Nested ANOVA, Statistics, Inference,Uploaded byAndrea Spencer
- ISENGUploaded byMarkyTheMark
- Ej 1122636Uploaded byAdjei Francis
- okUploaded bypisal
- ac.els-cdn.com_S092401360200273X_1-s2.0-S092401360200273X-mainUploaded byBalu Bhs
- Application of ANOVA.pptUploaded byUma Shankar
- Basic Statistical Procedures 2Uploaded byNayLwin
- (DOE)Cupcake HeightUploaded bySia Ping Chong
- Statistics applications in civil engineeringUploaded bySree Nivas
- Regrational Analysis (1)Uploaded byVinod Gowda
- Child Labor Chap 3Uploaded byAngel Agriam