- Application of Nonlinear Static Analyses to PSDA
- VI-02 Practical Classical Methods_2007
- 2009_13.pdf
- Presentation Borghetti Nucci Paolone Dallas 04
- Engdahl 1998.pdf
- 05208736
- Arroyo y Ordaz 2007 Energía Histerética EESD
- 12645234 Bradley OpinionPaperACriticalExaminationOfSeismicResponseUncertaintyInEarthquakeEngineering EESD 2013 Inpress
- Energy Conversion and Management Volume 68 Issue 2013 [Doi 10.1016%2Fj.enconman.2013.01.004] Demirhan, Haydar; Menteş, Turhan; Atilla, Mustafa -- Statistical Comparison of Global Solar Radiation Estim
- Independent Sample T-Test
- 1718. 3-Point Estimation-Statistical Inference.pdf
- j 41017380
- 15-16.2 Compre Part A
- S3 Revision Notes
- Estimation Theory
- Socio-Economic Impact of Business Establishments in Balagtas, Batangas City to the Community: Inputs to Business Plan Development
- res12_s4glossary
- 3 Scaling Test
- M4 All Slides
- Truncated Regression Model
- Tree and Forest Measurement Book
- Econometrics Assignment HW4
- Practical Lab 3 Example
- Child Labor Chap 3
- OUTPUT.doc
- The Effects of Using Videos on Teaching Selected Topics in Physics Towards the Development of Higher-Order Thinking Skills
- Ore Body Modelling Assignment 2
- 1-s2.0-S0304407607001741-main
- fulltext
- Pakistan Army Fight Till End
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Yes Please
- The Unwinding: An Inner History of the New America
- Sapiens: A Brief History of Humankind
- The Emperor of All Maladies: A Biography of Cancer
- Grand Pursuit: The Story of Economic Genius
- This Changes Everything: Capitalism vs. The Climate
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- The Prize: The Epic Quest for Oil, Money & Power
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- John Adams
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- Rise of ISIS: A Threat We Can't Ignore
- The New Confessions of an Economic Hit Man
- Team of Rivals: The Political Genius of Abraham Lincoln
- Angela's Ashes: A Memoir
- Steve Jobs
- How To Win Friends and Influence People
- Bad Feminist: Essays
- You Too Can Have a Body Like Mine: A Novel
- The Incarnations: A Novel
- The Light Between Oceans: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- Extremely Loud and Incredibly Close: A Novel
- Leaving Berlin: A Novel
- The Silver Linings Playbook: A Novel
- The Master
- Bel Canto
- A Man Called Ove: A Novel
- Brooklyn: A Novel
- The Flamethrowers: A Novel
- The First Bad Man: A Novel
- We Are Not Ourselves: A Novel
- The Blazing World: A Novel
- The Rosie Project: A Novel
- The Love Affairs of Nathaniel P.: A Novel
- Life of Pi
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Bonfire of the Vanities: A Novel
- The Perks of Being a Wallflower
- The Cider House Rules
- A Prayer for Owen Meany: A Novel
- The Wallcreeper
- Wolf Hall: A Novel
- The Art of Racing in the Rain: A Novel
- The Kitchen House: A Novel
- Beautiful Ruins: A Novel
- Interpreter of Maladies
- Good in Bed

Topic 2

Topic Overview

This topic will cover

• Basic Statistical Concepts (Montgomery 2-1, 2-2)

• Commonly Used Densities (Montgomery 2-3)

Basic Statistical Concepts

• Random Variable - Y

– Quantity (response) capable of taking on a set of values

– Discrete or continuous

¸

i

Pr(Y = y

i

) = 1 or

f(y)dy = 1

– Described by a probability distribution (density f)

• Numerical Summaries of a Variable

– Center - Mean: µ, E()

– Spread - Variance: σ

2

, Var()

Discrete Continuous

µ :

¸

yPr(Y = y)

y f(y)

σ

2

:

¸

(y −µ)

2

Pr(Y = y)

(y −µ)

2

f(y)

• Independence – Observations are statistically independent if the value of one of the

observations does not inﬂuence the value of any other observations.

• Elementary Results of Numerical Summaries

– E(aY ±b) = aE(Y ) ±b

– Var(aY ±b) = a

2

Var(Y )

– E(Y

1

±Y

2

) = E(Y

1

) ±E(Y

2

)

– Cov(Y

1

, Y

2

) = E(Y

1

Y

2

) −E(Y

1

)E(Y

2

).

– If Y

1

and Y

2

are independent, → Cov(Y

1

, Y

2

) = 0.

– Var(Y ) = E(Y

2

) −E(Y )

2

= E[(Y −E(Y ))

2

]

Topic 2 Page 1

– Var(Y

1

±Y

2

) = Var(Y

1

) + Var(Y

2

) ±2Cov(Y

1

, Y

2

)

– E(Y

1

×Y

2

) = E(Y

1

)E(Y

2

), if Y

1

, Y

2

independent.

– However, E

Y

1

Y

2

=

E(Y

1

)

E(Y

2

)

.

Common Sample Summaries

• Sample mean (

¯

Y )

If Y

1

, . . . , Y

n

are independent with mean µ and variance σ

2

,

E

1

n

¸

Y

i

=

1

n

¸

E(Y

i

) =

1

n

nµ = µ

Var

1

n

¸

Y

i

=

1

n

2

¸

Var(Y

i

) =

1

n

2

nσ

2

= σ

2

/n

What is the distribution of

¯

Y ?

If Y

i

Normal →

¯

Y Normal

If Y

i

Other →

¯

Y ≈ Normal

The Central Limit Theorem

If Y

1

, . . . , Y

n

are independent R. V.’s with mean µ and variance σ

2

,

¸

Y −nµ

√

nσ

2

∼ N(0, 1)

• Sample variance (S

2

=

1

n−1

¸

(Y

i

−

¯

Y )

2

)

E(Y

i

−

¯

Y ) = E(Y

i

) −E(

¯

Y ) = 0

Var(Y

i

−

¯

Y ) = Var(Y

i

) + Var(

¯

Y ) −2Cov(Y

i

,

¯

Y )

= σ

2

+ σ

2

/n −2σ

2

/n

=

n −1

n

σ

2

E(S

2

) =

1

n −1

¸

Var(Y

i

−

¯

Y )

=

1

n −1

n

n −1

n

σ

2

= σ

2

• Sample standard deviation (S =

√

S

2

)

What is the distribution of S

2

?

If Y

i

Normal, then

(n −1)S

2

/σ

2

∼ χ

2

n−1

,

where n −1 is the degrees of freedom.

Topic 2 Page 2

Setup

• Goal: Learn about population from (randomly) drawn data/sample

• Model and Parameter: Assume population (Y ) follow a certain model (distribution)

that depends on a set of unknown constants (parameters): Y ∼ f(y, θ)

Example: Y is the yield of a tomato plant

Y ∼ N(µ, σ

2

)

f(y) =

1

√

2πσ

2

exp

−

(y −µ)

2

2σ

2

θ = (µ, σ

2

)

Random sample/observations

• Random sample (conceptual)

X

1

, X

2

, . . . , X

n

∼ f(x; θ)

• Random sample (realized/actual numbers)

x

1

, x

2

, . . . , x

n

Example: 0.0 4.9 -0.5 -1.2 2.1 2.8 1.2 0.8 0.9 -0.9

Populations/Samples

A parameter is the true value of some aspect of the population. (Examples: mean, median,

variance, slope)

An estimator is a

• statistic that corresponds to a parameter.

• random variable (not based on any particular data).

An estimate is a particular value of the estimator, computed from the sample data. It is

considered ﬁxed, given the data.

Sampling from a population:

Collection of possible values Numerical Summary

What you want to know Population Parameter

What you actually get to see Sample Statistic

Estimator:

ˆ

θ = g(Y

1

, . . . , Y

n

)

Estimate:

ˆ

θ = g(y

1

, . . . , y

n

)

Topic 2 Page 3

Example

Estimators for µ and σ

2

ˆ µ =

¯

Y =

¸

n

i=1

Y

i

n

; ˆ σ

2

= S

2

=

¸

n

i=1

(Y

i

−

¯

Y )

2

n −1

Estimates

ˆ µ = ¯ y = 1.01 ; ˆ σ

2

= s

2

= 3.49

Variance vs. Bias

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

•

While variance refers to random spread, bias refers to a systematic drift in an estimator.

Most of the time, we will be concerned with the variance and bias of estimators, not

populations. (Thus, an estimate of variance may be biased.) Thus, bias and variance are

inherent (and thus often subject to manipulation) in a statistical method, not the sampling.

An estimator

ˆ

θ of θ is unbiased if E(

ˆ

θ) = θ.

Unbiased Biased

sample mean sample standard deviation

sample variance F-ratio

Degrees of Freedom of a sum is equal to the number of elements in that sum that are

independent (i.e., free to vary).

For example, if you are told the sum of three elements equals ﬁve, you only need

to know two of the three elements to know all of them.

General Result:

If Y

i

has variance σ

2

and SS =

¸

(Y

i

−

¯

Y )

2

with k degrees of freedom,

E(SS/k) = σ

2

.

Topic 2 Page 4

Sampling/Reference Distribution

Statistical inference/testing: making decision in the presence of variability. Is result of

experiment easily explained by chance variation or is it “unusual”?

• “Unusual”: Is it unlikely if only chance variation?

• Need distribution of results assuming only chance variation (null distribution)

• Compare observed result with distribution of outcomes

• Example 1: t-test (comparing two means)

– Calculate observed t-test statistic

– t distribution summarizes outcomes under Null hypothesis

– Compare observed result with distribution

• Example 2: randomization test

– Chance variation due to randomization

– Generate all possible outcomes (each equally likely)

– Compare observed result with distribution of outcomes

Standard Normal Distribution

Take Z

i

∼ N(0, 1) independent i = 1, . . . , n.

So P(a < Z

i

< b) =

b

a

f(z)dz, where

f(z) =

1

√

2π

e

−

1

2

z

2

-4 -2 0 2 4

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

Topic 2 Page 5

The density of Z = (Z

1

, . . . , Z

n

) is

f(z) = (2π)

−n/2

e

−

1

2

¸

n

i=1

z

2

i

Normal with mean µ and variance σ

2

X = µ + σZ ∼ N(µ, σ

2

)

Standardizing

Z =

X −µ

σ

∼ N(0, 1)

Application

Used to model random observations.

Y = µ(X) + eσ, where e ∼ N(0, 1).

Given X = x, Y ∼ N(µ(x), σ

2

).

Multivariate Normal

X = µ + AZ ∼ N(µ, A

A).

Z = A

−1

(X −µ) ∼ N(0, I).

Special case: A orthogonal, A

A = I.

SAS Code

data normal;

qtl = probit(0.975); /* to get z value for 95% confidence interval */

pval = 2*(1-probnorm(2.3)); /* to get p-value of 2-sided z-score 2.3 */

run;

proc print data = normal;

run;

Obs qtl pval

1 1.95996 0.0214

R Code

> qtl = qnorm(0.975)

> qnorm(0.975)

[1] 1.959964

> 2*(1-pnorm(qtl))

[1] 0.05

Topic 2 Page 6

Chisquare distribution

Chisquare distribution on n degrees of freedom. Sums of squares of normals; usually in

variance estimates

S

2

=

n

¸

i=1

Z

2

i

∼ χ

2

(n)

¸

(Y

i

−

¯

Y )

2

/σ

2

∼ χ

2

(n−1)

Chisquare densities on 2, 5, 10, d. f.

0 5 10 15 20 25 30

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

0

.

5

E(S

2

) = n, Var(S

2

) = 2n

• For large n, χ

2

(n)

.

= N(n, 2n) by CLT.

• In SAS, use probchi(q, df) for p-values and cinv(p, df) for quantiles.

• In R, use pchisq(q, df) for p-values and qchisq(p, df) for quantiles.

Student’s t Distribution

If X ∼ N(0, 1) and s

2

∼

1

d

χ

2

(d)

, independent, then t =

X

s

∼ t

(d)

.

Recipe: t

(d)

= N(0, 1)/

indep χ

2

(d)

/d.

Topic 2 Page 7

t densities on 2, 5, 10, ∞ d. f.

-4 -2 0 2 4

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

small d ⇒ heavy tails

Handy theorem

X

i

∼ N(µ, σ

2

) independent. Let

¯

X =

1

n

¸

n

i=1

X

i

, s

2

=

1

n−1

¸

n

i=1

(X

i

−

¯

X)

2

Then

¯

X ∼ N(µ, σ

2

/n), s

2

∼ (n −1)

−1

χ

2

(n−1)

, indep

Sample standardization

So, if X

i

∼ N(µ, σ

2

) independent, then

t =

√

n

¯

X −µ

s

∼ t

(n−1)

• In SAS, use probt(q, df) for p-values and tinv(p, df) for quantiles.

• In R, use pt(q, df) for p-values and qt(p, df) for quantiles.

Fisher’s F Distribution

If SS

N

∼ χ

2

(n)

independent of SS

D

∼ χ

2

(d)

,

Then

F =

1

n

SS

N

1

d

SS

D

≡

MS

N

MS

D

∼ F

n,d

Notes

Topic 2 Page 8

• 1/F

n,d

∼ F

d,n

• F

1,d

= t

2

(d)

• As d →∞, F

n,d

→χ

2

(n)

/n.

F

2,d

densities with d ∈ {2, 5, 10, ∞}

0 2 4 6 8 10

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

F

5,d

densities with d ∈ {2, 5, 10, ∞}

0 2 4 6 8 10

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

Topic 2 Page 9

F

10,d

densities with d ∈ {2, 5, 10, ∞}

0 2 4 6 8 10

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

1

.

0

• In SAS, use probf(q, df

1

, df

2

) for p-values and finv(p, df

1

, df

2

) for quantiles.

• In R, use pf(q, df

1

, df

2

) for p-values and qf(p, df

1

, df

2

) for quantiles.

Noncentral Distributions

Noncentral chisquare

X

i

∼ N(a

i

, 1) independent

C =

¸

n

i=1

X

2

i

∼ χ

2

(n)

(φ), φ =

¸

i

a

2

i

Miracle: only depends on a

Arises: mean square when µ = 0 (in alternate hypotheses)

• In SAS, use probchi(q, df, φ) for p-values and cinv(p, df, φ) for quantiles.

• In R, use pchisq(q, df, φ) for p-values and qchisq(p, df, φ) for quantiles.

Noncentral F

F

n,d

(φ) =

χ

2

(n)

(φ)/n

χ

2

(d)

/d

Arises: probability of signiﬁcant F-test under alternative (i.e. power of F test).

• In SAS, use probf(q, df, φ) for p-values and finv(p, df, φ) for quantiles.

• In R, use pf(q, df, φ) for p-values.

Topic 2 Page 10

Noncentral t

t

d

(a

1

) =

N(a

1

, 1)

χ

2

(d)

Power of t test

• In SAS, use probt(q, df, a

1

) for p-values and tinv(p, df, a

1

) for quantiles.

• In R, use pt(q, df, a

1

) for p-values.

Doubly noncentral F

F

n,d

(φ) =

χ

2

(n)

(φ

n

)/n

χ

2

(d)

(φ

d

)/d

For power when error mean square corrupted.

Noncentral distributions

Widely tabled

Watch parameterization closely

See

1. Encyclopedia of the statistical sciences

2. Johnson and Kotz’s books on distributions

3. the Web

Review on Finding p-values

Building block for p-values: cumulative distribution function (cdf)

• Let X be a random variable.

• The cdf for X is a function of x such that cdf(x) = P(X < x).

• Note: P(X > x) = 1 −P(X < x)

• Examples of functions which evaluate cdf for diﬀerent distributions: probnorm, probchi,

probt, probf.

Topic 2 Page 11

x

c

d

f

(

x

)

-2 -1 0 1 2

0

.

0

0

.

2

0

.

4

0

.

6

0

.

8

1

.

0

Figure 1: Example of cumulative distribution (for standard normal random variable)

1-sided p-values

• Used in F-tests, t-tests with > or < alternatives (not =).

• Procedure: Get test statistic u.

• If alternative hypothesis is θ > 0, then usually p-value is P(X > u).

• Example

– If alternative hypothesis is µ

1

− µ

2

> 0 and t statistic is 2.5 with 14 degrees of

freedom, the p-value is 1 −P(t

14

< 2.5) = 0.0127.

2-sided p-values

• Used in t-tests with = alternative.

• Procedure: Get test statistic u.

• If alternative hypothesis is θ = 0, then usually p-value is usually P(X > |u|) =

2(1 −cdf(|u|)).

• Example

– If alternative hypothesis is µ

1

− µ

2

= 0 and t statistic is 2.5 with 14 degrees of

freedom, the p-value is 2(1 −P(t

14

< 2.5)) = 0.0255.

Topic 2 Page 12

X

D

e

n

s

i

t

y

o

f

t

(

1

4

)

-2 0 2

0

.

0

0

.

1

0

.

2

0

.

3

0

.

4

Figure 2: Two-sided p-value corresponds to area in both shaded regions

• Notes

– Most of the time, SAS output will report the p-value associated with a particular

test. However, this will not always be true.

– Sometimes, a test statistic is so large that the p-value is reported as 0. This, of

course, is not true; you should report, in this case, that the p-value is too small

to be determined numerically.

Cutoﬀ values/rejection regions

• Often, we report a cutoﬀ value for a particular test at a particular α level.

• Test statistics that are larger (usually) than the cutoﬀ value are in the rejection region,

so called because being in the rejection region means that the null hypothesis is rejected.

• The main function for ﬁnding cutoﬀs is the quantile function, which takes as its input

a probability. (For 1-sided tests, the input is usually 1 −α; for 1-sided test, the input

is usually 1 −α/2.)

• The quantile function is the inverse of the cdf.

• Examples of quantile functions in SAS include probit, cinv, tinv, and finv.

• Example

– Suppose you are running an F-test at level 0.05 with 3 and 35 degrees of freedom.

Topic 2 Page 13

– The cutoﬀ for this test is 2.874 (the 95% quantile of an F distribution with 3 and

35 degrees of freedom); thus, F-ratios greater than 2.874 will result in the null

hypothesis being rejected.

Topic 2 Page 14

Topic 2 Page 2 . – However. .’s with mean µ and variance σ 2 . . E(Y2 ) Common Sample Summaries ¯ • Sample mean (Y ) If Y1 . Y2 ) – E(Y1 × Y2 ) = E(Y1 )E(Y2 ). if Y1 . . Y2 independent. E Var 1 n 1 n Yi Yi 1 n 1 = n2 = E(Yi ) = 1 nµ = µ n 1 Var(Yi ) = 2 nσ 2 = σ 2 /n n ¯ What is the distribution of Y ? ¯ If Yi Normal → Y Normal ¯ If Yi Other → Y ≈ Normal The Central Limit Theorem If Y1 . Yn are independent with mean µ and variance σ 2 . V. 1) nσ 2 • Sample variance (S 2 = 1 n−1 ¯ (Yi − Y )2 ) ¯ ¯ E(Yi − Y ) = E(Yi ) − E(Y ) = 0 ¯ ¯ ¯ Var(Yi − Y ) = Var(Yi ) + Var(Y ) − 2Cov(Yi .– Var(Y1 ± Y2 ) = Var(Y1 ) + Var(Y2 ) ± 2Cov(Y1 . then (n − 1)S 2 /σ 2 ∼ χ2 . Y ) = σ 2 + σ 2 /n − 2σ 2 /n n−1 2 σ = n 1 ¯ E(S 2 ) = Var(Yi − Y ) n−1 1 n−1 2 = n σ n−1 n = σ2 √ • Sample standard deviation (S = S 2 ) What is the distribution of S 2 ? If Yi Normal. E Y1 Y2 = E(Y1 ) . . Yn are independent R. . n−1 where n − 1 is the degrees of freedom. Y − nµ √ ∼ N (0. . . .

θ) • Random sample (realized/actual numbers) x1 . . . yn ) Numerical Summary Parameter Statistic Topic 2 Page 3 . .5 -1. An estimate is a particular value of the estimator. . . variance. . . Sampling from a population: Collection of possible values What you want to know Population What you actually get to see Sample Estimator: Estimate: ˆ θ = g(Y1 .9 -0. (Examples: mean.8 1. x2 .9 Populations/Samples A parameter is the true value of some aspect of the population. . . . .1 2. Yn ) ˆ θ = g(y1 . .2 2. median. xn Example: 0. θ) Example: Y is the yield of a tomato plant ∼ N (µ. • random variable (not based on any particular data). . It is considered ﬁxed. slope) An estimator is a • statistic that corresponds to a parameter. Xn ∼ f (x. .9 -0.Setup • Goal: Learn about population from (randomly) drawn data/sample • Model and Parameter: Assume population (Y ) follow a certain model (distribution) that depends on a set of unknown constants (parameters): Y ∼ f (y. . σ 2 ) Y Random sample/observations • Random sample (conceptual) X1 . computed from the sample data. X2 .0 4. given the data.2 0. σ 2 ) (y − µ)2 1 exp − f (y) = √ 2σ 2 2πσ 2 θ = (µ.8 0. .

if you are told the sum of three elements equals ﬁve.e.. Topic 2 Page 4 .) Thus.01 . Bias • • ••• • • •• • • •• • • • • • • •• • • • •• • • While variance refers to random spread. σ 2 = s2 = 3.Example Estimators for µ and σ 2 µ=Y = ˆ ¯ Estimates µ = y = 1. E(SS/k) = σ 2 . bias refers to a systematic drift in an estimator. bias and variance are inherent (and thus often subject to manipulation) in a statistical method. For example. (Thus. an estimate of variance may be biased. General Result: If Yi has variance σ 2 and SS = ¯ (Yi − Y )2 with k degrees of freedom. not the sampling. not populations. σ =S = ˆ 2 2 n i=1 (Yi ¯ − Y )2 n−1 Variance vs.49 ˆ ¯ ˆ n i=1 Yi n . ˆ ˆ An estimator θ of θ is unbiased if E(θ) = θ. free to vary). Unbiased sample mean sample variance Biased sample standard deviation F -ratio Degrees of Freedom of a sum is equal to the number of elements in that sum that are independent (i. we will be concerned with the variance and bias of estimators. Most of the time. you only need to know two of the three elements to know all of them.

Is result of experiment easily explained by chance variation or is it “unusual”? • “Unusual”: Is it unlikely if only chance variation? • Need distribution of results assuming only chance variation (null distribution) • Compare observed result with distribution of outcomes • Example 1: t-test (comparing two means) – Calculate observed t-test statistic – t distribution summarizes outcomes under Null hypothesis – Compare observed result with distribution • Example 2: randomization test – Chance variation due to randomization – Generate all possible outcomes (each equally likely) – Compare observed result with distribution of outcomes Standard Normal Distribution Take Zi ∼ N (0. b So P (a < Zi < b) = a f (z)dz.0 -4 0.2 0. . 1) independent i = 1. .1 0.Sampling/Reference Distribution Statistical inference/testing: making decision in the presence of variability. .4 -2 0 2 4 Topic 2 Page 5 .3 0. where 1 2 1 f (z) = √ e− 2 z 2π 0. . n.

. Y = µ(X) + eσ. σ 2 ).959964 > 2*(1-pnorm(qtl)) [1] 0. SAS Code data normal. Z = A−1 (X − µ) ∼ N (0. 1) σ Application Used to model random observations. run. Zn ) is f (z) = (2π)−n/2 e− 2 Normal with mean µ and variance σ 2 X = µ + σZ ∼ N (µ. σ 2 ) 1 n 2 i=1 zi Standardizing Z= X −µ ∼ N (0. I).05 Topic 2 Page 6 . proc print data = normal. where e ∼ N (0. Given X = x. Obs 1 qtl 1. A A = I. .The density of Z = (Z1 .95996 pval 0. .975) [1] 1.0214 R Code > qtl = qnorm(0. Multivariate Normal X = µ + AZ ∼ N (µ.3)).3 */ run. Y ∼ N (µ(x).975) > qnorm(0. qtl = probit(0. . A A). Special case: A orthogonal. /* to get z value for 95% confidence interval */ pval = 2*(1-probnorm(2.975). 1). /* to get p-value of 2-sided z-score 2.

1)/ indep χ2 /d. f. df ) for quantiles. then t = (d) X s ∼ t(d) . usually in variance estimates S = i=1 2 n Zi2 ∼ χ2 (n) ¯ (Yi − Y )2 /σ 2 ∼ χ2 (n−1) Chisquare densities on 2. • In R.5 0. • For large n. Recipe: t(d) = N (0. use pchisq(q.0 0 0. 1) and s2 ∼ d χ2 . 10. Student’s t Distribution 1 If X ∼ N (0. Var(S 2 ) = 2n . use probchi(q. Sums of squares of normals. 0. d. df ) for p-values and cinv(p. (d) Topic 2 Page 7 . 2n) by CLT.1 0.2 0.4 5 10 15 20 25 30 E(S 2 ) = n.Chisquare distribution Chisquare distribution on n degrees of freedom. χ2 = N (n. 5. (n) • In SAS. independent. df ) for p-values and qchisq(p. df ) for quantiles.3 0.

Fisher’s F Distribution If SSN ∼ χ2 independent of SSD ∼ χ2 . df ) for p-values and qt(p. s2 = 1 n−1 n i=1 (Xi ¯ − X)2 • In SAS. df ) for quantiles. ∞ d. f.1 0.t densities on 2. df ) for quantiles. then t= ¯ √ X −µ n ∼ t(n−1) s 1 n n i=1 Xi . 5.3 -2 0 2 4 small d ⇒ heavy tails Handy theorem Xi ∼ N (µ. use pt(q.2 0. use probt(q. σ 2 /n). 10. • In R.0 -4 0. (n) (d) Then F = Notes Topic 2 Page 8 1 SSN n 1 SSD d ≡ M SN ∼ Fn.d M SD . σ 2 ) independent. 0. σ 2 ) independent. Let ¯ X= Then ¯ X ∼ N (µ.4 0. indep Sample standardization So. if Xi ∼ N (µ. s2 ∼ (n − 1)−1 χ2 (n−1) . df ) for p-values and tinv(p.

5.n • F1.8 0.d = t2 (d) • As d → ∞. ∞} 0.8 2 4 6 8 10 F5.• 1/Fn. (n) F2.d densities with d ∈ {2.d densities with d ∈ {2. ∞} 0.6 0.4 0. Fn. 10.6 2 4 6 8 10 Topic 2 Page 9 .4 0. 5.0 0 0.d → χ2 /n.2 0. 10.2 0.0 0 0.d ∼ Fd.

df1 .F10. • In R. φ) for p-values. φ) for quantiles. df2 ) for quantiles. df1 . Noncentral F Fn. 5. • In SAS. 1) independent C = n Xi2 ∼ χ2 (φ). Topic 2 Page 10 .e. φ) for p-values and cinv(p. df . df . df . df2 ) for quantiles. use probchi(q. use pchisq(q.d densities with d ∈ {2. use pf(q. φ) for p-values and finv(p. use pf(q. • In R. φ) for quantiles.2 0.8 2 4 6 8 10 • In SAS.d (φ) = χ2 (φ)/n (n) χ2 /d (d) Arises: probability of signiﬁcant F -test under alternative (i. φ = i a2 i i=1 (n) Miracle: only depends on a Arises: mean square when µ = 0 (in alternate hypotheses) • In SAS. df . use probf(q. φ) for quantiles.4 0. df2 ) for p-values and qf(p. df1 . df . 10. ∞} 1.0 0 0. φ) for p-values and qchisq(p. use probf(q. • In R. df . df . df1 . power of F test).6 0.0 0. df2 ) for p-values and finv(p. Noncentral Distributions Noncentral chisquare Xi ∼ N (ai .

probchi. a1 ) for quantiles. Encyclopedia of the statistical sciences 2. df . df . use probt(q. a1 ) for p-values and tinv(p. probt. N (a1 . • The cdf for X is a function of x such that cdf (x) = P (X < x). use pt(q. • In R.d (φ) = χ2 (φn )/n (n) χ2 (φd )/d (d) For power when error mean square corrupted. • Note: P (X > x) = 1 − P (X < x) • Examples of functions which evaluate cdf for diﬀerent distributions: probnorm. the Web Review on Finding p-values Building block for p-values: cumulative distribution function (cdf) • Let X be a random variable. probf. Noncentral distributions Widely tabled Watch parameterization closely See 1. Topic 2 Page 11 . a1 ) for p-values. df .Noncentral t td (a1 ) = Power of t test • In SAS. Johnson and Kotz’s books on distributions 3. 1) χ2 (d) Doubly noncentral F Fn.

2-sided p-values • Used in t-tests with = alternative. the p-value is 2(1 − P (t14 < 2.5 with 14 degrees of freedom.6 0. t-tests with > or < alternatives (not =).0 -1 0 x 1 2 Figure 1: Example of cumulative distribution (for standard normal random variable) 1-sided p-values • Used in F -tests.0127. • If alternative hypothesis is θ = 0.0255.8 1.4 0. • Example – If alternative hypothesis is µ1 − µ2 = 0 and t statistic is 2.5)) = 0.cdf(x) 0. then usually p-value is usually P (X > |u|) = 2(1 − cdf (|u|)).5 with 14 degrees of freedom. • Procedure: Get test statistic u. • Example – If alternative hypothesis is µ1 − µ2 > 0 and t statistic is 2.0 -2 0.5) = 0. • If alternative hypothesis is θ > 0. Topic 2 Page 12 . • Procedure: Get test statistic u. the p-value is 1 − P (t14 < 2.2 0. then usually p-value is P (X > u).

a test statistic is so large that the p-value is reported as 0.) • The quantile function is the inverse of the cdf. in this case. this will not always be true. This.3 0. However. • Examples of quantile functions in SAS include probit. we report a cutoﬀ value for a particular test at a particular α level. tinv. Topic 2 Page 13 .05 with 3 and 35 degrees of freedom. the input is usually 1 − α/2.0 0. SAS output will report the p-value associated with a particular test. of course.4 -2 0 X 2 Figure 2: Two-sided p-value corresponds to area in both shaded regions • Notes – Most of the time. is not true. you should report. cinv. that the p-value is too small to be determined numerically.1 Density of t(14) 0. so called because being in the rejection region means that the null hypothesis is rejected. (For 1-sided tests. • Example – Suppose you are running an F -test at level 0. • Test statistics that are larger (usually) than the cutoﬀ value are in the rejection region. – Sometimes. which takes as its input a probability. Cutoﬀ values/rejection regions • Often. and finv. for 1-sided test.2 0.0. • The main function for ﬁnding cutoﬀs is the quantile function. the input is usually 1 − α.

F -ratios greater than 2. Topic 2 Page 14 . thus.– The cutoﬀ for this test is 2.874 (the 95% quantile of an F distribution with 3 and 35 degrees of freedom).874 will result in the null hypothesis being rejected.

- Application of Nonlinear Static Analyses to PSDAUploaded bysatoni12
- VI-02 Practical Classical Methods_2007Uploaded byAamir Habib
- 2009_13.pdfUploaded byEugenio Martinez
- Presentation Borghetti Nucci Paolone Dallas 04Uploaded bymspd2003
- Engdahl 1998.pdfUploaded bymanuelflorez1102
- 05208736Uploaded byelmabirparic
- Arroyo y Ordaz 2007 Energía Histerética EESDUploaded byhal9000_mark1
- 12645234 Bradley OpinionPaperACriticalExaminationOfSeismicResponseUncertaintyInEarthquakeEngineering EESD 2013 InpressUploaded byMohinuddin Ahmed
- Energy Conversion and Management Volume 68 Issue 2013 [Doi 10.1016%2Fj.enconman.2013.01.004] Demirhan, Haydar; Menteş, Turhan; Atilla, Mustafa -- Statistical Comparison of Global Solar Radiation EstimUploaded bycesarinigillas
- Independent Sample T-TestUploaded byPuguh Dwi Santoso
- 1718. 3-Point Estimation-Statistical Inference.pdfUploaded byApolinus Silalahi
- j 41017380Uploaded byAnonymous 7VPPkWS8O
- 15-16.2 Compre Part AUploaded byVishweshRaviShrimali
- S3 Revision NotesUploaded byAnonymous RnapGaSIZJ
- Estimation TheoryUploaded bynokiaa7610
- Socio-Economic Impact of Business Establishments in Balagtas, Batangas City to the Community: Inputs to Business Plan DevelopmentUploaded byAsia Pacific Journal of Multidisciplinary Research
- res12_s4glossaryUploaded byCarnage Recon
- 3 Scaling TestUploaded byRyanto
- M4 All SlidesUploaded byJosua Pardede
- Truncated Regression ModelUploaded byrajtechnic
- Tree and Forest Measurement BookUploaded byJosip Frljic
- Econometrics Assignment HW4Uploaded byNikhil Sharma
- Practical Lab 3 ExampleUploaded byNicholas Lorenzo Lachee
- Child Labor Chap 3Uploaded byAngel Agriam
- OUTPUT.docUploaded bysumi
- The Effects of Using Videos on Teaching Selected Topics in Physics Towards the Development of Higher-Order Thinking SkillsUploaded byAsia Pacific Journal of Multidisciplinary Research
- Ore Body Modelling Assignment 2Uploaded byBruna Kharyn
- 1-s2.0-S0304407607001741-mainUploaded byAziz Adam
- fulltextUploaded bySarah El Hachimi
- Pakistan Army Fight Till EndUploaded bySamee Lashari