3 views

Uploaded by Erick Diaz

MIT

- IE REVIEW QUESTIONS.pptx
- The Impact of Knowledge Management on the Function of Employee Performance Appraisals in Industrial Companies - Case Study
- Conditional Probability
- SeniorProblemSet_Feb16
- Efectos de Los Municipios en La Educacion
- 21 Only Stats
- courtney johnson - author evaluation essay
- document for website
- 4 1 1 howard r
- 2 November 7-11, 2016 ok.docx
- AP Psych Week 1
- Statistical Methods for Life Sciences
- statistical & non-statistical lesson
- Acknowledgement
- Content Mastery Bonus
- Math 103 05 Probability
- AWL 1000 families
- Investigation on workerspressure of Medical Industries in Tamilnadu
- lec20
- Lecture 1 Intro

You are on page 1of 46

650

Statistics for Applications

Chapter 1: Introduction

1/43

Goals

Goals:

▶ To give you a solid introduction to the mathematical theory

behind statistical methods;

▶ To provide theoretical guarantees for the statistical methods

that you may use for certain applications.

At the end of this class, you will be able to

1. From a real-life situation, formulate a statistical problem in

mathematical terms

2. Select appropriate statistical methods for your problem

3. Understand the implications and limitations of various

methods

2/43

Instructors

Associate Prof. of Applied Mathematics; IDSS; MIT Center

for Statistics and Data Science.

Instructor in Applied Mathematics; IDSS; MIT Center for

Statistics and Data Science.

3/43

Logistics

▶ Optional Recitation: TBD.

▶ Homework: weekly. Total 11, 10 best kept (30%).

▶ Midterm: Nov. 8, in class, 1 hours and 20 minutes (30 %).

Closed books closed notes. Cheatsheet.

▶ Final: TBD, 2 hours (40%). Open books, open notes.

4/43

Miscellaneous

notions of linear algebra (matrix, vector, multiplication,

orthogonality,…)

▶ Reading: There is no required textbook

▶ Slides are posted on course website

https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/lecture-slides

Attendance is still recommended.

5/43

Why statistics?

6/43

Not only in the press

Should be high enough for most floods Should not be

too expensive (high)

What is a fair premium?

44 showed no improvement. Is the drug effective?

8/43

Randomness

9/43

Randomness

RANDOMNESS

9/43

Randomness

RANDOMNESS

Associated questions:

▶ Notion of average (“fair premium”, …)

▶ Quantifying chance (“most of the floods”, …)

▶ Significance, variability, …

9/43

Probability

▶ Probability studies randomness (hence the prerequisite)

▶ Sometimes, the physical process is completely known: dice,

cards, roulette, fair coins, …

Examples

Rolling 1 die:

▶ Alice gets $1 if # of dots 3

▶ Bob gets $2 if # of dots 2

Who do you want to be: Alice or Bob?

Rolling 2 dice:

▶ Choose a number between 2 and 12

▶ Win $100 if you chose the sum of the 2 dice

Well known random process from physics: 1/6 chance of each side,

dice are independent. We can deduce the probability of outcomes,

and expected $ amounts. This is probability. 10/43

Probability

▶ Probability studies randomness (hence the prerequisite)

▶ Sometimes, the physical process is completely known: dice,

cards, roulette, fair coins, …

Examples

Rolling 1 die:

▶ Alice gets $1 if # of dots 3

▶ Bob gets $2 if # of dots 2

Who do you want to be: Alice or Bob?

Rolling 2 dice:

▶ Choose a number between 2 and 12

▶ Win $100 if you chose the sum of the 2 dice

Well known random process from physics: 1/6 chance of each side,

dice are independent. We can deduce the probability of outcomes,

and expected $ amounts. This is probability. 10/43

Probability

▶ Probability studies randomness (hence the prerequisite)

▶ Sometimes, the physical process is completely known: dice,

cards, roulette, fair coins, …

Examples

Rolling 1 die:

▶ Alice gets $1 if # of dots 3

▶ Bob gets $2 if # of dots 2

Who do you want to be: Alice or Bob?

Rolling 2 dice:

▶ Choose a number between 2 and 12

▶ Win $100 if you chose the sum of the 2 dice

Well known random process from physics: 1/6 chance of each side,

dice are independent. We can deduce the probability of outcomes,

and expected $ amounts. This is probability. 10/43

Probability

▶ Probability studies randomness (hence the prerequisite)

▶ Sometimes, the physical process is completely known: dice,

cards, roulette, fair coins, …

Examples

▶ Alice gets $1 if # of dots 3

▶ Bob gets $2 if # of dots 2

Who do you want to be: Alice or Bob?

Rolling 2 dice:

▶ Choose a number between 2 and 12

▶ Win $100 if you chose the sum of the 2 dice

Well known random process from physics: 1/6 chance of each side,

dice are independent. We can deduce the probability of outcomes,

and expected $ amounts. This is probability. 10/43

Statistics and modeling

parameters from data. This is statistics

▶ Sometimes real randomness (random student, biased coin,

measurement error, …)

▶ Sometimes deterministic but too complex phenomenon:

statistical modeling

Complicated process “=” Simple process + random noise

▶ (good) Modeling consists in choosing (plausible) simple

process and noise distribution.

11/43

Statistics vs. probability

effective. Then we can anticipate that for a study on

100 patients, in average 80 will be cured and at least

65 will be cured with 99.99% chances.

be able to) conclude that we are 95% confident that

for other studies the drug will be effective on between

69.88% and 86.11% of patients

13/43

18.650

▶ Understand mathematics behind statistical methods

▶ Justify quantitive statements given modeling assumptions

▶ Describe interesting mathematics arising in statistics

▶ Provide a math toolbox to extend to other models.

▶ Statistical thinking/modeling (applied stats, e.g. IDS.012)

▶ Implementation (computational stats, e.g. IDS.012)

▶ Laundry list of methods (boring stats, e.g. AP stats)

14/43

18.650

▶ Understand mathematics behind statistical methods

▶ Justify quantitive statements given modeling assumptions

▶ Describe interesting mathematics arising in statistics

▶ Provide a math toolbox to extend to other models.

▶ Statistical thinking/modeling (applied stats, e.g. IDS.012)

▶ Implementation (computational stats, e.g. IDS.012)

▶ Laundry list of methods (boring stats, e.g. AP stats)

14/43

Let’s do some statistics

15/43

Heuristics (1)

romantic reappearance later in life.”

the right when kissing.

▶ Let us design a statistical experiment and analyze its outcome.

▶ Observe n kissing couples times and collect the value of each

outcome (say 1 for RIGHT and 0 for LEFT);

▶ Estimate p with the proportion p̂ of RIGHT.

▶ Study: “Human behaviour: Adult persistence of head-turning

asymmetry” (Nature, 2003): n = 124, 80 to the right so

80

p̂ = = 64.5%

124

17/43

Heuristics (2)

▶ 64.5% is much larger than 50% so there seems to be a

preference for turning right.

▶ What if our data was RIGHT, RIGHT, LEFT (n = 3). That’s

66.7% to the right. Even better?

▶ Intuitively, we need a large enough sample size n to make a

call. How large?

the accuracy of this procedure?

18/43

Heuristics (3)

▶ For i = 1, . . . , n, define Ri = 1 if the ith couple turns to the

right RIGHT, Ri = 0 otherwise.

▶ The estimator of p is the sample average

∑ n

¯n = 1

p̂ = R Ri .

n

i=1

that describes/approximates well the experiment.

19/43

Heuristics (4)

observations Ri , i = 1, . . . , n in order to draw statistical

conclusions. Here are the assumptions we make:

20/43

Heuristics (5)

Let us discuss these assumptions.

perfect information about the conditions of kissing (including

what goes in the kissers’ mind), physics or sociology would

allow us to predict the outcome.

Ri → {0, 1}. They could still have a different parameter

Ri " Ber(pi ) for each couple but we don’t have enough

information with the data estimate the pi ’s accurately. So we

simply assume that our observations come from the same

process: pi = p for all i

locations and different times).

21/43

Two important tools: LLN & CLT

1∑

n

IP, a.s.

X̄n := Xi −−−−≥ µ.

n n-≥

i=1

∈ X̄n − µ (d)

n −−−≥ N (0, 1).

α n-≥

∈ (d)

(Equivalently, ¯ n − µ) −−

n (X −≥ N (0, α 2 ).)

n-≥

22/43

Consequences (1)

IP, a.s.

R̄n −−−−≥ p.

n-≥

▶ ¯n

Hence, when the size n of the experiment becomes large, R

is a good (say ”consistent”) estimator of p.

▶ The CLT refines this by quantifying how good this estimate is.

23/43

Consequences (2)

∈ R̄n − p

<n (x): cdf of n√ .

p(1 − p)

CLT: <n (x) ≤ <(x) when n becomes large. Hence, for all x > 0,

( ( ∈ ))

[ ] x n

IP |R̄n − p| 2 x ≤ 2 1 − < √ .

p(1 − p)

24/43

Consequences (3)

Consequences:

▶ ¯ n concentrates around p;

Approximation on how R

N (0, 1), then with probability ≤ 1 − o (if n is large enough !),

[ √ √ ]

qa/2 p(1 − p) qa/2 p(1 − p)

R̄n → p − ∈ ,p + ∈ .

n n

25/43

Consequences (4)

▶ Note that no matter the (unknown) value of p,

p(1 − p) 1/4.

▶ Hence, roughly with probability at least 1 − o,

[ ]

qa/2 qa/2

R̄n → p − ∈ , p + ∈ .

2 n 2 n

▶ In

[ other words, when n becomes

] large, the interval

qa/2 qa/2

R̄n − ∈ , R̄n + ∈ contains p with probability 2 1 − o.

2 n 2 n

▶ This interval is called an asymptotic confidence interval for p.

▶ In the kiss example, we get

[ 1.96 ]

0.645 ± ∈ = [0.56, 0.73]

2 124

If the extreme (n = 3 case) we would have [0.10, 1.23] but

CLT is not valid! Actually we can make exact computations!

26/43

Another useful tool: Hoeffding’s inequality

What if n is not so large ?

Hoeffding’s inequality (i.i.d. case)

Let n be a positive integer and X, X1 , . . . , Xn be i.i.d. r.v. such

that X → [a, b] a.s. (a < b are given numbers). Let µ = IE[X].

Then, for all λ > 0,

2n�2

−

IP[|X̄n − µ| 2 λ] 2e (b−a)2 .

Consequence:

▶ For o → (0, 1), with probability 2 1 − o,

√ √

log(2/o) ¯ n + log(2/o) .

R̄n − p R

2n 2n

▶ This holds even for small sample sizes n.

27/43

Review of different types of convergence (1)

deterministic).

▶ Almost surely (a.s.) convergence:

[{ }]

a.s.

Tn −−−≥ T iff IP θ : Tn (θ) −−−≥ T (θ) = 1.

n-≥ n-≥

▶ Convergence in probability:

IP

Tn −−−≥ T iff IP [|Tn − T | 2 λ] −−−≥ 0, ⇒λ > 0.

n-≥ n-≥

28/43

Review of different types of convergence (2)

▶ Convergence in Lp (p 2 1):

Lp

Tn −−−≥ T iff IE [|Tn − T |p ] −−−≥ 0.

n-≥ n-≥

▶ Convergence in distribution:

(d)

Tn −−−≥ T iff IP[Tn x] −−−≥ IP[T x],

n-≥ n-≥

Remark

These definitions extend to random vectors (i.e., random variables

in IRd for some d 2 2).

29/43

Review of different types of convergence (3)

(d)

(i) Tn −−−≥ T ;

n-≥

(ii) IE[f (Tn )] −−−≥ IE[f (T )], for all continuous and

n-≥

bounded function f ;

[ ] [ ]

(iii) IE eixTn −−−≥ IE eixT , for all x → IR.

n-≥

30/43

Review of different types of convergence (4)

Important properties

▶ If (Tn )n?1 converges a.s., then it also converges in probability,

and the two limits are equal a.s.

q p and in probability, and the limits are equal a.s.

distribution

▶ If f is a continuous function:

a.s./IP/(d) a.s./IP/(d)

Tn −−−−−−−≥ T ≈ f (Tn ) −−−−−−−≥ f (T ).

n-≥ n-≥

31/43

Review of different types of convergence (6)

One can add, multiply, ... limits almost surely and in probability. If

a.s./IP a.s./IP

Un −−−−≥ U and Vn −−−−≥ V , then:

n-≥ n-≥

a.s./IP

▶ Un + Vn −−−−≥ U + V ,

n-≥

a.s./IP

▶ Un Vn −−−−≥ U V ,

n-≥

Un a.s./IP U

▶ If in addition, V ̸= 0 a.s., then −−−−≥ .

Vn n-≥ V

distribution unless the pair (Un , Vn ) converges in distribution to

(U, V ).

33/43

Another example (1)

T1 , . . . , Tn .

▶ Mutually independent

▶ Exponential random variables with common parameter , > 0.

arrival times.

34/43

Another example (2)

completely justified (often the case with independence).

exponential distribution:

▶ The exponential distributions of T1 , . . . , Tn have the same

parameter: in average all the same inter-arrival time. True

̸ 11pm).

only for limited period (rush hour =

35/43

Another example (3)

▶ Density of T1 :

f (t) = ,e− t , ⇒t 2 0.

1

▶ IE[T1 ] = .

,

1

▶ Hence, a natural estimate of is

,

1∑

n

T̄n := Ti .

n

i=1

▶ A natural estimator of , is

ˆ := 1 .

,

T̄n

36/43

Another example (4)

▶ By the LLN’s,

a.s./IP 1

T̄n −−−−≥

n-≥ ,

▶ Hence,

ˆ−a.s./IP

, −−−≥ ,.

n-≥

▶ By the CLT,

( )

∈ 1 (d)

n T̄n − −−−≥ N (0, ,−2 ).

, n-≥

How does the CLT transfer to ,

confidence interval for , ?

37/43

The Delta method

∈ (d)

n(Zn − e) −−−≥ N (0, α 2 ),

n-≥

asymptotically normal around e).

Then,

▶ (g(Zn ))n?1 is also asymptotically normal;

▶ More precisely,

∈ (d)

n (g(Zn ) − g(e)) −−−≥ N (0, g ′ (e)2 α 2 ).

n-≥

38/43

Consequence of the Delta method (1)

∈ ( ) (d)

▶ n ,ˆ − , −−−≥ N (0, ,2 ).

n-≥

qa/2 ,

ˆ − ,|

|, ∈ .

n

[ ]

q , q ,

▶ Can , ˆ − a/2

∈ ,, ˆ + a/2

∈ be used as an asymptotic

n n

confidence interval for , ?

▶ No ! It depends on ,...

39/43

Consequence of the Delta method (2)

▶ In this case, we can solve for ,:

( ) ( )

qa/2 , qa/2 qa/2

ˆ

|, − ,| ∈ ∼≈ , 1 − ∈ , , 1+ ∈

ˆ

n n n

( )−1 ( )

qa/2 qa/2 −1

∼≈ , 1 + ∈

ˆ , , 1− ∈

ˆ .

n n

[ ( ) ( ) ]

qa/2 −1 qa/2 −1

Hence, , ˆ 1+ ∈ ,,ˆ 1− ∈ is an asymptotic

n n

confidence interval for ,.

40/43

Slutsky’s theorem

Slutsky’s theorem

Let (Xn ), (Yn ) be two sequences of r.v., such that:

(d)

(i) Xn −−−≥ X;

n-≥

IP

(ii) Yn −−−≥ c,

n-≥

where X is a r.v. and c is a given real number. Then,

(d)

(Xn , Yn ) −−−≥ (X, c).

n-≥

In particular,

(d)

Xn + Yn −−−≥ X + c,

n-≥

(d)

Xn Yn −−−≥ cX,

n-≥

...

41/43

Consequence of Slutsky’s theorem (1)

▶ Thanks to the Delta method, we know that

∈ ,ˆ − , (d)

n −−−≥ N (0, 1).

, n-≥

ˆ −−IP−≥ ,.

,

n-≥

▶ Hence, by Slutsky’s theorem,

∈ ,ˆ − , (d)

n −−−≥ N (0, 1).

ˆ

, n-≥

[ ]

ˆ

qa/2 , ˆ

qa/2 ,

ˆ − ∈ ,,

, ˆ+ ∈ .

n n

42/43

Consequence of Slutsky’s theorem (2)

Remark:

trick: “p(1 − p) 1/4”.

confidence interval

[ √ √ ]

qa/2 R¯ n (1 − R

¯ n ) qa/2 R¯ n (1 − R

¯n)

R¯ n − ∈ ¯n +

,R ∈ .

n n

43/43

MIT OpenCourseWare

https://ocw.mit.edu

Fall 2016

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

- IE REVIEW QUESTIONS.pptxUploaded byxxxxxxxxx
- The Impact of Knowledge Management on the Function of Employee Performance Appraisals in Industrial Companies - Case StudyUploaded byijmit
- Conditional ProbabilityUploaded byKhubaib Ahmed
- SeniorProblemSet_Feb16Uploaded byLlosemi Ls
- Efectos de Los Municipios en La EducacionUploaded byAlfredo Nery Soto Guzman
- 21 Only StatsUploaded byThe Gazette
- courtney johnson - author evaluation essayUploaded byapi-461477908
- document for websiteUploaded byapi-381865105
- 4 1 1 howard rUploaded byapi-256165085
- 2 November 7-11, 2016 ok.docxUploaded byjun del rosario
- AP Psych Week 1Uploaded byDavid Park
- Statistical Methods for Life SciencesUploaded byLorrie Suchy
- statistical & non-statistical lessonUploaded byapi-250789865
- AcknowledgementUploaded byHanisah Jamaludin
- Content Mastery BonusUploaded bymandala admaja
- Math 103 05 ProbabilityUploaded byAnonymous hzr2fbc1zM
- AWL 1000 familiesUploaded byapi-19992748
- Investigation on workerspressure of Medical Industries in TamilnaduUploaded byMURUGESAN MURUGESAN.R
- lec20Uploaded bytamizhan
- Lecture 1 IntroUploaded byRammurtiRawat
- The Influence of Earnings Management and Information Asymmetry Towards Cost of Equity Capital Case IDX Property Sector Listed Period 2008-2011Uploaded byCandy Gloria MeiMei
- Meyer - Meaning in MusicUploaded bywmason88
- Johor Education DepartmentUploaded byNaavin Raj
- How to Write a Research PaperUploaded byAshraful Alam
- Correlation Parameters MDUploaded byrathanya
- 810007Uploaded bymansi
- Hasil Uji DeskriptifUploaded bybagussm
- Chapter 3 - Theoretical Framework and Research MethodologyUploaded byramki66666
- Data Management Grade 12 Textbook SlideshareUploaded byCarlos
- Studies in Statistical Inference, Sample Techniques and Demography, by R. Singh, J. Singh, F. SmarandacheUploaded byAnonymous 0U9j6BLllB

- Aplicación la estadística descriptiva e inferencial en el caso de la madurez lectora en la niñez.docxUploaded byErick Diaz
- ASEAN 50 Final IntroducccionUploaded byErick Diaz
- Tarea 2.docxUploaded byErick Diaz
- Cristana.docxUploaded byErick Diaz
- Tarea #1 de Econometria AvanzadaUploaded byErick Diaz
- Tipo de cambio de flotación sucia.docxUploaded byErick Diaz
- tarea 1.docxUploaded byErick Diaz
- Análisis-de-tesis.docxUploaded byErick Diaz
- Presentacion esdistica.docxUploaded byErick Diaz
- Clase Práctica 1 2019Uploaded byErick Diaz
- Rsn 920 Viana BarceloUploaded byErick Diaz
- Crear Un Mejor Sistema de Comercio MundialUploaded byErick Diaz
- 16993.pdfUploaded byErick Diaz
- DISCRIM.pdfUploaded byErick Diaz
- Encuesta Boliche v01 InstruccionesUploaded byErick Diaz
- regresion polinomial cuadratica.pdfUploaded byManuel Garza
- Articulo Regresión LinealUploaded byErick Diaz
- Brenes y Cruz 2016Uploaded byErick Diaz
- 7 Balanced Hospital PublicoUploaded byHéctor Huno
- Operación condorUploaded byErick Diaz
- El Artículo Científico Aspectos a Tener en CuentaUploaded byCarlos Garcia
- Hideki.docxUploaded byErick Diaz
- SDFFUploaded byAlexander Junior Sandoval Flores
- 926-3 Norma Sobre Gestion de Riesgo de Liquidez Refundido 30-11-18 (1)Uploaded byErick Diaz
- lafuncionlogicasiejerciciosresueltos-100302054254-phpapp01Uploaded byguz079
- b. 14balanced Socrecard en Un Hospital PublicoUploaded byErick Diaz
- art4Uploaded byroble-mati
- Ejercicio Tabla dinámica #1.docxUploaded byErick Diaz
- Guía Para El Primer ParcialUploaded byErick Diaz

- 8th grade project marking period 2Uploaded byapi-294921435
- MKXVIIIUploaded byRuben Cahuana
- Report Mission to Kazakhstan, April 2018Uploaded byKatarzyna Chimiak
- Ramayana - FACTS & EVIDENCES (K. Gopalakrishnan, July 2013)Uploaded bySrini Kalyanaraman
- TorqSeal Test ProcedureUploaded byRafiqKu
- Adherencia y Colonizacion de StreptococcusUploaded bynydiacastillom2268
- Of the Delicacy of Taste and PassionUploaded byAthar Khan
- Journal.pone.0123230Uploaded bymilky1792
- Symfony Reference 2.5Uploaded byniurkam
- Observational Study DesignUploaded byIfanda Ibnu Hidayat
- Disasters and ConflictsUploaded byDayal Ravyansh
- The Colonnade, November 18, 1955Uploaded byBobcat News
- Physiology of GridUploaded byantoshdyade
- The Gyula Farkas memorial competition in the context of the Hungarian scientific competitionsUploaded byAmanda Car
- ScorpionUploaded byHersh Hama
- 9808049Uploaded byJohn
- i h 114 Reservoir Flood EstimationUploaded bySudharsananPRS
- 101-Rules-of-ArabicUploaded byislamanalysis
- Kennett Comparative So PolicyUploaded bybeck827
- 2013 Essay - Kampshof, LiamUploaded bymurrali_raj2002
- Common Emitter CharacteristicsUploaded bya2kash
- UofT Undergrad Case Comp Training 2008-01-08Uploaded byvvikap
- Seminar Report of SSTUploaded bychaitanya
- Met by the AD7873 Resistive-Touch-Screen Controller ADCUploaded bysroaa
- MCQs+-+oligopolyUploaded bypalak32
- THEODOLITE TM-5-6675-312-14Uploaded byNayyar Mahmood
- 1666-2011Uploaded byAjay Bidari
- Lec - 6 CodesUploaded byIqbal Uddin Khan
- Combination of Variable Sol.Uploaded byMerve Eser
- 2014 - The Latin American School on Education and the Cognitive and Neural Sciences Goals and challenges.pdfUploaded byMauricio Ríos