10 views

Original Title: Biostatistics Notes

Uploaded by lauren smith

- Normal Distribution
- Introduction to Biostatistics Second Edition
- John Ehlers - TheInverseFisherTransform.pdf
- Mixture Density Networks With TensorFlow
- MLC Exam Solutions
- Aptitude Test 3
- Problem Set 2 Solution
- ProblemSet4
- Daytoday
- Jane Musengya-Assignment 2
- stats_ch8
- Basic Statistics[1]
- Practice quiz 1 new.pdf
- Test 4 ReviewDay19
- Stats Ch3 Tb
- ch64
- Handout7
- Topic 1.0 Statistics QC
- Research Methods- Stats. Assignment
- Normal Practice

You are on page 1of 5

Instructors Copy

Moore/Notz/Fligner

Chapter 11 Sampling Distributions

The material presented in this module forms the backbone for much of the

material on statistical inference to follow; namely, confidence intervals and

significance tests based on large samples. One cannot understand the

meaning of a P-value, for example, without understanding sampling

distributions. It may not be easy for you to grasp the idea that a statistic has

its own probability distribution, and concepts such as the mean of a sample

mean confuse many newcomers. But, time invested here will yield rewards

repeatedly as we continue to study.

It is in this module that we will formally introduce the distinction between a

statistic and parameter for here we begin to turn toward inference from

sample to population. A common population parameter of interest is the

population mean . The sample mean x is commonly used to estimate , so

its sampling distribution is important.

In this module we examine several important results concerning this

particular sampling distribution:

(1)The mean of the statistic x is the same as the mean of the population

were sampling from, ;

(2)the standard deviation of the statistic x is n;

(3)because of this result, the Law of Large Numbers states that the

statistic x has less variability (greater consistency) when based on a

larger sample size;

(4)the Central Limit Theorem states the sampling distribution of x will be

approximately normal, no matter what population the sample is taken

from, provided the sample size is large enough.

The process of statistical inference involves using information from a sample

to draw conclusions about a wider population.

Population entire group of individuals about which we want information.

Ex. People with chronic pain (but we cant study all of them, so)

Sample part of the population from which we actually collect information.

Ex. Selected group of people to get data from.

Simple Random Sample (SRS) a sample of n individuals from the

population chosen in such a way that every set of n individuals has an equal

chance to be the sample actually selected. Does not favor any part of the

population; helps eliminate potential bias because the selection of samples is

left to probability and chance.

Parameter a number that describes the population. In statistical practice,

the value is not known because we cannot examine the entire population.

Statistic a number that describes some characteristic of a sample; the

value of a statistic can be computed from the sample data without making

use of any unknown parameters. In practice, we often use this to estimate an

unknown parameter.

x (x-bar) = sample mean; s = sample standard deviation

Parameters come from Populations

(mu) = population mean; = population standard deviation

Sampling Variability

Sampling variability the value of a statistic varies in repeated random

sampling.

Sampling variability is inevitable no matter how careful we are with random

sampling.

Think of a statistic as a random variable because it takes on numerical

values that describe the outcome of a random sampling process.

Chances are we will get many different statistics.

population parameter, since many random samples will produce different x-

bars?

How can x-bar (xx ) be an accurate estimate of mu ()?

One solution: increase sample size.

Law of large numbers draw observations at random from any population

with finite mean mu (). As the number of observations drawn increases, the

sample mean [x-bar (x )] of the observed values gets closer and closer to the

mean [mu ()] of the population. (i.e. the statistic gets closer to the

parameter). This is the case for any population for any distribution (whether

normal or skewed distribution); if we use a large enough sample, we can an

estimate that is very close to the population parameter.

The problem with this is that we may not always be able to get a large

enough sample size.

How can we make inferences about a population based on findings from a

smaller sample.

Create a sampling distribution.

(1)Take every possible sample of a certain size

(2)calculate the mean for each sample

(3)graph all calculated means on histogram

Population distribution distribution of values of the variable among all

individuals in the population. (i.e. each n within the population distribution

represents one person or one data point)

Sampling distribution distribution of values taken by a statistic in all

possible samples of the same size from the same population. (i.e. each n

within the sampling distribution represents one sample).

There are actually three distinct distributions involved when we sample

repeatedly and measure a variable of interest.

1) the population distribution gives the values of the variable for all the

individuals in the population.

Reality, its rarely known

2) the distribution of sample data shows the values of the variable for all

the individuals in the sample.

Reality, its whats created by doing exploratory data analysis on

collected data.

3) the sampling distribution shows the statistic values from all the

possible samples of the same size from the population. Simulation: using

software to imitate chance behavior (because we cant really take so many

samples in real life).

Reality, what we really need to look at when making statistical

inferences.

When we choose many SRSs from a population, the sampling

distribution of the sample mean is centered at the population mean

(mu) and is less spread out than the population distribution. (some

sample means may be greater than population mean; equally likely to get

some sample means that are less than the population mean; since there is

no systematic tendency to under or over estimate the parameter, the mean

of all the sample means will ultimately equal the population mean, so we can

say that x-bar (or the sample mean) is an unbiased estimator of the

parameter).

distribution, most of the outliers seen in the population distribution gets

evened out; the spread of the sampling distribution is always smaller than

the population distribution. in fact as sample size gets larger, the standard

deviation of the sampling distribution gets smaller (this is because larger

samples have a greater likelihood of getting outliers from high and low end,

thus evening them out). So we can calculate the standard deviation of the

sampling distribution by dividing the sigma (the population standard

deviation) over the square root of n.

Suppose that x (x-bar) is the mean of an SRS of size n drawn from a large

population with mean (mu) and standard deviation . Then:

The mean of the sampling distribution of x is x =

The standard deviation of the sampling distribution of x is

x = /(square root of n)

These facts about the mean and standard deviation of xx (x-bar) are always

true

no matter what shape the population distribution has (left skewed, right

skewed, big spread, little spread..)!!!

What this tells us: When we take a large enough random sample, we can

trust that it will help estimate the true population mean accurately because a

large sample will give us a sample mean that is close to the parameter (law

of large numbers), and also a large sample size will ensure a small sampling

distribution standard deviation, which means that all large samples will have

means that are close to the population parameter. (Slide C).

Why are all the sampling distributions seen so far normal? Is it because the

population distributions are normal? What if the population distributions

arent normal?

Most population distributions are not Normal. What is the shape of the

sampling distribution of sample means when the population distribution isnt

Normal?

sample means changes its shape: it looks less like that of the population

distribution and more like a Normal distribution! (This is true no matter what

the population distribution looks like, so long as the population distribution

has a meaningful or finite standard deviation, and the sample size is large

enough; if the population distribution already resembles a Normal

distribution, then the sample size required to create a Normal sampling

distribution will be far less than when a population distribution is very

skewed. If we have a large enough sample size, then even a very skewed

population distribution can have a very normal sampling distribution = CLT)

Draw an SRS of size n from any population with mean (mu) and

finite standard deviation (sigma).

Central limit theorem (CLT) for large n, the sampling distribution of the

sample mean x (x-bar) is approximately Normal. (That is, averages are more

Normal than individual observations).

when sample size is large, the sampling distribution of the sample mean x-

bar is approximately normal with the mean of the sample means equal to

mu (population parameter) and the standard deviation of sigma over the

square root of n.

inference, we rely heavily on the normal distribution. So, CLT is VERY

IMPORTANT!!! (Slide D. for practice).

If the population distribution is Normal, then so is the sampling distribution of

x-bar. This is true no matter what the sample size n is.

If the population distribution is not Normal, the central limit theorem (CLT)

tells us that the sampling distribution of x-bar (sample means) will be

approximately Normal in most cases if n >= 30.

Remember:

(1) Means of random samples are LESS VARIABLE than

individual observations.

(2) Means of random samples are MORE NORMAL than

individual observations.

- Normal DistributionUploaded byAlyssa Marie L. Palamiano
- Introduction to Biostatistics Second EditionUploaded byAjmal Khan
- John Ehlers - TheInverseFisherTransform.pdfUploaded byArc Angel M
- Mixture Density Networks With TensorFlowUploaded bysm
- MLC Exam SolutionsUploaded byAki Tsukiyomi
- Problem Set 2 SolutionUploaded bypntgaur54185
- ProblemSet4Uploaded bySatish Kumar Rawal
- Jane Musengya-Assignment 2Uploaded byJane Muli
- Aptitude Test 3Uploaded byshashiraje5922
- DaytodayUploaded bypop67
- stats_ch8Uploaded byRathnaKumar
- Basic Statistics[1]Uploaded byArmando Saldaña
- Practice quiz 1 new.pdfUploaded byNataliAmiranashvili
- Test 4 ReviewDay19Uploaded byTri Nguyen
- Stats Ch3 TbUploaded byRamiro Diaz
- ch64Uploaded byCarla Regina
- Handout7Uploaded byahmed22gouda22
- Topic 1.0 Statistics QCUploaded byA Lishaaa
- Research Methods- Stats. AssignmentUploaded byMatt Haeffner
- Normal PracticeUploaded byYinny Chen
- Practica I.O.Uploaded byLuiggiAndre
- Statistik IntroUploaded byNSBMR
- 18__44_48Uploaded byMuhammad Irfan Ardiansyah
- Med School ProbabilitiesUploaded byMichael Tilghman
- ARTICULOSimplified Statistics for Small Numbers of Observations.pdfUploaded by2704honey
- Liu 2017Uploaded byFranklin Quispe Moya
- Handout Chapter 7Uploaded byRegDalisay
- PSYCH IAUploaded byJordy Roque
- 6.Descriptive StatsUploaded byBoobalan Dhanabalan
- ADM_2303_Week6Uploaded bysaad ullah

- [Jack R. Fraenkel, Norman E. Wallen] How to Design(BookFi)-218-229Uploaded byAii LoveOppasukkie
- MODULE 2 - Descriptive StatisticsUploaded byluis_m_dr
- 01BCAScheme&Syll020611Uploaded byMayank Makwana
- Chapter 1 - Organising and Displaying DataUploaded bytrmp
- Eco No Metrics- Regression With StataUploaded bypotionpotion
- nchrp_rpt_592hUploaded byKangho Won
- Sample Exam 1Uploaded bymeaok1
- Sample Answer-Research Methodology (2).docxUploaded byShantiram Dahal
- hosaUploaded bySerj Silv
- Basic Concepts in Population Modeling, Simulation, and Model-Based Drug Development—Part 2-Introduction to Pharmacokinetic Modeling Methods.pdfUploaded byJoão Victor
- Syllabus QmbUploaded byRajesh Kumar
- Solutions ManualUploaded byDenzel G Francis
- DrillsUploaded byCeasar Ryan Asuncion
- w19208Uploaded byalf fredo
- Basic StatisticsUploaded byKrishna Veni Gnanasekaran
- skewnessUploaded byRajesh Bansal
- StatisticUploaded byVinoka Dilangi Jayawardhana
- FRM 2Uploaded bysadiakhn03
- Portfolio Performance Evaluation With Generalized Sharpe Ratios_beyond the Mean VarianceUploaded byẢo Tung Chảo
- Statistics NotesUploaded byAli Hussain
- KurtosisUploaded byCarl Samson
- frekuensiUploaded byEvan Azrael
- Forecasting Stock Market Volatility Using Nonlinear) Garch ModelsUploaded byfuzhiguo8888
- 5Other TestsUploaded byProfesor Jorge
- Folio Add MathUploaded byAiza Iera
- Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011Uploaded byAI Coordinator - CSC Journals
- HW.I.ReviewUploaded byNuraeni Shq
- CCP303Uploaded byapi-3849444
- MIKEZero ToolboxUploaded byAnkiq Maqapagal
- Statistics M102 Measures of Central TendencyUploaded byMayur Somvanshi