2020-2021 EDA 101 Lectures (Sampling Distribution and Point Estimates - Test of Hypothesis For Single Population)

Lecture Notes
EDA 101
Luciano M. Medrano Jr.
College of Engineering
President Ramon Magsaysay State University
January 4, 2021
Sampling Distribution and Point Estimates Test of Hypothesis for a Single Population
Sampling Distribution
Results from Chapter 3

Definition (Expected Value)
The mean or expected value of a discrete random variable X, denoted by µX or E[X], is
defined as X
µX = E[X] = xf (x)
all x
Definition (Variance)
2
The variance of a discrete random variable X, denoted by σX or V [X] is
X
2
σX = V [X] = (x − µX )2 f (x)
all x
SirMedz CoE - PRMSU


If X is a binomial random variable with parameters n and p, then
µ = E[X] = np
σ 2 = V[X] = np(1 − p)
For the r.v. X where n = 5, p = 0.75,
µ = 5(0.75) = 3.75, σ 2 = 5(0.75)(0.25) = 0.9375
For the r.v. Y where n = 8, p = 56 ,

5 20 2 5 1 10

µ=8 6 = 3 ,σ =8 6 6 = 9
SirMedz CoE - PRMSU


The pdf of a continuous uniform random variable X is
1
f (x) = a<x<b
b−a
Its cdf is
x−a
F (x) =
b−a
The mean and variance of X are
µ = 12 (a + b)
σ2 = 1
12 (b − a)2
SirMedz CoE - PRMSU


X −µ X − E[X]
If X ∼ N µ, σ 2 , and Z =

= p
σ V[X]
1 1 2
The density function of Z is f (z) = √ e− 2 z
2π
Z is normally distributed
2
µZ = 0, σZ = 1. Z is written Z ∼ N(0, 1).
The distribution of Z is called the standard normal distribution.
xi − µ
The value/score xi of X is equivalent to the standard score zi = .
σ
Conversely, the equivalent score of zi is xi = µ + zi σ.
P[x1 < X < x2 ] = P[z1 < Z < z2 ].
SirMedz CoE - PRMSU


1 If X1 , X2 , . . . , Xn are independent random variables, then
E[c1 X1 + c2 X2 + · · · + cn Xn ] = c1 µ1 + c2 µ2 + · · · + cn µn
V[c1 X1 + c2 X2 + · · · + cn Xn ] = c21 σ12 + c22 σ22 + · · · + c2n σn2
2 If X1 , X2 , . . . , Xn are independent and identically distributed1 random variables, then
E[c1 X1 + c2 X2 + · · · + cn Xn ] = (c1 + c2 + · · · + cn )µ
V[c1 X1 + c2 X2 + · · · + cn Xn ] = c21 + c22 + · · · + c2n σ 2

Furthermore, if the random variables are normally distributed, then

c1 X1 + c2 X2 + · · · + cn Xn
is normally distributed.
1 same f (x),mean µ, and variance σ 2
SirMedz CoE - PRMSU
Definition (Random Sample)

The independent and identically distributed random variables X1 , X2 , . . . , Xn constitute a
random sample of size n from a population described by the random variable X.
Definition (Sample Mean X)

The random variable X, the sample mean, is defined as
X1 + X2 + · · · + Xn
X=
n
The distribution of X is called sampling distribution of the mean
SirMedz CoE - PRMSU

In the linear combination c1 X1 + c2 X2 + · · · + cn Xn , we set

1
c1 = c2 = · · · = cn =
n
for the sample mean X so that

1 1 1
µX = E X = E (X1 + · · · + Xn ) = (µ + · · · + µ) = (nµ) = µ
n n n
σ2

2
1 1 2 2 1 2
σX = V X = V (X1 + · · · + Xn ) = (σ + · · · + σ ) = (nσ ) =
n n2 n2 n
SirMedz CoE - PRMSU

Example
An electronics company manufactures resistors that have a mean resistance of 100 ohms and a
standard deviation of 10 ohms. The resistance follows a normal distribution. Find the
probability that a random sample of 25 resistors will have an average resistance of fewer than
95 ohms.
2 2
The distribution of X is normal with E X = µ = 100 and V X = σn = 10

25 = 4.
 
X−E X 95 − 100
P X < 95 = P q < √ 
VX 4
= P[Z < −2.5] = 0.0062

If the distribution of resistance is normal with mean 100 ohms and standard deviation of 10 ohms, finding a
random sample of resistors with a sample mean less than 95 ohms is a rare event. If this actually happens, it
casts doubt as to whether the true mean is really 100 ohms or if the true standard deviation is really 10 ohms.
SirMedz CoE - PRMSU

Central Limit Theorem

Let f (x) be a probability function of a random variable X with mean µ and variance σ 2 . Let X
be the mean of a random sample of size n from a population with distribution f (x), then
σ2

X ∼ N µ, as n → ∞
n
The above result implies the following:

1 E X = µ = E[X]. The expected value of X is the mean of the population.
2
2 V X = σ /n = V[X] /n. The variance of the sample mean is a fraction 1/n of the
variance of the population.
3 The distribution of X is approximately normal.
4 The distribution of X is irrelevant when n ≥ 30, only its mean and variance matter.
SirMedz CoE - PRMSU


The result of the Central Limit Theorem can also be stated as

X−E X
q ∼ N(0, 1)
VX
or equivalently,
X−µ
√ ∼ N(0, 1)
σ/ n
SirMedz CoE - PRMSU

Example
Suppose that a random variable X has a continuous uniform distribution
f (x) = 0.5 for 4 ≤ x ≤ 6
Find the distribution of the sample mean of a random sample of size 40.
The distribution of X is approximately normal with mean and variance
a+b 4+6
µX = E[X] = = =5
2 2
2 V[X] (b − a)2 /12 1
σX = = =
n 40 120
or
1

X ∼ N 5, 120
SirMedz CoE - PRMSU

Example
What is the probability that a random sample of size 40 has a mean value between 4.8 and 5.3?
 
4.8 − 5 X−E X 5.3 − 5
P 4.8 < X < 5.3 = P p < q <p 
1/120 VX 1/120
= P[−2.19 < Z < 3.29] = 0.984 870
Note: If X is normally distributed, the sampling distribution is exactly normal regardless of the sample size n.
SirMedz CoE - PRMSU

Sampling from Two Independent Populations

Suppose we draw
a random sample of size n1 from one population with mean µ1 and variance σ12
another random sample of size n2 from a second population with mean µ2 and variance σ22
Then

E X1 = µ1 E X2 = µ2
σ2 σ2
V X1 = 1 V X2 = 2
n1 n2
and
σ2 σ2
V X1 − X2 = 1 + 2

E X1 − X2 = µ1 − µ2
n1 n2
SirMedz CoE - PRMSU
Sampling from Two Independent Populations

If n1 ≥ 30 and n2 ≥ 30, the Central Limit Theorem guarantees that the sampling distributions
of X1 and X2 are approximately normal, and consequently,
σ2 σ2

X1 − X2 ∼ N µ1 − µ2 , 1 + 2
n1 n2
or
X1 − X2 − (µ1 − µ2 )
s ∼ N(0, 1)
σ12 σ22
+
n1 n2
Note: If both populations are normally distributed, the condition for the sample sizes n1 and n2 is relaxed.
SirMedz CoE - PRMSU

Example
Two independent experiments are run in which two different types of paint are compared.
Thirty-six specimens are painted using type A, and the drying time, in hours, is recorded for
each. The same is done with type B. The population standard deviations are both known to be
1.0.
Assuming that the mean drying time is equal for the two types of paint, find
P XA − XB > 0.5 .
Since n1 = n2 = 36 > 30, we can apply the Central Limit Theorem.
1.02 1.02

XA − XB ∼ N µA − µB , +
36 36
Also, µA = µB (the mean drying time is equal for the two types of paint). Thus,
  " #
0.5 − 0 0.5
P XA − XB > 0.5 = PZ > q =P Z> p = 0.017 003
1.02
+ 1.02 1/18
36 36
SirMedz CoE - PRMSU

Let X be a Bernoulli random variable with distribution

(
p x=1
f (x) =
q x=0
where p is a probability value and q = 1 − p.
The mean and variance of X are
µ=p σ 2 = p(1 − p) = pq
The random variable X1 + X2 + · · · + Xn has mean and variance
µ = np σ 2 = np(1 − p) = npq
so that X1 + X2 + · · · + Xn has a binomial distribution.
SirMedz CoE - PRMSU

The sample mean

1
X= (X1 + X2 + · · · + Xn )
n
represents the proportion of success in n ‘trials’ with
1
E X = (np) = p
n
1 pq
V X = 2 np(1 − p) =
n n
SirMedz CoE - PRMSU

Results
If n ≥ 30,
X−p
p ∼ N(0, 1)
pq/n
If n < 30 but np > 5 and nq > 5 the distribution of X is approximately normal.
If n1 ≥ 30 and n2 ≥ 30,

X1 − X2 − (p1 − p2 )
r ∼ N(0, 1)
p1 q1 p2 q2
+
n1 n2
If n1 < 30 and n2 < 30 but n1 p1 > 5, n1 q1 > 5, n2 p2 > 5 and n2 q2 > 5 the distributions
of X1 and X2 are approximately normal.
SirMedz CoE - PRMSU

Definition (Sample Variance S 2 )

If X1 , X2 , . . . , Xn is a random sample from a population, the random variable S 2 , the sample
variance, is defined as
n
1 X
S2 = (Xi − X)2
n − 1 i=1
where X is the sample mean.
The distribution of S 2 is called the sampling distribution of the variance.
An alternate form of the sample variance is

 !2 
n n
1 X 2 1 X 
S2 = X − Xi
n − 1  i−1 i n i=1

SirMedz CoE - PRMSU

Sample Computation: n = 5
xi x2i
1.3 1.69
1.8 3.24
1.4 1.96
1.1 1.21
1.8 3.24
7.4 11.34
 !2 
n n
1 X 1 X 
s2 = x2i − xi
n − 1  i=1 n i=1 
= 41 11.34 − 15 (7.4)2 = 0.097

SirMedz CoE - PRMSU

Point Estimation
Probability Theory
The time X until recharge for a battery in a laptop computer under common conditions is
normally distributed with µ = 260 minutes and σ = 50 minutes. Find the probability that a
fully charged laptop lasts anywhere from 3 to 4 hours.
Z 240
1 1 2
P[180 < X < 240] = √ e− 2·502 (x−260) dx = 0.2898
180 50 2π
We know the distribution (including µ and σ 2 ) of the battery life.

⇒ We compute probability of a battery lasting a given number of minutes.
⇒ Interpret the probability to predict the number units that will last the given number of
minutes. That is, if there are 80 units of a laptop computer, 80(0.2898) = 23.184 of them
will last between 180 and 240 hours.
SirMedz CoE - PRMSU

Point Estimation
Statistical Inference
1 The distribution of the time X until recharge for a battery in a laptop is unknown.
2 The mean and variance of X are unknown.
⇒ We measure the time x until recharge of a sample of laptop computers, say 25 units.
⇒ We use some formula (like x) to estimate the mean life µ until recharge of a laptop battery.
SirMedz CoE - PRMSU

Point Estimation
Terminologies
Definition (Parameter)
A parameter is a quantity, θ, that is a property of an unknown probability distribution.
Definition (Statistic)
A statistic is a function of observable random variables.
Examples (sample size n = 3)

5X1 + 2X2 − 3X3 X1 + X2 + X3 2X1 + X2 + 2X3
4 3 5
Our goal is to find out as much as possible about a parameter θ, using the information
contained within a sample.
SirMedz CoE - PRMSU

Point Estimation
Terminologies
Definition (Point Estimator)
A statistic that is used to estimate an unknown parameter θ is a point estimator of θ, denoted
Θ̂.
When the observations are recorded, the statistic takes a value θ̂ called the point estimate.
Examples (Estimates of µ)
Let x1 = 1.5, x2 = 1.8, x3 = 1.4.
1 6.9
1 4 (5x1 + 2x2 − 3x3 ) = 4 = 1.725
1 4.7
2 3 (x1 + x2 + x3 ) = 3 = 1.56
1 7.6
3 5 (2x1 + x2 + 2x3 ) = 5 = 1.52
Point estimates can only be as good as the data set from which they are calculated.
SirMedz CoE - PRMSU

General Concepts of Estimators
Definition (Unbiased Estimator)

The point estimator Θ̂ is an unbiased estimator of the unknown parameter θ if
h i
E Θ̂ = θ
If the estimator is not unbiased, the quantity

h i
E Θ̂ − θ
is called the bias of Θ̂.
SirMedz CoE - PRMSU

Examples
5X1 + 2X2 − 3X3 5µ + 2µ − 3µ
1 µ̂1 = : E[µ̂1 ] = =µ
4 4
X1 + X2 + X3 µ+µ+µ
2 µ̂2 = : E[µ̂2 ] = =µ
3 3
2X1 + X2 + 2X3 2µ + µ + 2µ
3 µ̂3 = : E[µ̂3 ] = =µ
5 5
All three estimators are unbiased for µ.
The following are unbiased estimators:

The sample mean X for µ.
The proportion P̂ = X0 /n for the population proportion p.
The sample variance S 2 for σ 2 .
SirMedz CoE - PRMSU

Variance of an Estimator

5X1 + 2X2 − 3X3 1 19 2
V = (25 + 4 + 9)σ 2 = σ
4 16 8

X1 + X2 + X3 1 1
V = (1 + 1 + 1)σ 2 = σ 2
3 9 3

2X1 + X2 + 2X3 1 9 2
V = (4 + 1 + 4)σ 2 = σ
5 25 25
SirMedz CoE - PRMSU

Minimum Variance Principle

Θ̂1
Θ̂2
If all unbiased estimators Θ̂ are considered, the one with the smallest variance is called the
minimum variance unbiased estimator.
The sample mean X is the minimum variance unbiased estimator of µ.
SirMedz CoE - PRMSU

Definition (Standard Error)

The standard error of an estimator Θ̂ is
q
se Θ̂ = V Θ̂
If the standard error of Θ̂ has an unknown parameter, an estimate of the unknown parameter is
used and the value is called standard error estimate.
Example (Standard Error of a Population Proportion)
r r
p(1 − p) p̂(1 − p̂)
se P̂ = ≈
n n
SirMedz CoE - PRMSU


Whenever the conditions of the Central Limit Theorem are satisfied,

Θ̂ − E Θ̂
∼ N(0, 1)
se Θ̂
where Θ̂ can be
1 X
2 X1 − X2
3 P̂
4 P̂1 − P̂2
SirMedz CoE - PRMSU

Hypothesis Testing
Definition (Statistical Inference)

Statistical inference consists of those methods by which one makes inference or generalization
about a population.
classical inferences are based strictly on information obtained from a random sample
selected from the population.
Bayesian utilizes prior subjective knowledge about the probability distribution of the
unknown parameters in conjunction with the information provided by the sample
data.
SirMedz CoE - PRMSU

Hypothesis Testing
Areas of Statistical Inference

1 Estimation – A parameter θ is estimated using a statistic θ̂
point A single value θ̂ estimates θ, with probability 0 or 1.
interval A range of plausible values of θ with confidence level 100(1 − α)%
2 Test of Hypothesis – a methodology which allows an experimenter to assess the
plausibility or credibility of a specific statement or hypothesis.
SirMedz CoE - PRMSU

Hypothesis Testing
Test of Hypothesis
Examples
1 A supplier claims that its products made from a graphite-epoxy composite material have a
tensile strength of 40. An experimenter may test this claim by collecting a random sample
of products and measuring their tensile strengths.
2 Immediately below the asphalt surface of a roadway is a layer of base material composed
of a crushed stone or gravel aggregate. The resilient modulus of this aggregate is a
measure of how the aggregate deforms when subjected to stress, and it is an important
property affecting the manner in which the roadway responds to loads. A construction
engineer has four different suppliers of this aggregate material who obtain their raw
materials from four different locations. The engineer would like to assess whether the
aggregates from the four different locations have different values of resilient modulus.
SirMedz CoE - PRMSU

Hypothesis Testing
About the Hypothesis

The statistical hypothesis is either true or false but is never known with absolute certainty
unless the population is examined.
A random sample from the population is taken and the information contained in it
provides evidence that either supports or does not support the hypothesis.
Evidence that is inconsistent with the stated hypothesis leads to the rejection of the
hypothesis.
Risk is always present whenever a decision to reject or not reject a hypothesis based on
the random sample.
Hypothesis-testing procedure should be developed with the probability of reaching a wrong
conclusion in mind.
SirMedz CoE - PRMSU

Hypothesis Testing
A Simple Case
From a packet of 200 seeds that I planted, only 180 germinated.
It is reasonable to conclude that this evidence does not refute p = 0.93.
It also does not refute p = 0.90 or perhaps even p = 0.94.
Rejection of a hypothesis implies that the sample evidence refutes the hypothesis.
What is the risk of rejecting a hypothesis say p = 0.94, when in fact, the hypothesis is true.
P[X ≤ 180] where X ∼ bin(n, p)

180
X 200
= (0.94)x (0.06)200−x = 0.018
x=0
x
In the light of the evidence, the risk of incorrectly rejecting a true hypothesis p = 0.94 is
0.018. Thus, the decision is to reject the hypothesis.
SirMedz CoE - PRMSU
Hypothesis Testing
Structure of Hypothesis Testing

1 Evidence is collected with the purpose of rejecting a hypothesis H0 , called the null
hypothesis.
2 An alternative hypothesis H1 is logically formulated with the null hypothesis.
3 If the dataset provides evidence of a small risk of rejecting H0 , it is rejected in favor of the
alternative.
Examples
1 To support the claim that one kind of gauge is more accurate than another, the engineer
tests the (null) hypothesis that there is no difference in the accuracy of the two kinds of
gauges.
2 A medical researcher wishes to show strong evidence that coffee drinking increases the risk
of cancer by rejecting the (null) hypothesis of “there is no increase in cancer risk produced
by drinking coffee.”
SirMedz CoE - PRMSU
Hypothesis Testing
One-sided Alternative Hypotheses

1 The mean pull-off force of a connector depends on cure time. An experiment is performed
to demonstrate that the pull-off force is below 25 newtons.
H0 : µ ≥ 25
H1 : µ < 25
2 A textile fiber manufacturer is investigating a new drapery yarn, which the company
claims has a mean thread elongation of (at least) 12 kilograms.
H0 : µ ≤ 12
H1 : µ > 12
SirMedz CoE - PRMSU

Hypothesis Testing
One-sided Alternative Hypotheses

3 An allergist wishes to test the hypothesis that at least 30% of the public is allergic to
some cheese products.
H0 : p ≤ 30%
H1 : p > 30%
SirMedz CoE - PRMSU

Hypothesis Testing
Two-sided Alternative Hypotheses

1 The machine that produces metal cylinders is set to make cylinders with a diameter of 50
mm. Is it calibrated correctly?
H0 : µ = 50
H1 : µ 6= 50
2 A random sample of 400 voters in a certain city are asked if they favor an additional 4%
gasoline sales tax to provide badly needed revenues for street repairs. If more than 220 but
fewer than 260 favor the sales tax, we shall conclude that 60% of the voters are for it.
H0 : p = 0.60
H1 : p 6= 0.60
SirMedz CoE - PRMSU

Hypothesis Testing
Decision Errors
Decision
State of H0 Reject H0 Do not reject H0
type I error
H0 is true correct decision
α
type II error
H0 is false correct decision
β
α level of significance; size of the test

α = P reject H0 H0 is true

β β = P fail to reject H0 H0 is false
SirMedz CoE - PRMSU

Hypothesis Testing
Test Statistic
The “discrepancy” between the data set and the null hupothesis is measured through a test
statistic.
Example
A supplier claims that its products made from a graphite-epoxy composite material have a
tensile strength of 40. When the tensile strengths of 30 randomly selected products are
measured, a sample mean of x̄ = 38.518 and a sample standard deviation of s = 2.299 are
obtained.
The statistic
x̄ − µ
t= √
s/ n
measures the discrepancy between x̄ = 38.518 and µ = 40.
SirMedz CoE - PRMSU

Hypothesis Testing
p−value
The p−value is a measure of the plausibility or credibility of the null hypothesis.
It is the observed level of significance.
It is the smallest level of significance that would lead to the rejection of H0 .
It is a measure of the risk of rejecting H0 in favor of H1 when H0 is true.
It is the probability of obtaining the data set (or even worse) when H0 is true.
p = P[H0 is rejected|H0 is true]
SirMedz CoE - PRMSU

Hypothesis Testing
Example
tensile strength of 40. A random sample of size 30 yields x̄ = 38.518 and s = 2.299.
ν = 29
38.518 40

p = P X ≤ 38.518
SirMedz CoE - PRMSU
Hypothesis Testing
Decision
Decision Rule
If the observed level of significance p is less than the specified level of significance α, the
decision is to reject H0 in favor of H1 , otherwise, the null hypothesis is not rejected.
SirMedz CoE - PRMSU

Hypothesis Testing
Hypothesis Testing Procedure

A State H0 and H1
B Specify the level of significance α
C Determine the test statistic
D Compute value of the test statistic and the observed level of significance
E Decide whether or not H0 is rejected.
F Report your conclusion in the context of the problem.
SirMedz CoE - PRMSU

Population Mean
Alternative Hypothesis H1 and p-value
Test for Conditions Statistic H1 p-value Computation

µ 6= µ0 2 × P[Z > |z|]
x̄ − µ0
σ known z= √ µ < µ0 P[Z < z]
σ/ n
µ > µ0 P[Z > z]
mean µ
µ 6= µ0 2 × P T(ν) > |t|
σ unknown x̄ − µ0
t= √

s/ n µ < µ0 P T(ν) < t
ν =n−1
µ > µ0 P T(ν) > t
SirMedz CoE - PRMSU

Population Mean
A manufacturer of sports equipment has developed a new synthetic fishing line that the company
claims has a mean breaking strength of 8 kilograms with a standard deviation of 0.5 kilogram. Test the
hypothesis that µ = 8 kilograms against the alternative that µ 6= 8 kilograms if a random sample of 50
lines is tested and found to have a mean breaking strength of 7.8 kilograms. Use a 0.01 level of
significance.
A H0 : µ = 8, H1 : µ 6= 8
SirMedz CoE - PRMSU

Population Mean
significance.
A H0 : µ = 8, H1 : µ 6= 8
B α = 0.01
SirMedz CoE - PRMSU

Population Mean
significance.
A H0 : µ = 8, H1 : µ 6= 8
B α = 0.01
X − µ0
C Z = √
σ/ n
SirMedz CoE - PRMSU

Population Mean
significance.
A H0 : µ = 8, H1 : µ 6= 8
B α = 0.01
X − µ0
C Z = √
σ/ n
x̄ − µ0 7.8 − 8
D z = √ = √ = −2.8284
σ/ n 0.5/ 50
p = 2 ∗ P[Z > 2.8284] ≈ 0.0047
SirMedz CoE - PRMSU

Population Mean
significance.
A H0 : µ = 8, H1 : µ 6= 8
B α = 0.01
X − µ0
C Z = √
σ/ n
x̄ − µ0 7.8 − 8
D z = √ = √ = −2.8284
σ/ n 0.5/ 50
p = 2 ∗ P[Z > 2.8284] ≈ 0.0047
E Reject H0 since p < 0.01.
SirMedz CoE - PRMSU

Population Mean
significance.
A H0 : µ = 8, H1 : µ 6= 8
B α = 0.01
X − µ0
C Z = √
σ/ n
x̄ − µ0 7.8 − 8
D z = √ = √ = −2.8284
σ/ n 0.5/ 50
p = 2 ∗ P[Z > 2.8284] ≈ 0.0047
F Based on a random sample of size 50, there is sufficient evidence that the mean breaking strength
of a new synthetic fishing line is different from 8 kg with p = 0.0047.
SirMedz CoE - PRMSU
Population Mean
tensile strength of 40. When the tensile strengths of 30 randomly selected products are
measured, a sample mean of x̄ = 39.018 and a sample standard deviation of s = 2.299 are
obtained. Test the hypothesis against H1 : µ < 40 at 5% level of significance.
A H0 : µ ≥ 40, H1 : µ < 40
B α = 0.05
X − µ0
C T = √
S/ n
x̄ − µ0 39.018 − 40
D t= √ = √ = −2.33955
s/ n 2.299/ 30

p = P T(29) < −2.33995 = 0.0132

F Conclusion: ?
SirMedz CoE - PRMSU
Population Mean
The sodium content of twenty 300-gram boxes of organic cornflakes was determined. The data
(in milligrams) are as follows: 131.15, 130.69, 130.91, 129.54, 129.64, 128.77, 130.72, 128.33,
128.24, 129.65, 130.14, 129.29, 128.71, 129.00, 129.39, 130.42, 129.53, 130.12, 129.78,
130.92. Can you support a claim that mean sodium content of this brand of cornflakes differs
from 130 milligrams? Use α = 0.05.
SirMedz CoE - PRMSU

2020-2021 EDA 101 Lectures (Sampling Distribution and Point Estimates - Test of Hypothesis For Single Population)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2020-2021 EDA 101 Lectures (Sampling Distribution and Point Estimates - Test of Hypothesis For Single Population)

Uploaded by

Copyright:

Available Formats

Lecture Notes

Luciano M. Medrano Jr.

Results from Chapter 3

SirMedz CoE - PRMSU

Results from Chapter 3

For the r.v. X where n = 5, p = 0.75,

µ = 5(0.75) = 3.75, σ 2 = 5(0.75)(0.25) = 0.9375

For the r.v. Y where n = 8, p = 56 ,

SirMedz CoE - PRMSU

Results from Chapter 4

SirMedz CoE - PRMSU

Results from Chapter 4

SirMedz CoE - PRMSU

Results from Chapter 5

Furthermore, if the random variables are normally distributed, then

Definition (Random Sample)

Definition (Sample Mean X)

SirMedz CoE - PRMSU

In the linear combination c1 X1 + c2 X2 + · · · + cn Xn , we set

SirMedz CoE - PRMSU

= P[Z < −2.5] = 0.0062

SirMedz CoE - PRMSU

Central Limit Theorem

The above result implies the following:

SirMedz CoE - PRMSU

Central Limit Theorem

SirMedz CoE - PRMSU

f (x) = 0.5 for 4 ≤ x ≤ 6

SirMedz CoE - PRMSU

= P[−2.19 < Z < 3.29] = 0.984 870

SirMedz CoE - PRMSU

Sampling from Two Independent Populations

Sampling from Two Independent Populations

SirMedz CoE - PRMSU

SirMedz CoE - PRMSU

Let X be a Bernoulli random variable with distribution

where p is a probability value and q = 1 − p.

The mean and variance of X are

The random variable X1 + X2 + · · · + Xn has mean and variance

so that X1 + X2 + · · · + Xn has a binomial distribution.

SirMedz CoE - PRMSU

The sample mean

SirMedz CoE - PRMSU

SirMedz CoE - PRMSU

Definition (Sample Variance S 2 )

where X is the sample mean.

The distribution of S 2 is called the sampling distribution of the variance.

An alternate form of the sample variance is

SirMedz CoE - PRMSU

= 41 11.34 − 15 (7.4)2 = 0.097

SirMedz CoE - PRMSU

We know the distribution (including µ and σ 2 ) of the battery life.

SirMedz CoE - PRMSU

SirMedz CoE - PRMSU

Examples (sample size n = 3)

SirMedz CoE - PRMSU

SirMedz CoE - PRMSU

General Concepts of Estimators

Definition (Unbiased Estimator)

If the estimator is not unbiased, the quantity

is called the bias of Θ̂.

SirMedz CoE - PRMSU

General Concepts of Estimators