Professional Documents
Culture Documents
i.
ii. Var(X)=σ2 =[∑x2 •p(x)]–μ2 =E(X2)–[E(X)]2
iii. Var(ax+b) = a2Var(x)
10. Bernouilli
a. The n trials are identical.
b. The trials are independent (the outcome on any particular trial does not influence the
a. outcome on any other trial).
b. Each trial has two possible outcomes: success or failure.
c. The probability of success, denoted by p, is the same for each trial (identical).
d. If in Bernoulli trials, the number of trials n is fixed in advance of the experiment. This experiment is called a binomial experiment.
11. Binomial
a. P(X ≤ x) = B(x; n, p) =b(y; n, p) from y = 0 to n
b.
c. E(x) = np, Var(x)= np(1-p)
12. Hypergeometric Distribution: If X is the number of successes in a completely random sample of size n drawn from a population consisting of M successes and
(N – M) failures, then the distribution of X is given by
a.
b. n = sample size (10 in socks example).
c. M = total number of successes in the population (34 in socks example).
d. N = total number of individuals in the population (50 in socks example).
e. We wish to obtain P(X=x) = h(x; n, M, N).
f.
g. E(X) = n(M/N)
h. Var(X) = (N – n)/(N – 1) n (M/N) (1 – M/N)
i. Connect to Binomial: let M/N = p
i. Notice that if we fix n, and let N be sufficiently large, Var(X) → np(1 – p) which is the variance of a binomial rv. This is the reason why we can use
a binomial model to approximate hypergeometric when population is large.
ii. (N – n)/(N – 1) is often called finite population correction factor
13. Geometric Model
a. a series of independent Bernoulli trials are performed until we get a success for the first time.
b. P(X=n) = (1 – p)n p
14. A negative binomial X clearly has two parameters r and p, thus we denote its pmf as nb(x; r, p).
a. Geometric model is a simple model which can be used to model many things, such as, the first time we observe a head in coin tosses; the first time a girl is
born in the family; etc.
b. We can further generalize the model by letting X to be the number of failures that precede the rth success. X can be any number in the set of {0, 1, 2, ...}.
c.
d. E(X) = r(1 – p)/p, Var(X) = r(1 – p)/p2
15. Poisson distribution
a. A very important application of the Poisson distribution arises in connection with the occurrence of events of some type over time. For instance, visits to a
particular website; some kind of pulses recorded by a counter; accidents in an industrial facility; customers going to a particular ATM machine; customers
coming to a particular store.
b.
c. E(X) = Var(X) = λ
d. The number of events during a time interval of length t is a Poisson rv with λ=αt, if the following three assumptions are satisfied:
i. The probability of exactly one event is received for a short time interval is fixed (constant).
ii. The probability of more than one (2 or more than 2) events being received for a very short time interval is almost 0.
iii. The number of events received during any time interval is independent of the number received prior to this time interval.
e. A typesetter, on the average makes one error in every 500 words typeset. A typical page contains 300 words. What is the probability that there will be no
more than two errors in five pages: Use the Poisson approximation with λ = np = 1500/500 = 3.
f.
16. Continuous Random Variables
a. P(X > a) = 1 - F(a)
b. P(a ≤ X ≤ b) = F(b) - F(a)
17. Uniform Distribution
a.
b. Suppose a bus arrives equally likely at any time between 7:00 – 7:05 AM. What is the probability it arrives sometime between 7:00 – 7:02 AM?
c.
d. E(X) = (B + A)/2, Var(X) = (B - A)2/12
18. PMF: The probability density function (pdf) of a continuous rv X is a function f(x) such that for any two numbers a and b with a
19.
a. The graph of f(x) is often referred to as the density curve.
b. Note that 0 ≤ f(x) for all x.
c. f(x)dx can be treated as P(X=x)!
d. The total area under the curve must be 1
e. Mean:
f. Variance: Find E[x2]-E[x]2
20. The Exponential Distribution
a.
b. E(X) = 1/λ, Var(X) = 1/λ2
Bus:
c. P(X ≥s+t|X≥s)=P[(X ≥s+t)∩(X≥s)]/P(X≥s) = P(X ≥ s + t)/P(X ≥ s)= e-λt
21. The Normal Distribution
a. Normal distribution is a bell-shaped, single peaked and symmetric distribution.
b. All normal models have the same shape and the same area within x standard deviations of its mean.
c.
d. 68% of the students scored between x±σ
e. 95% of the scores were between x±2*σ
f. 99.7% of the scores were between x±3*σ
g. Z = (X – μ) / σ
h. Binomial Approx: As we see, as n becomes larger and larger the pmf of the binomial becomes more bell-shaped and more symmetric. Thus a normal
distribution can be used to approximate the binomial when n is large.
i. “+0.5” is called “continuous correction”.
j. Za denotes the value of the z axis for which a of the areas under the z curve lies to the right of a.
k. I-a lies to the left. Za is the 100(1-a)th percentile of normal dis
2. Joint Distribution
a. The joint probability mass function p(x, y) is defined for each pair of numbers (x, y) by p(x, y) = P(X=x, Y=y).
b. We must have p(x,y)≥0 and ∑(x)∑(y)p(x,y) =1
c. Remarks
i. In the continuous case, roughly speaking, f(x, y)dxdy can be treated as P(X=x,Y=y).
ii. As in the discrete case, fX(x) and fY(y) calculated from the joint distribution are automatically proper pdf’s.
iii. Marginal distributions are, in fact, the distributions of the marginal random variables when they are treated as univariate random variables.
iv. This is the example that different joint distributions may have the same marginal distributions.
v. We say two random variables X and Y are independent if and only if P(X=x, Y=y) = P(X=x) P(Y=y), for any x and y.
vi. More specifically, two random variables X and Y are said to be independent if for every pair x and y values,
1. p(x, y) = pX(x) pY(y), when X and Y are discrete;
2. f(x, y) = fX(x) fY(y), when X and Y are continuous.
vii. The random variables X1, X2, ..., Xn, are said to be independent if for every subset Xi1, Xi2, ..., Xik, of the variables (each pair, each triple, and so
on), the joint pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s.
d. Conditional
i. Using the marginal distributions, one can calculate the conditional distribution of one rv given the other.
ii. Let X and Y be two conditional rv’s with joint pdf f(x, y) and marginal X pdf fX(x). Then for any X value x for which fX(x)>0, the conditional
probability density function of Y given that X=x is ……
iii. If X and Y are discrete, replace pdf’s by pmf’s in this definition gives the conditional probability mass function of Y when X=x.
3. Covariance
a. A popular measurement to characterize the dependence of two rv’s is called correlation. To calculate correlation of two rv’s, we’ll have calculate the
covariance of the two rv’s.
b. The covariance between two rv’s X and Y is
c.
d. Cov(X, Y) = E(XY) – E(X)E(Y)
e.
f. Correlation Coefficeient
i.
g. If X and Y are independent, then ρX,Y = 0 (why?). But ρX,Y = 0 does NOT imply independence.
i. ρX,Y =1or-1iffY=aX+b for some numbers a and b with a not equal to 0.