By Ibrahim Ishag The Normal Probability Distribution
The normal distribution is the most widely known and
used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a standard of reference for many probability problems. Probability Density Function: Is an equation that is used to compute probabilities of continuous random variables that must satisfy the following two properties. Let f(x) be a probability density function.
Where: x is the variable, μ is the mean, σ is the standard
deviation Two parameters, μ and σ. Note that the normal distribution is actually a family of distributions, since μ and σ determine the shape of the distribution Normal distribution curve If a continuous random variable is normally distributed or has a normal probability distribution, then a relative frequency histogram of the random variable has the shape of a normal curve (bell-shaped and symmetric).
Normal Distribution Curve
Characteristics of the Normal distribution:
Symmetric about its mean (μ), bell shaped
The highest point of curve occur at x=μ Continuous for all values of X between -∞ and ∞ so that each conceivable interval of real numbers has a probability other than zero (-∞≤ X ≤∞) The area under curve= 1, that is = f ( x)dx 1 xS
The notation N(μ, σ2) means normally distributed with mean
μ and variance σ2. If we say X ∼ N(μ, σ2) we mean that X is distributed N(μ, σ2). About 2/3 of all cases fall within one standard deviation of the mean, that is P(μ - σ≤ X ≤ μ + σ) = .6826. About 95% of cases lie within 2 standard deviations of the mean, that is P(μ - 2σ≤ X ≤ μ + 2σ) = . 9544 About 99% of cases lie within 3 standard deviations of the mean, that is P(μ - 3σ≤ X ≤ μ + 3σ) = . 9974 Why is the normal distribution useful? Many things actually are normally distributed, or very close to it. For e.g. height and intelligence are approximately normally distributed. The normal distribution is easy to work with mathematically. There is a very strong connection between the size of a sample N and the extent to which a sampling distribution approaches the normal form. The standardized normal distribution: As you might suspect from the formula for the normal density function, it would be difficult and tedious to do the calculus every time we had a new set of parameters for μ and σ. So instead, we usually work with the standardized normal distribution, where μ = 0 and σ = 1, i.e. N(0,1).
That is, rather than directly solve a problem involving a
normally distributed variable X with mean μ and standard deviation σ, an indirect approach is used. Properties of standard Normal distribution curve Symmetric about its mean, μ=0 The highest point of curve occur at z=0 The area under curve= 1 The Empirical Rule: o About 68% of the area under the graph is between -1 and 1; o about 95% of the area under the graph is between -2 and 2; o about 99.7% of the area under the graph is between -3 and 3 Finding the Area Under the Standard Normal Curve
EXAMPLE: Find the area under the standard normal
curve to the left of Z = -0.38. Solution: Look in the Normal distribution table (B): P(x<-0.38) = 0.3520 Example: Find the area under the standard normal curve to the right of Z = 1.25. Solution: Look in the Normal distribution table (A): P(x≥1.25) =1-P(x<1.25)=1-0.8944=0.1056 Area under the normal curve to the right of zo = 1 – Area to the left of zo Example: Find the area under the standard normal curve between Z = -1.02 and Z = 2.94. Solution: P(-1.02<x<2.94)=P(x<2.94)-p(x<-1.02) =0.9984-0.1539 =0.8445 Normal distribution table (A) Normal distribution table (B) Finding a Z-score From a specified area to the left: Example: Find the Z-score such that the area to the left of the Z-score is 0.68. Solution: find Z such that P(x<Z)=0.68 , Z =0.46 From a specified area to the right: Example: Find the Z-score such that the area to the right of the Z-score is 0.3021. Solution: P(x>Z) =0.3021, so P(x<Z) = 0.6979, z=0.52 Tutorials: 1. Find each of the following probabilities: (a) P (Z < -0.23) (b) P (Z > 1.93) (c) P (0.65 < Z < 2.10) 2. The weights of pennies minted after 1982 are approximately normally distributed with mean 2.46 grams and standard deviation 0.02 grams. Find the area under the normal curve (probability) between 2.44 and 2.49 grams? 3. The random variable X is normally distributed with mean=800 and standard deviation= 40, find the area between X1=778 and X2= 834. Solution: where X~(800, 1600), we convert the X to Z (standard Normal distribution) by P (X1≤ X ≥X2) = P (Z1≤ Z ≥Z2) Z1= X1-μ/ = 778-800/40=-22/40= -0.55 Z2= X2-μ/ = 934-800/40= 34/40= 0.85 P (-0.55 ≤ Z ≥ 0.85), using standard normal distribution tables P (-0.55 ≤ Z ≥ 0.85)= 0.8023 – 0.2913 = 0.511 4. Suppose that the random variable X is normally distributed with μ=100 and = 16, find the value of X the area under curve left of it=0.04. Solution: Firstly: search in the standard normal distribution tables, the z value for area=0.04 ; Z= -1.75 Secondly: by formula - 1.75 = x= 100 – 1.75*16= 100-28= 72 The notation z is z value such that the area under the standard normal curve to right is . The figure illustrate the notation. Estimation of a population mean The most fundamental point and interval estimation process involves the estimation of a population mean. Suppose it is of interest to estimate the population mean, μ, for a quantitative variable. Data collected from a simple random sample can be used to compute the sample mean, x̄, where the value of x̄ provides a point estimate of μ. When the sample mean is used as a point estimate of the population mean, some error can be expected The absolute value of the difference between the sample mean, x̄, and the population mean, μ, written |x̄ − μ|, is called the sampling error. Statisticians have shown that the mean of the sampling distribution of x̄ is equal to the population mean, μ, and that the standard deviation is given by σ/√n, where σ is the population standard deviation (standard error). For large sample sizes, the central limit theorem indicates that the sampling distribution of x̄ can be approximated by a normal probability distribution. As a matter of practice, statisticians usually consider samples of size 30 or more to be large. In the large-sample case, a 95% confidence interval estimate for the population mean is given by x̄ ± 1.96σ/√n. When the population standard deviation, σ, is unknown, the sample standard deviation is used to estimate σ in the confidence interval formula. The quantity 1.96σ/√n is often called the Confidence Z margin of error for the estimate. Interval 80% 1.282 By changing the constant from 1.96 to 1.645, a 90% confidence 85% 1.440 interval can be obtained. It should be noted from the formula for an 90% 1.645 interval estimate that a 90% confidence interval is narrower than a 95% 1.960 95% . 99% 2.576 99.5% 2.807 In practice, a 95% confidence interval is the most widely used. 99.9% 3.291 Example: We measure the heights of 40 randomly chosen men, and get a mean height of 175cm, We also know the standard deviation of men's heights is 20cm. Find true mean with The 95% Confidence Interval. Solutions: use that Z value of 1.96 in the following formula for the Confidence Interval X ± Zs√n, Where: X is the mean, Z is the chosen Z-value from the table above s is the standard deviation, n is the number of observations And we have: 175 ± 1.960 × 20√40, Which is: 175cm ± 6.20cm In other words: from 168.8 cm to 181.2 cm. Example of Apple Orchard: There are hundreds of apples on the trees, so you randomly choose just 46 apples and get: a Mean of 86 and a Standard Deviation of 6.2, find the apple mean with confidence interval of 95%? Solution: So let's calculate: X ± Zs√n We know: X is the mean = 86, Z is the Z-value = 1.960 (from the table above for 95%), s is the standard deviation = 6.2, n is the number of observations = 46 86 ± 1.960 × 6.2√46 = 86 ± 1.79 So the true mean (of all the hundreds of apples) is likely to be between 84.21 and 87.79 Example: A random sample of 30 students at a Florida college has the following grade point averages: 59.1, 65.0, 75.1, 79.2, 95.0, 99.8, 89.1, 65.2, 41.9, 55.2, 94.8, 84.1, 83.2, 74.0, 75.1, 76.2, 79.1, 80.1, 92.1, 74.2, 59.2, 64.0, 75.1, 78.2, 95.0, 97.8, 89.1, 64.2, 41.8, 57.2. What is the 90% confidence interval for the population mean? Solution: The procedure just described for developing interval estimates of a population mean is based on the use of a large sample. In the small-sample case—i.e., where the sample size n is less than 30—the t distribution is used when specifying the margin of error and constructing a confidence interval estimate.