THE ULTIMATE STATS STUDY GUIDE

AP Statistics Tutorial: Variables.

Univariate vs. Bivariate Data Statistical data is often classified according to the number of variables being studied.

Univariate data. When we conduct a study that looks at only one variable, we say that we are working with univariate data. Suppose, for example, that we conducted a survey to estimate the average weight of high school students. Since we are only working with one variable (weight), we would be working with univariate data. Bivariate data. When we conduct a study that examines the relationship between two variables, we are working with bivariate data. Suppose we conducted a study to see if there were a relationship between the height and weight of high school students. Since we are working with two variables (height and weight), we would be working with bivariate data

AP Statistics: Measures of Central Tendency Statisticians use summary measures to describe patterns of data. Measures of central tendency refer to the summary measures used to describe the most "typical" value in a set of values. The Mean and the Median The two most common measures of central tendency are the median and the mean, which can be illustrated with an example. Suppose we draw a sample of five women and measure their weights. They weigh 100 pounds, 100 pounds, 130 pounds, 140 pounds, and 150 pounds.

To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values. Thus, in the sample of five women, the median value would be 130 pounds; since 130 pounds is the middle weight. The mean of a sample or a population is computed by adding all of the observations and dividing by the number of observations. Returning to the example of the five women, the mean weight would equal (100 + 100 + 130 + 140 + 150)/5 = 620/5 = 124 pounds. In the general case, the mean can be calculated, using one of the following equations: Population mean = μ = ΣX / N OR Sample mean = x = Σx / n

where ΣX is the sum of all the population observations, N is the number of population observations, Σx is the sum of all the sample observations, and n is the number of sample observations. When statisticians talk about the mean of a population, they use the Greek letter μ to refer to the mean score. When they talk about the mean of a sample, statisticians use the symbol x to refer to the mean score. The Mean vs. the Median As measures of central tendency, the mean and the median each have advantages and disadvantages. Some pros and cons of each measure are summarized below. 1 -1 -133-

The median may be a better indicator of the most typical value if a set of scores has an outlier. An outlier is an extreme value that differs greatly from other values. However, when the sample size is large and does not include outliers, the mean score usually provides a better measure of central tendency.

To illustrate these points, consider the following example. Suppose we examine a sample of 10 households to estimate the typical family income. Nine of the households have incomes between \$20,000 and \$100,000; but the tenth household has an annual income of \$1,000,000,000. That tenth household is an outlier. If we choose a measure to estimate the income of a typical household, the mean will greatly over-estimate family income (because of the outlier); while the median will not. Effect of Changing Units Sometimes, researchers change units (minutes to hours, feet to meters, etc.). Here is how measures of central tendency are affected when we change units.

If you add a constant to every value, the mean and median increase by the same constant. For example, suppose you have a set of scores with a mean equal to 5 and a median equal to 6. If you add 10 to every score, the new mean will be 5 + 10 = 15; and the new median will be 6 + 10 = 16. Suppose you multiply every value by a constant. Then, the mean and the median will also be multiplied by that constant. For example, assume that a set of scores has a mean of 5 and a median of 6. If you multiply each of these scores by 10, the new mean will be 5 * 10 = 50; and the new median will be 6 * 10 = 60.

AP Statistics Tutorial: Measures of Variability Statisticians use summary measures to describe the amount of variability or spread in a set of data. The most common measures of variability are the range, the interquartile range (IQR), variance, and standard deviation. The Range The range is the difference between the largest and smallest values in a set of values. For example, consider the following numbers: 1, 3, 4, 5, 5, 6, 7, 11. For this set of numbers, the range would be 11 - 1 or 10. The Interquartile Range (IQR) The interquartile range (IQR) is the difference between the largest and smallest values in the middle 50% of a set of data. To compute an interquartile range from a set of data, first remove observations from the lower quartile. Then, remove observations from the upper quartile. Then, from the remaining observations, compute the difference between the largest and smallest values. For example, consider the following numbers: 1, 3, 4, 5, 5, 6, 7, 11. After we remove observations from the lower and upper quartiles, we are left with: 4, 5, 5, 6. The interquartile range (IQR) would be 6 - 4 = 2. The Variance 2 -2 -133-

In a population, variance is the average squared deviation from the population mean, as defined by the following formula: σ 2 = Σ ( X i - μ )2 / N where σ2 is the population variance, μ is the population mean, Xi is the ith element from the population, and N is the number of elements in the population. The variance of a sample, is defined by slightly different formula, and uses a slightly different notation: s2 = Σ ( xi - x )2 / ( n - 1 ) where s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the sample variance can be considered an unbiased estimate of the true population variance. Therefore, if you need to estimate an unknown population variance, based on data from a sample, this is the formula to use. The Standard Deviation The standard deviation is the square root of the variance. Thus, the standard deviation of a population is: σ = sqrt [ σ2 ] = sqrt [ Σ ( Xi - μ )2 / N ] where σ is the population standard deviation, σ2 is the population variance, μ is the population mean, Xi is the ith element from the population, and N is the number of elements in the population. And the standard deviation of a sample is: s = sqrt [ s2 ] = sqrt [ Σ ( xi - x )2 / ( n - 1 ) ] where s is the sample standard deviation, s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Effect of Changing Units Sometimes, researchers change units (minutes to hours, feet to meters, etc.). Here is how measures of variability are affected when we change units.

If you add a constant to every value, the distance between values does not change. As a result, all of the measures of variability (range, interquartile range, standard deviation, and variance) remain the same. On the other hand, suppose you multiply every value by a constant. This has the effect of multiplying the range, interquartile range (IQR), and standard deviation by that constant. It has an even greater effect on the variance. It multiplies the variance by the square of the constant.

AP Statistics Tutorial: Measures of Position Statisticians often talk about the position of a value, relative to other values in a set of observations. The most common measures of position are quartiles, percentiles, and standard scores (aka, z-scores). Percentiles 3 -3 -133-

Assume that the elements in a data set are rank ordered from the smallest to the largest. The values that divide a rank-ordered set of elements into 100 equal parts are called percentiles An element having a percentile rank of Pi would have a greater value than i percent of all the elements in the set. Thus, the observation at the 50th percentile would be denoted P50, and it would be greater than 50 percent of the observations in the set. An observation at the 50th percentile would correspond to the median value in the set. Quartiles Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively. Note the relationship between quartiles and percentiles. Q1 corresponds to P25, Q2 corresponds to P50, Q3 corresponds to P75. Q2 is the median value in the set. Standard Scores (z-Scores) A standard score (aka, a z-score) indicates how many standard deviations an element is from the mean. A standard score can be calculated from the following formula. z = (X - μ) / σ where z is the z-score, X is the value of the element, μ is the mean of the population, and σ is the standard deviation. Here is how to interpret z-scores.
     

A z-score less than 0 represents an element less than the mean. A z-score greater than 0 represents an element greater than the mean. A z-score equal to 0 represents an element equal to the mean. A z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc. A z-score equal to -1 represents an element that is 1 standard deviation less than the mean; a z-score equal to -2, 2 standard deviations less than the mean; etc. If the number of elements in the set is large, about 68% of the elements have a z-score between -1 and 1; about 95% have a z-score between -2 and 2; and about 99% have a z-score between -3 and 3.

AP Statistics Tutorial: Patterns in Data Graphical displays are useful for seeing patterns in data. Patterns in data are commonly described in terms of: center, spread, shape, and unusual features. Center Graphically, the center of a distribution is located at the median of the distribution. This is the point in a graphic display where about half of the observations are on either side. In the chart to the right, the height of each column indicates the frequency of observations. Here, the observations are centered over 4. Spread 4 -4 -133-

If the observations are clustered around a single value. Skewness. each column represents a group defined by a quantitative variable. Uniform. The Difference Between Bar Charts and Histograms Here is the main differnce between bar charts and histograms.     Symmetry. however. it is less appropriate to comment on the skewness of a bar chart. and with histograms. all of the observations fall between 0 and 4. the spread is smaller. As a "rule of thumb". and distributions with two clear peaks are called bimodal. statisticians refer to unusual features in a set of data. A uniform distribution has no clear peaks. Shape The shape of a distribution is described by the following characteristics. Sometimes.5 interquartile ranges above the third quartile (Q3). When the observations in a set of data are equally spread across the range of the distribution. The second figure below illustrates a distribution with an outlier.5 interquartile ranges below the first quartile (Q1). When it is graphed. because the labels on the X axis are categorical . Except for one lonely observation (the outlier on the extreme right). One implication of this distinction: it is always appropriate to talk about the skewness of a histogram. The first figure below has a gap. the distribution is called a uniform distribution. the spread is larger. values range from 1 to 9. As a result. With bar charts. Distributions can have few or many peaks. and distributions with most of their observations on the right (toward higher values) are said to be skewed left. each column represents a group defined by a categorical variable. Gaps refer to areas of a distribution where there are no observations. an extreme value is often considered to be an outlier if it is at least 1. or at least 1. If the observations cover a wide range.not quantitative. there are no observations in the middle of the distribution. a symmetric distribution can be divided at the center so that each half is a mirror image of the other. data values range from 3 to 7. When they are displayed graphically. Unusual Features Sometimes. whereas in the figure on the right. When a symmetric distribution has a single peak at the center. distributions are characterized by extreme values that differ greatly from the other observations.The spread of a distribution refers to the variability of the data.   Gaps. the tendency of the observations to fall more on the low end or the high end of the X axis. so it has the greater spread. that is. Distributions with most of their observations on the left (toward lower values) are said to be skewed right. The figure on the right is more variable. some distributions have many more observations on one side of the graph than the other. it is referred to as bell-shaped. With bar charts. In the figure on the left. Distributions with one clear peak are called unimodal. Consider the figures above. Number of peaks. Outliers. the X axis does not have a low end or a high end. How to Interpret a Boxplot 5 -5 -133- . The two most common unusual features are gaps and outliers. These extreme values are called outliers.

such as gaps or outliers. Both distributions were roughly bell-shaped.The back-to-back stemplot on the right shows the amount of cash (in dollars) carried by a random sample of teenage boys and girls. except that it provides two piece of information for each category rather than just one. using the same measurement scale. Double Bar Charts A double bar chart is similar to a regular bar chart. Often. side-by-side boxplots). It appears that the drug had a positive effect on patient recovery.about 5 days for the treatment group versus about 9 days for the control group. 8 -8 -133- . although there was more variation among the boys. the charts are color-coded with a different colored bar representing each piece of information. although the skew is more prominent in the treatment group. data from two distributions are displayed on the same chart. Patient response was slightly less variable in the treatment group than in the control group. And finally. The boys carried more cash than the girls . cold symptoms lasted 1 to 14 days (range = 13) versus 3 to 17 days (range = 14) for the control group. there were neither gaps nor outliers in either group. Neither distribution has unusual features. The boxplot shows the number of days each group continued to report symptoms. The median recovery time is more telling . Both distributions are skewed to the right. The boxplot to the right summarizes results from a medical study. Parallel Boxplots Control group Treatment group 2 4 6 8 10 12 14 16 With parallel boxplots (aka. and the control group received a placebo. In the treatment group. The treatment group received an experimental drug to relieve cold symptoms.a median of \$42 for the boys versus \$36 for the girls.

a double bar chart shows customer satisfaction ratings for different cars. measures the strength of the linear association between variables. μY is the population mean for variable Y.y) / sy ] } where n is the number of observations in the sample. A formula for computing a population correlation coefficient (ρ) is given below. x is the mean x value. How to Calculate a Correlation Coefficient A formula for computing a sample correlation coefficient (r) is given below. Both groups prefer the Japanese cars to the American cars. Correlation coefficients measure the strength of association between two variables. we are referring to the Pearson productmoment correlation. females. The correlation ρ between two variables is: ρ = [ 1 / N ] * Σ { [ (Xi . For example. Yi is the Y value for observation i. The blue rows represent males. a casual user might not realize that Microsoft uses a population correlation coefficient (ρ) for the Pearson() function in its Excel software. 9 -9 -133- . and σy is the standard deviation of Y. you will rarely have to compute a correlation coefficient by hand. The most common correlation coefficient. As a group. Fortunately. the red rows. Moreover. y is the mean y value. the men seem to be tougher raters.To the right. yi is the y value for observation i. Population correlation coefficient. Excel) and most graphing calculators have a correlation function that will do the job for you. both genders agree on the rank order in which the cars are rated. Σ is the summation symbol. sx is the sample standard deviation of x. xi is the x value for observation i. Many software packages (e.1) ] * Σ { [ (xi . Xi is the X value for observation i. Σ is the summation symbol. with Honda receiving the highest ratings and Ford receiving the lowest ratings.μX) / σx ] * [ (Yi . How to Interpret a Correlation Coefficient The sign and the absolute value of a correlation coefficient describe the direction and the magnitude of the relationship between two variables. μX is the population mean for variable X.. it is not clear whether a software package or a graphing calculator uses a population correlation coefficient or a sample correlation coefficient. broken out by gender.μY) / σy ] } where N is the number of observations in the population. In this tutorial. when we speak simply of a correlation coefficient. Note: Sometimes. σx is the standard deviation of X. The correlation r between two variables is: r = [ 1 / (n . they gave lower ratings to each car than the women gave. Sample correlation coefficient. called the Pearson product-moment correlation coefficient.g. and sy is the sample standard deviation of y.x) / sx ] * [ (yi .

A positive correlation means that if one variable gets bigger.0 and r = -1. A little skewness is ok if the sample size is large. The Least Squares Regression Line 10 .10 -133- . To check this.      The value of a correlation coefficient ranges between -1 and 1. the independent variable is the cause. rather. The strongest linear relationship is indicated by a correlation coefficient of -1 or 1. Least squares linear regression is a method for predicting the value of a dependent variable Y. The weakest linear relationship is indicated by a correlation coefficient equal to 0. and vice versa. The correlation becomes weaker as the data points become more scattered. it means zero linear relationship. make sure that the XY scatterplot is linear and that the residual plot shows a random pattern. Since this lesson is a little dense. as indicated by a random pattern on the residual plot. AP Statistics Tutorial: Least Squares Linear Regression In a cause and effect relationship. which handles two or more independent variables). Compare the first scatterplot with the last scatterplot. Correlation is affected by outliers. In this tutorial. The Y values are roughly normally distributed (i. The strongest correlations (r = 1.      When the slope of the line in the plot is negative. A histogram or a dotplot will show the shape of the distribution. If the data points fall in a random pattern. Therefore. we focus on the case where there is only one independent variable. and the dependent variable is the effect. Tip: The next lesson presents a simple regression example that shows how to apply the material covered in this lesson.00 to 0. This is called simple regression (as opposed to multiple regression. the stronger the linear relationship. the correlation is negative. For each value of X. the variability of the residuals will be relatively constant across all values of X.) Several points are evident from the scatterplots. the probability distribution of Y has the same standard deviation σ. When this condition is satisfied.0 ) occur when data points fall exactly on a straight line. the other variable tends to get bigger. the other variable tends to get smaller. The single outlier in the last plot greatly reduces the correlation (from 1. Keep in mind that the Pearson product-moment correlation coefficient only measures linear relationships. you may benefit by also reading the next lesson. symmetric and unimodal). based on the value of an independent variable X. (It is possible for two variables to have zero linear relationship and a strong curvilinear relationship at the same time. Prerequisites for Regression Simple linear regression is appropriate when the following conditions are satisfied. The greater the absolute value of a correlation coefficient.  The dependent variable Y has a linear relationship to the independent variable X. which is easily checked in a residual plot. the correlation is equal to zero. For any given value of X.71). a correlation of 0 does not mean zero relationship between two variables.. • •   The Y values are independent.e. A negative correlation means that if one variable gets bigger.

The regression coefficient (b1) is the average change in the dependent variable (Y) for a 1-unit change in the independent variable (X). you can solve for b0 and b1 "by hand".a software package (e. xi is the X value of observation i. Excel) or a graphing calculator . It is interpreted as the proportion of the variance in the dependent variable that is predictable from the independent variable.to find b0 and b1. you will use a computational tool . sx is the standard deviation of X. b1 is the regression coefficient. y is the mean of Y. Here are the equations. x is the mean of X. and the tool solves for each parameter. and sy is the standard deviation of Y Properties of the Regression Line When the regression parameters (b0 and b1) are defined as described above. the population regression line is estimated by: ŷ = b0 + b1x where b0 is a constant. The regression line passes through the mean of the X values (x) and the mean of the Y values (y). The least squares regression line is the only straight line that has all of these properties. X is the value of the independent variable. Given a random sample of observations. and Y is the value of the dependent variable. b1 = Σ [ (xi . Suppose Y is a dependent variable. and ŷ is the predicted value of the dependent variable. the regression line has the following properties. Β1 is the regression coefficient. You enter the X and Y values into your program or calculator.11 -133- .b1 * x where b0 is the constant in the regression equation. b1 is the regression coefficient. The regression constant (b0) is equal to the y intercept of the regression line.     The line minimizes the sum of squared differences between observed values (the y values) and predicted values (the ŷ values computed from the regression equation). 11 .Linear regression finds the straight line. x is the value of the independent variable. The population regression line is: Y = Β0 + Β1X where Β0 is a constant.g. yi is the Y value of observation i. that best represents observations in a bivariate data set.y) ] / Σ [ (xi .. It is the slope of the regression line. The Coefficient of Determination The coefficient of determination (denoted by R2) is a key output of regression analysis. r is the correlation between x and y. and X is an independent variable. called the least squares regression line or LSRL.x)(yi .x)2] b1 = r * (sy / sx) b0 = y . How to Define a Regression Line Normally. In the unlikely event that you find yourself on a desert island without a computer or a graphing calculator.

xi is the x value for observation i. An R2 of 1 means the dependent variable can be predicted without error from the independent variable. The formula for computing the coefficient of determination for a linear regression model with one independent variable is given below. outliers. and it can produce unreasonable estimates. Residual = Observed value .10 means that 10 percent of the variance in Y is predictable from X. σx is the standard deviation of x. y is the mean y value. You can assess the appropriateness of the model by examining residuals. Standard Error The standard error about the regression line (often denoted by SE) is a measure of the average amount that the regression equation over. x is the mean x value. The coefficient of determination (R2) for a linear regression model with one independent variable is: R2 = { ( 1 / N ) * Σ [ (xi . Coefficient of determination. and σy is the standard deviation of y. and so on. That is called extrapolation. Warning: When you use a regression equation. That is. An R2 of 0. and the more accurate predictions are likely to be.Predicted value e=y-ŷ Both the sum and the mean of the residuals are equal to zero. Σ is the summation symbol. and Influential Points A linear regression model is not always appropriate for the data. Residuals The difference between the observed value of the dependent variable (y) and the predicted value (ŷ) is called the residual (e). yi is the y value for observation i.y) ] / (σx * σy ) }2 where N is the number of observations used to fit the model. an R2 of 0.20 means that 20 percent is predictable. An R2 of 0 means that the dependent variable cannot be predicted from the independent variable. Residual Plots 12 . An R2 between 0 and 1 indicates the extent to which the dependent variable is predictable.    The coefficient of determination ranges from 0 to 1. do not use values for the independent variable that are outside the range of values used to create the equation.12 -133- . Outliers. Σ e = 0 and e = 0. the lower the standard error. The higher the coefficient of determination. Each data point has one residual.or under-predicts. AP Statistics: Residuals.x) * (yi . and influential points.

The residual plot shows a non-random pattern . Or it may be possible to "transform" the data to allow us to use a linear model. The coefficient of determination is bigger when the outlier is not present. Broadly speaking. Influential Points Influential points are data points with extreme values that greatly affect the the slope of the regression line.A residual plot is a graph that shows the residuals on the vertical axis and the independent variable on the horizontal axis. The charts below compare regression statistics for a data set with and without an influential point. changes the correlation between variables. Therefore. In regression. A nonlinear transformation changes (increases or decreases) linear relationships between variables and. Note that this influential point. A linear transformation preserves linear relationships between variables. This is illustrated in the scatterplots below. the coefficient of determination was bigger when the influential point was present. As a result of that single influential point. If the points in a residual plot are randomly dispersed around the horizontal axis. it is often possible to "transform" the raw data to make it linear.   Linear transformation.6.5 to -1. there are two kinds of transformations.13 -133- . Outliers limit the fit of the regression equation to the data. This allows us to use linear regression techniques appropriately with nonlinear data. What is a Transformation to Achieve Linearity? Transforming a variable involves using a mathematical operation to change its measurement scale. and the chart on the right displays those results as a residual plot. unlike the outliers discussed above. The chart on the right has a single influential point.negative residuals on the low end of the X axis and positive residuals on the high end. or adding a constant to x. Outliers Data points that diverge from the overall pattern and have large residuals are called outliers. did not reduce the coefficient of determination. Examples of a linear transformation to variable x would be multiplying x by a constant. thus. We discuss linear transformations in the next lesson. dividing x by a constant. a non-linear model is more appropriate. It is a nonlinear transformation that increases the linear relationship between two variables. Below the table on the left summarizes regression results from the from the example presented in a previous lesson. Examples of a nonlinear transformation of variable x would be taking the square root of x or the reciprocal of x. Nonlinear tranformation. This indicates that a non-linear model will provide a much better fit to the data. the slope of the regression line increases dramatically. from -2. In fact. AP Statistics: Transformations to Achieve Linearity When a residual plot reveals a data set to be nonlinear. Methods of Transforming Variables to Achieve Linearity 13 . otherwise. the correlation between x and y would be unchanged after a linear transformation. located at the high end of the X axis (where x = 24). a transformation to achieve linearity is a special kind of nonlinear transformation. a linear regression model is appropriate for the data.

proceed to the next step. dependent variable. Method Standard linear regression Exponential model Quadratic model Reciprocal model Logarithmic model Power model Transformation(s) None Dependent variable = log(y) Regression equation y = b0 + b1x log(y) = b0 + b1x Predicted value (ŷ) ŷ = b0 + b1x ŷ = 10b0 + b1x ŷ = ( = b0 + b1x )2 Dependent variable sqrt(y) = b0 + = sqrt(y) b1x Dependent variable ŷ = 1 / ( b0 + 1/y = b0 + b1x = 1/y b1x ) Independent variable = log(x) Dependent variable = log(y) Independent variable = log(x) y= b0 + b1log(x) log(y)= b0 + b1log(x) ŷ = b0 + b1log(x) ŷ = 10b0 + b log(x) 1 Each row shows a different nonlinear transformation method. Use a graphic calculator to obtain the log of a number or to transform back from the logarithm to the original number. Testing the effect of a transformation method involves looking at residual plots and correlation coefficients. The third column shows the regression equation used in the analysis.14 -133• •   . Congratulations! 14 . If the scatterplot is linear.    Choose a transformation method (see above table). trial-and-error process. • If the residual plot shows a linear pattern. based on regression results. using the transformed data. as described in the following sections. return to Step 1 and try a different approach. Conduct a regression analysis. In practice. the transformation was successful. Note: The logarithmic model and the power model require the ability to work with logarithms. And the last column shows the "back transformation" equation used to restore the dependent variable to its original. Create a residual plot. non-transformed measurement scale. How to Perform a Transformation to Achieve Linearity Transforming a data set to achieve linearity is a multi-step. these methods need to be tested on the data to which they are applied to be sure that they increase rather than decrease the linearity of the relationship. Choose a different transformation method and/or transform a different variable. Plot the independent variable against the dependent variable. the Stat Trek glossary has a brief refresher on logarithms. Some common methods are summarized below. If the plot is not linear. using the transformed variables. or both. Transform the independent variable. The second column shows the specific transformation applied to dependent and/or independent variables.There are many ways to transform variables to achieve linearity for regression analysis. If you need it.

Below. When we apply a linear regression to the raw data.96 with the transformed data versus only 0. it is called a frequency table. it is called a one-way table. The transformed data resulted in a better model. x y 1 1. The pattern of residuals is random. representing travel choices of 10 travel agency clients. The table below shows the data we analyzed.x and y. categorical variable. rather than y. reciprocal model.14 2 1. using a quadratic model to transform the dependent variable.66 The residual plot (above right) suggests that the transformation to achieve linearity was successful. the table on the left shows data for independent and dependent variables . correlation coefficients).74 5 3. x y 1 2 2 1 3 6 4 14 5 15 6 30 7 40 8 74 9 75 Suppose we repeat the analysis. as the dependent variable. return to Step 1 and try a different approach. A Transformation Example Below.00 3 2.) will depend on nature of the original data.87 6 5. a one-way table displays categorical data in the form of frequency counts and/or relative frequencies. respectively. residual plots. the bar chart and the frequency table display the same data. For a quadratic model.• If the plot pattern is nonlinear. The only way to determine which method is best is to try each and compare the result (i. quadratic model.45 4 3.48 7 6. A one-way table is the tabular equivalent of a bar chart. which suggests that the data are nonlinear. suggesting that the relationship between the independent variable (x) and the transformed dependent variable (square root of y) is linear. and only one. Frequency Tables When a one-way table shows frequency counts for a particular category of a categorical variable. the residual plot shows a non-random pattern (a U-shaped curve). we use the square root of y. Like a bar chart. And the coefficient of determination was 0. etc.32 8 8.15 -133- . Both show frequency counts..60 9 8.88 with the raw data. Relative Frequency Tables 15 .e. The best tranformation method (exponential model. AP Statistics Tutorial: One-Way Tables When a table presents data for one.

32 0.When a one-way table shows relative frequencies for particular categories of a categorical variable.04 0.16 -133- . and the table on the right shows relative frequencies as a percentage. Each of the tables below summarizes data from the bar chart above.1 0. Yet. we might conclude that the three activities had roughly equal appeal.36 0. it is called a relative frequency table. Entries in the body of the table are called joint frequencies. Entries in the "Total" row and "Total" column are called marginal frequencies or the marginal distribution.20 T V Tot al 0. AP Statistics Tutorial: Two-Way Tables A common task in statistics is to look for a relationship between two categorical variables. The table on the left shows relative frequencies as a proportion.6 0. and little interest in dance among men. Both tables are relative frequency tables. Two-Way Frequency Tables Dan Spo T Tot ce rts V al Men Wom en Total 2 16 18 10 6 16 8 8 20 30 16 50 To the right.3 1. the two-way table shows the favorite leisure activities for 50 adults . The entries in the cells of a two-way table can be frequency counts or relative frequencies (just like a one-way table).0 2 0 16 .4 6 0 Wom 0. Two-Way Relative Frequency Tables Dan Spo ce rts Men 0. the table is a frequency table. Because entries in the table are frequency counts.12 en 6 0 Total 0. If we looked only at the marginal frequencies in the Total row.20 men and 30 women.32 0.1 0. the joint frequencies show a strong preference for dance among women. Two-Way Tables A two-way table (also called a contingency table) is a useful tool for examining relationships between categorical variables.

The table to the right shows preferences for leisure activities in the form of relative frequencies. It shows that women have an strong preference for dance.50 0. For instance. Dance Men Women Total 0.40 0.20 0. for rows. The segmented bar chart on the right uses data from the "Relative Frequency for Rows" table above.38 1. For example.00 Men Women Total Dance 0.27 0. "Relative Frequency for Rows" table most clearly shows the probability that each gender will prefer a particular leisure activity.00 Sports 0. Two-way tables can show relative frequencies for the whole table. A segmented bar chart has one bar for each level of a categorical variable.89 1.50 0.62 0.00 TV 0.Relative Frequency of Table We can also display relative frequencies in two-way tables.00 1.10 0.32 TV 0. Such relationships are often easier to detect when they are displayed graphically in a segmented bar chart. but the degree of preference for sports over TV is not great.00 Total 0.17 -133- .50 1. and so on.00 1. or for columns.53 0.00 Relative Frequency of Row Relative Frequency of Column Each type of relative frequency table makes a different contribution to understanding the relationship between gender and preferences for leisure activities. The relative frequencies in the body of the table are called conditional frequencies or the conditional distribution.40 0. the probability that a woman will prefer dance is 53%. Each bar is divided into "segments". such that the length of each segment indicates proportion or percentage of observations in a second variable. while men seldom make dance their first choice. AP Statistics Tutorial: Data Collection Methods To derive conclusions from data. Methods of Data Collection 17 . it is easy to see that the probability that a man will prefer dance is 10%. that is. the probability that a man will prefer sports is 50%.36 Sports 0. and the table on the right shows relative frequencies for columns. Below. we need to know how the data were collected. the table on the left shows relative frequencies for rows. we need to know the method(s) of data collection.11 0.32 Total 1. Men are most likely to prefer sports.60 1. The table to the right shows relative frequencies for the whole table.

 Resources. a census is not practical. Sample survey.18 -133- . However. Sample Statistic The reason for conducting a sample survey is to estimate the value of some attribute of a population. if not. In most studies. If participants in a study are randomly selected from a larger population. A population parameter is the true value of a population attribute. observational studies attempt to understand cause-and-effect relationships. it is appropriate to generalize study results to the larger population. A sample survey is a study that obtains data from a subset of a population. a sample survey has a big resource advantage over a census. cheaper. An experiment is a controlled study in which the researcher attempts to understand causeand-effect relationships.There are four main methods of data collection. Cause-and-effect relationships can be teased out when subjects are randomly assigned to groups. AP Statistics Tutorial: Survey Sampling Methods Sampling method refers to the way that observations are selected from a population to be in the sample for a sample survey. A well-designed sample survey can provide very precise estimates of population parameters . Data Collection Methods: Pros and Cons Each method of data collection has advantages and disadvantages. When the population is large. Like experiments. the researcher compares group scores on some dependent variable. Population Parameter vs. experiments. Therefore. Based on the analysis. Experiment. which allow the researcher to control assignment of subjects to treatment groups.   Causal inference. In the analysis phase. the researcher draws a conclusion about whether the treatment ( independent variable) had a causal effect on the dependent variable. it is not appropriate to generalize. are the best method for investigating causal relationships.  Population parameter. A census is a study that obtains data from every member of a population. Generalizability. unlike experiments.quicker. in order to estimate population attributes. and with less manpower than a census.  Census. the researcher is not able to control (1) how subjects are assigned to groups and/or (2) which treatments each group receives. 18 .    Observational study. Generalizability requires random selection. Generalizability refers to the appropriateness of applying findings from a study to a larger population. The study is "controlled" in the sense that the researcher controls (1) how subjects are assigned to groups and (2) which treatments each group receives. Observational studies do not feature random selection. so it is not appropriate to generalize from the results of an observational study to a larger population. because of the cost and/or time required.

Non-probability samples. accuracy. The actual percentage of all the voters is a population parameter. The estimate of that percentage. With probability sampling methods.  Non-probability sampling methods offer two potential advantages . Consider this example. With non-probability sampling methods.  Probability samples. and/or we cannot be sure that each population element has a non-zero chance of being chosen. precision.  Convenience sample. not by the survey administrator.e. is a sample statistic. of a population parameter. Consider the following example.. based on sample data. by the sampling method. The sample is chosen by the viewers. sampling methods fall into one of two categories. This ensures that the statistical conclusions will be valid. Probability Sampling Methods The main types of probability sampling methods are simple random sampling. A sample statistic is an estimate. Often. each population element has a known (nonzero) chance of being chosen for the sample.. Non-Probability Samples As a group. Simple random sampling refers to any sampling method that has the following properties. stratified sampling. 19 .19 -133- . This would be a volunteer sample. The main disadvantage is that non-probability sampling methods do not allow you to estimate the extent to which sample statistics are likely to differ from population parameters. The quality of a sample statistic (i. A voluntary sample is made up of people who self-select into the survey. A public opinion pollster wants to know the percentage of voters that favor a flat-rate income tax.  Voluntary sample. Sample statistic.  Simple random sampling. A convenience sample is made up of people who are easy to reach. representativeness) is strongly affected by the way that sample observations are chosen. Probability vs. cluster sampling. this would be a convenience sample. Non-Probability Sampling Methods Two of the main types of non-probability sampling methods are voluntary samples and convenience samples. these folks have a strong interest in the main topic of the survey. for example. Suppose. The key benefit of of probability sampling methods is that they guarantee that the sample chosen is representative of the population. and systematic random sampling. based on sample data.convenience and cost. we do not know the probability that each population element will be chosen. that is. that a news show asks viewers to participate in an on-line poll. If the mall was chosen because it was a convenient site from which to solicit survey participants and/or because it was close to the pollster's home or business. Only probability sampling methods permit that kind of analysis. multistage sampling. A pollster interviews shoppers at a local mall.

within each group. Population members having the selected numbers are included in the sample. With stratified sampling. we select a sample by using combinations of different sampling methods. the sampling method is called simple random sampling. Thereafter. We might divide the population into groups or strata.• • • The population consists of N objects. From the list. in Stage 2. Then. A sample of clusters is chosen. a blind-folded researcher selects n numbers. In stratified sampling. and only one. we might use simple random sampling to select a subset of elements from each chosen cluster for the final sample. within each stratum. the sample includes elements from each stratum. With systematic random sampling. we might randomly select survey respondents. east. Only individuals within sampled clusters are surveyed. If all possible samples of n objects are equally likely to occur.  Systematic random sampling. based on geography . using a probability method (often simple random sampling). suppose we conduct a national survey. Note the difference between cluster sampling and stratified sampling. Then. AP Statistics Tutorial: Bias in Survey Sampling In survey sampling. With cluster sampling. a probability sample (often a simple random sample) is selected. based on some characteristic. the sample includes elements only from sampled clusters. This method is different from simple random sampling since every possible sample of n elements is not equally likely. we create a list of every member of the population.  Multistage sampling. the groups are called strata. in contrast. Each group is called a cluster. The numbers are placed in a bowl and thoroughly mixed.20 -133- . This means that each sample point represents the attributes of a known number of population elements. With multistage sampling.  Stratified sampling. group. There are many ways to obtain a simple random sample. Then. we randomly select the first sample element from the first k elements on the population list. the population is divided into groups. south. we might use cluster sampling to choose clusters from a population. With cluster sampling. One way would be the lottery method.or under-estimate a population parameter. For example. and west. Then. The sample consists of n objects.north. Bias Due to Unrepresentative Samples A good sample is representative.  Cluster sampling. in Stage 1. we select every kth element on the list. bias refers to the tendency of a sample statistic to systematically over. With stratified sampling. 20 . every member of the population is assigned to one. As a example. Each of the N population members is assigned a unique number.

Undercoverage occurs when some members of the population are inadequately represented in the sample. so they will be reluctant to admit to unsavory attitudes or illegal activities in a survey.Bias often occurs when the survey sample does not accurately represent the population. affirmative action. this survey question is biased toward getting a dissatisfied response. Instead. Since only 25% of the sampled voters actually completed the mail-in survey. people who owned cars and telephones tended to be more affluent.).21 -133-  . making mail surveys vulnerable to nonresponse bias. non-zero probability of being selected. The resulting sample tends to overrepresent individuals who have strong opinions.  Undercoverage. All probability sampling methods rely on random sampling. the measurement process includes the environment in which the survey is conducted. For example. The Literary Digest survey illustrates this problem. The Literary Digest experience illustrates a common problem with mail surveys. How did this happen? The survey relied on a convenience sample. Respondents tended to be Landon supporters. In survey research. Roosevelt supporters. their responses may be biased toward what they believe is socially desirable. drawn from telephone directories and car registration lists. Response rate is often low. which predicted that Alfred Landon would beat Franklin Roosevelt in the 1936 presidential election. dissatisfied. particularly if survey results are not confidential. Voluntary response bias occurs when sample members are self-selected volunteers. The bias that results from an unrepresentative sample is called selection bias. Nonresponse bias is the bias that results when respondents differ in meaningful ways from nonrespondents. Undercoverage is often a problem with convenience samples. Sometimes. individuals chosen for the sample are unwilling or unable to participate in the survey. as in voluntary samples.  Leading questions. Bias Due to Measurement Error A poor measurement process can also lead to bias. a satisfaction survey may ask the respondent to indicate where she is satisfied. 21 . etc. survey results overestimated voter support for Alfred Landon. and the state of the survey respondent. Social desirability. gun control. The survey sample suffered from undercoverage of low-income voters. Random sampling is a procedure for sampling from a population in which (a) the selection of a sample unit is based on chance and (b) every element of the population has a known. Random sampling helps produce representative samples by eliminating voluntary response bias and guarding against undercoverage bias. Response bias refers to the bias that results from problems in the measurement process. Most people like to present themselves in a favorable light. who tended to be Democrats. By giving the respondent one response option to express satisfaction and two response options to express dissatisfaction. The wording of the question may be loaded in some way to unduly favor one response over another. A classic example of undercoverage is the Literary Digest voter survey. and nonrespondents. Some examples of response bias are given below. Some common examples of selection bias are described below. An example would be call-in radio shows that solicit audience participation in surveys on controversial topics (abortion. or very dissatified.  Voluntary response bias.  Nonresponse bias. the way that questions are asked. In 1936.

The sample size was very large . or even inanimate objects. number of days hospitalized.  Characteristics of a Well-Designed Experiment 22 . plants. increasing sample size does not affect survey bias.people. which is used to estimate a population parameter. factors. and experimental units. but the large sample size could not overcome problems with the sample .  Independent variable. and treatments for a hypothetical experiment. The table below shows independent variables. By noting how the manipulated variables affect a response variable. animals. In the hypothetical experiment above.e. The recipients of experimental treatments are called experimental units or subjects.undercoverage and nonresponse bias. An independent variable (also called a factor) is an explanatory variable manipulated by the experimenter. The variability among statistics from different samples is called sampling error. And each of the different sample statistics would be an estimate for the same population parameter. A large sample size cannot correct for the methodological problems (undercoverage. etc. But in an experiment to measure the tensile strength of string. The dependent variable in this experiment would be some measure of health (annual doctor bills. Combinations of factor levels are called treatments. the experimental units might be pieces of string. i. that is. If the statistic is unbiased. number of colds caught in a year. If you repeated a survey many times. it makes the sample statistic less variable. Increasing the sample size tends to reduce the sampling error. different values of the factor. dependent variables.. even though any individual statistic may differ from the population parameter. you would get a different sample statistic with each replication.   Dependent variable. nonresponse bias. Parts of an Experiment All experiments have independent variables. levels.). using different samples each time. The Literary Digest example discussed above illustrates this point.) that produce survey bias. In the hypothetical experiment above. Each factor has two or more levels.over 2 million surveys were completed. The experimental units in an experiment could be anything . while holding all other variables constant. Experimental units. AP Statistics Tutorial: Experiments In an experiment. the researcher can test whether a causal relationship exists between the manipulated variables and the response variable. the average of all the statistics from all possible samples will equal the true population parameter. However.Sampling Error and Survey Bias A survey produces a sample statistic. the experimental units would probably be people (or lab animals). the researcher is looking at the effect of vitamins on health.22 -133- . a researcher manipulates one or more variables. etc.

the placebo effect will be reduced or eliminated. Of course. Randomization refers to the practice of using chance methods (random number tables. In this way. 23 . In this way.  Confounding Confounding occurs when the experimental controls do not allow the experimenter to reasonably eliminate plausible alternative explanations for an observed relationship between independent and dependent variables. the potential effects of lurking variables are distributed at chance levels (hopefully roughly evenly) across treatment conditions. Replication. and the placebo will not serve its intended control purpose. These extraneous variables are called lurking variables. subjects respond differently after they receive a treatment.A well-designed experiment includes design features that allow researchers to eliminate extraneous variables as an explanation for the observed relationship between the independent variable(s) and the dependent variable. a placebo) to the control group. flipping a coin. Often. • Control group. Often.e. and a subject's positive response to a placebo is called the placebo effect. Some of these features are listed below. Control involves making the experiment as similar as possible for subjects in each treatment condition. The drug is effective only if subjects who receive the drug have better outcomes than subjects who receive the sugar pill. • • Blinding. the more subjects in each treatment condition. A neutral treatment that has no "real" effect on the dependent variable is called a placebo. A control group is a baseline group that receives no treatment or a neutral treatment. Placebo. placebos..  Randomization.) to assign subjects to treatments. the experimenter compares results in the treatment group to results in the control group. and it assures that the analyst's evaluation is not tainted by awareness of actual treatment conditions. even if the treatment is neutral. if subjects in the control group know that they are receiving a placebo.e. the lower the variability of the dependent measures. To control for the placebo effect. In general. This practice is called double blinding.. It prevents the experimenter from "spilling the beans" to subjects through subtle cues.  Control. and blinding. Control refers to steps taken to reduce the effects of extraneous variables (i. etc. variables other than the independent variable and the dependent variable). The classic example is using a sugar pill in drug research. Blinding is the practice of not telling subjects whether they are receiving a placebo. To assess treatment effects. subjects in the control and treatment groups experience the placebo effect equally.23 -133- . knowledge of which groups receive placebos is also kept from people who administer or evaluate the experiment. researchers often administer a neutral treatment (i. Three control strategies are control groups. Replication refers to the practice of assigning each treatment to many experimental subjects.

Perhaps. For example. Or perhaps the men experienced a placebo effect. men are less vulnerable to the particular cold virus circulating during the experiment. many variables are confounded. and the new medicine had no effect at all. 24 . With this design. the men report fewer colds. At the end of the test period.Consider this example. with blinding. if the treatment group (i. Control.. variables other than the independent variables). Acme Medicine is conducting an experiment to test a new vaccine.e. Then. To test the vaccine..   An Experimental Design Example Consider the following hypothetical experiment. we describe three experimental designs . The men receive the drug. and it is impossible to say whether the drug was effective. in terms of data analysis and convenience. developed to immunize people against the common cold.500 men and 500 women. which makes it easier to detect differences in treatment outcomes. the group getting the medicine) had sufficiently fewer colds than the control group. Acme has 1000 volunteer subjects .e. A good experimental design serves three purposes. gender is confounded with drug use. while ruling out confounding effects of other factors. and the women do not. It reduces variability within treatment conditions. a randomized block design. A drug manufacturer tests a new cold medicine with 200 volunteer subjects . Completely Randomized Design Treatment Place Vacci bo ne 500 500 The completely randomized design is probably the simplest experimental design. This experiment implements no controls at all! As a result. and a matched pairs design. Variability.100 men and 100 women.24 -133- .  Causation. it would be reasonable to conclude that the medicine was effective in preventing colds. One treatment could receive a placebo. This experiment could be strengthened with a few controls. AP Statistics Tutorial: Experimental Design The term experimental design refers to a plan for assigning subjects to treatment conditions. It allows the experimenter to make causal inferences about the relationship between independent variables and a dependent variable.a completely randomized design. And we show how each design might be applied by Acme Medicine to understand the effect of the vaccine. The subjects range in age from 21 to 70. subjects are randomly assigned to treatments. In this lesson. Women and men could be randomly assigned to treatments. It allows the experimenter to rule out alternative explanations due to the confounding effects of extraneous variables (i.

they are not the same.25 -133- . 250 women get the placebo. differences between treatment conditions cannot be attributed to gender. In this Acme example. This design ensures that each treatment condition has an equal proportion of men and women.g. But it can also be a problem (e. Using the subject as his own control is desirable in some experiments (e. Because this design reduces variability and potential confounding. Note 2: Blocks perform a similar function in experimental design as strata perform in sampling. In this design. Then. Blocking is associated with experimental design. The dependent variable is the number of colds reported in each treatment condition.A completely randomized design layout for the Acme Experiment is shown in the table to the right. research on learning or fatigue). Matched Pairs Design 25 .. It is known that men and women are physiologically different and react differently to medication. individual subjects may receive multiple treatments. based on gender. But only the randomized block design explicitly controls for gender. Both designs use randomization to implicitly guard against confounding. such that the variability within blocks is less than the variability between blocks. This is called using the subject as his own control. the randomized block design is an improvement over the completely randomized design. 250 men get the vaccine. Both divide observations into subgroups. As a result. For this design. A completely randomized design relies on randomization to control for the effects of extraneous variables. They received a placebo or they received the vaccine. it produces a better estimate of treatment effects. The experimenter assumes that. subjects are randomly assigned to treatments. on averge. Then. and 250 women get the vaccine. medical studies where the medicine used in one treatment might interact with the medicine used in another treatment). Note 1: In some blocking designs. Subjects are assigned to blocks. the experimenter divides subjects into subgroups called blocks. subjects in the "vaccine" condition should report significantly fewer colds than subjects in the "placebo" condition. the experimenter randomly assigned subjects to one of two treatment conditions. and stratification is associated with survey sampling. subjects within each block are randomly assigned to treatment conditions. within each block. The same number of subjects (500) were assigned to each treatment condition (although this is not required). This randomized block design removes gender as a potential source of variability and as a potential confounding variable. 250 men get the placebo.g. Randomized Block Design Treatment Gen Place Vacci der bo ne Male Fem ale 250 250 250 250 With a randomized block design. so any significant differences between conditions can fairly be attributed to the independent variable. extraneous factors will affect treatment conditions equally.. The table to the right shows a randomized block design for the Acme experiment. If the vaccine is effective. However.

B. unlike the others.. both age 22. and subjects can be grouped into pairs. there is no chance that the event A will occur.. subjects are randomly assigned to different treatments. Then. and C). However. the probability that an event will occur is expressed as a number between 0 and 1. based on some blocking variable.. Notationally. The sum of all possible outcomes in a statistical experiment is equal to one.26 -133- . for example. a statistical experiment can have n possible outcomes. For example. AP Statistics Tutorial: Probability The probability of an event refers to the likelihood that the event will occur. event A will definitely occur. that if an experiment can have three possible outcomes (A. 1 1 1 1 . If P(A) is close to one. For the Acme example. It is used when the experiment has only two treatment conditions. there is a strong chance that event A will occur If P(A) equals one. the probability of event A is represented by P(A). Each pair is matched on gender and age. The 1000 subjects are grouped into 500 matched pairs. Like the other designs. each of which is equally likely. If P(A) is close to zero.. the matched pairs design uses randomization to control for confounding. both age 21. the matched pairs design is an improvement over the completely randomized design and the randomized block design.     If P(A) equals zero. Suppose a subset of r outcomes are classified as "successful" outcomes. Pair 2 might be two women. then P(A) + P(B) + P(C) = 1. The probability that the experiment results in a successful outcome (S) is: 26 . How to Compute Probability: Equally Likely Outcomes Sometimes. and so on. there is little likelihood that event A will occur. within each pair. This means.. this design explicitly controls for two potential lurking variables . 49 9 50 0 1 1 .. 1 1 A matched pairs design is a special case of the randomized block design. The table to the right shows a matched pairs design for the Acme experiment. How to Interpret Probability Mathematically.age and gender.Treatment Pa Place Vacci ir bo ne 1 2 . Pair 1 might be two women.

AP Statistics Tutorial: Rules of Probability Often. The relative frequency of an event is the number of times an event occurs. we want to compute the probability of an event from the known probabilities of other events. given that Event B has occurred. Definitions and Notation Before discussing the rules of probability. The scatterplot (above right) shows the relative frequency as the number of trials (in this case. there are 10 equally likely outcomes. summing results over many visitors. the relative frequency converges toward a stable value (0. we state the following definitions:   Two events are mutually exclusive or disjoint if they cannot occur at the same time. Over many trials. given Event B. 20 out of 50 visitors make a purchase. This lesson covers some important rules that simplify those computations.10 and 20/50 or 0. is denoted by the symbol P(A|B). is called a conditional probability. which can be interpreted as the probability that a visitor to the store will make a purchase. Therefore. divided by the total number of trials. three are green. what is the probability that it will be green? In this experiment. P(A) = ( Frequency of Event A ) / ( Number of Trials ) For example. The probability that Event A will not occur is denoted by P(A'). as the number of trials increases. The next day. she might find that the probability that a visitor makes a purchase gets closer and closer 0.20. The conditional probability of Event A. The two relative frequencies (5/50 or 0.30.20). An urn has 10 marbles. three of which are green marbles. a merchant notices one day that 5 out of 50 visitors to her store make a purchase. the probability of choosing a green marble is 3/10 or 0. Two marbles are red. If an experimenter randomly selects 1 marble from the urn. However. How to Compute Probability: Law of Large Numbers One can also think about the probability of an event in terms of its long-run relative frequency. and five are blue.40) differ.P(S) = ( Number of successful outcomes ) / ( Total number of equally likely outcomes ) = r / n Consider the following experiment. the number of visitors) increases. 27 . The probability that Event A occurs. The idea that the relative frequency of an event will converge on the probability of the event. is called the law of large numbers. The complement of an event is the event not occuring.27 -133-  .

then Events A and B are independent. P(A ∩ B) = P(A) P(B|A) Example An urn contains 6 red marbles and 4 black marbles. What is the probability that both of the marbles are black? Solution: Let A = the event that the first marble is black. and let B = the event that the second marble is black. for example.0. P(A) = 4/10. if the occurence of Event A does not change the probability of Event B. we learned two important properties of probability:   The probability of an event ranges from 0 to 1.20. Rule of Multiplication The rule of multiplication applies to the situation when we want to know the probability of the intersection of two events. 3 of which are black. The rule of subtraction follows directly from these properties.00 . The probability that Events A and B both occur is the probability of the intersection of A and B. there are 10 marbles in the urn. we want to know the probability that two events (Event A and Event B) both occur.   Rule of Subtraction In a previous lesson. After the first selection. Two marbles are drawn without replacement from the urn. Therefore.80 or 0. there are 9 marbles in the urn. based on the rule of multiplication: 28 . We know the following:   In the beginning. The probability of the intersection of Events A and B is denoted by P(A ∩ B). 4 of which are black. P(A) = 1 . that is. The probability that Events A or B occur is the probability of the union of A and B. the probability that Bill will not graduate is 1. If the occurence of Event A changes the probability of Event B. Rule of Multiplication The probability that Events A and B both occur is equal to the probability that Event A occurs times the probability that Event B occurs.P(A') Suppose. If Events A and B are mutually exclusive.28 -133- . What is the probability that Bill will not graduate from college? Based on the rule of subtraction. P(B|A) = 3/9. P(A ∩ B) = 0. The probability of the union of Events A and B is denoted by P(A ∪ B) .80. given that A has occurred. the probability that Bill will graduate from college is 0. On the other hand. Therefore. Rule of Subtraction The probability that event A will occur is equal to 1 minus the probability that event A will not occur. then Events A and B are dependent. The sum of probabilities of all possible events equals 1. Therefore.

HT. For example.25 + 0.5x over the range of 0 to 2 and y = 0 elsewhere.0.75. For example. The variable X can take on the values 0. The total area under the curve of the function is equal to one. the table is an example of a probability distribution for a discrete random variable. The probability of getting 0 heads is 0. and 2 heads. Thus. P(x) 0. the shaded area shows the probability that 30 . The chart on the left shows a probability density function described by the equation y = 1 over the range of 0 to 1 and y = 0 elsewhere. the probability of getting 1 or fewer heads [ P(X < 1) ] is P(X = 0) + P(X = 1).5x The probability that a continuous random variable falls in the interval between a and b is equal to the area under the pdf curve between a and b. The value of y is greater than or equal to zero for all values of x.A probability distribution is a table or an equation that links each possible value that a random variable can assume with its probability of occurence.25. 1 head.50. and X is a discrete random variable. Discrete Probability Distributions The probability distribution of a discrete random variable can always be represented by a table. This simple exercise can have four possible outcomes: HH. 0. 0. The chart on the right shows a probability density function described by the equation y = 1 .30 -133- . let the variable X represent the number of heads that result from the coin flips. y = f(x). Continuous Probability Distributions The probability distribution of a continuous random variable is represented by an equation. called the probability density function (pdf). you can find cumulative probabilities. that is. The charts below show two continuous probability distributions. which is equal to 0. y=1 y = 1 . or 2. in the first chart above. All probability density functions satisfy the following conditions:    The random variable Y is a function of X. x 0 1 2 Probability .25. For example.25 Note: Given a probability distribution.0. suppose you flip a coin two times.50 or 0. and TT. TH.25 0. The table below shows the probabilities associated with each possible value of the X. Number of heads.50 0. 1. The area under the curve is equal to 1 for both charts. Now.

25 + 4*0. The mean of the probability distribution is defined by the following equation.0. E(X) = Σ [ xi * P(xi) ] E(X) = 0*0. E(X) = μx = Σ [ xi * P(xi) ] where xi is the value of the random variable for outcome i. Notationally.the random variable X will fall between 0.. AP Statistics Tutorial: Attributes of Random Variables Just like variables from a data set. Number of hits.0 5 . there are an infinite number of values between any two data points.75 Median of a Discrete Random Variable 31 .2 5 4 0.. x Probability.3 0 3 0. μx is the mean of random variable X.20 + 2*0.0 and 2. And in the second chart.6 and 1. the shaded area shows the probability of falling between 1.00 (B) 1.2 0 2 0.10 1 0. This lesson shows how to compute these measures for discrete random variables. Solution The correct answer is B. each player went to bat 4 times. P(x) What is the mean of the probability distribution? (A) 1.75 (C) 2.0.e. As a result.31 -1330 0. That probability is 0.30 + 3*0. and P(xi) is the probability that the random variable will be outcome i. Use the following formula to compute the mean of a discrete random variable. Mean of a Discrete Random Variable The mean of the discrete random variable X is also called the expected value of X. the expected value of X is denoted by E(X). The number of hits made by each player is described by the following probability distribution. mean and median) and measures of variability (i.10 + 1*0. That probability is 0. Note: With a continuous distribution.25 (E) None of the above. random variables are described by measures of central tendency (i.05 = 1. the probability that a continuous random variable will assume a particular value is always zero. standard deviation and variance).40.00 (D) 2.25.e. Example 1 In a recent little league softball game.

find the variance. x Probability.32 -133- .60.10 Solution The correct answer is D.5 0 3 0.25 + 2*0. The equation for computing the variance of a discrete random variable is shown below. Computations are shown below.20 + 0. beginning with the expected value.1 0 What is the standard deviation of the probability distribution? (A) 0.10 = 2.50 + 3*0.30 + 0. E(x) is the expected value of the discrete random variable x. P(X < 2) = P(x=0) + P(x=1) + P(x=2) = 0. find the expected value.79 (D) 0. 32 . the median is 2. The solution has three parts. Number of adults. we find the variance.10 + 0.10 Now that we know the expected value. P(xi) is the probability that the random variable will be outcome i.60 P(X > 2) = P(x=2) + P(x=3) + P(x=4) = 0.E(x) ]2 * P(xi) where xi is the value of the random variable for outcome i.89 (E) 2.25 2 0. σ2 = Σ [ xi .60.5.50 (B) 0. then.05 = 0. The computations are shown below. P(x) 1 0.62 (C) 0. and P(X > 2) is equal to 0.5 and P(X > x) is greater than or equal to 0. First.30 = 0.25 + 0.60 Variability of a Discrete Random Variable The standard deviation of a discrete random variable (σ) is equal to the square root of the variance of a discrete random variable (σ2). In Example 1. then. Consider the problem presented above in Example 1. Example 2 The number of adults living in homes on a randomly selected city block is described by the following probability distribution. find the standard deviation.15 + 4*0.1 5 4 0. E(X) = Σ [ xi * P(xi) ] E(X) = 1*0. because P(X < 2) is equal to 0.The median of a discrete random variable is the value of X for which P(X < x) is greater than or equal to 0.

21 * 0. Then. If X and Y are random variables.σ2 = Σ [ xi . Recommendation: Read the sample problems at the end of the lesson.10 2 σ = (1.μy The above equations for general variables also apply to random variables. it is necessary to add or subtract random variables.81) * 0.1)2 * 0. X and Y are dependent. This lesson introduces some important equations.3610 = 0.79 And finally. Independence of Random Variables If two random variables. If either condition is not met. When this occurs.25) + (0.Y) = Var(X) + Var(Y) 33 . μx+y = μx + μy and μx-y = μx .1215 + 0.2. the variance of (X + Y) and the variance of (X .01 * 0.3025 + 0.33 -133- .2.15) + (3. then the correlation between X and Y is equal to zero. for all values of X and Y.   P(x|y) = P(x).50) + (0. Sums and Differences of Random Variables: Effect on the Mean Suppose you have two variables: X with a mean of μx and Y with a mean of μy. so the standard deviation is sqrt(0.2. and E(X .889.1)2 * 0. and the sample problems show how to apply those equations. AP Statistics: Combinations of Random Variables Sometimes.1)2 * 0. the other condition also met.E(x) ]2 * P(xi) σ2 = (1 . If either one is met. The above conditions are equivalent. are independent.2.Y) = E(X) . Then.Y) are described by the following equations Var(X + Y) = Var(X .E(Y) where E(X) is the expected value (mean) of X. Sums and Differences of Random Variables: Effect on Variance Suppose X and Y are independent random variables. X and Y.Y) is the expected value of X minus Y. it is useful to know the mean and variance of the result. the standard deviation is equal to the square root of the variance.10) = 0. for all values of X and Y. then E(X + Y) = E(X) + E(Y) and E(X .50 + (3 .61 * 0. and X and Y are independent. the mean of the sum of these variables μx+y and the mean of the difference between these variables μx-y are given by the following equations. Note: If X and Y are independent.79) or 0. P(x ∩ y) = P(x) * P(y). E(Y) is the expected value of Y.25 + (2 .15 + (4 .1)2 * 0. E(X + Y) is the expected value of X plus Y. they satisfy the following conditions.0050 + 0.

and the correlation between X and Z is equal to r. and Var(Y) is the variance of Y. and let m and b be constants.       and SD(X . then the correlation between Y and Z will also equal r. and Var(X) is the variance of X. and/or dividing the variable by a constant. it is necessary to apply a linear transformation to a random variable. That is. Var(Y) is the variance of Y. Linear Transformations of Random Variables A linear transformation is a change to a variable characterized by one or more of the following operations: adding a constant to the variable.where Var(X + Y) is the variance of the sum of X and Y. Note: The standard deviation (SD) of the transformed variable is equal to the square root of the variance. When a linear transformation is applied to a random variable. By observing simulated outcomes. multiplying the variable by a constant. Note: The standard deviation (SD) is always equal to the square root of the variance (Var). SD(X + Y) = sqrt[ Var(X + Y) ] AP Statistics: Linear Transformations of Variables Sometimes. If a new variable Y is created by applying a linear transformation to X. Thus.34 -133- .Y) ] Adding a constant: Y = X + b Subtracting a constant: Y = X . a new random variable is created. it may be useful to know the mean and variance of the result.b Note: Suppose X and Z are variables. Y is the mean of Y. let X be a random variable. such that simulated outcomes closely match real-world outcomes. Var(X . When this is done. the mean and variance of the new random variable Y are defined by the following equations. Y = mX + b and Var(Y) = m2 * Var(X) where m and b are constants.Y) is the variance of the difference between X and Y.b Multiplying by a constant: Y = mX Dividing by a constant: Y = X/m Multiplying by a constant and adding a constant: Y = mX + b Dividing by a constant and subtracting a constant: Y = X/m . To illustrate. SD(Y) = sqrt[ Var(Y) ]. Then.Y) = sqrt[ Var(X . 34 . subtracting a constant from the variable. How Linear Transformations Affect the Mean and Variance Suppose a linear transformation is applied to the random variable X to create a new random variable Y. Var(X) is the variance of X. X is the mean of X. AP Statistics Tutorial: Simulation of Random Events Simulation is a way to model random events. researchers gain insight on the real world. Each of the following examples show how a linear transformation of X defines a new random variable Y.

Tables of random numbers (often found in the appendices of statistics texts) are another option. 1. Others may be difficult. denoted by P. 5. You flip a coin 2 times and count the number of times the coin lands on heads. Each trial can result in just two possible outcomes . Notation 35 . Analyze the simulated outcomes and report results. Link each outcome to one or more random numbers. a failure. How to Conduct a Simulation A simulation is useful only if it closely mirrors real-world outcomes. We flip a coin 2 times. yet. 4. 3. AP Statistics Tutorial: Binomial Distribution To understand binomial distributions and binomial probability. The trials are independent. Describe the possible outcomes. is the same on every trial. that is. preferably. that is. so we cover those topics first. and/or money than other approaches. require less time. Choose a source of random numbers. 7. Flipping a coin and rolling dice are low-tech but effective. The trials are independent. the outcome on one trial does not affect the outcome on other trials. Note: When it comes to choosing a source of random numbers (Step 3 above). The probability of success. Choose a random number. Consider the following statistical experiment. getting heads on one trial does not affect whether we get heads on other trials. or expensive to analyze. This is a binomial experiment because:     The experiment consists of repeated trials. timeconsuming.5 on every trial. The probability of success is constant . Each trial can result in just two possible outcomes.Why use simulation? Some situations do not lend themselves to precise mathematical treatment.heads or tails.35 -133- . 2. The steps required to produce a useful simulation are presented below. We call one of these outcomes a success and the other.0. Based on the random number. Repeat steps 4 and 5 multiple times. Binomial Experiment A binomial experiment (also known as a Bernoulli trial) is a statistical experiment that has the following properties:     The experiment consists of n repeated trials. effort. until the outcomes show a stable pattern. you have many options. it helps to understand binomial experiments and some associated notation. simulation may approximate real-world results. 6. In these situations. And good random number generators can be found on the internet. note the "simulated" outcome.

For example. which can take on values of 0.P ) ]. Number of Probabi heads lity 0 1 2 The binomial distribution has the following properties:    0.The following notation is helpful. P) = nCx * Px * (1 . n. P: The probability of success on an individual trial. Suppose a binomial experiment consists of n trials and results in x successes. The probability distribution of a binomial random variable is called a binomial distribution (also known as a Bernoulli distribution).50.x Cumulative Binomial Probability 36 . n: The number of trials in the binomial experiment. we can compute the binomial probability based on the following formula: Binomial Formula.) b(x. n. or 2.36 -133- . Suppose we flip a coin two times and count the number of heads (successes).       x: The number of successes that result from the binomial experiment. Binomial Probability The binomial probability refers to the probability that a binomial experiment results in exactly x successes.the probability that an n-trial binomial experiment results in exactly x successes. If the probability of success on an individual trial is P. in the above table. when the probability of success on an individual trial is P.25 The mean of the distribution (μx) is equal to n * P . nCr: The number of combinations of n things. Given x. Q: The probability of failure on an individual trial. The variance (σ2x) is n * P * ( 1 . Binomial Distribution A binomial random variable is the number of successes x in n repeated trials of a binomial experiment. n.P. (This is equal to 1 . taken r at a time. 1. when we talk about binomial probability. P): Binomial probability . The standard deviation (σx) is sqrt[ n * P * ( 1 . The binomial distribution is presented below.P ). we see that the binomial probability of getting exactly one head in two coin flips is 0.P)n . and P. then the binomial probability is: b(x. The binomial random variable is the number of heads.50 0.25 0.

in the above table.078125.0781 25 0. the number of coin flips is a random variable that can take on any integer value between 2 and plus infinity.P)x . r.25 0. In this example. The negative binomial probability distribution for this example is presented below.1 successes after trial x . If the probability of success on an individual trial is P. Geometric Distribution 38 . For example. then the negative binomial probability is: b*(x. we are conducting a negative binomial experiment. we can compute the negative binomial probability based on the following formula: Negative Binomial Formula.25 0. The variance is: σ2 = rQ / P2 .1875 0. Suppose we flip a coin repeatedly and count the number of heads (successes). Given x.1 and r successes after trial x. r. Number of coin flips 2 3 4 5 6 7 or more Probabi lity 0.1093 75 Negative Binomial Probability The negative binomial probability refers to the probability that a negative binomial experiment results in r .38 -133- . Suppose a negative binomial experiment consists of x trials and results in r successes. and P.Negative Binomial Distribution A negative binomial random variable is the number X of repeated trials to produce r successes in a negative binomial experiment. we see that the negative binomial probability of getting the second head on the sixth flip of the coin is 0. The negative binomial distribution is also known as the Pascal distribution. The probability distribution of a negative binomial random variable is called a negative binomial distribution.r The negative binomial distribution has the following properties:   The mean of the distribution is: μ = rQ / P . If we continue flipping the coin until it has landed 2 times on heads. The negative binomial random variable is the number of coin flips required to achieve 2 heads. P) = x-1Cr-1 * Pr * (1 .125 0.

σ is the standard deviation. The variance is: σ2 = Q / P2 . P) = P * Qx . when the standard deviation is small. We might ask: What is the probability that the first head occurs on the third flip? That probability is referred to as a geometric probability and is denoted by g(x. The normal equation is the probability density function for the normal distribution.1 The geometric distribution has the following properties:   The mean of the distribution is: μ = Q / P .71828. Thus. If the probability of success on an individual trial is P. and e is approximately 2. Geometric Probability Formula. AP Statistics Tutorial: Normal Distribution The normal distribution refers to a family of continuous probability distributions described by the normal equation.The geometric distribution is a special case of the negative binomial distribution. The mean of the distribution determines the location of the center of the graph. The Normal Equation The normal distribution is defined by the following equation: Normal equation.μ)2/2σ2 where X is a normal random variable. and the standard deviation determines the height and width of the graph.14159. the geometric distribution is negative binomial distribution where the number of successes (r) is equal to 1. P). The value of the random variable Y is: Y = [ 1/σ * sqrt(2π) ] * e(x . μ is the mean.the mean and the standard deviation.39 -133- . the curve is tall and narrow. π is approximately 3. Suppose a negative binomial experiment consists of x trials and results in one success. 39 . then the geometric probability is: g(x. as shown below. When the standard deviation is large. the curve is short and wide. bellshaped curve. The formula for geometric probability is given below. All normal distributions look like a symmetric. The random variable X in the normal equation is called the normal random variable. An example of a geometric distribution would be tossing a coin until it lands on heads. The Normal Curve The graph of the normal distribution depends on two factors . It deals with the number of trials required for a single success.

Probability and the Normal Curve The normal distribution is a continuous probability distribution.     The total area under the normal curve is equal to 1. because the curve on the left has a bigger standard deviation. About 95% of the area under the curve falls within 2 standard deviations of the mean. Clearly. or a normal distribution table. The probability that a normal random variable X equals any particular value is 0. 40 . a free tool available on this site.7 rule. Additionally. This has several implications for probability. Collectively. every normal curve (regardless of its mean or standard deviation) conforms to the following "rule". It is the distribution that occurs when a normal random variable has a mean of zero and a standard deviation of one. these points are known as the empirical rule or the 68-95-99.40 -133- . AP Statistics Tutorial: Standard Normal Distribution Standard Normal Distribution The standard normal distribution is a special case of the normal distribution.The curve on the left is shorter and wider than the curve on the right. The probability that X is less than a equals the area under the normal curve bounded by a and minus infinity (as indicated by the shaded area in the figure below). given a normal distribution. In the next lesson. an online normal distribution calculator. The probability that X is greater than a equals the area under the normal curve bounded by a and plus infinity (as indicated by the non-shaded area in the figure below). In the examples below.    About 68% of the area under the curve falls within 1 standard deviation of the mean. we illustrate the use of Stat Trek's Normal Distribution Calculator. we demonstrate the use of normal distribution tables. most outcomes will be within 3 standard deviations of the mean. use a graphing calculator. About 99. To find the probability associated with a normal random variable.7% of the area under the curve falls within 3 standard deviations of the mean.

. .0681 0.0343.0934 0.1151 . From the table (see above)..4 -1. Table rows show the whole number and tenths place of the z-score. These probabilities are easy to compute from a normal distribution table.0013 0. that we want to know the probability that a z-score will be greater than 3. we find that P(Z < -1.1020 0.00) = 0.01...0013 0. and P(Z < -1.0985 .0694 0. μ is the mean mean of X..09 0..9987.P(Z < -1.31 is 0.. a section of the standard normal table is reproduced below.40 and less than -1. Suppose.1112 0.0735 0.1075 0.9987 0. .. The probability that a standard normal random variable (z) is greater than a given value (a) is easy to find...04 0.3 with the column containing 0.. for example.03 0.1151. Or you may want to know the probability that a standard normal random variable lies between two given values.0764 0.μ) / σ where X is a normal random variable..0918 0.The normal random variable of a standard normal distribution is called a standard score or a z-score. .P(Z < a). .01 0.0793 0.2 .0808..0722 0..07 0..3 -1.1093 0..0012 0.. ..  Find P(a < Z < b). cross-reference the row of the table containing -1.0 ... Therefore. .1056 0.0 0.0808 = 0..0749 0.9988 0..9989 0. . Here's how.0010 0. The P(a < Z < b) = P(Z < b) . The Normal Distribution as a Model for Measurements 41 ..0. . you may be called upon to use or interpret standard normal distribution tables.. that is. you may not be interested in the probability that a standard normal random variable falls between minus infinity and a given value. -1.41 -133- .. .0708 0.1038 0.0011 0.00 0. Standard normal tables are commonly found in appendices of most statistics texts. .. 3..9989 0. In school or on the Advanced Placement Statistics Exam.9990 Of course. z -3.31. . .0011 0. For example..06 0. The table shows that the probability that a standard normal random variable will be less than -1. Every normal random variable X can be transformed into a z score via the following equation: z = (X .  Find P(Z > a).. Therefore.00.0838 0.. The cumulative probability (often from minus infinity to the z-score) appears in the cell of the table.0013 0.0853 0.31) = 0.20) = 0. You may want to know the probability that it lies between a given value and plus infinity. .P(Z < a). Table columns show the hundredths place.0885 0.1151 0.0011 0. The table shows the P(Z < a).0951.. For example. .20.20) . . .9987 = 0..0823 0.40) = 0.0. The P(Z > a) = 1 ..40 < Z < -1. suppose we want to know the probability that a z-score will be greater than -1.40) = 0. 0.9989 0.05 0. Standard Normal Distribution Table A standard normal distribution table shows a cumulative probability associated with a particular z-score.02 0. and σ is the standard deviation of X..9987 0.P(Z < 3. P(-1.0901 0. From the table (see above).9987 0..0968 0.1003 0. 0. P(Z < -1..0951.0012 0.20) = P(Z < -1. The probability that a standard normal random variables lies between two values is also easy to find.1131 0.. .... P(Z > 3.9990 0. .0010 ..0013..0778 0.00) = 1 .0808 0.08 0. To find the cumulative probability of a z-score equal to -1.0951 0.0869 0.9988 0.00) = 1 . we find that P(Z < 3...

When either of these problems occur.  Transform raw data. The variance is always greater than 1. Typically. Stat Trek's free normal distribution calculator). Hence.1 or 7 degrees of freedom. μ is the population mean. the number of independent observations is equal to the sample size minus one. When estimating a mean score or a proportion from a single sample. online calculators (e. phenomena in the real world follow a normal (or near-normal) distribution. s is the standard deviation of the sample.Often. and n is the sample size. Find probability. Therefore. the sampling distribution of a statistic (like a sample mean) will follow a normal distribution. we can compute a z-score. the degrees of freedom may be calculated differently. The particular form of the t distribution is determined by its degrees of freedom. They need to be transformed into z-scores.2 ). We will describe those computations as they come up. and often we do not know the standard deviation of the population.42 -133- . With infinite degrees of freedom. The degrees of freedom refers to the number of independent observations in a set of data. as long as the sample size is sufficiently large.μ) / σ. statisticians rely on the distribution of the t statistic (also known as the t score). The variance is equal to v / ( v . a t distribution having 15 degrees of freedom would be used with a sample of size 16. For other applications. 42 . whose values are given by: t = [ x . and use the normal distribution to evaluate probabilities with the sample mean.g. when we know the standard deviation of the population. where v is the degrees of freedom (see last section) and v > 2. The distribution of the t statistic is called the t distribution or the Student t distribution. the t distribution is the same as the standard normal distribution. the distribution of the t statistic from samples of size 8 would be described by a t distribution having 8 . Usually. the raw data are not in the form of z-scores. Once the data have been transformed into z-scores.  The problem in the next section demonstrates the use of the normal distribution as a model for measurement. But sample sizes are sometimes small. Similarly.. using the transformation equation presented earlier: z = (X . AP Statistics Tutorial: Student's t Distribution According to the central limit theorem.μ ] / [ s / sqrt( n ) ] where x is the sample mean. the analysis involves two steps. Properties of the t Distribution The t distribution has the following properties:    The mean of the distribution is equal to 0 . or handheld graphing calculators to find probabilities associated with the z-scores. Degrees of Freedom There are actually many different t distributions. although it is close to 1 when there are many degrees of freedom. This allows researchers to use the normal distribution as a model for assessing probabilities associated with real-world phenomena. you can use standard normal distribution tables.

95) or 0. This cumulative probability represents the likelihood of finding a sample mean less than or equal to x. then t0. approximately normal). In this example. n is the sample size. the value of t0..alpha = -tα 43 .When to Use the t Distribution The t distribution can be used with any statistic having a bell-shaped distribution (i. s is the standard deviation of the sample.05. if any of the following conditions apply. We repeat that equation below: t = [ x . the following is true. For example. The sampling distribution is symmetric.1.α). without outliers. The central limit theorem states that the sampling distribution of a statistic will be normal or nearly normal. The sampling distribution is moderately skewed. and degrees of freedom are equal to n . Note: Because the t distribution is symmetric about a mean of zero.43 -133- . the sample mean can be transformed into a t score. unimodal.μ ] / [ s / sqrt( n ) ] where x is the sample mean.92.05 Of course.725.     The population distribution is normal.e. Probability and the Student t Distribution When a sample of size n is drawn from a population having a normal (or nearly normal) distribution. μ is the population mean. and the sample size is between 16 and 40. without outliers. a free tool provided by Stat Trek. The t distribution should not be used with small samples from populations that are not approximately normal. without outliers. that t0. The t score produced by this transformation can be associated with a unique cumulative probability.05 depends on the number of degrees of freedom.95. given a random sample of size n.95 = -2. unimodal.92.92. AP Statistics Tutorial: Chi-Square Distribution And t1 . The easiest way to find the probability associated with a particular t score is to use the T Distribution Calculator. tα = -t1 .alpha Thus. using the equation presented at the beginning of this lesson. The sample size is greater than 40. with 2 degrees of freedom. We would refer to the t-score as t0. but with 20 degrees of freedom. and the sample size is 15 or less.05 is equal to 1.05 is equal to 2. suppose we were interested in the t-score having a cumulative probability (from minus infinity to t) of 0. α would be equal to (1 . For example. Notation and t Scores Statisticians use tα to represent the t-score associated with a cumulative probability (from minus infinity to t) of (1 .05 = 2.0. that t0. if t0.

The chi-square distribution has the following properties:     The mean of the distribution is equal to the number of degrees of freedom: μ = v.1 ) * e-Χ2 / 2 where Y0 is a constant that depends on the number of degrees of freedom. the maximum value for Y occurs when Χ2 = v . called chi-square. For example.44 -133- .1 is the number of degrees of freedom. where degrees of freedom is n . Y0 is defined. in the figure below. Similarly.2. so that the area under the chi-square curve is equal to one. In the figure above. the chi-square curve approaches a normal distribution.1 = 3 . for samples of size 11 (degrees of freedom equal to 10). the shaded area represents the cumulative probability for a chi-square equal to A.71828). The chi-square distribution is defined by the following probability density function: Y = Y0 * ( Χ2 ) ( v/2 .Suppose we conduct the following statistical experiment. Cumulative Probability and the Chi-Square Distribution The chi-square distribution is constructed so that the total area under the curve is equal to 1. We find that the standard deviation in our sample is equal to s. and the blue curve. the the green curve shows the distribution for samples of size 5 (degrees of freedom equal to 4). The area under the curve between 0 and a particular value of a chi-square statistic is the cumulative probability associated with that statistic.1 = 2. and e is a constant equal to the base of the natural logarithm system (approximately 2. 44 . we can compute a statistic. As the degrees of freedom increase. Χ2 is the chi-square statistic. Given these data. We select a random sample of size n from a normal population. The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v When the degrees of freedom are greater than or equal to 2. using the following equation: Χ2 = [ ( n .1 ) * s2 ] / σ2 If we repeated this experiment an infinite number of times. having a standard deviation equal to σ. the red curve shows the distribution of chi-square values computed from all possible samples of size 3. v = n . we could obtain a sampling distribution for the chisquare statistic.

you can refine that rule. then the sampling distribution has roughly the same sampling error. if the sample represents a significant fraction (say. The variability of a sampling distribution depends on three factors:    N: The number of observations in the population. If you know something about the shape of the sample distribution. The easiest way to find the probability associated with a particular chi-square is to use the Chi-Square Distribution Calculator. whether we sample with or without replacement. unimodal. we don't have to compute the area under the curve to find the probability. the sampling error will be noticeably smaller. standard deviation) for each sample. The sampling distribution is symmetric. The probability distribution of this statistic is called a sampling distribution.Fortunately. proportion. without outliers. The sample size is large enough if any of the following conditions apply. Central Limit Theorem The central limit theorem states that the sampling distribution of any statistic will be normal or nearly normal. Variability of a Sampling Distribution The variability of a sampling distribution is measured by its variance or its standard deviation. The sample size is greater than 40. Suppose further that we compute a statistic (e. if the sample size is large enough. a free tool provided by Stat Trek. and the sample size is between 16 and 40. n: The number of observations in the sample. If the population size is much larger than the sample size. when we sample without replacement. On the other hand. and the sample size is 15 or less. The way that the random sample is chosen. a mean. AP Statistics Tutorial: Sampling Distributions Suppose that we draw all possible samples of size n from a given population. How large is "large enough"? As a rough rule of thumb.     The population distribution is normal. 1/10) of the population size. without outliers. The sampling distribution is moderately skewed. many statisticians say that a sample size of 30 is large enough..45 -133- .g. 45 . unimodal. without outliers.

if we know the mean and standard deviation of a statistic. From this population. the factor PQ/N is approximately equal to zero. suppose that we draw all possible samples of size n. The mean of the population (μ) is equal to the mean of the sampling distribution (μx). In this way. and the standard deviation formula reduces to: σx = σ / sqrt(n). The population standard deviation σ is known.46 -133and σp = σ * sqrt( 1/n .The exact shape of any normal curve is totally determined by its mean and standard deviation. We find that the mean of the sampling distribution of the proportion (μp) is equal to the probability of success in the population (P). AP Statistics Tutorial: Difference Between Proportions Many statistical applications involve comparisons between two independent sample proportions. 46 . and the sample size. These relationships are shown in the equations below: μx = μ and σx = σ * sqrt( 1/n . You often see this formula in intro statistics texts. We know the following. And the standard error of the sampling distribution (σp) is determined by the standard deviation of the population (σ). suppose that the probability of the occurence of an event (dubbed a "success") is P. we create a sampling distribution of the mean. suppose that we determine the proportion of successes p and failures q. and the standard deviation formula reduces to: σp = sqrt( PQ/n ). the population size. or the sample size is sufficiently large. and the probability of the event's non-occurence (dubbed a "failure") is Q. Sampling Distribution of the Mean Suppose we draw all possible samples of size n from a population of size N. Therefore. we create a sampling distribution of the proportion. And finally. These relationships are shown in the equations below: μp = P where σ = sqrt[ PQ ].PQ/N ] . and the sample size. we can specify the sampling distribution of the mean whenever two conditions are met:   The population is normally distributed. And the standard error of the sampling distribution (σx) is determined by the standard deviation of the population (σ). we can find the mean and standard deviation of the sampling distribution of the statistic (assuming that the statistic came from a "large" sample). the population size. Suppose further that we compute a mean score for each sample. Note: When the population size is very large. Note: When the population size is very large. In this way. You often see this formula in introductory statistics texts. Sampling Distribution of the Proportion In a population of size N. Suppose further that we take all possible samples of size n1 and n2. within each sample. suppose that the following assumptions are valid. And finally. Difference Between Proportions: Theory Suppose we have two populations with proportions equal to P1 and P2.1/N ) Therefore. the factor 1/N is approximately equal to zero.1/N ) = sqrt[ PQ/n .

observations in population 1 are not affected by observations in population 2. respectively. suppose that the following assumptions are valid.P1) / n1] + [P2(1 . and N2 is large relative to n2.P2 = σ21 + σ22 If the populations N1 and N2 are both large relative to n1 and n2.P1) / n1 Therefore. The samples are independent. That is. Thus. and vice versa.) The samples from each population are big enough to justify using a normal distribution to model differences between proportions. We know this from the central limit theorem. And finally.  And σ22 = P2(1 . and N2 is large relative to n2. E(p1 . Thus. sample sizes of 40 are large enough).    The set of differences between sample proportions will be normally distributed. The expected value of the difference between all possible sample proportions is equal to the difference between population proportions. N1 is large relative to n1. (In this context. The derivation starts with a recognition that the variance of the difference between independent random variables is equal to the sum of the individual variances. (Based on the central limit theorem.P2) / n2] } It is straightforward to derive the last bullet point. populations are considered to be large if they are at least 10 times bigger than their sample. The standard deviation of the difference between sample proportions (σd) is approximately equal to: σd = sqrt{ [P1(1 . This will be true if each population is normal or if the sample sizes are large.P1) / n1 ] + [ P2(1 . and vice versa. Suppose further that we take all possible samples of size n1 and n2. that is. σ2d = [ P1(1 . N1 is large relative to n1. then σ21 = P1(1 . based on material covered in previous lessons. (In this context.) The samples are independent. 47 . and n2(1 . that is.P2. observations in population 1 are not affected by observations in population 2. populations are considered to be large if they are at least 10 times bigger than their sample.p2) = P1 .P2) / n2 ] }   The size of each population is large relative to the sample drawn from the population.P1) / n1 ] + [ P2(1 .47 -133- . σ2d = σ2P1 . n1(1 -P1) > 10.P2) / n2 ] AP Statistics Tutorial: Difference Between Means Many statistical applications involve comparisons between two independent sample means. n2P2 > 10. The set of differences between sample means are normally distributed. Given these assumptions.P2) / n2 And σd = sqrt{ [ P1(1 . The sample sizes will be big enough when the following conditions are met: n1P1 > 10.P2) > 10. That is. Difference Between Means: Theory Suppose we have two populations with means equal to μ1 and μ2. we know the following.   The size of each population is large relative to the sample drawn from the population.

a < x < b is an interval estimate of the population mean μ.Given these assumptions. Thus. σ2d = σ2 (x1 . For example. to estimate population proportions. For example. Thus.x2) = μd = μ1 . estimation refers to the process by which one makes inferences about a population. sample means are used to estimate population means. based on material covered in previous lessons. A confidence interval consists of three parts. Point Estimate vs.    A confidence level.  Confidence Intervals Statisticians use a confidence interval to express the precision and uncertainty associated with a particular sampling method. sample proportions.48 -133- . It indicates that the population mean is greater than a but less than b. σd2 = σ12 / n1 + σ22 / n2 AP Statistics Tutorial: Estimation Problems In statistics. we know the following. For example. The derivation starts with a recognition that the variance of the difference between independent random variables is equal to the sum of the individual variances. the sample mean x is a point estimate of the population mean μ.   The expected value of the difference between all possible sample means is equal to the difference between population means. then σ2 x1 = σ21 / n1 Therefore. E(x1 . An interval estimate is defined by two numbers. A point estimate of a population parameter is a single value of a statistic.x2) = σ2 x1 + σ2 x2 If the populations N1 and N2 are both large relative to n1 and n2. Interval Estimate Statisticians use sample statistics to estimate population parameters. A statistic. respectively. based on information obtained from a sample. 48 . between which a population parameter is said to lie. the sample proportion p is a point estimate of the population proportion P. Similarly. The standard deviation of the difference between sample means (σd) is approximately equal to: σd = sqrt( σ12 / n1 + σ22 / n2 ) It is straightforward to derive the last bullet point. A margin of error.μ2. An estimate of a population parameter may be expressed in two ways:  And σ2 x2 = σ22 / n2 And σd = sqrt( σ12 / n1 + σ22 / n2 ) Point estimate. Interval estimate.

49 -133- . suppose the local newspaper conducts an election survey and reports that the independent candidate will receive 30% of the vote. This lesson describes how to find the standard deviation and standard error. This statement is a confidence interval. and shows how the two measures are related. To clearly interpret survey results you need to know both! We are much more likely to accept survey findings if the confidence level is high (say. For example. Confidence Level The probability part of a confidence interval is called a confidence level. and computed confidence intervals for each sample. you need to know the the standard deviation or the standard error of the statistic. the true population mean would fall within a range defined by the sample statistic + margin of error 95% of the time.The confidence level describes the uncertainty of a sampling method. 95%) than if it is low (say. For example. 50%). AP Statistics Tutorial: Standard Error To compute a confidence interval for a statistic. and so on. Some confidence intervals would include the true population parameter. These findings result in the following confidence interval: We are 95% confident that the independent candidate will receive between 25% and 35% of the vote. The statistic and the margin of error define an interval estimate that describes the precision of the method. Confidence intervals are preferred to point estimates. but not the confidence level. Population parameter N: Number of observations in the Sample statistic n: Number of observations in the sample 49 . a 90% confidence level means that 90% of the intervals contain the population parameter. The interval estimate of a confidence interval is defined by the sample statistic + margin of error. They provide the margin of error. It means that if we used the same sampling method to select different samples and compute different interval estimates. when we talk about the standard deviation and the standard error. the range of values above and below the sample statistic is called the margin of error. A 95% confidence level means that 95% of the intervals contain the true population parameter. Margin of Error In a confidence interval. Notation The following notation is helpful. Note: Many public opinion surveys report interval estimates. Suppose we collected many different samples. Here is how to interpret a confidence level. The confidence level describes how strongly we believe that a particular sampling method will produce a confidence interval that includes the true population parameter. we might say that we are 95% confident that the true population mean falls within a specified range. because confidence intervals indicate (a) the precision of the estimate and (b) the uncertainty of the estimate. The newspaper states that the survey had a 5% margin of error and a confidence level of 95%. but not confidence intervals. others would not.

the values of population parameters are often unknown. p1 . The variability of a statistic is measured by its standard deviation.x2 Difference between proportions. x Sample proportion. Naturally. Statistic Standard Error 50 . Statistic Sample mean. Standard Error of Sample Estimates Sadly. x1 . use the standard error.population Ni: Number of observations in population i P: Proportion of successes in population Pi: Proportion of successes in population i μ: Population mean μi: Mean of population i σ: Population standard deviation σp: Standard deviation of p σx: Standard deviation of x Standard Deviation of Sample Estimates Statisticians use sample statistics to estimate population parameters. making it impossible to compute the standard deviation of a statistic. and it provides an unbiased estimate of the standard deviation. The standard error is computed from known sample statistics. assuming the population size is at least 10 times larger than the sample size.P) / n ] σx1-x2 = sqrt [ σ21 / n1 + σ22 / n2 ] σp1-p2 = sqrt [ P1(1-P1) / n1 + P2(1-P2) / n2 ] ni: Number of observations in sample i p: Proportion of successes in sample pi: Proportion of successes in sample i x: Sample estimate of population mean xi: Sample estimate of μi s: Sample estimate of σ SEp: Standard error of p SEx: Standard error of x Note: In order to compute the standard deviation of a sample statistic. These formulas are valid when the population size is much larger (at least 10 times larger) than the sample size.p2 Standard Deviation σx = σ / sqrt( n ) σp = sqrt [ P(1 . The table below shows how to compute the standard error for simple random samples. When this occurs. p Difference between means. The table below shows formulas for computing the standard deviation of statistics from simple random samples.50 -133- . the value of a statistic may vary from one sample to the next. you must know the value of one or more population parameters.

unimodal. Specifically.(confidence level / 100) Find the critical probability (p*): p* = 1 .51 -133- . Previously. Otherwise.x2 Difference between proportions. The sample size is greater than 40. The sampling distribution is symmetric. This section describes how to find the critical value. without outliers.p) / n ] SEx1-x2 = sqrt [ s21 / n1 + s22 / n2 ] SEp1-p2 = sqrt [ p1(1-p1) / n1 + p2(1-p2) / n2 ] The equations for the standard error are identical to the equations for the standard deviation. and the sample size is between 16 and 40. without outliers. 5 percent (the margin of error) 90 percent of the time (the confidence level).p2 SEx = s / sqrt( n ) SEp = sqrt [ p(1 . use the second equation. unimodal. For example.Sample mean. We could devise a sample design to ensure that our sample estimate will not differ from the true population value by more than. and s in place of σ.     The population distribution is normal. if any of the following conditions apply. the range of values above and below the sample statistic is called the margin of error. the critical value can be expressed as a t score or as a z score. without outliers. Margin of error = Critical value x Standard deviation of the statistic Margin of error = Critical value x Standard error of the statistic If you know the standard deviation of the statistic. use the first equation to compute the margin of error. The sampling distribution is moderately skewed. when the sampling distribution of the statistic is normal or nearly normal. p Difference between means. the standard error equations use p in place of P. The central limit theorem states that the sampling distribution of a statistic will be normal or nearly normal. How to Compute the Margin of Error The margin of error can be defined by either of the following equations. p1 . except for one thing .α/2 51 . AP Statistics Tutorial: Margin of Error In a confidence interval. follow these steps. How to Find the Critical Value The critical value is a factor used to compute the margin of error. suppose we wanted to know the percentage of adults that exercise daily. To find the critical value. and the sample size is 15 or less. say.the standard error equations use statistics where the standard deviation equations use parameters.   Compute alpha (α): α = 1 . x1 . we described how to compute the standard deviation and standard error. x Sample proportion. When one of these conditions is satisfied.

Strictly speaking. And the uncertainty associated with the confidence interval is specified by the confidence level.52 -133- . You can also use a graphing calculator or standard statistical tables (found in the appendix of most introductory statistics texts). Confidence Interval Data Requirements To express a confidence interval. A 95% confidence level means that 95% of the intervals would include the parameter. it doesn't make much difference. and the t Distribution Calculator to find the critical t score. It does not change. Some interval estimates would include the true population parameter and some would not. We will describe those computations as they come up. Like any population parameter. the population mean is a constant. Both approaches yield similar results. Suppose we used the same sampling method to select different samples and to compute a different interval estimate for each sample. To express the critical value as a t score. A 90% confidence level means that we would expect 90% of the interval estimates to include the population parameter. AP Statistics Tutorial: Confidence Intervals Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter. • • Find the degrees of freedom (DF). the t score is preferred. 52 . DF is equal to the sample size minus one. many introductory statistics texts use the z score exclusively. The confidence level describes the uncertainty associated with a sampling method. As a practical matter. the degrees of freedom may be calculated differently. follow these steps.  To express the critical value as a z score.00 or 1. How to Interpret Confidence Intervals Consider the following confidence interval: We are 90% confident that the population mean is greater than 100 and less than 200. You can use the Normal Distribution Calculator to find the critical z score. we provide sample problems that illustrate both approaches.    Confidence level Statistic Margin of error Given these inputs. Some people think this means there is a 90% chance that the population mean falls between 100 and 200. you need three pieces of information. This is incorrect. When estimating a mean score or a proportion from a single sample. The probability that a constant falls within any given range is always 0. On this web site.00. find the z score having a cumulative probability equal to the critical probability (p*). The critical t score (t*) is the t score having degrees of freedom equal to DF and a cumulative probability equal to the critical probability (p*). when the sample size is large (greater than 40). For other applications. not a random variable. and so on. Should you express the critical value as a t score or as a z score? There are several ways to answer this question. the range of the confidence interval is defined by the sample statistic + margin of error. Nevertheless. when the population standard deviation is unknown or when the sample size is small.

Previously. Often. or 99% confidence levels. How to Construct a Confidence Interval There are four steps to constructing a confidence interval.) The Variability of the Sample Proportion To construct a confidence interval for a sample proportion. we need to know the variability of the sample proportion. The sample includes at least 10 successes and 10 failures. but any percentage can be used. the margin of error is not given. Often. 95%. you must calculate it. This means we need to know how to compute the standard deviation and/or the standard error of the sampling distribution. we described how to compute the margin of error.53 -133- . based on one of the following equations. standard deviation) that you will use to estimate a population parameter. the margin of error may be given. you will need to compute the margin of error. however.    Specify the confidence interval. the confidence level describes the uncertainty of a sampling method. Estimation Requirements The approach described in this lesson is valid whenever the following conditions are met:   The sampling method is simple random sampling. 53 . mean. p. The uncertainty is denoted by the confidence level. And the range of the confidence interval is defined by the following equation.  Identify a sample statistic. Choose the statistic (e. (Some texts say that 5 successes and 5 failures are enough.g.Often. Find the margin of error. AP Statistics Tutorial: Estimating a Proportion This lesson describes how to construct a confidence interval for a sample proportion. Select a confidence level. Confidence interval = sample statistic + Margin of error The sample problem in the next section applies the above four steps to construct a 95% confidence interval for a mean score. researchers choose 90%. see how to compute the margin of error. Margin of error = Critical value * Standard deviation of statistic Margin of error = Critical value * Standard error of statistic For guidance. As we noted in the previous section. If you are working on a homework problem or a test question. The next few lessons discuss this topic in greater detail.

Specify the confidence interval. the standard deviation can be approximated by: σp = sqrt[ P * ( 1 .1 ) ] where p is the sample proportion.   Identify a sample statistic. namely. we described how to construct confidence intervals. How to Find the Confidence Interval for a Proportion Previously. the standard error can be approximated by: SEp = sqrt[ p * ( 1 . Often. The standard deviation of the sampling distribution is the "average" deviation between the k sample proportions and the true population proportion. When the population size at least 10 times larger than the sample size. we repeat the key steps below. n is the sample size. P. Select a confidence level. researchers choose 90%. n is the sample size. When the population size is much larger (at least 10 times larger) than the sample size. It can be calculated from the equation below. students are expected to be aware of the limitations of these formulas. we showed how to compute the margin of error.P ) / n ]  When the true population proportion P is not known. and N is the population size.P ) / n ] * sqrt[ ( N . use the standard error. and N is the population size. The standard error (SE) provides an unbiased estimate of the standard deviation. The standard deviation of the sample proportion σp is: σp = sqrt[ P * ( 1 . Previously. Suppose k possible samples of size n can be selected from the population. the standard deviation of the sampling distribution cannot be calculated.   In the next section. 95%. we work through a problem that shows how to use this approach to construct a confidence interval for a proportion.n ) / ( N . SEp = sqrt[ p * ( 1 . However. but any percentage can be used.p ) / n ] Alert The Advanced Placement Statistics Examination only covers the "approximate" formulas for the standard deviation and standard error. AP Statistics Tutorial: Difference Between Proportions 54 . The confidence level describes the uncertainty of a sampling method. And the uncertainty is denoted by the confidence level. or 99% confidence levels.p ) / n ] * sqrt[ ( N .n ) / ( N . Use the sample proportion to estimate the population proportion. For convenience.1 ) ] where P is the population proportion. the approximate formulas should only be used when the population size is at least 10 times larger than the sample size. The range of the confidence interval is defined by the sample statistic + margin of error. Under these circumstances.54 -133- . Find the margin of error.

1)] } where P1 is the population proportion for sample 1. students are expected to be aware of the limitations of these formulas.n2) / (N2 . The samples are independent.1)] + [p2 * (1 . SEp1 .  The standard deviation of the sampling distribution is the "average" deviation between all possible sample differences (p1 . the standard deviation of the sampling distribution cannot be calculated.p2 = sqrt{ [P1 * (1 . Each sample includes at least 10 successes and 10 failures.p2) / n2] * [(N2 .n1) / (N1 .1)] + [P2 * (1 .n2) / (N2 .p1) / n1] * [(N1 .p2. the standard deviation can be approximated by: σp1 .p2 = sqrt{ [p1 * (1 .p2 = sqrt{ [p1 * (1 . we need to know about the sampling distribution of the difference.P2) / n2] }  When the population parameters (P1 and P2) are not known. use the standard error. Specifically.P1) / n1] * [(N1 . (P1 . Alert 55 .P2). and N2 is the number of observations in population 2.1)] } where p1 is the sample proportion for sample 1.P1) / n1] + [P2 * (1 . p1 . However.P2) / n2] * [(N2 . and where p2 is the sample proportion for sample 2.p2 = sqrt{ [P1 * (1 . Under these circumstances. (Some texts say that 5 successes and 5 failures are enough. where P2 is the population proportion for sample 2. When each sample is small (less than 10% of its population).This lesson describes how to construct a confidence interval for the difference between two sample proportions.p2 is: σp1 . n2 is the sample size from population 2. that they should only be used when each population is at least 10 times larger than its respective sample. Estimation Requirements The approach described in this lesson is valid whenever the following conditions are met:    Both samples are simple random samples. N1 is the number of observations in population 1.n1) / (N1 .p2) and the true population difference. namely. we need to know how to compute the standard deviation or standard error of the sampling distribution. the standard deviation can be approximated by: SEp1 . The standard deviation of the difference between sample proportions σp1 .p1) / n1] + [p2 * (1 . When each sample is small (less than 10% of its population). It can be calculated from the equation below. The standard error (SE) provides an unbiased estimate of the standard deviation.p2) / n2] } Note: The Advanced Placement Statistics Examination only covers the "approximate" formulas for the standard deviation and standard error.) The Variability of the Difference Between Proportions To construct a confidence interval for the difference between two sample proportions.55 -133- . n1 is the sample size from population 1.

Previously.  Identify a sample statistic.p2 = sqrt[P * (1 . Estimation Requirements The approach described in this lesson is valid whenever the following conditions are met:   The sampling method is simple random sampling. AP Statistics Tutorial: Estimating the Population Mean This lesson describes how to construct a confidence interval for a sample mean. Generally. The sampling distribution is symmetric. unimodal. or 99% confidence levels. we repeat the key steps below. Use the sample proportions (p1 .   σp1 . x. The range of the confidence interval is defined by the sample statistic + margin of error. the sampling distribution will be approximately normally distributed if any of the following conditions apply. The sampling distribution is moderately skewed. 95%. and the sample size is 15 or less. we work through a problem that shows how to use this approach to construct a confidence interval for the difference between proportions. How to Find the Confidence Interval for a Proportion Previously.P)] * sqrt[ (1 / n2) + (1 / n2)] SEp1 . without outliers. researchers choose 90%. without outliers. but any percentage can be used.p)] * sqrt[ (1 / n2) + (1 / n2)] where P = P1 = P2 where p = p1 = p2 Remember.p2) to estimate the difference between population proportions (P1 . we showed how to compute the margin of error. Find the margin of error.     The population distribution is normal. we described how to construct confidence intervals. without outliers.Some texts present a different. which appear below. The Variability of the Sample Mean 56 .p2 = sqrt[p * (1 .56 -133- . The sampling distribution is approximately normally distributed. And the uncertainty is denoted by the confidence level.P2). The sample size is greater than 40. Often. and when each sample size is small (less than of the population size). These formulas. these two formulas should be used only when the proportions from each group are equal. unimodal. Select a confidence level. The confidence level describes the uncertainty of a sampling method. and the sample size is between 16 and 40. are valid when the proportions are equal. less general version of the approximate formulas. Specify the confidence interval. For convenience.    In the next section.

To construct a confidence interval for a sample mean. the standard error can be approximated by: SEx = s / sqrt( n ) Note: In real-world analyses. Use the sample mean to estimate the population mean.n/N ) * [ N / ( N . we repeat the key steps below. SEx = s * sqrt{ ( 1/n ) * ( 1 . However. the standard deviation can be approximated by: σx = σ / sqrt( n )  When the standard deviation of the population σ is unknown. How to Find the Confidence Interval for a Mean Previously. students are expected to be aware of the limitations of these formulas. Alert The Advanced Placement Statistics Examination only covers the "approximate" formulas for the standard deviation and standard error. μ. The standard error (SE) provides an unbiased estimate of the standard deviation. The standard deviation of the sampling distribution is the "average" deviation between the k sample means and the true population mean. but any percentage can be used. This means we need to know how to compute the standard deviation or the standard error of the sampling distribution. Often. For convenience. 95%. and n is the sample size. Therefore.1 ) ] } where σ is the standard deviation of the population. researchers choose 90%. the standard deviation of the population is seldom known. the standard error is used more often than the standard deviation.57 -133- .1 ) ] } where s is the standard deviation of the sample. Select a confidence level. namely. the approximate formulas should only be used when the population size is at least 10 times larger than the sample size. use the standard error. The standard deviation of the sample mean σx is: σx = σ * sqrt{ ( 1/n ) * ( 1 .   Identify a sample statistic.n/N ) * [ N / ( N .  Suppose k possible samples of size n can be selected from a population of size N. The confidence level describes the uncertainty of a sampling method. or 99% confidence levels. N is the population size. It can be calculated from the equation below. Under these circumstances. we described how to construct confidence intervals. When the population size is much larger (at least 10 times larger) than the sample size. N is the population size. 57 . the standard deviation of the sampling distribution cannot be calculated. we need to know the variability of the sample mean. and n is the sample size. When the population size is much larger (at least 10 times larger) than the sample size.

Generally.     The population distribution is normal. In the next section. The standard error (SE) provides an unbiased estimate of the standard deviation. the standard deviation of the sampling distribution cannot be calculated. Previously. the sampling distribution will be approximately normally distributed if each sample is described by at least one of the following statements. AP Statistics Tutorial: Difference Between Means This lesson describes how to construct a confidence interval for the difference between two means. Estimation Requirements The approach described in this lesson is valid whenever the following conditions are met:     Both samples are simple random samples. and n2 is the size of sample 2. and the sample size is 15 or less. It can be calculated from the equation below.  When the standard deviation of either population is unknown. and the sample size is between 16 and 40. This means we need to know how to compute the standard deviation and/or the standard error of the sampling distribution of the difference. σ2 is the standard deviation of the population 2. Each population is at least 10 times larger than its respective sample. The samples are independent. and n1 is the size of sample 1. SEx1-x2 = sqrt [ s21 / n1 + s22 / n2 ] 58 . The sample size is greater than 40. The sampling distribution is moderately skewed. And the uncertainty is denoted by the confidence level. The sampling distribution is symmetric. The Variability of the Difference Between Sample Means To construct a confidence interval. the standard deviation of the sampling distribution is: σx1-x2 = sqrt [ σ21 / n1 + σ22 / n2 ] where σ1 is the standard deviation of the population 1. without outliers. without outliers. Specify the confidence interval. we showed how to compute the margin of error. without outliers. use the standard error.  Find the margin of error. The sampling distribution of the difference between means is approximately normally distributed.58 -133- . we need to know the variability of the difference between sample means.  If the population standard deviations are known. Under these circumstances. unimodal. unimodal. The range of the confidence interval is defined by the sample statistic + margin of error. we work through a problem that shows how to use this approach to construct a confidence interval to estimate a population mean.

DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 . Find the margin of error. Often. we repeat the key steps below. and n1 is the size of sample 1. When the sample size is large. the z score is a little easier. the standard deviation of the population is seldom known. Alert Some texts present additional options for calculating standard deviations and standard errors. Previously.59 -133- . Use this formula when the population standard deviations are known and are equal. Use the difference between sample means to estimate the difference between population means. SEpooled = sqrt{ [ (n1 -1) * s12) + (n1 -1) * s12) ] / ((n1 + n1 -1) } where σ1 = σ2  Remember.2. •   The following formula is appropriate whenever a t score is used to analyze the difference between means. These formulas. The confidence level describes the uncertainty of a sampling method. If you use a t score.where s1 is the standard deviation of the sample 1. Note: In real-world analyses. or 99% confidence levels.x2 = σd = σ * sqrt[ (1 / n2) + (1 / n2)] where σ = σ1 = σ2 Pooled standard error.  Standard deviation.1) ] } • If you are working with a pooled standard error (see above). Select a confidence level. and the samples sizes (n1) and (n2) are small (under 30). but assumed to be equal. Here's how. the standard error is used more often than the standard deviation. Since it does not require computing degrees of freedon. use a t score for the critical value. researchers choose 90%.1) ] + [ (s22 / n2)2 / (n2 . How to Find the Confidence Interval for a Mean Previously. you can use a t score or a z score for the critical value. Use this formula when the population standard deviations are unknown. but any percentage can be used. these two formulas should be used only when the various required underlying assumptions are justified. which should only be used under special circumstances. we described how to construct confidence intervals. are described below. DF = n1 + n2 .  Identify a sample statistic. When the sample sizes are small (less than 40). Therefore. we showed how to compute the margin of error. The next section presents sample problems that illustrate the use to z scores and t scores as critical values. s2 is the standard deviation of the sample 2. and n2 is the size of sample 2. you will need to compute degrees of freedom (DF). σx1 . 59 . based on the critical value and standard deviation. For convenience. 95%.

   The sampling method must be simple random sampling. we cannot compute the standard deviation of the difference between sample means.70 (B) 50 + 28.) (A) 50 + 1. use a t score as the critical value. •   Find standard deviation or standard error. In this analysis. The range of the confidence interval is defined by the sample statistic + margin of error. 60 . Select a confidence level. instead.950 = 50. Elsewhere on this site.15 students from school A and 20 students from school B. Test Your Understanding of This Lesson Problem 1: Small Samples Suppose that simple random samples of college freshman are selected from two universities . The sampling distribution should be approximately normally distributed. The samples must be independent. And the uncertainty is denoted by the confidence level. we can use the following four-step approach to construct a confidence interval.49 (C) 50 + 32. We are working with a 90% confidence level. the samples are independent. The problem states that test scores in each population are normally distributed. we show how to compute the margin of error when the sampling distribution is approximately normal. the confidence level is defined for us in the problem. The key steps are shown below. Specify the confidence interval. Find the margin of error.x2 = 1000 . the sample from school A has an average score of 1000 with a standard deviation of 100. Since responses from one sample did not affect responses from the other sample. The sample from school B has an average score of 950 with a standard deviation of 90. On a standardized test.  Identify a sample statistic. Since we do not know the standard deviation of the populations.60 -133- . This condition is satisfied. Thus. the problem statement says that we used simple random sampling. we choose the difference between sample means as the sample statistic. What is the 90% confidence interval for the difference in test scores at the two schools. Since the above requirements are satisfied. The approach that we used to solve this problem is valid when the following conditions are met. Since we are trying to estimate the difference between population means. we compute the standard error (SE). so the difference between test scores will also be normally distributed. x1 .74 (D) 50 + 55. assuming that test scores came from normal distributions in both schools? (Hint: Since the sample sizes are small.66 (E) None of the above Solution The correct answer is (D).

That is. To find the critical value. we take these steps.66 to 100. For men. The range of the confidence interval is defined by the sample statistic + margin of error.95 Find the degrees of freedom (df): DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 .1) ] } DF = (1002/15 + 902/20)2 / { [ (1002 /15)2 / 14 ] + [ (902 /20)2 / 19 ] } DF = (666.92 = 28.58 (D) \$5 + \$5. Over the course of the season they gather simple random samples of 50 men and 100 women.89) = 1150614.21 (C) \$5 + \$2.03 + 8632. And the uncertainty is denoted by the confidence level.10/2 = 0.67 + 405) = 32.66  Specify the confidence interval.67 + 405}2 / (31746.α/2 = 1 . o The critical value is the t score having 28 degrees of freedom and a cumulative probability equal to 0. we find that the critical value is 1.0.66.1) ] + [ (s22 / n2)2 / (n2 . Because the sample sizes are small.74 = 55.5 / 40378.90/100 = 0.66.10 Find the critical probability (p*): p* = 1 . o o o Compute alpha (α): α = 1 . What is the 99% confidence interval for the spending difference between men and women? Assume that the two populations are independent and normally distributed.(confidence level / 100) = 1 .00 (E) None of the above 61 .SE = sqrt [ s21 / n1 + s22 / n2 ] SE = sqrt [(100)2 / 15 + (90)2 / 20] = sqrt (10. (A) \$5 + \$0. the 90% confidence interval is -5. Therefore.95.61 -133- . From the t Distribution Calculator.7. The critical value is a factor used to compute the margin of error. with a standard deviation of \$3.74 • Find critical value. we express the critical value as a t score rather than a z score. For women. it was \$15. the average expenditure was \$20. we conclude that there are 28 degrees of freedom.000/15 + 8100/20) = sqrt(666. with a standard deviation of \$2. • Compute margin of error (ME): ME = critical value * standard error = 1.495 Rounding off to the nearest whole number. Problem 2: Large Samples The local baseball team conducts a study to find the amount spent on refreshments at the ball park.7 * 32. we are 99% confident that the true difference in population means is in the range defined by 50 + 55.47 (B) \$5 + \$1.

The problem states that test scores in each population are normally distributed.995. Since the above requirements are satisfied.  Identify a sample statistic. we show how to compute the margin of error when the sampling distribution is approximately normal. instead.21. SE = sqrt [ s21 / n1 + s22 / n2 ] SE = sqrt [(3)2 / 50 + (2)2 / 100] = sqrt (9/50 + 4/100) = sqrt(0.79 to \$6. so the difference between test scores will also be normally distributed. the confidence level is defined for us in the problem. The critical value is a factor used to compute the margin of error.18 + 0.Solution The correct answer is (B).α/2 = 1 . Therefore. Since we do not know the standard deviation of the populations. Because the sample sizes are large enough. Find the margin of error. we choose the difference between sample means as the sample statistic.99/100 = 0. The approach that we used to solve this problem is valid when the following conditions are met. And the uncertainty is denoted by the confidence level. we take these steps. Elsewhere on this site.58.62 -133- . the problem statement says that we used simple random sampling. To find the critical value.995 The critical value is the z score having a cumulative probability equal to 0. The key steps are shown below.21. we are 99% confident that men outspend women at the ballpark by about \$5 + \$1.01 Find the critical probability (p*): p* = 1 . In this analysis.47 • Find critical value. 62 . we express the critical value as a z score.x2 = \$20 . we find that the critical value is 2. Thus. • Compute margin of error (ME): ME = critical value * standard error = 2. we can use the following four-step approach to construct a confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error.(confidence level / 100) = 1 . o o o Compute alpha (α): α = 1 .47 = 1. we cannot compute the standard deviation of the difference between sample means.\$15 = \$5. We are working with a 99% confidence level. The sampling distribution should be approximately normally distributed. •   Find standard deviation or standard error. The samples must be independent. x1 . we compute the standard error (SE).21  Specify the confidence interval. Select a confidence level. From the Normal Distribution Calculator.    The sampling method must be simple random sampling. This condition is satisfied.01/2 = 0.04) = 0. Again. That is. the 99% confidence interval is \$3.0.58 * 0. Since we are trying to estimate the difference between population means. the problem statement satisfies this condition.

the variable d) is normal. When the population size is much larger (at least 10 times larger) than the sample size. and the sample size is 15 or less.n/N ) * [ N / ( N . the standard deviation can be approximated by: σd = σd / sqrt( n )  When the standard deviation of the population σd is unknown.     The population distribution of paired differences (i. x and y) such that the paired difference between x and y is: d = x . The sampling distribution of the mean difference between data pairs (d) is approximately normally distributed. To construct a confidence interval for d. The Variability of the Mean Difference Between Matched Pairs Suppose d is the mean difference between sample data pairs.. The sample distribution of paired differences is symmetric. The sample distribution is moderately skewed.1 ) ] } 63 .e.63 -133- . SEd = sd * sqrt{ ( 1/n ) * ( 1 .. The sample size is greater than 40. Estimation Requirements The approach described in this lesson is valid whenever the following conditions are met:    The data set is a simple random sample of observations from the population of interest.n/N ) * [ N / ( N . the sampling distribution will be approximately normally distributed if the sample is described by at least one of the following statements. we need to know how to compute the standard deviation and/or the standard error of the sampling distribution for d.  The standard deviation of the mean difference σd is: σd = σd * sqrt{ ( 1/n ) * ( 1 . It can be calculated from the equation below. the standard deviation of the sampling distribution cannot be calculated. Generally.y.1 ) ] } where σd is the standard deviation of the population difference.AP Statistics Tutorial: Mean Difference Between Matched Data Pairs This lesson describes how to construct a confidence interval for the mean difference between matched data pairs. without outliers. Under these circumstances. N is the population size. unimodal. and the sample size is between 16 and 40. without outliers. Each element of the population includes measurements on two paired variables (e. use the standard error. The standard error (SE) provides an unbiased estimate of the standard deviation. and n is the sample size.g. without outliers. unimodal.

Often. Alert The Advanced Placement Statistics Examination only covers the "approximate" formulas for the standard deviation and standard error. x is the value of the independent variable. the standard error is used more often than the standard deviation. the standard error can be approximated by: SEd = sd / sqrt( n ) Note: In real-world analyses. researchers choose 90%. Therefore. 95%. The confidence level describes the uncertainty of a sampling method. based on the critical value and standard deviation. Since it does not require computing degrees of freedon. And the uncertainty is denoted by the confidence level. When the sample sizes are small (less than 40). How to Find the Confidence Interval for Mean Difference With Paired Data Previously. we repeat the key steps below. namely.    Specify the confidence interval. The range of the confidence interval is defined by the sample statistic + margin of error.where sd is the standard deviation of the sample difference. we described how to construct confidence intervals. When the sample size is large. the approximate formulas should only be used when the population size is at least 10 times larger than the sample size. Use the mean difference between sample data pairs (d to estimate the mean difference between population data pairs μd. and ŷ is the predicted value of the dependent variable. use a t score for the critical value. AP Statistics Tutorial: Estimate Regression Slope This lesson describes how to construct a confidence interval to estimate the slope of a regression line ŷ = b0 + b1x where b0 is a constant.  Identify a sample statistic. Select a confidence level. If you use a t score. However. you can use a t score or a z score for the critical value. and n is the sample size. When the population size is much larger (at least 10 times larger) than the sample size. In this case. 64 . students are expected to be aware of the limitations of these formulas. Previously. N is the population size. Find the margin of error. the degrees of freedom is equal to the sample size minus one: DF = n . we showed how to compute the margin of error. you will need to compute degrees of freedom (DF). but any percentage can be used. For convenience. the z score is a little easier. b1 is the slope (also called the regression coefficient).1.64 -133- . the standard deviation of the population is seldom known. or 99% confidence levels.

How to Find the Confidence Interval for the Slope of a Regression Line Previously. Many statistical software packages and some graphing calculators provide the standard error of the slope as a regression analysis output. other software packages might use a different label for the standard error. that the critical value is based on a t score with n . however. Predic tor Consta nt X Coef 76 35 SE Coef 30 20 T 2. we described how to construct confidence intervals.0 4 In the output above. we described how to verify that regression requirements are met. 65 .7 5 P 0. "Std Dev". The Variability of the Slope Estimate To construct a confidence interval for the slope of the regression line. ŷi is estimated value of the dependent variable for observation i. The sample statistic is the regression slope b1 calculated from sample data. However. or something else.2) ] / sqrt [ Σ(xi . In this example. The Y values are roughly normally distributed (i. The table below shows hypothetical output for the following regression equation: y = 76 + 35x . A little skewness is ok if the sample size is large. use the following formula: SE = sb1 = sqrt [ Σ(yi .  Identify a sample statistic. For each value of X.2 degrees of freedom. • • The Y values are independent. the regression slope is 35. x is the mean of the independent variable.    The dependent variable Y has a linear relationship to the independent variable X. It might be "StDev". If you need to calculate the standard error of the slope (SE) by hand.ŷi)2 / (n . symmetric and unimodal). "SE". Previously. the standard error is referred to as "SE Coeff". The confidence interval for the slope uses the same general approach. xi is the observed value of the independent variable for observation i. the standard error of the slope (shaded in gray) is equal to 20. For any given value of X.65 -133- .5 3 1. the probability distribution of Y has the same standard deviation σ.e.Estimation Requirements The approach described in this lesson is valid whenever the standard requirements for simple linear regression are met. Note. In the table above. we need to know the standard error of the sampling distribution of the slope. and n is the number of observations.x)2 ] where yi is the value of the dependent variable for observation i..0 1 0.

in Tails.  State the hypotheses. we would be inclined to reject the null hypothesis and accept the alternative hypothesis. A null hypothesis might be that half the flips would result in Heads and half. we showed how to compute the margin of error.5 Suppose we flipped the coin 50 times. This process. The hypotheses are stated in such a way that they are mutually exclusive. the hypothesis is accepted. Symbolically.5 Ha: P ≠ 0. Since that is often impractical. denoted by H0. Alternative hypothesis. The null hypothesis. 95%. Find the margin of error. Often. This involves stating the null and alternative hypotheses. This assumption may or may not be true.   In the next section. The analysis plan describes how to use sample data to accept or reject the null hypothesis. the other must be false. Formulate an analysis plan. When calculating the margin of error for a regression slope. based on the critical value and standard error.66 -133-  . 66 .  Null hypothesis. is usually the hypothesis that sample observations result purely from chance. The confidence level describes the uncertainty of a sampling method. or 99% confidence levels. If sample data are consistent with the statistical hypothesis. denoted by H1 or Ha. The alternative hypothesis might be that the number of Heads and Tails would be very different. with degrees of freedom (DF) equal to n . consists of four steps. Select a confidence level. we work through a problem that shows how to use this approach to construct a confidence interval for the slope of a regression line. Specify the confidence interval. researchers typically examine a random sample from the population. Hypothesis Tests Statisticians follow a formal process to determine whether to accept or reject a null hypothesis. is the hypothesis that sample observations are influenced by some non-random cause. AP Statistics Tutorial: Tests of Significance A statistical hypothesis is an assumption about a population parameter. researchers choose 90%. it is rejected. resulting in 40 Heads and 10 Tails. Given this result. The alternative hypothesis. but any percentage can be used. based on sample data. There are two types of statistical hypotheses. And the uncertainty is denoted by the confidence level.  For example. use a t score for the critical value. suppose we wanted to determine whether a coin was fair and balanced. these hypotheses would be expressed as H0: P = 0. The best way to determine whether a statistical hypothesis is true would be to examine the entire population. That is. called hypothesis testing. Previously. if not. if one is true. The range of the confidence interval is defined by the sample statistic + margin of error.2. The accept/reject decision often focuses around a single test statistic.

where the region of rejection is on both sides of the sampling distribution. If the test statistic supports the null hypothesis.  Type I error. Analyze sample data. assuming the null hypotheis is true. suppose the null hypothesis states that the mean is less than or equal to 10. Find the value of the test statistic (mean score. In practice. suppose the null hypothesis states that the mean is equal to 10. Complete other computations. If the P-value is less than the significance level. The Advanced Placement (AP) Statistics Exam uses P-values.67 -133- . we say that the hypothesis has been rejected at the α level of significance. The set of values outside the region of acceptance is called the region of rejection. otherwise.  Decision Rules The analysis plan includes decision rules for accepting or rejecting the null hypothesis. The region of acceptance is a range of values. proportion. For example.   P-value. and is often denoted by α. Region of acceptance. Some statistics texts use the P-value approach. The alternative hypothesis would be that the mean is less than 10 or greater than 10. and is often denoted by β. These approaches are equivalent. we reject the null hypothesis. the null hypothesis is rejected. This probability is also called alpha. The probability of not committing a Type II error is called the Power of the test. The probability of committing a Type II error is called Beta.with reference to a P-value or with reference to a region of acceptance. Type II error. A test of a statistical hypothesis. Apply the decision rule described in the analysis plan. In such cases. as required by the plan. The probability of committing a Type I error is called the significance level. One-Tailed and Two-Tailed Tests A test of a statistical hypothesis.  Decision Errors Two types of errors can result from a hypothesis test. The strength of evidence in support of a null hypothesis is measured by the P-value. is called a two-tailed test. others use the region of acceptance approach. accept the null hypothesis. The region of acceptance is defined so that the chance of making a Type I error is equal to the significance level. a set of numbers greater than 10. If the test statistic falls within the region of rejection. The region of rejection would 67 . The alternative hypothesis would be that the mean is greater than 10. The region of rejection would consist of a range of numbers located located on the right side of sampling distribution. z-score. t-score. The P-value is the probability of observing a test statistic as extreme as S. statisticians describe these decision rules in two ways . etc. the null hypothesis is accepted. Interpret results. For example. where the region of rejection is on only one side of the sampling distribution. A Type II error occurs when the researcher accepts a null hypothesis that is false.) described in the analysis plan. If the test statistic falls within the region of acceptance. reject the null hypothesis. that is. Suppose the test statistic is equal to S. A Type I error occurs when the researcher rejects a null hypothesis when it is true. so examples in this tutorial favor the P-value approach. is called a one-tailed test.

Effect Size To compute the power of the test.e. Increasing sample size. the greater the power of the test. Effect size = True value . the greater the power of the test. Significance level (α).   Sample size (n). less likely to make a Type II error. Increasing significance level. The effect size is the difference between the true value and the value specified in the null hypothesis. one offers an alternative view about the "true" value of the population parameter.Hypothesized value For example. If you increase the significance level.  Test Your Understanding of This Lesson Problem 1 Other things being equal. which of the following actions will reduce the power of a hypothesis test? I. II. That is.68 -133- . Factors That Affect Power The power of a hypothesis test is affected by three factors. the greater the power of the test. the greater the sample size. AP Statistics Tutorial: Power of a Hypothesis Test The probability of not committing a Type II error is called the power of a hypothesis test. The "true" value of the parameter being tested. the power of the test is increased. i. As a result. A researcher might ask: What is the probability of rejecting the null hypothesis if the true population mean is equal to 90? In this example. (A) I only (B) II only (C) III only (D) All of the above (E) None of the above 68 . the region of rejection would consist partly of numbers that were less than 10 and partly of numbers that were greater than 10. Increasing beta. Other things being equal. suppose the null hypothesis states that a population mean is equal to 100. III.. This means you are less likely to accept the null hypothesis when it is false. the probability of a Type II error. The greater the difference between the "true" value of a parameter and the value specified in the null hypothesis. the higher the power of the test.100.consist of a range of numbers located located on both sides of sampling distribution. assuming that the null hypothesis is false. the effect size would be 90 . that is. The higher the significance level. which equals -10. the greater the effect size. you are more likely to reject the null hypothesis. you reduce the region of acceptance. Hence.

the power of a test will get smaller as beta gets bigger. the test method involves a test statistic and a sampling distribution. if one is true. the other must be false.  State the hypotheses. and vice versa. As part of the analysis. The hypotheses are stated in such a way that they are mutually exclusive. analyzes sample data according to the plan. perform computations called for in the analysis plan. power is equal to one minus beta. by definition. Increasing sample size makes the hypothesis test more sensitive . The analysis plan describes how to use sample data to accept or reject the null hypothesis. •  Analyze sample data.05. use either of the following equations to compute the test statistic. chi-square.Parameter) / (Standard error of statistic) where Parameter is the value appearing in the null hypothesis. Since. A General Procedure for Conducting Hypothesis Tests All hypothesis tests are conducted the same way. •  Significance level. Test statistic = (Statistic .more likely to reject the null hypothesis when it is. z-score.Solution The correct answer is (C).69 -133- . etc. difference between means. Test method. Often. Typically. When the parameter in the null hypothesis involves categorical data. Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. but any value between 0 and 1 can be used. The researcher states a hypothesis to be tested. difference between proportions. Given a test statistic and its sampling distribution. Previously. in fact. AP Statistics Tutorial: How to Test Hypotheses This lesson describes a general procedure that can be used to test statistical hypotheses. Using sample data. you may need to compute the standard deviation or standard error of the statistic. or 0. proportion. Computed from sample data. That is. and Statistic is the point estimate of Parameter. thus increasing the power of the test. the test statistic might be a mean score. It should specify the following elements. When the null hypothesis involves a mean or proportion. researchers choose significance levels equal to 0. you may use a chi-square 69 . false. formulates an analysis plan. Formulate an analysis plan. • Test statistic. t-score. the null hypothesis is rejected. which makes the hypothesis test more likely to reject the null hypothesis. and accepts or rejects the null hypothesis. we presented common formulas for the standard deviation and standard error.01. Increasing the significance level reduces the region of acceptance.Parameter) / (Standard deviation of statistic) Test statistic = (Statistic .10. 0. If the test statistic probability is less than the significance level. based on results of the analysis. a researcher can assess probabilities associated with the test statistic.

when the following conditions are met:    The sampling method is simple random sampling. The test statistic is a z-score (z) defined by the following equation. Instructions for computing a chi-square test statistic are presented in the lesson on the chi-square goodness of fit test. It should specify the following elements. given the null hypothesis. assuming the null hypotheis is true. (3) analyze sample data.01. If the sample findings are unlikely. Often. That is. researchers choose significance levels equal to 0. and (4) interpret results. 0. this involves comparing the P-value to the significance level. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. Typically. and rejecting the null hypothesis when the P-value is less than the significance level. The hypotheses are stated in such a way that they are mutually exclusive.  Analyze Sample Data Using sample data. Test method.05. The P-value is the probability of observing a sample statistic as extreme as the test statistic.P ) / n ] where P is the hypothesized value of population proportion in the null hypothesis. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. σ = sqrt[ P * ( 1 .statistic as the test statistic. (Some texts say that 5 successes and 5 failures are enough. Compute the standard deviation (σ) of the sampling distribution. Use the one-sample z-test to determine whether the hypothesized population proportion differs significantly from the observed sample proportion. 70 . • P-value. but any value between 0 and 1 can be used.  Significance level. find the test statistic and its associated P-Value.10. and n is the sample size. the researcher rejects the null hypothesis. the other must be false. (2) formulate an analysis plan.  Standard deviation. and vice versa. The sample includes at least 10 successes and 10 failures.) The population size is at least 10 times as big as the sample size.  Interpret the results. if one is true. AP Statistics Tutorial: Hypothesis Test for a Proportion This lesson explains how to conduct a hypothesis test of a proportion. or 0.70 -133- . This approach consists of four steps: (1) state the hypotheses.  Test statistic.

is a one-sample z-test. the significance level is 0. the P-value = 0.P) / σ where P is the hypothesized value of population proportion in the null hypothesis. and (4) interpret results.000. Among the sampled customers.04 z = (p .08. p is the sample proportion. The P-value is the probability of observing a sample statistic as extreme as the test statistic. the P-value is the probability that the z-score is less than -1..80 Note that these hypotheses constitute a two-tailed test. can we reject the CEO's hypothesis that 80% of the customers are very satisfied? Use a 0. (3) analyze sample data.73 . (2) formulate an analysis plan.75) = 0. We use the Normal Distribution Calculator to find P(z < -1. given the null hypothesis.05 level of significance. use the Normal Distribution Calculator to assess the probability associated with the z-score. (See sample problems at the end of this lesson for examples of how this is done. p is the sample proportion.80)/0. Thus. this involves comparing the P-value to the significance level. The null hypothesis will be rejected if the sample proportion is too big or if it is too small.04. the local newspaper surveyed 100 customers.2) / 100] = sqrt(0.04 = -1.0016) = 0.P ) / n ] = sqrt [(0. The first step is to state the null hypothesis and an alternative hypothesis.000 customers are very satisfied with the service they receive. Problem 1: Two-Tailed Test The CEO of a large electric utility claims that 80 percent of his 1.) Interpret Results If the sample findings are unlikely.80 Alternative hypothesis: P ≠ 0.75) = 0. the researcher rejects the null hypothesis. Based on these findings. Using sample data. we calculate the standard deviation (σ) and compute the zscore test statistic (z).75 or greater than 1. Null hypothesis: P = 0. To test this claim.05.z = (p .75 where P is the hypothesized value of population proportion in the null hypothesis.P) / σ = (. using simple random sampling. σ = sqrt[ P * ( 1 .04 = 0. and P(z > 1. Solution: The solution to this problem takes four steps: (1) state the hypotheses.71 -133- . For this analysis. 73 percent say they are very satisified.04 + 0. 71 . The test method. and rejecting the null hypothesis when the P-value is less than the significance level. Analyze sample data.   Formulate an analysis plan. We work through those steps below:  State the hypotheses. Since the test statistic is a z-score. and n is the sample size.  P-value. shown in the next section.75. and σ is the standard deviation of the sampling distribution. Since we have a two-tailed test.04.8 * 0. Typically.

P1 and P2.P2 > 0 P1 .P2 < 0 P1 .. (Some texts say that 5 successes and 5 failures are enough. and (4) interpret results.) Each population is at least 10 times as big as its sample. The test procedure. and the population size was at least 10 times the sample size. The table below shows three sets of hypotheses.08) is greater than the significance level (0.P2 > 0 Number of tails 2 1 1 The first set of hypotheses (Set 1) is an example of a two-tailed test. (2) formulate an analysis plan. Each makes a statement about the difference d between two population proportions. The samples are independent.05). the null and alternative hypothesis for a two-tailed test are often stated in the following form.P2 = 0 P1 . the approach is appropriate because the sampling method was simple random sampling. you may also want to mention why this approach is appropriate. Note: If you use this approach on an exam. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests. called the two-proportion z-test. This approach consists of four steps: (1) state the hypotheses.P2 < 0 Alternative hypothesis P1 . the symbol ≠ means " not equal to ". Each sample includes at least 10 successes and 10 failures.P2 ≠ 0 P1 . is appropriate when the following conditions are met:     The sampling method for each population is simple random sampling. Specifically. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis.e. (In the table. d = 0). we cannot reject the null hypothesis. Interpret results. the sample included at least 10 successes and 10 failures. Since the P-value (0. AP Statistics Tutorial: Hypothesis Test for Difference Between Proportions This lesson explains how to conduct a hypothesis test to determine whether the difference between two proportions is significant. (3) analyze sample data.) Se Null t hypothesis 1 2 3 P1 . since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis. H0: P1 = P2 Ha: P1 ≠ P2 Formulate an Analysis Plan 72 . since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis.72 -133- . When the null hypothesis states that there is no difference between the two population proportions (i.

The P-value is the probability of observing a sample statistic as extreme as the test statistic. n1 is the size of sample 1.  P-value.73 -133- . when the following conditions are met:   The sampling method is simple random sampling. z = (p1 .10. Use the two-proportion z-test (described in the next section) to determine whether the hypothesized difference between population proportions differs significantly from the observed sample difference.p ) * [ (1/n1) + (1/n2) ] } where p is the pooled sample proportion.05.  Significance level. p = (p1 * n1 + p2 * n2) / (n1 + n2) where p1 is the sample proportion from population 1. and SE is the standard error of the sampling distribution. use the Normal Distribution Calculator to assess the probability associated with the z-score.  Standard error. The test statistic is a z-score (z) defined by the following equation. Often. 73 . SE = sqrt{ p * ( 1 . 0. p2 is the proportion from sample 2. Test method. Since the null hypothesis states that P1=P2. and n2 is the size of sample 2. p2 is the sample proportion from population 2. n1 is the size of sample 1. researchers choose significance levels equal to 0. AP Statistics Tutorial: Hypothesis Test of the Mean This lesson explains how to conduct a hypothesis test of a mean.  Test statistic. but any value between 0 and 1 can be used.The analysis plan describes how to use sample data to accept or reject the null hypothesis.01. complete the following computations to find the test statistic and its associated P-Value.p2) / SE where p1 is the proportion from sample 1. Since the test statistic is a z-score.) The analysis described above is a two-proportion z-test. or 0. The sample is drawn from a normal or near-normal population. we use a pooled sample proportion (p) to compute the standard error of the sampling distribution. It should specify the following elements. (See sample problems at the end of this lesson for examples of how this is done.  Pooled sample proportion. Compute the standard error (SE) of the sampling distribution difference between two proportions. and n2 is the size of sample 2.  Analyze Sample Data Using sample data.

and vice versa. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. Each makes a statement about how the population mean μ is related to a specified value M. since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis. test statistic. That is. and the sample size is between 16 and 40.Generally. Test method. (In the table. degrees of freedom. (3) analyze sample data. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. The sampling distribution is symmetric. and (4) interpret results. The sample size is greater than 40.) Se Null t hypothesis 1 2 3 μ=M μ>M μ<M Alternative hypothesis μ≠M μ<M μ>M Number of tails 2 1 1 The first set of hypotheses (Set 1) is an example of a two-tailed test. without outliers.  Significance level. It should specify the following elements. unimodal. (2) formulate an analysis plan.10. the other must be false. or 0. The hypotheses are stated in such a way that they are mutually exclusive. without outliers. 74 . if one is true. 0. but any value between 0 and 1 can be used. unimodal. since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis.05. the sampling distribution will be approximately normally distributed if any of the following conditions apply. researchers choose significance levels equal to 0. This approach consists of four steps: (1) state the hypotheses. without outliers.     The population distribution is normal.  Analyze Sample Data Using sample data. The table below shows three sets of hypotheses.74 -133- . and the sample size is 15 or less. This involves finding the standard error. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests. Use the one-sample t-test to determine whether the hypothesized mean differs significantly from the observed sample mean. conduct a one-sample t-test. The sampling distribution is moderately skewed. Often.01. and the P-value associated with the test statistic. the symbol ≠ means " not equal to ".

The samples are independent. The sample data are slightly skewed. the sampling distribution will be approximately normal if any of the following conditions apply. The test procedure. The test statistic is a t-score (t) defined by the following equation. unimodal. Note: If you use this approach on an exam. is appropriate when the following conditions are met:     The sampling method for each sample is simple random sampling.μ) / SE where x is the sample mean. Generally. the researcher rejects the null hypothesis. given the degrees of freedom computed above. Compute the standard error (SE) of the sampling distribution. When the population size is much larger (at least 10 times larger) than the sample size. you may also want to mention why this approach is appropriate. The sample data are symmetric. 75 . Each sample is drawn from a normal or near-normal population. Test statistic. t = (x . and SE is the standard error. AP Statistics Tutorial: Hypothesis Test for the Difference Between Two Means This lesson explains how to conduct a hypothesis test for the difference between two means. the standard error can be approximated by: SE = s / sqrt( n )  Degrees of freedom. unimodal.75 -133- . called the two-sample t-test. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Specifically. use the t Distribution Calculator to assess the probability associated with the t-score. μ is the hypothesized population mean in the null hypothesis. Standard error. without outliers. given the null hypothesis. Since the test statistic is a t-score. the approach is appropriate because the sampling method was simple random sampling. Each population is at least 10 times larger than its respective sample. N is the population size.) Interpret Results If the sample findings are unlikely.1 ) ] } where s is the standard deviation of the sample. (See sample problems at the end of this lesson for examples of how this is done. and the sample size is 16 to 40. • • • The population distribution is normal. DF = n .   P-value.n/N ) * [ N / ( N . this involves comparing the P-value to the significance level. SE = s * sqrt{ ( 1/n ) * ( 1 .1. without outliers. and n is the sample size. and the sample size is 15 or less. The degrees of freedom (DF) is equal to the sample size (n) minus one. and the population was normally distributed. Thus. and rejecting the null hypothesis when the P-value is less than the significance level. Typically.

and vice versa. When the null hypothesis states that there is no difference between the two population means (i. (3) analyze sample data.• The sample size is greater than 40. without outliers. The hypotheses are stated in such a way that they are mutually exclusive. The table below shows three sets of null and alternative hypotheses.e. Often. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. and (4) interpret results.μ2 ≠ d μ1 . find the standard error.76 -133- . but any value between 0 and 1 can be used. That is.10.05. the symbol ≠ means " not equal to ".μ2 < d μ1 . (In the table. the other must be false. Test method. researchers choose significance levels equal to 0.01. and the P-value associated with the test statistic.μ2 > d Number of tails 2 1 1 The first set of hypotheses (Set 1) is an example of a two-tailed test. (2) formulate an analysis plan. Each makes a statement about the difference d between the mean of one population μ1 and the mean of another population μ2. the null and alternative hypothesis are often stated in the following form. if one is true. H0: μ1 = μ2 Ha: μ1 ≠ μ2 Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. 0. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests. since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. This approach consists of four steps: (1) state the hypotheses. test statistic. degrees of freedom. or 0.μ2 < d Alternative hypothesis μ1 .) Se Null t hypothesis 1 2 3 μ1 . Use the two-sample t-test to determine whether the difference between means found in the sample is significantly different from the hypothesized difference between means.. d = 0). 76 .μ2 > d μ1 . since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis.  Analyze Sample Data Using sample data.μ2 = d μ1 . It should specify the following elements.  Significance level.

without outliers. without outliers.) Each sample is drawn from a normal or near-normal population.1) ] } If DF does not compute to an integer. Typically. The test procedure.1. n1 is the size of sample 1. The P-value is the probability of observing a sample statistic as extreme as the test statistic.d ] / SE where x1 is the mean of sample 1.  Test statistic. SE = sqrt[(s12/n1) + (s22/n2)] where s1 is the standard deviation of sample 1.  P-value. unimodal. AP Statistics Tutorial: Hypothesis Test for Difference Between Matched Pairs This lesson explains how to conduct a hypothesis test for the difference between paired means. and SE is the standard error. and the sample size is 15 or less.x2) . and the sample size is 16 to 40. The test statistic is a t-score (t) defined by the following equation. round it off to the nearest whole number. and rejecting the null hypothesis when the P-value is less than the significance level. without outliers. Compute the standard error (SE) of the sampling distribution.1 and n2 . (As a result. and n2 is the size of sample 2. given the null hypothesis. Some texts suggest that the degrees of freedom can be approximated by the smaller of n1 . this involves comparing the P-value to the significance level.1) ] + [ (s22 / n2)2 / (n2 . called the matched-pairs t-test. Standard error. having the degrees of freedom computed above. d is the hypothesized difference between population means. Generally. unimodal. Since the test statistic is a t-score. use the t Distribution Calculator to assess the probability associated with the t-score. 77 . The sample data are symmetric. the sampling distribution will be approximately normal if any of the following conditions apply. x2 is the mean of sample 2. the data sets are not independent. is appropriate when the following conditions are met:    The sampling method for each sample is simple random sampling. The test is conducted on paired data.77 -133- . The degrees of freedom (DF) is: DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 . The sample data are slightly skewed. • • • • The population distribution is normal. t = [ (x1 .  Degrees of freedom.) Interpret Results If the sample findings are unlikely. but the above formula gives better results. (See sample problems at the end of this lesson for examples of how this is done. s2 is the standard deviation of sample 2. the researcher rejects the null hypothesis. The sample size is greater than 40.

researchers choose significance levels equal to 0. (3) analyze sample data. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. 78 . since an extreme value on either side of the sampling distribution would cause a researcher to reject the null hypothesis. or 0. Each makes a statement about how the true difference in population values μd is related to some hypothesized value D.  Analyze Sample Data Using sample data.78 -133- . (In the table.x2 where x1 is the value of variable x in the first data set.  Significance level. The table below shows three sets of null and alternative hypotheses. which is based on the difference between paired values from two data sets. (2) formulate an analysis plan. but any value between 0 and 1 can be used. since an extreme value on only one side of the sampling distribution would cause a researcher to reject the null hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. degrees of freedom. d = x1 . test statistic.05. find the standard deviation. and vice versa. Often. standard error.This approach consists of four steps: (1) state the hypotheses.10. if one is true. Use the matched-pairs t-test to determine whether the difference between sample means for paired data is significantly different from the hypothesized difference between population means. and the P-value associated with the test statistic. The hypotheses concern a new variable d. The other two sets of hypotheses (Sets 2 and 3) are one-tailed tests. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. 0. the symbol ≠ means " not equal to ". It should specify the following elements. the other must be false. That is. Test method. and x2 is the value of the variable from the second data set that is paired with x1.01.) Se Null t hypothesis 1 2 3 μd= D μd > D μd < D Alternative hypothesis μd ≠ D μd < D μd > D Number of tails 2 1 1 The first set of hypotheses (Set 1) is an example of a two-tailed test. and (4) interpret results.

Test statistic. sd = sqrt [ (Σ(di .n/N ) * [ N / ( N . Compute the standard deviation (sd) of the differences computed from n matched pairs.D ] / SE = (d . It is used to determine whether sample data are consistent with a hypothesized distribution.1) ] where di is the difference for pair i. For example.d)2 / (n . The P-value is the probability of observing a sample statistic as extreme as the test statistic. veterans. Since the test statistic is a t-score. AP Statistics Tutorial: Chi-Square Goodness-of-Fit Test This lesson explains how to conduct a chi-square goodness of fit test. the researcher rejects the null hypothesis. D is the hypothesized difference between population means. When the population size is much larger (at least 10 times larger) than the sample size. and n is the sample size. the standard error can be approximated by: SE = sd / sqrt( n )   Degrees of freedom. t = [ (x1 .1 ) ] } where sd is the standard deviation of the sample difference. 60%.) Interpret Results If the sample findings are unlikely. All-Stars. It claimed that 30% of its cards were rookies. x2 is the mean of sample 2. having the degrees of freedom computed above.D) / SE where x1 is the mean of sample 1. The test is applied when you have one categorical variable from a single population. and 10%. d is the mean difference between paired values in the sample. d is the sample mean of the differences. Compute the standard error (SE) of the sampling distribution of d. N is the population size. suppose a company printed baseball cards.  Standard error. Standard deviation. (See the sample problem at the end of this lesson for guidance on how this is done. and SE is the standard error. The test procedure described in this lesson is appropriate when the following conditions are met: 79 . this involves comparing the P-value to the significance level. use the t Distribution Calculator to assess the probability associated with the t-score. The test statistic is a t-score (t) defined by the following equation.79 -133- . and rejecting the null hypothesis when the P-value is less than the significance level. We could gather a random sample of baseball cards and use a chi-square goodness of fit test to see whether our sample distribution differed significantly from the distribution claimed by the company. Typically.1 . and n is the number of paired values.  P-value. given the null hypothesis.x2) . SE = sd * sqrt{ ( 1/n ) * ( 1 . The sample problem at the end of the lesson considers this example. The degrees of freedom (DF) is: DF = n .

Typically. 0. expected frequency counts. the hypotheses take the following form. The alternative hypothesis is that at least one of the specified proportions is not true. (2) formulate an analysis plan. researchers choose significance levels equal to 0.10.1 . The hypotheses are stated in such a way that they are mutually exclusive. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. For a chi-square goodness of fit test. The expected value for each level of the variable is at least 5.    The sampling method is simple random sampling. but any value between 0 and 1 can be used.05. Test method. or 0. Expected frequency counts. the other must be false. and the P-value associated with the test statistic. and vice versa. Use the chi-square goodness of fit test to determine whether observed sample frequencies differ significantly from expected frequencies specified in the null hypothesis. The population is at least 10 times as large as the sample. find the degrees of freedom. (3) analyze sample data.80 -133-  . This approach consists of four steps: (1) state the hypotheses. if one is true. Ha: The data are not consistent with a specified distribution. and (4) interpret results. test statistic. The chi-square goodness of fit test is described in the next section.  Analyze Sample Data Using sample data.  Significance level.01. The expected frequency counts at each level of the categorical variable are equal to the sample size times the hypothesized proportion from the null hypothesis Ei = npi 80 . and demonstrated in the sample problem at the end of this lesson. the null hypothesis specifies the proportion of observations at each level of the categorical variable. The plan should specify the following elements.  Degrees of freedom. The variable under study is categorical. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. H0: The data are consistent with a specified distribution. That is. The degrees of freedom (DF) is equal to one minus the number of levels (k) of the categorical variable: DF = k . Often.

The test is applied to a single categorical variable from two different different populations. such as males and females.Ei)2 / Ei ] where Oi is the observed frequency count for the ith level of the categorical variable. For example. and Ei is the expected frequency count for the ith level of the categorical variable. use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. (3) analyze sample data. in a survey of TV viewing preferences. The P-value is the probability of observing a sample statistic as extreme as the test statistic. and rejecting the null hypothesis when the P-value is less than the significance level. The test procedure described in this lesson is appropriate when the following conditions are met:     For each population. Typically.where Ei is the expected frequency count for the ith level of the categorical variable. We could use a chi-square test for homogeneity to determine whether male viewing preferences differed significantly from female viewing preference. and vice versa. Interpret Results If the sample findings are unlikely. This approach consists of four steps: (1) state the hypotheses. If sample data are displayed in a contingency table (Populations x Category levels). The hypotheses are stated in such a way that they are mutually exclusive. Each population is at least 10 times as large as its respective sample. State the Hypotheses Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. n is the total sample size. this involves comparing the P-value to the significance level. and (4) interpret results.  P-value. Use the degrees of freedom computed above. The variable under study is categorical.81 -133- . we might ask respondents to identify their favorite program. if one is true. That is. It is used to determine whether frequency counts are distributed identically across different populations. the researcher rejects the null hypothesis. The sample problem at the end of the lesson considers this example. the expected frequency count for each cell of the table is at least 5. 81 . and pi is the hypothesized proportion of observations in level i. (2) formulate an analysis plan. AP Statistics Tutorial: Chi-Square Test for Homogeneity This lesson explains how to conduct a chi-square test of homogeneity. We might ask the same question of two different populations. Χ2 = Σ [ (Oi . given the null hypothesis. the sampling method is simple random sampling. the other must be false. The test statistic is a chi-square random variable (Χ2) defined by the following equation.  Test statistic. Since the test statistic is a chi-square.

Suppose that data were sampled from r populations, and assume that the categorical variable had c levels. At any specified level of the categorical variable, the null hypothesis states that each population has the same proportion of observations. Thus, H0: Plevel 1 of population 1 = Plevel 1 of population 2 = . . . = Plevel 1 of population r H0: Plevel 2 of population 1 = Plevel 2 of population 2 = . . . = Plevel 2 of population r ... H0: Plevel c of population 1 = Plevel c of population 2 = . . . = Plevel c of population r The alternative hypothesis (Ha) is that at least one of the null hypothesis statements is false. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used. Test method. Use the chi-square test for homogeneity to determine whether observed sample frequencies differ significantly from expected frequencies specified in the null hypothesis. The chi-square test for homogeneity is described in the next section.

Analyze Sample Data Using sample data from the contingency tables, find the degrees of freedom, expected frequency counts, test statistic, and the P-value associated with the test statistic. The analysis described in this section is illustrated in the sample problem at the end of this lesson.

Degrees of freedom. The degrees of freedom (DF) is equal to: DF = (r - 1) * (c - 1) where r is the number of populations, and c is the number of levels for the categorical variable.

Expected frequency counts. The expected frequency counts are computed separately for each population at each level of the categorical variable, according to the following formula. Er,c = (nr * nc) / n where Er,c is the expected frequency count for population r at level c of the categorical variable, nr is the total number of observations from population r, nc is the total number of observations at treatment level c, and n is the total sample size.

Test statistic. The test statistic is a chi-square random variable (Χ2) defined by the following equation. Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ] 82 - 82 -133-

where Or,c is the observed frequency count in population r for level c of the categorical variable, and Er,c is the expected frequency count in population r for level c of the categorical variable.

P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

Interpret Results If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level. AP Statistics Tutorial: Chi-Square Test for Independence This lesson explains how to conduct a chi-square test for independence. The test is applied when you have two categorical variables from a single population. It is used to determine whether there is a significant association between the two variables. For example, in an election survey, voters might be classified by gender (male or female) and voting preference (Democrat, Republican, or Independent). We could use a chi-square test for independence to determine whether gender is related to voting preference. The sample problem at the end of the lesson considers this example. The test procedure described in this lesson is appropriate when the following conditions are met:
   

The sampling method is simple random sampling. Each population is at least 10 times as large as its respective sample. The variables under study are each categorical. If sample data are displayed in a contingency table, the expected frequency count for each cell of the table is at least 5.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. State the Hypotheses Suppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that knowing the level of Variable A does not help you predict the level of Variable B. That is, the variables are independent. H0: Variable A and Variable B are independent. Ha: Variable A and Variable B are not independent. The alternative hypothesis is that knowing the level of Variable A can help you predict the level of Variable B. Note: Support for the alternative hypothesis suggests that the variables are related; but the relationship is not necessarily causal, in the sense that one variable "causes" the other. Formulate an Analysis Plan 83 - 83 -133-

The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used. Test method. Use the chi-square test for independence to determine whether there is a significant relationship between two categorical variables.

Analyze Sample Data Using sample data, find the degrees of freedom, expected frequencies, test statistic, and the P-value associated with the test statistic. The approach described in this section is illustrated in the sample problem at the end of this lesson.

Degrees of freedom. The degrees of freedom (DF) is equal to: DF = (r - 1) * (c - 1) where r is the number of levels for one catagorical variable, and c is the number of levels for the other categorical variable.

Expected frequencies. The expected frequency counts are computed separately for each level of one categorical variable at each level of the other categorical variable. Compute r * c expected frequencies, according to the following formula. Er,c = (nr * nc) / n where Er,c is the expected frequency count for level r of Variable A and level c of Variable B, nr is the total number of sample observations at level r of Variable A, nc is the total number of sample observations at level c of Variable B, and n is the total sample size.

Test statistic. The test statistic is a chi-square random variable (Χ2) defined by the following equation. Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ] where Or,c is the observed frequency count at level r of Variable A and level c of Variable B, and Er,c is the expected frequency count at level r of Variable A and level c of Variable B.

P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

Interpret Results If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level. AP Statistics Tutorial: Hypothesis Test for Slope of Regression Line 84 - 84 -133-

This lesson describes how to conduct a hypothesis test to determine whether there is a significant linear relationship between an independent variable X and a dependent variable Y. The test focuses on the slope of the regression line Y = Β0 + Β1X where Β0 is a constant, Β1 is the slope (also called the regression coefficient), X is the value of the independent variable, and Y is the value of the dependent variable. Test Requirements The approach described in this lesson is valid whenever the standard requirements for simple linear regression are met.
  

The dependent variable Y has a linear relationship to the independent variable X. For each value of X, the probability distribution of Y has the same standard deviation σ. For any given value of X,
• •

The Y values are independent. The Y values are roughly normally distributed (i.e., symmetric and unimodal). A little skewness is ok if the sample size is large.

Previously, we described how to verify that regression requirements are met. The test procedure consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. State the Hypotheses If there is a significant linear relationship between the independent variable X and the dependent variable Y, the slope will not equal zero. H0: Β1 = 0 Ha: Β1 ≠ 0 The null hypothesis states that the slope is equal to zero, and the alternative hypothesis states that the slope is not equal to zero. Formulate an Analysis Plan The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used. Test method. Use a linear regression t-test (described in the next section) to determine whether the slope of the regression line differs significantly from zero.

Analyze Sample Data 85 - 85 -133-

The P-value is the probability of observing a sample statistic as extreme as the test statistic. Many statistical software packages and some graphing calculators provide the standard error of the slope as a regression analysis output. the slope of the regression line. 86 . the standard error is referred to as "SE Coeff". Interpret Results If the sample findings are unlikely. find the standard error of the slope. Use the degrees of freedom computed above. The test statistic is a t-score (t) defined by the following equation. or something else. "Std Dev". the researcher rejects the null hypothesis. t = b1 / SE where b1 is the slope of the sample regression line.x)2 ] where yi is the value of the dependent variable for observation i. "SE". and the P-value associated with the test statistic. and SE is the standard error of the slope. The degrees of freedom (DF) is equal to: DF = n . and rejecting the null hypothesis when the P-value is less than the significance level. and n is the number of observations. ŷi is estimated value of the dependent variable for observation i.86 -133- . the slope is equal to 35. the test statistic.     Test statistic. Typically.5 3 1. given the null hypothesis. If you need to calculate the standard error of the slope (SE) by hand.  P-value. Like the standard error. In the hypothetical output above. x is the mean of the independent variable. use the following formula:  SE = sb1 = sqrt [ Σ(yi . this involves comparing the P-value to the significance level. Since the test statistic is a t-score.2 where n is the number of observations in the sample. In this example. use the t Distribution Calculator to assess the probability associated with the test statistic.ŷi)2 / (n . Slope. the degrees of freedom. Degrees of freedom.  Standard error. The table below shows hypothetical output for the following regression equation: y = 76 + 35x . It might be "StDev".2) ] / sqrt [ Σ(xi .0 4  In the output above.0 1 0. However. xi is the observed value of the independent variable for observation i.7 5 P 0. Predic tor Consta nt X Coef 76 35 SE Coef 30 20 T 2.Using sample data. The approach described in this section is illustrated in the sample problem at the end of this lesson. the slope of the regression line will be provided by most statistics software packages. the standard error of the slope (shaded in gray) is equal to 20. other software packages might use a different label for the standard error.

you may also want to mention that this approach is only appropriate when the standard requirements for simple linear regression are satisfied.Note: If you use this approach on an exam. 87 .87 -133- .