Satistics Chapters

Azad Balochistan________________________________________________________________Statistics
Statistics: Notes
Chapter
 Definitions
 Notes
 Generating Random Numbers on the TI-82
 Sampling Lab designed to expose the student to each of the five types of sampling
Chapter
 Definitions
 Creating Grouped Frequency Distributions
 Introduction to Statistics and Lists on the TI-82
 Creating Histograms, Box Plots, and Grouped Frequency Distributions on the TI-82
 Creating an Ogive on the TI-82
 Creating Pie Charts on the TI-82 using the PIE program
Chapter
 Definitions
 Measures of Central Tendency
 Measures of Variation
 Measures of Position
Chapter
 Definitions
 Introduction to Probability
 Addition and Multiplication Rules
 Conditional Probability
Chapter
 Definitions
 Probability Distributions
 Binomial Probabilities
 Other Distributions: Multinomial, Poisson, HyperGeometric
Chapter
 Definitions
 Introduction to Normal Probabilities
 Table - Standard Normal Probabilities
 Central Limit Theorem
 Approximating the Binomial with the Normal
Chapter
 Definitions
 Introduction to Estimation
 Estimating the Population Mean
 Table - Student's T Probabilities
 Estimating the Population Proportion
 Sample Size Determination
Chapter
 Definitions
 Introduction to Hypothesis Testing
 Determining the type of test
 Using confidence intervals to do hypothesis testing
 Steps to Hypothesis Testing
 Testing of Means
 Hypothesis test example: Does pi = 3.2?
 Testing of Proportions
 P-values
Chapter 1
Statistics: Introduction
Definitions
Statistics
Collection of methods for planning experiments, obtaining data, and then organizing,
summarizing, presenting, analyzing, interpreting, and drawing conclusions.
Variable
Characteristic or attribute that can assume different values
Random Variable
A variable whose values are determined by chance.
Population
All subjects possessing a common characteristic that is being studied.
Sample
A subgroup or subset of the population.
Parameter
Characteristic or measure obtained from a population.
Statistic (not to be confused with Statistics)
Characteristic or measure obtained from a sample.
Descriptive Statistics
Collection, organization, summarization, and presentation of data.
Inferential Statistics
Generalizing from samples to populations using probabilities. Performing hypothesis
testing, determining relationships between variables, and making predictions.
Qualitative Variables
Variables which assume non-numerical values.
Quantitative Variables
Variables which assume numerical values.
Discrete Variables
Variables which assume a finite or countable number of possible values. Usually
obtained by counting.
Continuous Variables
Variables which assume an infinite number of possible values. Usually obtained by
measurement.
Nominal Level
Level of measurement which classifies data into mutually exclusive, all inclusive
categories in which no order or ranking can be imposed on the data.
Ordinal Level
Level of measurement which classifies data into categories that can be ranked.
Differences between the ranks do not exist.
Interval Level
Level of measurement which classifies data that can be ranked and differences are
meaningful. However, there is no meaningful zero, so ratios are meaningless.
Ratio Level
Level of measurement which classifies data that can be ranked, differences are
meaningful, and there is a true zero. True ratios exist between the different units of
measure.
Random Sampling
Sampling in which the data is collected using chance methods or random numbers.
Systematic Sampling
Sampling in which data is obtained by selecting every kth object.
Convenience Sampling
Sampling in which data is which is readily available is used.
Stratified Sampling
Sampling in which the population is divided into groups (called strata) according to some
characteristic. Each of these strata is then sampled using one of the other sampling
techniques.
Cluster Sampling
Sampling in which the population is divided into groups (usually geographically). Some
of these groups are randomly selected, and then all of the elements in those groups are
selected.
Statistics: Introduction
Population vs Sample
The population includes all objects of interest whereas the sample is only a portion of the
population. Parameters are associated with populations and statistics with samples. Parameters
are usually denoted using Greek letters (mu, sigma) while statistics are usually denoted using
Roman letters (x, s).
There are several reasons why we don't work with populations. They are usually large, and it is
often impossible to get data for every object we're studying. Sampling does not usually occur
without cost, and the more items surveyed, the larger the cost.
We compute statistics, and use them to estimate parameters. The computation is the first part of
the statistics course (Descriptive Statistics) and the estimation is the second part (Inferential
Statistics)
Discrete vs Continuous
Discrete variables are usually obtained by counting. There are a finite or countable number of
choices available with discrete data. You can't have 2.63 people in the room.
Continuous variables are usually obtained by measuring. Length, weight, and time are all
examples of continous variables. Since continuous variables are real numbers, we usually round
them. This implies a boundary depending on the number of decimal places. For example: 64 is
really anything 63.5 <= x < 64.5. Likewise, if there are two decimal places, then 64.03 is really
anything 63.025 <= x < 63.035. Boundaries always have one more decimal place than the data
and end in a 5.
Levels of Measurement
There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These go from
lowest level to highest level. Data is classified according to the highest level which it fits. Each
additional level adds something the previous level didn't have.
 Nominal is the lowest level. Only names are meaningful here.

 Ordinal adds an order to the names.
 Interval adds meaningful differences
 Ratio adds a zero so that ratios are meaningful.
Types of Sampling
There are five types of sampling: Random, Systematic, Convenience, Cluster, and Stratified.
 Random sampling is analogous to putting everyone's name into a hat and drawing out
several names. Each element in the population has an equal chance of occuring. While
this is the preferred way of sampling, it is often difficult to do. It requires that a complete
list of every element in the population be obtained. Computer generated lists are often
used with random sampling. You can generate random numbers using the TI82
calculator.
 Systematic sampling is easier to do than random sampling. In systematic sampling, the
list of elements is "counted off". That is, every kth element is taken. This is similar to
lining everyone up and numbering off "1,2,3,4; 1,2,3,4; etc". When done numbering, all
people numbered 4 would be used.
 Convenience sampling is very easy to do, but it's probably the worst technique to use. In
convenience sampling, readily available data is used. That is, the first people the surveyor
runs into.
 Cluster sampling is accomplished by dividing the population into groups -- usually
geographically. These groups are called clusters or blocks. The clusters are randomly
selected, and each element in the selected clusters are used.
 Stratified sampling also divides the population into groups called strata. However, this
time it is by some characteristic, not geographically. For instance, the population might
be separated into males and females. A sample is taken from each of these strata using
either random, systematic, or convenience sampling.
 TI-82: Generating Random Numbers

 You can generate random numbers on the TI-82 calculator using the following sequence.
N is the number of different values which could be and S is the minimum number.
 int (N*rand+S)
 INT is found under the MATH menu (math num 4). RAND is also found under the
MATH menu (math prb 1).
 Simulate the rolling of a die (1-6): int (6*rand+1)
 Simulate the flipping of a coin (0-1): int (2*rand)
 This works because the rand function returns a random number between 0 and 1
(including 0 but not including 1). When it is multiplied by N, it becomes between 0 and
N, and then S is added, so it becomes between S and S+N.
 If you have two values (A and B) that you need random numbers between, then you can
generate them using the following formulas.
 N=B-A+1
 int (N*rand+A)
 Notice it is B-A+1 not B-A. Everyone agrees there are 10 numbers between 1 and 10
(inclusive). But, if you take 10-1, you get 9, not 10. Also, in the formula above, replace
the N by the actual number of different values.
 Since the calculator remembers the last formula put in, and evaluates it when you hit
enter, to generate more random numbers, just hit enter again. Each time you hit enter, you
will get another random number.
Sampling Lab
The purpose of this laboratory exercise is to familiarize yourself with the different sampling
techniques.
You need one page from a movie listing (like contained in TV-Guide). Note, if you actually use
TV Guide®, then you need to use two facing pages. Pick a page with little extraneous material,
other than the listings, on it.
For the purposes of this sampling project, a movie is included on the page or in a cluster if the
running time for the movie falls on the page.
Random Sampling
Number each movie on the page. If there are a lot of movies, you may wish to number every
other or every third movie.
Generate a random sample on 8 numbers between 1 and the number of movies on the page.
Write down the # generated and the running time for the movie corresponding to that number.
Systematic Sampling
Generate a random number between 1 and 6. Beginning with the movie corresponding to that
number, and then taking every 6th movie thereafter, write the # of the movie and the running
length of the movie.
Convenience Sampling
Write down the running time of the first eight movies.
Stratified Sampling
On a separate piece of paper, write down the running times of all PG/PG13, R, and not-rated
(either NR or no rating given) movies in three columns -- ignore all other types (NC17, G, etc).
Split a sample of 8 proportionally to each type of movie (if R is 40%, then sample 40% of 8 =
3.2 -> 3 R movies). Use random sampling within each movie type. Record the running lengths of
the movies selected.
Cluster Sampling
Divide the page into equal regions so that each region has roughly 3 - 4 movies in each cluster.
Randomly select 3 clusters, and record the running length of all movies in those clusters.
Chapter 2
Statistics: Frequency Distributions & Graphs
Definitions
Raw Data
Data collected in original form.
Frequency
The number of times a certain value or class of values occurs.
Frequency Distribution
The organization of raw data in table form with classes and frequencies.
Categorical Frequency Distribution
A frequency distribution in which the data is only nominal or ordinal.
Ungrouped Frequency Distribution
A frequency distribution of numerical data. The raw data is not grouped.
Grouped Frequency Distribution
A frequency distribution where several numbers are grouped into one class.
Class Limits
Separate one class in a grouped frequency distribution from another. The limits could
actually appear in the data and have gaps between the upper limit of one class and the
lower limit of the next.
Class Boundaries
Separate one class in a grouped frequency distribution from another. The boundaries have
one more decimal place than the raw data and therefore do not appear in the data. There
is no gap between the upper boundary of one class and the lower boundary of the next
class. The lower class boundary is found by subtracting 0.5 units from the lower class
limit and the upper class boundary is found by adding 0.5 units to the upper class limit.
Class Width
The difference between the upper and lower boundaries of any class. The class width is
also the difference between the lower limits of two consecutive classes or the upper limits
of two consecutive classes. It is not the difference between the upper and lower limits of
the same class.
Class Mark (Midpoint)
The number in the middle of the class. It is found by adding the upper and lower limits
and dividing by two. It can also be found by adding the upper and lower boundaries and
dividing by two.
Cumulative Frequency
The number of values less than the upper class boundary for the current class. This is a
running total of the frequencies.
Relative Frequency
The frequency divided by the total frequency. This gives the percent of values falling in
that class.
Cumulative Relative Frequency (Relative Cumulative Frequency)
The running total of the relative frequencies or the cumulative frequency divided by the
total frequency. Gives the percent of the values which are less than the upper class
boundary.
Histogram
A graph which displays the data by using vertical bars of various heights to represent
frequencies. The horizontal axis can be either the class boundaries, the class marks, or the
class limits.
Frequency Polygon
A line graph. The frequency is placed along the vertical axis and the class midpoints are
placed along the horizontal axis. These points are connected with lines.
Ogive
A frequency polygon of the cumulative frequency or the relative cumulative frequency.
The vertical axis the cumulative frequency or relative cumulative frequency. The
horizontal axis is the class boundaries. The graph always starts at zero at the lowest class
boundary and will end up at the total frequency (for a cumulative frequency) or 1.00 (for
a relative cumulative frequency).
Pareto Chart
A bar graph for qualitative data with the bars arranged according to frequency.
Pie Chart
Graphical depiction of data as slices of a pie. The frequency determines the size of the
slice. The number of degrees in any slice is the relative frequency times 360 degrees.
Pictograph
A graph that uses pictures to represent data.
Stem and Leaf Plot
A data plot which uses part of the data value as the stem and the rest of the data value
(the leaf) to form groups or classes. This is very useful for sorting data quickly.
Statistics: Grouped Frequency Distributions
Guidelines for classes
1. There should be between 5 and 20 classes.

2. The class width should be an odd number. This will guarantee that the class midpoints are
integers instead of decimals.
3. The classes must be mutually exclusive. This means that no data value can fall into two different
classes
4. The classes must be all inclusive or exhaustive. This means that all data values must be included.
5. The classes must be continuous. There are no gaps in a frequency distribution. Classes that have
no values in them must be included (unless it's the first or last class which are dropped).
6. The classes must be equal in width. The exception here is the first or last class. It is possible to
have an "below ..." or "... and above" class. This is often used with ages.
Creating a Grouped Frequency Distribution
1. Find the largest and smallest values

2. Compute the Range = Maximum - Minimum
3. Select the number of classes desired. This is usually between 5 and 20.
4. Find the class width by dividing the range by the number of classes and rounding up. There are
two things to be careful of here. You must round up, not off. Normally 3.2 would round to be 3,
but in rounding up, it becomes 4. If the range divided by the number of classes gives an integer
value (no remainder), then you can either add one to the number of classes or add one to the
class width. Sometimes you're locked into a certain number of classes because of the
instructions. The Bluman text fails to mention the case when there is no remainder.
5. Pick a suitable starting point less than or equal to the minimum value. You will be able to cover:
"the class width times the number of classes" values. You need to cover one more value than
the range. Follow this rule and you'll be okay: The starting point plus the number of classes
times the class width must be greater than the maximum value. Your starting point is the lower
limit of the first class. Continue to add the class width to this lower limit to get the rest of the
lower limits.
6. To find the upper limit of the first class, subtract one from the lower limit of the second class.
Then continue to add the class width to this upper limit to find the rest of the upper limits.
7. Find the boundaries by subtracting 0.5 units from the lower limits and adding 0.5 units from the
upper limits. The boundaries are also half-way between the upper limit of one class and the
lower limit of the next class. Depending on what you're trying to accomplish, it may not be
necessary to find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not be
necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies.
It is possible to have the TI-82 calculator find the frequencies for you. You will have to find the
class width and class boundaries first.
TI-82: Lists and Statistics

There are two features of the TI-82 calculator that will be used. Lists and Statistics. The STAT
key is located at the top center of the calculator and the LIST key is obtained by 2nd STAT.
There are six lists that you can work with at any time on the calculator. Each set of data requires
a list. If you include frequencies for the frequency distribution, then it will require a list for the
data and a separate list for the frequencies. The lists are labeled L1, L2, L3, L4, L5, and L6 and
are accessed on the calculator by pressing "2nd 1", "2nd 2", etc.
STATS Key
STAT has two major categories, EDIT and CALC
STAT-EDIT
1. Edit - Use this to enter data into a list.

2. SortA( - This will sort a list in ascending order. This is useful if you want to find the frequencies
after you have already established the limits or boundaries. Since the data is sorted in order, you
just have to go through and count the number in each class. You don't need to do the tally. This
will replace the list you tell it sort.
3. SortD( - This will sort a list in descending order. This will replace the list you tell it to sort.
4. ClrList - This will erase any existing lists.
STAT-CALC
1. 1-Var Stats - This is used when there is only one variable. It will handle both raw data and
frequency distributions.
2. 2-Var Stats - This is used when there are two variables, and x and y. This won't happen until the
end of the semester.
3. Setup - You will need to check the setup before you find any other statistical values from this
menu. It allows you to specify which list(s) you put the data into and if necessary, which list
contains the frequencies.
4. Med-Med - A regression model that isn't used in this course.
5. LinReg(ax+b) - A regression model that will be used later in the course after we talk about two
variable statistics.
6. QuadReg - A regression model that isn't used in this course.
7. CubicReg - A regression model that isn't used in this course.
8. QuartReg - A regression model that isn't used in this course.
9. LinReg(ax+b) - A regression model that will be used later in the course after we talk about two
variable statistics. The book uses this model, however, we will use #5 instead.
10. LnReg - A regression model that isn't used in this course.
11. ExpReg - A regression model that isn't used in this course.
12. PwrReg - A regression model that isn't used in this course.
LIST Key
The LIST command has two major sections, OPS (Operations) and MATH.
LIST-OPS
1. SortA( - This will sort a list in ascending order. This command is equivalent to the
SortA( command under the STATS key.
2. SortD( - This will sort a list in descending order. This command is equivalent to the
SortD( command under the STATS key.
3. dim - This function will return the dimensions of a list. The dimension of a list the number of
elements in the list. This is also used as a command to set the dimensions of a list.
4. Fill( - This command will fill a list with a constant. This is useful if you need to set an entire list to
all be one number.
5. seq - This function will generate a sequence of numbers according to the function specified as
the first argument. A list is returned, but you must save it to one of the six lists if you want to
use it for anything.
LIST-MATH
1. min( - Returns the minimum value in a list.

2. max( - Returns the maximum value in a list
3. mean( - Returns the arithmetic mean of all numbers in the list. The mean is the sum of the list
divided by the dimension of the list.
4. median( - Returns the median of the list. The median is the middle number when the list is
sorted in ascending order. If the dimension is an even number, the median is the midpoint
between the two middle values when the list is sorted in ascending order.
5. sum - Returns the sum of the values in the list.
6. prod - Returns the product of the values in the list. If the product of a list is zero, then at least
one of the numbers is zero.
Other Keys
VARS
The VARS key can be used to retrieve the value of a statistic.
VARS Statistics
This will save a lot of retyping of values and allow you to use the full accuracy of the calculator
instead of losing digits when re-entering numbers.
Here are some common values you will be using:
Keystrokes Statistic
VARS 5 1 n, the sample size
VARS 5 1 x bar, the sample mean
VARS 5 1 Sx, the sample standard deviation

VARS 5 1 minX, the minimum value
VARS 5 1 maxX, the maximum value
There are other values under statistics which you will use. You may have to arrow to other
submenus first for some of them.
STORE
This key will save values. You may save a scalar value to a real variable (A-Z) or a list value to a
list (L1 - L6). You can use the STORE key to save a value to the dimension of a list to set its
size. You can use the STORE key to save a list generated by the sequence command to a list.
Mathematical Operations and Functions
Lists can be used as arguments of functions. If they are, the function is applied to each element in
the list. Mathematical operations can be performed on lists. For more information on lists, see the
Introduction to the TI-82.
Entering Data
Always start with a clean set of data. You don't want to mix data from one problem with data
from another problem. Before starting any new problem, you should clear out existing data.
STAT ClrList L1,L2,L3
Another way to clear the lists is to go into STAT EDIT, arrow to the top so that the list name is
highlighted. Then press the CLEAR key and ENTER.
You may only need to specify one list, but you can specify more than one, just separate them
with commas.
After the lists have been cleared, you can enter the new lists:
STAT Edit
Select this list that you want to use. The default will be L1. This will be fine for most things, but
do realize you can use any of the lists. Just be sure to check the setup later.
Type in each number separating them by enter. When you are done entering, press the QUIT key
(2nd MODE).
If you need to correct data, just go back to STAT EDIT without clearing the list first.
TI-82: Histograms, BoxPlots

You can use the calculator to draw histograms, box-plots, and compute the frequency of each
class.
See the instructions on using the calculator to do statistics and lists. This provides an overview as
well as some helpful advice for working with statistics on the calculator.
Histograms
1. Enter the data.
2. Determine the class width and the lower class boundary (not limit) of the first class using
the techniques for creating grouped frequency distributions.
3. Turn off any regular plots: Hit Y= and position the cursor over any equal sign which is in
inversed video (white on black) by arrowing left and then down if necessary. Hit enter
while the cursor is on the equal sign to toggle between displaying the function (equal sign
highlighted) and not displaying the function (equal sign not highlighted).
4. Press the STATPLOT key (2nd Y=)
5. Select a plot (usually plot 1) and hit enter
6. Turn the plot on by highlighting the ON and pressing enter.
7. Set the TYPE to histograph (last type)
8. Set the XLIST to the list you put the data into
9. Set the FREQ to 1.
10. Select WINDOW
11. Put the lower class boundary for the first class in XMIN
12. The XMAX value should be the lower class boundary for the first class plus the number
of classes times the class width.
13. The Class Width should be stored in XSCL
14. YMIN should be set to 0
15. YMAX should be at least the largest frequency in any class. This is difficult to know if
you're generating the histogram without first writing the table by hand. If the histogram
displayed doesn't fit on the screen, go back and change this number. A good initial guess
might be the sample size divided by the number or classes. You might round up it to a
nice number (multiple of 5) or add one or two so that graph is completely shown on the
screen.
16. YSCL should be set based on the YMAX value. A factor of YMAX would be a good
choice (so if YMAX is 30, let YSCL be 5). If your YMAX is small (say under 10), you
might want to set it to 1. This will determine how many marks are placed along the
vertical axis.
17. Hit the GRAPH key.
Finding the Frequency

1. Generate a histogram first
2. Hit the TRACE key

3. The "min" value is the lower class boundary
4. The "max" value is the upper class boundary
5. The "n" value is the frequency for that class.
6. Use the left and right arrow keys to get the values for all the classes.
Box Plots
1. Enter the data.
2. Turn off any regular plots: Hit Y= and position the cursor over any equal sign which is in
inversed video (white on black) by arrowing left and then down if necessary. Hit enter
while the cursor is on the equal sign to toggle between displaying the function (equal sign
highlighted) and not displaying the function (equal sign not highlighted).
6. Set the TYPE to box-plot (3rd type)
7. Set the XLIST to the list you put the data into
8. Set the FREQ to 1.
9. Zoom to Statistics mode (ZOOM 9)
You hit the TRACE key with the box plot displayed to find the five numbers associated with it.
You may use the left and right arrow keys to find all five numbers. Note that the calculator uses
the quartiles instead of the hinges. The hinges and quartiles are the same unless the remainder
when the sample size is divided by four is three.
TI-82: Plotting an Ogive

The Ogive is a frequency polygon (line plot) graph of the cumulative frequency or the relative
cumulative frequency.
The horizontal axis is marked with the class boundaries and the vertical axis is the frequency. All
class boundaries are used -- there will be one more class boundary than the number of classes.
The following example assumes the class boundaries are in List 1 and the cumulative frequencies
are in List 2. You are free to use any two lists that you desire, but you should make the
appropriate adjustments in the instructions if you don't use List 1 and List 2.
1. Enter the class boundaries into List 1. Start with the lower boundary of the first class and end
with the upper boundary of the last class.
2. Enter the cumulative frequencies into List 2. Start with 0 for the first value because there is
nothing less than the first lower class boundary.
3. Turn off any regular plots: Hit Y= and position the cursor over any equal sign which is in inversed
video (white on black) by arrowing left and then down if necessary. Hit enter while the cursor is
on the equal sign to toggle between displaying the function (equal sign highlighted) and not
displaying the function (equal sign not highlighted).
7. Set the TYPE to LinePlot (2nd type)
8. Set the XLIST to List 1
9. Set the YLIST to List 2
10. Set the MARKER to any of the three values
11. Select WINDOW
12. Put the lower class boundary for the first class in XMIN
13. The XMAX value should be the upper class boundary of the last class
14. The Class Width should be stored in XSCL
15. YMIN should be set to 0
16. YMAX should be set to the total frequency if using cumulative frequencies in List 2 and set to
1.00 if using relative cumulative frequencies in List 2.
17. YSCL should be set appropriately based on YMAX.
18. Hit the GRAPH key.
Relative Frequencies
There is no need to re-enter the data if you wish to use relative cumulative frequencies instead of
cumulative frequencies.
The following assumes that the cumulative frequencies are in List 2.
Replace the ### by the total frequency. You can't put "###" into the calculator.
L2 / ### STORE L2
This will replace the cumulative frequencies with the relative cumulative frequencies.
To replace relative cumulative frequencies with cumulative frequencies, change the division to
multiplication.
L2 * ### STORE L2
PIE Program
The TI-82 doesn't support pie charts directly as it does with scatterplots, box plots, and
histograms.
Place the frequencies or relative frequencies in List 1. If the List 1 is empty or the sum of list 1 is
zero, then you are instructed to put the frequencies in list 1.
Turn off any graphs that may be on before running the PIE program. Otherwise, the graphs will
overlay the pie chart and it will take longer to draw.
The program will ask the user if they wish to place the labels on the graph. If the user enters 1 for
yes, then the values in List 1 will be placed in the graph. This is where the difference between
frequencies or relative frequencies appear.
This program will force the calculator into radian mode and turn the axes off, zoom standard and
then zoom square. It will then draw a circle and proceed to draw the lines which define the pie
graph.
To reset the graphing screen to normal when done viewing the pie chart, you need to:
1. DRAW CLRDRAW
2. WINDOW FORMAT AXESON
3. MODE DEGREE -Depending on your use, Leaving it in Radian mode may be preferred
Chapter 3
Statistics: Data Description
Definitions
Statistic
Characteristic or measure obtained from a sample

Parameter
Characteristic or measure obtained from a population
Mean
Sum of all the values divided by the number of values. This can either be a population
mean (denoted by mu) or a sample mean (denoted by x bar)
Median
The midpoint of the data after being ranked (sorted in ascending order). There are as
many numbers below the median as above the median.
Mode
The most frequent number
Skewed Distribution
The majority of the values lie together on one side with a very few values (the tail) to the
other side. In a positively skewed distribution, the tail is to the right and the mean is
larger than the median. In a negatively skewed distribution, the tail is to the left and the
mean is smaller than the median.
Symmetric Distribution
The data values are evenly distributed on both sides of the mean. In a symmetric
distribution, the mean is the median.
Weighted Mean
The mean when each value is multiplied by its weight and summed. This sum is divided
by the total of the weights.
Midrange
The mean of the highest and lowest values. (Max + Min) / 2
Range
The difference between the highest and lowest values. Max - Min
Population Variance
The average of the squares of the distances from the population mean. It is the sum of the
squares of the deviations from the mean divided by the population size. The units on the
variance are the units of the population squared.
Sample Variance
Unbiased estimator of a population variance. Instead of dividing by the population size,
the sum of the squares of the deviations from the sample mean is divided by one less than
the sample size. The units on the variance are the units of the population squared.
Standard Deviation
The square root of the variance. The population standard deviation is the square root of
the population variance and the sample standard deviation is the square root of the sample
variance. The sample standard deviation is not the unbiased estimator for the population
standard deviation. The units on the standard deviation is the same as the units of the
population/sample.
Coefficient of Variation
Standard deviation divided by the mean, expressed as a percentage. We won't work with
the Coefficient of Variation in this course.
Chebyshev's Theorem
The proportion of the values that fall within k standard deviations of the mean is at least
where k > 1. Chebyshev's theorem can be applied to any distribution regardless

of its shape.
Empirical or Normal Rule
Only valid when a distribution in bell-shaped (normal). Approximately 68% lies within 1
standard deviation of the mean; 95% within 2 standard deviations; and 99.7% within 3
standard deviations of the mean.
Standard Score or Z-Score
The value obtained by subtracting the mean and dividing by the standard deviation. When
all values are transformed to their standard scores, the new mean (for Z) will be zero and
the standard deviation will be one.
Percentile
The percent of the population which lies below that value. The data must be ranked to
find percentiles.
Quartile
Either the 25th, 50th, or 75th percentiles. The 50th percentile is also called the median.
Decile
Either the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, or 90th percentiles.
Lower Hinge
The median of the lower half of the numbers (up to and including the median). The lower
hinge is the first Quartile unless the remainder when dividing the sample size by four is 3.
Upper Hinge
The median of the upper half of the numbers (including the median). The upper hinge is
the 3rd Quartile unless the remainder when dividing the sample size by four is 3.
Box and Whiskers Plot (Box Plot)
A graphical representation of the minimum value, lower hinge, median, upper hinge, and
maximum. Some textbooks, and the TI-82 calculator, define the five values as the
minimum, first Quartile, median, third Quartile, and maximum.
Five Number Summary
Minimum value, lower hinge, median, upper hinge, and maximum.
InterQuartile Range (IQR)
The difference between the 3rd and 1st Quartiles.
Outlier
An extremely high or low value when compared to the rest of the values.
Mild Outliers
Values which lie between 1.5 and 3.0 times the InterQuartile Range below the 1st
Quartile or above the 3rd Quartile. Note, some texts use hinges instead of Quartiles.
Extreme Outliers
Values which lie more than 3.0 times the InterQuartile Range below the 1st Quartile or
above the 3rd Quartile. Note, some texts use hinges instead of Quartiles.
Stats: Measures of Central Tendency

The term "Average" is vague

Average could mean one of four things. The arithmetic mean, the median, midrange, or mode.
For this reason, it is better to specify which average you're talking about.
Mean
This is what people usually intend when they say "average"
Population Mean:
Sample Mean:
Frequency Distribution:
The mean of a frequency distribution is also the weighted mean.
Median
The data must be ranked (sorted in ascending order) first. The median is the number in the
middle.
To find the depth of the median, there are several formulas that could be used, the one that we
will use is:
Depth of median = 0.5 * (n + 1)
Raw Data
The median is the number in the "depth of the median" position. If the sample size is even, the
depth of the median will be a decimal -- you need to find the midpoint between the numbers on
either side of the depth of the median.
Ungrouped Frequency Distribution
Find the cumulative frequencies for the data. The first value with a cumulative frequency greater
than depth of the median is the median. If the depth of the median is exactly 0.5 more than the
cumulative frequency of the previous class, then the median is the midpoint between the two
classes.
Grouped Frequency Distribution
This is the tough one.
Since the data is grouped, you have lost all original information. Some textbooks have you
simply take the midpoint of the class. This is an over-simplification which isn't the true value
(but much easier to do). The correct process is to interpolate.
Find out what proportion of the distance into the median class the median by dividing the sample
size by 2, subtracting the cumulative frequency of the previous class, and then dividing all that
bay the frequency of the median class.
Multiply this proportion by the class width and add it to the lower boundary of the median class.
Mode
The mode is the most frequent data value. There may be no mode if no one value appears more
than any other. There may also be two modes (bimodal), three modes (trimodal), or more than
three modes (multi-modal).
For grouped frequency distributions, the modal class is the class with the largest frequency.
Midrange
The midrange is simply the midpoint between the highest and lowest values.
Summary
The Mean is used in computing other statistics (such as the variance) and does not exist for open
ended grouped frequency distributions (1). It is often not appropriate for skewed distributions
such as salary information.
The Median is the center number and is good for skewed distributions because it is resistant to
change.
The Mode is used to describe the most typical case. The mode can be used with nominal data
whereas the others can't. The mode may or may not exist and there may be more than one value
for the mode (2).
The Midrange is not used very often. It is a very rough estimate of the average and is greatly
affected by extreme values (even more so than the mean).
Property Mean Median Mode Midrange

Always Exists No (1) Yes No (2) Yes
Uses all data values Yes No No No
Affected by extreme values Yes No No Yes
Using the TI-82

One can find the mean, median, and midrange using the list functions of the TI-82. You can also
find the measures of variation with the TI-82 calculator.
Stats: Measures of Variation
Range
The range is the simplest measure of variation to find. It is simply the highest value minus the
lowest value.
RANGE = MAXIMUM - MINIMUM
Since the range only uses the largest and smallest values, it is greatly affected by extreme values,
that is - it is not resistant to change.
Variance
"Average Deviation"
The range only involves the smallest and largest numbers, and it would be desirable to have a
statistic which involved all of the data values.
The first attempt one might make at this is something they might call the average deviation from
the mean and define it as:
The problem is that this summation is always zero. So, the average deviation will always be zero.
That is why the average deviation is never used.
Population Variance
So, to keep it from being zero, the deviation from the mean is squared and called the "squared
deviation from the mean". This "average squared deviation from the mean" is called the variance.
Unbiased Estimate of the Population Variance
One would expect the sample variance to simply be the population variance with the population
mean replaced by the sample mean. However, one of the major uses of statistics is to estimate
the corresponding parameter. This formula has the problem that the estimated value isn't the
same as the parameter. To counteract this, the sum of the squares of the deviations is divided by
one less than the sample size.
Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means that the
units were also squared. To get the units back the same as the original data values, the square
root must be taken.
The sample standard deviation is not the unbiased estimator for the population standard
deviation.
The calculator does not have a variance key on it. It does have a standard deviation key. You will
have to square the standard deviation to find the variance.
Sum of Squares (shortcuts)

The sum of the squares of the deviations from the means is given a shortcut notation and several
alternative formulas.
A little algebraic simplification returns:
What's wrong with the first formula, you ask? Consider the following example - the last row are
the totals for the columns
1. Total the data values: 23

2. Divide by the number of values to get the mean: 23/5 = 4.6
3. Subtract the mean from each value to get the numbers in the second column.
4. Square each number in the second column to get the values in the third column.
5. Total the numbers in the third column: 5.2
6. Divide this total by one less than the sample size to get the variance: 5.2 / 4 = 1.3
4 4 - 4.6 = -0.6 ( - 0.6 )^2 = 0.36
5 5 - 4.6 = 0.4 ( 0.4 ) ^2 = 0.16
3 3 - 4.6 = -1.6 ( - 1.6 )^2 = 2.56
6 6 - 4.6 = 1.4 ( 1.4 )^2 = 1.96
5 5 - 4.6 = 0.4 ( 0.4 )^2 = 0.16
23 0.00 (Always) 5.2
Not too bad, you think. But this can get pretty bad if the sample mean doesn't happen to be an
"nice" rational number. Think about having a mean of 19/7 = 2.714285714285... Those
subtractions get nasty, and when you square them, they're really bad. Another problem with the
first formula is that it requires you to know the mean ahead of time. For a calculator, this would
mean that you have to save all of the numbers that were entered. The TI-82 does this, but most
scientific calculators don't.
Now, let's consider the shortcut formula. The only things that you need to find are the sum of the
values and the sum of the values squared. There is no subtraction and no decimals or fractions
until the end. The last row contains the sums of the columns, just like before.
1. Record each number in the first column and the square of each number in the second column.
2. Total the first column: 23
3. Total the second column: 111
4. Compute the sum of squares: 111 - 23*23/5 = 111 - 105.8 = 5.2
5. Divide the sum of squares by one less than the sample size to get the variance = 5.2 / 4 = 1.3
x x^2
4 16
5 25
3 9
6 36
5 25
23 111
Chebyshev's Theorem
The proportion of the values that fall within k standard deviations of the mean will be at least
, where k is an number greater than 1.
"Within k standard deviations" interprets as the interval: to .
Chebyshev's Theorem is true for any sample set, not matter what the distribution.
Empirical Rule
The empirical rule is only valid for bell-shaped (normal) distributions. The following statements
are true.
 Approximately 68% of the data values fall within one standard deviation of the mean.
 Approximately 95% of the data values fall within two standard deviations of the mean.
 Approximately 99.7% of the data values fall within three standard deviations of the mean.
The empirical rule will be revisited later in the chapter on normal probabilities.
Using the TI-82 to find these values

You may use the TI-82 to find the measures of central tendency and the measures of variation
using the list handling capabilities of the calculator.
Stats: Measures of Position
Standard Scores (z-scores)

The standard score is obtained by subtracting the mean and dividing the difference by the
standard deviation. The symbol is z, which is why it's also called a z-score.
The mean of the standard scores is zero and the standard deviation is 1. This is the nice feature of
the standard score -- no matter what the original scale was, when the data is converted to its
standard score, the mean is zero and the standard deviation is 1.
Percentiles, Deciles, Quartiles

Percentiles (100 regions)
The kth percentile is the number which has k% of the values below it. The data must be ranked.
1. Rank the data

2. Find k% (k /100) of the sample size, n.
3. If this is an integer, add 0.5. If it isn't an integer round up.
4. Find the number in this position. If your depth ends in 0.5, then take the midpoint between the
two numbers.
It is sometimes easier to count from the high end rather than counting from the low end. For
example, the 80th percentile is the number which has 80% below it and 20% above it. Rather than
counting 80% from the bottom, count 20% from the top.
Note: The 50th percentile is the median.
If you wish to find the percentile for a number (rather than locating the kth percentile), then
1. Take the number of values below the number

2. Add 0.5
3. Divide by the total number of values
4. Convert it to a percent
Deciles (10 regions)
The percentiles divide the data into 100 equal regions. The deciles divide the data into 10 equal
regions. The instructions are the same for finding a percentile, except instead of dividing by 100
in step 2, divide by 10.
Quartiles (4 regions)
The quartiles divide the data into 4 equal regions. Instead of dividing by 100 in step 2, divide by
4.
Note: The 2nd quartile is the same as the median. The 1st quartile is the 25th percentile, the 3rd
quartile is the 75th percentile.
The quartiles are commonly used (much more so than the percentiles or deciles). The TI-82
calculator will find the quartiles for you. Some textbooks include the quartiles in the five number
summary.
Hinges
The lower hinge is the median of the lower half of the data up to and including the median. The
upper hinge is the median of the upper half of the data up to and including the median.
The hinges are the same as the quartiles unless the remainder when dividing the sample size by
four is three (like 39 / 4 = 9 R 3).
The statement about the lower half or upper half including the median tends to be confusing to
some students. If the median is split between two values (which happens whenever the sample
size is even), the median isn't included in either since the median isn't actually part of the data.
Example 1: sample size of 20
The median will be in position 10.5. The lower half is positions 1 - 10 and the upper half is
positions 11 - 20. The lower hinge is the median of the lower half and would be in position 5.5.
The upper hinge is the median of the upper half and would be in position 5.5 starting with
original position 11 as position 1 -- this is the original position 15.5.
Example 2: sample size of 21
The median is in position 11. The lower half is positions 1 - 11 and the upper half is positions 11
- 21. The lower hinge is the median of the lower half and would be in position 6. The upper
hinge is the median of the upper half and would be in position 6 when starting at position 11 --
this is original position 16.
Five Number Summary

The five number summary consists of the minimum value, lower hinge, median, upper hinge,
and maximum value. Some textbooks use the quartiles instead of the hinges.
Box and Whiskers Plot

A graphical representation of the five number summary. A box is drawn between the lower and
upper hinges with a line at the median. Whiskers (a single line, not a box) extend from the hinges
to lines at the minimum and maximum values.
Interquartile Range (IQR)

The interquartile range is the difference between the third and first quartiles. That's it: Q3 - Q1
Outliers
Outliers are extreme values. There are mild outliers and extreme outliers. The Bluman text does
not distinguish between mild outliers and extreme outliers and just treats either as an outlier.
Extreme Outliers
Extreme outliers are any data values which lie more than 3.0 times the interquartile range below
the first quartile or above the third quartile. x is an extreme outlier if ...
x < Q1 - 3 * IQR
or
x > Q3 + 3 * IQR
Mild Outliers
Mild outliers are any data values which lie between 1.5 times and 3.0 times the interquartile
range below the first quartile or above the third quartile. x is a mild outlier if ...
Q1 - 3 * IQR <= x < Q1 - 1.5 * IQR
or
Q1 + 1.5 * IQR < x <= Q3 + 3 * IQR
Chapter 4
Stats: Probability
Definitions
Probability Experiment
Process which leads to well-defined results call outcomes
Outcome
The result of a single trial of a probability experiment
Sample Space
Set of all possible outcomes of a probability experiment
Event
One or more outcomes of a probability experiment
Classical Probability
Uses the sample space to determine the numerical probability that an event will happen.
Also called theoretical probability.
Equally Likely Events
Events which have the same probability of occurring.
Complement of an Event
All the events in the sample space except the given events.
Empirical Probability
Uses a frequency distribution to determine the numerical probability. An empirical
probability is a relative frequency.
Subjective Probability
Uses probability values based on an educated guess or estimate. It employs opinions and
inexact information.
Mutually Exclusive Events

Two events which cannot happen at the same time.
Disjoint Events
Another name for mutually exclusive events.
Independent Events
Two events are independent if the occurrence of one does not affect the probability of the
other occurring.
Dependent Events
Two events are dependent if the first event affects the outcome or occurrence of the
second event in a way the probability is changed.
Conditional Probability
The probability of an event occurring given that another event has already occurred.
Bayes' Theorem
A formula which allows one to find the probability that an event occurred as the result of
a particular previous event.
Stats: Introduction to Probability
Sample Spaces
A sample space is the set of all possible outcomes. However, some sample spaces are better than
others.
Consider the experiment of flipping two coins. It is possible to get 0 heads, 1 head, or 2 heads.
Thus, the sample space could be {0, 1, 2}. Another way to look at it is flip { HH, HT, TH, TT }.
The second way is better because each event is as equally likely to occur as any other.
When writing the sample space, it is highly desirable to have events which are equally likely.
Another example is rolling two dice. The sums are { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }. However,
each of these aren't equally likely. The only way to get a sum 2 is to roll a 1 on both dice, but you
can get a sum of 4 by rolling a 1-3, 2-2, or 3-1. The following table illustrates a better sample
space for the sum obtain when rolling two dice.
Second Die
First Die 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
Classical Probability
The above table lends itself to describing data another way -- using a probability distribution.
Let's consider the frequency distribution for the above sums.
Sum Frequency Relative

Frequency
2 1 1/36
3 2 2/36
4 3 3/36
5 4 4/36
6 5 5/36
7 6 6/36
8 5 5/36
9 4 4/36
10 3 3/36
11 2 2/36
12 1 1/36
If just the first and last columns were written, we would have a probability distribution. The
relative frequency of a frequency distribution is the probability of the event occurring. This is
only true, however, if the events are equally likely.
This gives us the formula for classical probability. The probability of an event occurring is the
number in the event divided by the number in the sample space. Again, this is only true when the
events are equally likely. A classical probability is the relative frequency of each event in the
sample space when each event is equally likely.
P(E) = n(E) / n(S)
Empirical Probability
Empirical probability is based on observation. The empirical probability of an event is the
relative frequency of a frequency distribution based upon observation.
P(E) = f / n
Probability Rules
There are two rules which are very important.
All probabilities are between 0 and 1 inclusive

0 <= P(E) <= 1
The sum of all the probabilities in the sample space is 1
There are some other rules which are also important.
The probability of an event which cannot occur is 0.
The probability of any event which is not in the sample space is zero.
The probability of an event which must occur is 1.
The probability of the sample space is 1.
The probability of an event not occurring is one minus the probability of it occurring.
P(E') = 1 - P(E)
Continue and learn more about the rules of probability.
Stats: Probability Rules
"OR" or Unions
Mutually Exclusive Events
Two events are mutually exclusive if they cannot occur at the same time. Another word that
means mutually exclusive is disjoint.
If two events are disjoint, then the probability of them both occurring at the same time is 0.
Disjoint: P(A and B) = 0
If two events are mutually exclusive, then the probability of either occurring is the sum of the
probabilities of each occurring.
Specific Addition Rule
Only valid when the events are mutually exclusive.
P(A or B) = P(A) + P(B)

Example 1:
Given: P(A) = 0.20, P(B) = 0.70, A and B are disjoint
I like to use what's called a joint probability distribution. (Since disjoint means nothing in
common, joint is what they have in common -- so the values that go on the inside portion of the
table are the intersections or "and"s of each pair of events). "Marginal" is another word for totals
-- it's called marginal because they appear in the margins.
B B' Marginal
A 0.00 0.20 0.20
A' 0.70 0.10 0.80
Marginal 0.70 0.30 1.00

The values in red are given in the problem. The grand total is always 1.00. The rest of the values
are obtained by addition and subtraction.
Non-Mutually Exclusive Events
In events which aren't mutually exclusive, there is some overlap. When P(A) and P(B) are added,
the probability of the intersection (and) is added twice. To compensate for that double addition,
the intersection needs to be subtracted.
General Addition Rule
Always valid.
P(A or B) = P(A) + P(B) - P(A and B)

Example 2:
Given P(A) = 0.20, P(B) = 0.70, P(A and B) = 0.15
B B' Marginal
A 0.15 0.05 0.20
A' 0.55 0.25 0.80
Marginal 0.70 0.30 1.00
Interpreting the table
Certain things can be determined from the joint probability distribution. Mutually exclusive
events will have a probability of zero. All inclusive events will have a zero opposite the
intersection. All inclusive means that there is nothing outside of those two events: P(A or B) =
1.
B B' Marginal
A A and B are Mutually Exclusive if . .

this value is 0
A' . A and B are All Inclusive if this .

value is 0
Marginal . . 1.00
"AND" or Intersections
Independent Events
Two events are independent if the occurrence of one does not change the probability of the other
occurring.
An example would be rolling a 2 on a die and flipping a head on a coin. Rolling the 2 does not
affect the probability of flipping the head.
If events are independent, then the probability of them both occurring is the product of the
probabilities of each occurring.
Specific Multiplication Rule
Only valid for independent events
P(A and B) = P(A) * P(B)

Example 3:
P(A) = 0.20, P(B) = 0.70, A and B are independent.
B B' Marginal
A 0.14 0.06 0.20
A' 0.56 0.24 0.80
Marginal 0.70 0.30 1.00
The 0.14 is because the probability of A and B is the probability of A times the probability of B
or 0.20 * 0.70 = 0.14.
Dependent Events
If the occurrence of one event does affect the probability of the other occurring, then the events
are dependent.
The probability of event B occurring that event A has already occurred is read "the probability of
B given A" and is written: P(B|A)
General Multiplication Rule
Always works.
P(A and B) = P(A) * P(B|A)

Example 4:
P(A) = 0.20, P(B) = 0.70, P(B|A) = 0.40
A good way to think of P(B|A) is that 40% of A is B. 40% of the 20% which was in event A is
8%, thus the intersection is 0.08.
B B' Marginal
A 0.08 0.12 0.20
A' 0.62 0.18 0.80
Marginal 0.70 0.30 1.00
Independence Revisited
The following four statements are equivalent
1. A and B are independent events

2. P(A and B) = P(A) * P(B)
3. P(A|B) = P(A)
4. P(B|A) = P(B)
The last two are because if two events are independent, the occurrence of one doesn't change the
probability of the occurrence of the other. This means that the probability of B occurring,
whether A has happened or not, is simply the probability of B occurring
Stats: Conditional Probability

Recall that the probability of an event occurring given that another event has already occurred is
called a conditional probability.
The probability that event B occurs, given that event A has already occurred is
P(B|A) = P(A and B) / P(A)
This formula comes from the general multiplication principle and a little bit of algebra.
Since we are given that event A has occurred, we have a reduced sample space. Instead of the
entire sample space S, we now have a sample space of A since we know A has occurred. So the
old rule about being the number in the event divided by the number in the sample space still
applies. It is the number in A and B (must be in A since A has occurred) divided by the number
in A. If you then divided numerator and denominator of the right hand side by the number in the
sample space S, then you have the probability of A and B divided by the probability of A.
Examples
Example 1:
The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
. Yes No Total
Male 19 41 60
Female 12 28 40
Total 31 69 100
 What is the probability of a randomly selected individual being a male who smokes? This is just a
joint probability. The number of "Male and Smoke" divided by the total = 19/100 = 0.19
 What is the probability of a randomly selected individual being a male? This is the total for male
divided by the total = 60/100 = 0.60. Since no mention is made of smoking or not smoking, it
includes all the cases.
 What is the probability of a randomly selected individual smoking? Again, since no mention is
made of gender, this is a marginal probability, the total who smoke divided by the total = 31/100
= 0.31.
 What is the probability of a randomly selected male smoking? This time, you're told that you
have a male - think of stratified sampling. What is the probability that the male smokes? Well,
19 males smoke out of 60 males, so 19/60 = 0.31666...
 What is the probability that a randomly selected smoker is male? This time, you're told that you
have a smoker and asked to find the probability that the smoker is also male. There are 19 male
smokers out of 31 total smokers, so 19/31 = 0.6129 (approx)
After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it
- that's the beauty of it. A Bayes' problem can be set up so it appears to be just another
conditional probability. In this class we will treat Bayes' problems as another conditional
probability and not involve the large messy formula given in the text (and every other text).
Example 2:
There are three major manufacturing companies that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and
10% of Chompieliens' product is defective.
This information can be placed into a joint probability distribution
Company Good Defective Total
Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50
Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30
Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20
Total 0.934 0.066 1.00
The percent of the market share for Chompieliens wasn't given, but since the marginals must add
to be 1.00, they have a 20% market share.
Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because
they are conditional probabilities and the table is a joint probability table. These defective
probabilities are conditional upon which company was given. That is, the 7% is not P(Defective),
but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) =
P(Defective|Brochmailians) * P(Brochmailians).
The "good" probabilities can be found by subtraction as shown above, or by multiplication using
conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good.
0.93(0.30)=0.279.
 What is the probability a randomly selected product is defective? P(Defective) = 0.066

 What is the probability that a defective product came from Brochmailians? P(Brochmailian|
Defective) = P(Brochmailian and Defective) / P(Defective) = 0.021/0.066 = 7/22 = 0.318 (approx).
 Are these events independent? No. If they were, then P(Brochmailians|Defective)=0.318 would
have to equal the P(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and
Defective)=0.025 would have to be P(Aberations)*P(Defective) = 0.50*0.066=0.033, and it
doesn't.
The second question asked above is a Bayes' problem. Again, my point is, you don't have to
know Bayes formula just to work a Bayes' problem.
Bayes' Theorem
However, just for the sake of argument, let's say that you want to know what Bayes' formula is.
Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D
instead of Aberations, Brochmailians, Chompieliens, and Defective.
P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse
conditional probability P(B|D).
It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the
part of D in C.
P(B and D)
P(B|D) = -----------------------------------------
P(A and D) + P(B and D) + P(C and D)
Inserting the multiplication rule for each of these joint probabilities gives
P(D|B)*P(B)
P(B|D) = -----------------------------------------
P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)
However, and I hope you agree, it is much easier to take the joint probability divided by the
marginal probability. The table does the adding for you and makes the problems doable without
having to memorize the formulas.
Stats: Conditional Probability
Recall that the probability of an event occurring given that another event has already occurred is
called a conditional probability.
The probability that event B occurs, given that event A has already occurred is
P(B|A) = P(A and B) / P(A)
This formula comes from the general multiplication principle and a little bit of algebra.
Since we are given that event A has occurred, we have a reduced sample space. Instead of the
entire sample space S, we now have a sample space of A since we know A has occurred. So the
old rule about being the number in the event divided by the number in the sample space still
applies. It is the number in A and B (must be in A since A has occurred) divided by the number
in A. If you then divided numerator and denominator of the right hand side by the number in the
sample space S, then you have the probability of A and B divided by the probability of A.
Examples
Example 1:
The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
. Yes No Total
Male 19 41 60
Female 12 28 40
Total 31 69 100
 What is the probability of a randomly selected individual being a male who smokes? This is just a
joint probability. The number of "Male and Smoke" divided by the total = 19/100 = 0.19
 What is the probability of a randomly selected individual being a male? This is the total for male
divided by the total = 60/100 = 0.60. Since no mention is made of smoking or not smoking, it
includes all the cases.
 What is the probability of a randomly selected individual smoking? Again, since no mention is
made of gender, this is a marginal probability, the total who smoke divided by the total = 31/100
= 0.31.
 What is the probability of a randomly selected male smoking? This time, you're told that you
have a male - think of stratified sampling. What is the probability that the male smokes? Well,
19 males smoke out of 60 males, so 19/60 = 0.31666...
 What is the probability that a randomly selected smoker is male? This time, you're told that you
have a smoker and asked to find the probability that the smoker is also male. There are 19 male
smokers out of 31 total smokers, so 19/31 = 0.6129 (approx)
After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it
- that's the beauty of it. A Bayes' problem can be set up so it appears to be just another
conditional probability. In this class we will treat Bayes' problems as another conditional
probability and not involve the large messy formula given in the text (and every other text).
Example 2:
There are three major manufacturing companies that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and
10% of Chompieliens' product is defective.
This information can be placed into a joint probability distribution
Company Good Defective Total
Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50
Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30
Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20
Total 0.934 0.066 1.00
The percent of the market share for Chompieliens wasn't given, but since the marginals must add
to be 1.00, they have a 20% market share.
Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because
they are conditional probabilities and the table is a joint probability table. These defective
probabilities are conditional upon which company was given. That is, the 7% is not P(Defective),
but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) =
P(Defective|Brochmailians) * P(Brochmailians).
The "good" probabilities can be found by subtraction as shown above, or by multiplication using
conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good.
0.93(0.30)=0.279.
 What is the probability a randomly selected product is defective? P(Defective) = 0.066

 What is the probability that a defective product came from Brochmailians? P(Brochmailian|
Defective) = P(Brochmailian and Defective) / P(Defective) = 0.021/0.066 = 7/22 = 0.318 (approx).
 Are these events independent? No. If they were, then P(Brochmailians|Defective)=0.318 would
have to equal the P(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and
Defective)=0.025 would have to be P(Aberations)*P(Defective) = 0.50*0.066=0.033, and it
doesn't.
The second question asked above is a Bayes' problem. Again, my point is, you don't have to
know Bayes formula just to work a Bayes' problem.
Bayes' Theorem
However, just for the sake of argument, let's say that you want to know what Bayes' formula is.
Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D
instead of Aberations, Brochmailians, Chompieliens, and Defective.
P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse
conditional probability P(B|D).
It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the
part of D in C.
P(B and D)
P(B|D) = -----------------------------------------
P(A and D) + P(B and D) + P(C and D)
Inserting the multiplication rule for each of these joint probabilities gives
P(D|B)*P(B)
P(B|D) = -----------------------------------------
P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)
However, and I hope you agree, it is much easier to take the joint probability divided by the
marginal probability. The table does the adding for you and makes the problems doable without
having to memorize the formulas.
Chapter 5
Stats: Probability Distributions
Definitions
Random Variable
Variable whose values are determined by chance
Probability Distribution
The values a random variable can assume and the corresponding probabilities of each.
Expected Value
The theoretical mean of the variable.
Binomial Experiment
An experiment with a fixed number of independent trials. Each trial can only have two
outcomes, or outcomes which can be reduced to two outcomes. The probability of each
outcome must remain constant from trial to trial.
Binomial Distribution
The outcomes of a binomial experiment with their corresponding probabilities.
Multinomial Distribution
A probability distribution resulting from an experiment with a fixed number of
independent trials. Each trial has two or more mutually exclusive outcomes. The
probability of each outcome must remain constant from trial to trial.
Poisson Distribution
A probability distribution used when a density of items is distributed over a period of
time. The sample size needs to be large and the probability of success to be small.
Hypergeometric Distribution
A probability distribution of a variable with two outcomes when sampling is done
without replacement.
Stats: Probability Distributions

Probability Functions
A probability function is a function which assigns probabilities to the values of a random
variable.
 All the probabilities must be between 0 and 1 inclusive

 The sum of the probabilities of the outcomes must be 1.
If these two conditions aren't met, then the function isn't a probability function. There is no
requirement that the values of the random variable only be between 0 and 1, only that the
probabilities be between 0 and 1.
Probability Distributions
A listing of all the values the random variable can assume with their corresponding probabilities
make a probability distribution.
A note about random variables. A random variable does not mean that the values can be anything
(a random number). Random variables have a well defined set of outcomes and well defined
probabilities for the occurrence of each outcome. The random refers to the fact that the outcomes
happen by chance -- that is, you don't know which outcome will occur next.
Here's an example probability distribution that results from the rolling of a single fair die.
x 1 2 3 4 5 6 sum
p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6=1
Mean, Variance, and Standard Deviation

Consider the following.
The definitions for population mean and variance used with an ungrouped frequency distribution
were:
Some of you might be confused by only dividing by N. Recall that this is the population
variance, the sample variance, which was the unbiased estimator for the population variance was
when it was divided by n-1.
Using algebra, this is equivalent to:

Recall that a probability is a long term relative frequency. So every f/N can be replaced by p(x).
This simplifies to be:
What's even better, is that the last portion of the variance is the mean squared. So, the two
formulas that we will be using are:
Here's the example we were working on earlier.
x 1 2 3 4 5 6 sum
p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6 = 1
x p(x) 1/6 2/6 3/6 4/6 5/6 6/6 21/6 = 3.5
x^2 p(x) 1/6 4/6 9/6 16/6 25/6 36/6 91/6 = 15.1667
The mean is 7/2 or 3.5

The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666...
The standard deviation is the square root of the variance = 1.7078
Do not use rounded off values in the intermediate calculations. Only round off the final answer.
You can learn how to find the mean and variance of a probability distribution using lists with the
TI-82 or using the program called pdist.
Stats: Binomial Probabilities
Binomial Experiment
A binomial experiment is an experiment which satisfies these four conditions
 A fixed number of trials

 Each trial is independent of the others
 There are only two outcomes
 The probability of each outcome remains constant from trial to trial.
These can be summarized as: An experiment with a fixed number of independent trials, each of
which can only have two possible outcomes.
The fact that each trial is independent actually means that the probabilities remain constant.
Examples of binomial experiments
 Tossing a coin 20 times to see how many tails occur.

 Asking 200 people if they watch ABC news.
 Rolling a die to see if a 5 appears.
Examples which aren't binomial experiments
 Rolling a die until a 6 appears (not a fixed number of trials)

 Asking 20 people how old they are (not two outcomes)
 Drawing 5 cards from a deck for a poker hand (done without replacement, so not independent)
Binomial Probability Function

Example:
What is the probability of rolling exactly two sixes in 6 rolls of a die?
There are five things you need to do to work a binomial story problem.
1. Define Success first. Success must be for a single trial. Success = "Rolling a 6 on a single die"
2. Define the probability of success (p): p = 1/6
3. Find the probability of failure: q = 5/6
4. Define the number of trials: n = 6
5. Define the number of successes out of those trials: x = 2
Anytime a six appears, it is a success (denoted S) and anytime something else appears, it is a
failure (denoted F). The ways you can get exactly 2 successes in 6 trials are given below. The
probability of each is written to the right of the way it could occur. Because the trials are
independent, the probability of the event (all six dice) is the product of each probability of each
outcome (die)
1 FFFFSS 5/6 * 5/6 * 5/6 * 5/6 * 1/6 * 1/6 = (1/6)^2 * (5/6)^4

2 FFFSFS 5/6 * 5/6 * 5/6 * 1/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
3 FFFSSF 5/6 * 5/6 * 5/6 * 1/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
4 FFSFFS 5/6 * 5/6 * 1/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
5 FFSFSF 5/6 * 5/6 * 1/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
6 FFSSFF 5/6 * 5/6 * 1/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
7 FSFFFS 5/6 * 1/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
8 FSFFSF 5/6 * 1/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4
9 FSFSFF 5/6 * 1/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
10 FSSFFF 5/6 * 1/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
11 SFFFFS 1/6 * 5/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4
12 SFFFSF 1/6 * 5/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

13 SFFSFF 1/6 * 5/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
14 SFSFFF 1/6 * 5/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
15 SSFFFF 1/6 * 1/6 * 5/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4
Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4.
Also, note that the 1/6 is the probability of success and you needed 2 successes. The 5/6 is the
probability of failure, and if 2 of the 6 trials were success, then 4 of the 6 must be failures. Note
that 2 is the value of x and 4 is the value of n-x.
Further note that there are fifteen ways this can occur. This is the number of ways 2 successes
can be occur in 6 trials without repetition and order not being important, or a combination of 6
things, 2 at a time.
The probability of getting exactly x success in n trials, with the probability of success on a
single trial being p is:
P(X=x) = nCx * p^x * q^(n-x)

Example:
A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.
1. Success = "A head is flipped on a single coin"

2. p = 0.5
3. q = 0.5
4. n = 10
5. x=6
P(x=6) = 10C6 * 0.5^6 * 0.5^4 = 210 * 0.015625 * 0.0625 = 0.205078125
Mean, Variance, and Standard Deviation

The mean, variance, and standard deviation of a binomial distribution are extremely easy to find.
Another way to remember the variance is mu-q (since the np is mu).

Example:
Find the mean, variance, and standard deviation for the number of sixes that appear when rolling
30 dice.
Success = "a six is rolled on a single die". p = 1/6, q = 5/6.
The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard deviation is
the square root of the variance = 2.041241452 (approx)
Stats: Other Discrete Distributions
Multinomial Probabilities
A multinomial experiment is an extended binomial probability. The difference is that in a
multinomial experiment, there are more than two possible outcomes. However, there are still a
fixed number of independent trials, and the probability of each outcome must remain constant
from trial to trial.
Instead of using a combination, as in the case of the binomial probability, the number of ways
the outcomes can occur is done using distinguishable permutations.
An example here will be much more useful than a formula.
The probability that a person will pass a College Algebra class is 0.55, the probability that a
person will withdraw before the class is completed is 0.40, and the probability that a person will
fail the class is 0.05. Find the probability that in a class of 30 students, exactly 16 pass, 12
withdraw, and 2 fail.
Outcome x p(outcome)
Pass 16 0.55
Withdraw 12 0.40
Fail 2 0.05
Total 30 1.00
The probability is found using this formula:
30!
P = ---------------- * 0.55^16 * 0.40^12 * 0.05^2
(16!) (12!) (2!)
You can do this on the TI-82.
Poisson Probabilities
Named after the French mathematician Simeon Poisson, Poisson probabilities are useful when
there are a large number of independent trials with a small probability of success on a single trial
and the variables occur over a period of time. It can also be used when a density of items is
distributed over a given area or volume.
Lambda in the formula is the mean number of occurrences. If you're approximating a binomial
probability using the Poisson, then lambda is the same as mu or n * p.
Example:
If there are 500 customers per eight-hour day in a check-out lane, what is the probability that
there will be exactly 3 in line during any five-minute period?
The expected value during any one five minute period would be 500 / 96 = 5.2083333. The 96 is
because there are 96 five-minute periods in eight hours. So, you expect about 5.2 customers in 5
minutes and want to know the probability of getting exactly 3.
p(3;500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.1288 (approx)
Hypergeometric Probabilities
Hypergeometric experiments occur when the trials are not independent of each other and occur
due to sampling without replacement -- as in a five card poker hand.
Hypergeometric probabilities involve the multiplication of two combinations together and then
division by the total number of combinations.
Example:
How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women?
The answer is = 7350/19448 = 0.3779 (approx)
Note that the sum of the numbers in the numerator are the numbers used in the combination in
the denominator.
This can be extended to more than two groups and called an extended hypergeometric problem.
You can use the TI-82 to find hypergeometric probabilities.
Chapter 7
Stats: Normal Distribution
Definitions
Central Limit Theorem
Theorem which stats as the sample size increases, the sampling distribution of the sample
means will become approximately normally distributed.
Correction for Continuity
A correction applied to convert a discrete distribution to a continuous distribution.

Finite Population Correction Factor
A correction applied to the standard error of the means when the sample size is more than
5% of the population size and the sampling is done without replacement.
Sampling Distribution of the Sample Means
Distribution obtained by using the means computed from random samples of a specific
size.
Sampling Error
Difference which occurs between the sample statistic and the population parameter due to
the fact that the sample isn't a perfect representation of the population.
Standard Error or the Mean
The standard deviation of the sampling distribution of the sample means. It is equal to the
standard deviation of the population divided by the square root of the sample size.
Standard Normal Distribution
A normal distribution in which the mean is 0 and the standard deviation is 1. It is denoted
by z.
Z-score
Also known as z-value. A standardized score in which the mean is zero and the standard
deviation is 1. The Z score is used to represent the standard normal distribution.
Stats - Normal Distributions
Any Normal Distribution

 Bell-shaped
 Symmetric about mean
 Continuous
 Never touches the x-axis
 Total area under curve is 1.00
 Approximately 68% lies within 1 standard deviation of the mean, 95% within 2 standard
deviations, and 99.7% within 3 standard deviations of the mean. This is the Empirical Rule
mentioned earlier.
 Data values represented by x which has mean mu and standard deviation sigma.
 Probability Function given by
Standard Normal Distribution

Same as a normal distribution, but also ...
 Mean is zero
 Variance is one
 Standard Deviation is one
 Data values represented by z.
 Probability Function given by
Normal Probabilities
Comprehension of this table is vital to success in the course!
There is a table which must be used to look up standard normal probabilities. The z-score is
broken into two parts, the whole number and tenth are looked up along the left side and the
hundredth is looked up across the top. The value in the intersection of the row and column is the
area under the curve between zero and the z-score looked up.
Because of the symmetry of the normal distribution, look up the absolute value of any z-score.
Computing Normal Probabilities
There are several different situations that can arise when asked to find normal probabilities.
Situation Instructions
Between zero and Look up the area in the table

any number
Between two positives, or Look up both areas in the table and subtract the smaller from
Between two negatives the larger.
Between a negative and Look up both areas in the table and add them together
a positive
Less than a negative, or Look up the area in the table and subtract from 0.5000
Greater than a positive
Greater than a negative, or Look up the area in the table and add to 0.5000
Less than a positive
This can be shortened into two rules.
1. If there is only one z-score given, use 0.5000 for the second area, otherwise look up both z-
scores in the table
2. If the two numbers are the same sign, then subtract; if they are different signs, then add. If
there is only one z-score, then use the inequality to determine the second sign (< is negative,
and > is positive).
Finding z-scores from probabilities
This is more difficult, and requires you to use the table inversely. You must look up the area
between zero and the value on the inside part of the table, and then read the z-score from the
outside. Finally, decide if the z-score should be positive or negative, based on whether it was on
the left side or the right side of the mean. Remember, z-scores can be negative, but areas or
probabilities cannot be.
Situation Instructions
Area between 0 and a value Look up the area in the table

Make negative if on the left side
Area in one tail Subtract the area from 0.5000

Look up the difference in the table
Make negative if in the left tail
Area including one complete half Subtract 0.5000 from the area
(Less than a positive or greater than a Look up the difference in the table
negative) Make negative if on the left side
Within z units of the mean Divide the area by 2

Look up the quotient in the table
Use both the positive and negative z-scores
Two tails with equal area Subtract the area from 1.000
(More than z units from the mean) Divide the area by 2
Look up the quotient in the table
Use both the positive and negative z-scores
Using the table becomes proficient with practice, work lots of the normal probability problems
Standard Normal Probabilities

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
The values in the table are the areas between zero and the z-score. That is, P(0<Z<z-score)
Stats: Central Limit Theorem

Sampling Distribution of the Sample Means

Instead of working with individual scores, statisticians often work with means. What happens is
that several samples are taken, the mean is computed for each sample, and then the means are
used as the data, rather than individual scores being used. The sample is a sampling distribution
of the sample means.
When all of the possible sample means are computed, then the following properties are true:
 The mean of the sample means will be the mean of the population
 The variance of the sample means will be the variance of the population divided by the
sample size.
 The standard deviation of the sample means (known as the standard error of the mean)
will be smaller than the population mean and will be equal to the standard deviation of
the population divided by the square root of the sample size.
 If the population has a normal distribution, then the sample means will have a normal
distribution.
 If the population is not normally distributed, but the sample size is sufficiently large, then
the sample means will have an approximately normal distribution. Some books define
sufficiently large as at least 30 and others as at least 31.
The formula for a z-score when working with the sample means is:
Finite Population Correction Factor

If the sample size is more than 5% of the population size and the sampling is done without
replacement, then a correction needs to be made to the standard error of the means.
In the following, N is the population size and n is the sample size. The adjustment is to multiply
the standard error by the square root of the quotient of the difference between the population and
sample sizes and one less than the population size.
For the most part, we will be ignoring this in class.
Stats: Normal Approximation to Binomial

Recall that according to the Central Limit Theorem, the sample mean of any distribution will
become approximately normal if the sample size is sufficiently large.
It turns out that the binomial distribution can be approximated using the normal distribution if np
and nq are both at least 5. Furthermore, recall that the mean of a binomial distribution is np and
the variance of the binomial distribution is npq.
Continuity Correction Factor

There is a problem with approximating the binomial with the normal. That problem arises
because the binomial distribution is a discrete distribution while the normal distribution is a
continuous distribution. The basic difference here is that with discrete values, we are talking
about heights but no widths, and with the continuous distribution we are talking about both
heights and widths.
The correction is to either add or subtract 0.5 of a unit from each discrete x-value. This fills in
the gaps to make it continuous. This is very similar to expanding of limits to form boundaries
that we did with group frequency distributions.
Examples
Discrete Continuous
x=6 5.5 < x < 6.5
x>6 x > 6.5
x >= 6 x > 5.5
x<6 x < 5.5
x <= 6 x < 6.5
As you can see, whether or not the equal to is included makes a big difference in the discrete
distribution and the way the conversion is performed. However, for a continuous distribution,
equality makes no difference.
Steps to working a normal approximation to the binomial distribution
1. Identify success, the probability of success, the number of trials, and the desired number
of successes. Since this is a binomial problem, these are the same things which were
identified when working a binomial problem.
2. Convert the discrete x to a continuous x. Some people would argue that step 3 should be
done before this step, but go ahead and convert the x before you forget about it and miss
the problem.
3. Find the smaller of np or nq. If the smaller one is at least five, then the larger must also
be, so the approximation will be considered good. When you find np, you're actually
finding the mean, mu, so denote it as such.
4. Find the standard deviation, sigma = sqrt (npq). It might be easier to find the variance and
just stick the square root in the final calculation - that way you don't have to work with all
of the decimal places.
5. Compute the z-score using the standard formula for an individual score (not the one for a
sample mean).
6. Calculate the probability desired.
Chapter 8
Stats: Estimation
Definitions
Confidence Interval
An interval estimate with a specific level of confidence
Confidence Level
The percent of the time the true mean will lie in the interval estimate given.
Consistent Estimator
An estimator which gets closer to the value of the parameter as the sample size increases.
Degrees of Freedom
The number of data values which are allowed to vary once a statistic has been
determined.
Estimator
A sample statistic which is used to estimate a population parameter. It must be unbiased,
consistent, and relatively efficient.
Interval Estimate
A range of values used to estimate a parameter.
Maximum Error of the Estimate
The maximum difference between the point estimate and the actual parameter. The
Maximum Error of the Estimate is 0.5 the width of the confidence interval for means and
proportions.
Point Estimate
A single value used to estimate a parameter.
Relatively Efficient Estimator
The estimator for a parameter with the smallest variance.
T distribution
A distribution used when the population variance is unknown.
Unbiased Estimator
An estimator whose expected value is the mean of the parameter being estimate
Stats: Introduction to Estimation
One area of concern in inferential statistics is the estimation of the population parameter from the
sample statistic. It is important to realize the order here. The sample statistic is calculated from
the sample data and the population parameter is inferred (or estimated) from this sample statistic.
Let me say that again: Statistics are calculated, parameters are estimated.
We talked about problems of obtaining the value of the parameter earlier in the course when we
talked about sampling techniques.
Another area of inferential statistics is sample size determination. That is, how large of a sample
should be taken to make an accurate estimation. In these cases, the statistics can't be used since
the sample hasn't been taken yet.
Point Estimates
There are two types of estimates we will find: Point Estimates and Interval Estimates. The point
estimate is the single best value.
A good estimator must satisfy three conditions:
 Unbiased: The expected value of the estimator must be equal to the mean of the parameter
 Consistent: The value of the estimator approaches the value of the parameter as the sample size
increases
 Relatively Efficient: The estimator has the smallest variance of all estimators which could be
used
Confidence Intervals
The point estimate is going to be different from the population parameter because due to the
sampling error, and there is no way to know who close it is to the actual parameter. For this
reason, statisticians like to give an interval estimate which is a range of values used to estimate
the parameter.
A confidence interval is an interval estimate with a specific level of confidence. A level of

confidence is the probability that the interval estimate will contain the parameter. The level of
confidence is 1 - alpha. 1-alpha area lies within the confidence interval.
Maximum Error of the Estimate
The maximum error of the estimate is denoted by E and is one-half the width of the confidence
interval. The basic confidence interval for a symmetric distribution is set up to be the point
estimate minus the maximum error of the estimate is less than the true population parameter
which is less than the point estimate plus the maximum error of the estimate. This formula will
work for means and proportions because they will use the Z or T distributions which are
symmetric. Later, we will talk about variances, which don't use a symmetric distribution, and the
formula will be different.
Area in Tails
Since the level of confidence is 1-alpha, the amount in the tails is alpha. There is a notation in
statistics which means the score which has the specified area in the right tail.
Examples:
 Z(0.05) = 1.645 (the Z-score which has 0.05 to the right, and 0.4500 between 0 and it)
 Z(0.10) = 1.282 (the Z-score which has 0.10 to the right, and 0.4000 between 0 and it).
As a shorthand notation, the () are usually dropped, and the probability written as a subscript.
The greek letter alpha is used represent the area in both tails for a confidence interval, and so
alpha/2 will be the area in one tail.
Here are some common values
Confidence Area between Area in one z-score

Level 0 and z-score tail (alpha/2)
50% 0.2500 0.2500 0.674
80% 0.4000 0.1000 1.282
90% 0.4500 0.0500 1.645
95% 0.4750 0.0250 1.960
98% 0.4900 0.0100 2.326
99% 0.4950 0.0050 2.576
Notice in the above table, that the area between 0 and the z-score is simply one-half of the
confidence level. So, if there is a confidence level which isn't given above, all you need to do to
find it is divide the confidence level by two, and then look up the area in the inside part of the Z-
table and look up the z-score on the outside.
Also notice - if you look at the student's t distribution, the top row is a level of confidence, and
the bottom row is the z-score. In fact, this is where I got the extra digit of accuracy from.
Stats: Estimating the Mean
You are estimating the population mean, mu, not the sample mean, x bar.
Population Standard Deviation Known

If the population standard deviation, sigma is known, then the mean has a normal (Z)
distribution.
The maximum error of the estimate is given by the formula for E shown.
The Z here is the z-score obtained from the normal table, or the bottom of
the t-table as explained in the introduction to estimation. The z-score is a
factor of the level of confidence, so you may get in the habit of writing it
next to the level of confidence.
Once you have computed E, I suggest you save it to the memory on your calculator. On the TI-
82, a good choice would be the letter E. The reason for this is that the limits for the confidence
interval are now found by subtracting and adding the maximum error of the estimate from/to the
sample mean.
Student's t Distribution
When the population standard deviation is unknown, the mean has a Student's t distribution. The
Student's t distribution was created by William T. Gosset, an Irish brewery worker. The brewery
wouldn't allow him to publish his work under his name, so he used the pseudonym "Student".
The Student's t distribution is very similar to the standard normal distribution.
 It is symmetric about its mean

 It has a mean of zero
 It has a standard deviation and variance greater than 1.
 There are actually many t distributions, one for each degree of freedom
 As the sample size increases, the t distribution approaches the normal distribution.
 It is bell shaped.
 The t-scores can be negative or positive, but the probabilities are always positive.
Degrees of Freedom
A degree of freedom occurs for every data value which is allowed to vary once a statistic has
been fixed. For a single mean, there are n-1 degrees of freedom. This value will change
depending on the statistic being used.
Population Standard Deviation Unknown

If the population standard deviation, sigma is unknown, then the mean has a student's t (t)
distribution and the sample standard deviation is used instead of the population standard
deviation.
The maximum error of the estimate is given by the formula for E shown. The
t here is the t-score obtained from the Student's t table. The t-score is a factor
of the level of confidence and the sample size.
Once you have computed E, I suggest you save it to the memory on your
calculator. On the TI-82, a good choice would be the letter E. The reason for
this is that the limits for the confidence interval are now found by subtracting and adding the
maximum error of the estimate from/to the sample mean.
Notice the formula is the same as for a population mean when the population standard deviation
is known. The only thing that has changed is the formula for the maximum error of the estimate.
Student's T Critical Values

Conf. Level 50% 80% 90% 95% 98% 99%
One Tail 0.250 0.100 0.050 0.025 0.010 0.005
Two Tail 0.500 0.200 0.100 0.050 0.020 0.010
df = 1 1.000 3.078 6.314 12.706 31.821 63.657
2 0.816 1.886 2.920 4.303 6.965 9.925
3 0.765 1.638 2.353 3.182 4.541 5.841
4 0.741 1.533 2.132 2.776 3.747 4.604
5 0.727 1.476 2.015 2.571 3.365 4.032
6 0.718 1.440 1.943 2.447 3.143 3.707
7 0.711 1.415 1.895 2.365 2.998 3.499
Conf. Level 50% 80% 90% 95% 98% 99%

One Tail 0.250 0.100 0.050 0.025 0.010 0.005
Two Tail 0.500 0.200 0.100 0.050 0.020 0.010
8 0.706 1.397 1.860 2.306 2.896 3.355
9 0.703 1.383 1.833 2.262 2.821 3.250
10 0.700 1.372 1.812 2.228 2.764 3.169
11 0.697 1.363 1.796 2.201 2.718 3.106
12 0.695 1.356 1.782 2.179 2.681 3.055
13 0.694 1.350 1.771 2.160 2.650 3.012
14 0.692 1.345 1.761 2.145 2.624 2.977
15 0.691 1.341 1.753 2.131 2.602 2.947
16 0.690 1.337 1.746 2.120 2.583 2.921
17 0.689 1.333 1.740 2.110 2.567 2.898
18 0.688 1.330 1.734 2.101 2.552 2.878
19 0.688 1.328 1.729 2.093 2.539 2.861
20 0.687 1.325 1.725 2.086 2.528 2.845
21 0.686 1.323 1.721 2.080 2.518 2.831
22 0.686 1.321 1.717 2.074 2.508 2.819
23 0.685 1.319 1.714 2.069 2.500 2.807
24 0.685 1.318 1.711 2.064 2.492 2.797
25 0.684 1.316 1.708 2.060 2.485 2.787
26 0.684 1.315 1.706 2.056 2.479 2.779
27 0.684 1.314 1.703 2.052 2.473 2.771
28 0.683 1.313 1.701 2.048 2.467 2.763
29 0.683 1.311 1.699 2.045 2.462 2.756
30 0.683 1.310 1.697 2.042 2.457 2.750
40 0.681 1.303 1.684 2.021 2.423 2.704
50 0.679 1.299 1.676 2.009 2.403 2.678
60 0.679 1.296 1.671 2.000 2.390 2.660
70 0.678 1.294 1.667 1.994 2.381 2.648
80 0.678 1.292 1.664 1.990 2.374 2.639
90 0.677 1.291 1.662 1.987 2.368 2.632
100 0.677 1.290 1.660 1.984 2.364 2.626
z 0.674 1.282 1.645 1.960 2.326 2.576
The values in the table are the areas critical values for the given areas in the right tail or in both
tails.
Stats: Estimating the Proportion

You are estimating the population proportion, p.
All estimation done here is based on the fact that the normal can be used to approximate the
binomial distribution when np and nq are both at least 5. Thus, the p that were talking about is
the probability of success on a single trial from the binomial experiments.
Recall:
The best point estimate for p is p hat, the sample proportion:
If the formula for z is divided by n in both the numerator and the denominator, then the formula
for z becomes:
Solving this for p to come up with a confidence interval, gives the maximum error of the
estimate as: .
This is not, however, the formula that we will use. The problem with estimation is that you don't
know the value of the parameter (in this case p), so you can't use it to estimate itself - if you
knew it, then there would be no problem to work out. So we will replace the parameter by the
statistic in the formula for the maximum error of the estimate.
The maximum error of the estimate is given by the formula for E shown.
The Z here is the z-score obtained from the normal table, or the bottom of
the t-table as explained in the introduction to estimation. The z-score is a
factor of the level of confidence, so you may get in the habit of writing it
next to the level of confidence.
When you're computing E, I suggest that you find the sample proportion, p hat, and save it to P
on the calculator. This way, you can find q as (1-p). Do NOT round the value for p hat and use
the rounded value in the calculations. This will lead to error. Once you have computed E, I
suggest you save it to the memory on your calculator. On the TI-82, a good choice would be the
letter E. The reason for this is that the limits for the confidence interval are now found by
subtracting and adding the maximum error of the estimate from/to the sample proportion.
Stats: Sample Size Determination
The sample size determination formulas come from the formulas for the maximum error of the
estimates. The formula is solved for n. Be sure to round the answer obtained up to the next
whole number, not off to the nearest whole number. If you round off, then you will exceed your
maximum error of the estimate in some cases. By rounding up, you will have a smaller
maximum error of the estimate than allowed, but this is better than having a larger one than
desired.
Population Mean
Here is the formula for the sample size which is obtained by solving
the maximum error of the estimate formula for the population mean
for n.
Population Proportion
Here is the formula for the sample size which is obtained by solving
the maximum error of the estimate formula for the population
proportion for n. Some texts use p hat and q hat, but since the sample
hasn't been taken, there is no value for the sample proportion. p and q
are taken from a previous study, if one is available. If there is no
previous study or estimate available, then use 0.5 for p and q, as
these are the values which will give the largest sample size, and it is better to have too large of a
sample size and come under the maximum error of the estimate than to have too small of a
sample size and exceed the maximum error of the estimate.
Chapter 9
Stats: Hypothesis Testing
Definitions
Null Hypothesis ( H0 )
Statement of zero or no change. If the original claim includes equality (<=, =, or >=), it is
the null hypothesis. If the original claim does not include equality (<, not equal, >) then the
null hypothesis is the complement of the original claim. The null hypothesis always
includes the equal sign. The decision is based on the null hypothesis.
Alternative Hypothesis ( H1 or Ha )
Statement which is true if the null hypothesis is false. The type of test (left, right, or two-
tail) is based on the alternative hypothesis.
Type I error
Rejecting the null hypothesis when it is true (saying false when true). Usually the more
serious error.
Type II error
Failing to reject the null hypothesis when it is false (saying true when false).
alpha
Probability of committing a Type I error.
beta
Probability of committing a Type II error.
Test statistic
Sample statistic used to decide whether to reject or fail to reject the null hypothesis.
Critical region
Set of all values which would cause us to reject H0
Critical value(s)
The value(s) which separate the critical region from the non-critical region. The critical
values are determined independently of the sample statistics.
Significance level ( alpha )
The probability of rejecting the null hypothesis when it is true. alpha = 0.05 and alpha =
0.01 are common. If no level of significance is given, use alpha = 0.05. The level of
significance is the complement of the level of confidence in estimation.
Decision
A statement based upon the null hypothesis. It is either "reject the null hypothesis" or
"fail to reject the null hypothesis". We will never accept the null hypothesis.
Conclusion
A statement which indicates the level of evidence (sufficient or insufficient), at what
level of significance, and whether the original claim is rejected (null) or supported
(alternative).
Stats: Hypothesis Testing
Introduction
Be sure to read through the definitions for this section before trying to make sense out of the
following.
The first thing to do when given a claim is to write the claim mathematically (if possible), and
decide whether the given claim is the null or alternative hypothesis. If the given claim contains
equality, or a statement of no change from the given or accepted condition, then it is the null
hypothesis, otherwise, if it represents change, it is the alternative hypothesis.
The following example is not a mathematical example, but may help introduce the concept.
Example
"He's dead, Jim," said Dr. McCoy to Captain Kirk.
Mr. Spock, as the science officer, is put in charge of statistically determining the correctness of
Bones' statement and deciding the fate of the crew member (to vaporize or try to revive)
His first step is to arrive at the hypothesis to be tested.
Does the statement represent a change in previous condition?
 Yes, there is change, thus it is the alternative hypothesis, H 1

 No, there is no change, therefore is the null hypothesis, H 0
The correct answer is that there is change. Dead represents a change from the accepted state of
alive. The null hypothesis always represents no change. Therefore, the hypotheses are:
 H0 : Patient is alive.
 H1 : Patient is not alive (dead).
States of nature are something that you, as a statistician have no control over. Either it is, or it
isn't. This represents the true nature of things.
Possible states of nature (Based on H0)
 Patient is alive (H0 true - H1 false )

 Patient is dead (H0 false - H1 true)
Decisions are something that you have control over. You may make a correct decision or an
incorrect decision. It depends on the state of nature as to whether your decision is correct or in
error.
Possible decisions (Based on H0 ) / conclusions (Based on claim )
 Reject H0 / "Sufficient evidence to say patient is dead"

 Fail to Reject H0 / "Insufficient evidence to say patient is dead"
There are four possibilities that can occur based on the two possible states of nature and the two
decisions which we can make.
Statisticians will never accept the null hypothesis, we will fail to reject. In other words, we'll say
that it isn't, or that we don't have enough evidence to say that it isn't, but we'll never say that it is,
because someone else might come along with another sample which shows that it isn't and we
don't want to be wrong.
Statistically (double) speaking ...

State of Nature
Decision H0 True H0 False
Reject H0 Patient is Patient is dead,

alive,
Sufficient evidence of death
Sufficient
evidence of
death
Fail to reject H0 Patient is Patient is dead,

alive,
Insufficient evidence of death
Insufficient
evidence of
death
In English ...
State of Nature
Reject H0 Vaporize Vaporize a dead person

a live
person
Fail to reject H0 Try to Try to revive a dead person

revive a
live
person
Were you right ? ...

State of Nature
Reject H0 Type I Error Correct Assessment

alpha
Fail to reject H0 Correct Type II Error

Assessment beta
Which of the two errors is more serious? Type I or Type II ?
Since Type I is the more serious error (usually), that is the one we concentrate on. We usually
pick alpha to be very small (0.05, 0.01). Note: alpha is not a Type I error. Alpha is the
probability of committing a Type I error. Likewise beta is the probability of committing a Type II
error.
Conclusions
Conclusions are sentence answers which include whether there is enough evidence or not (based
on the decision), the level of significance, and whether the original claim is supported or
rejected.
Conclusions are based on the original claim, which may be the null or alternative hypotheses.
The decisions are always based on the null hypothesis
Original Claim
H0 H1
Decision "REJECT" "SUPPORT"
Reject H0 There is There is sufficient evidence at the alpha level of

"SUFFICIENT" sufficient significance to support the claim that (insert original claim
evidence at here)
the alpha
level of
significance
to reject
the claim
that (insert
original
claim here)
Fail to reject H0 There is

"INSUFFICIENT" insufficient
evidence at
the alpha
level of
significance
to reject
the claim
that (insert
original
claim here)
Stats: Type of Tests
This document will explain how to determine if the test is a left tail, right tail, or two-tail test.
The type of test is determined by the Alternative Hypothesis ( H1 )
Left Tailed Test

H1: parameter < value
Notice the inequality points to the left
Decision Rule: Reject H0 if t.s. < c.v.
Right Tailed Test

H1: parameter > value
Notice the inequality points to the right
Decision Rule: Reject H0 if t.s. > c.v.
Two Tailed Test

H1: parameter not equal value

Another way to write not equal is < or >
Notice the inequality points to both sides
Decision Rule: Reject H0 if t.s. < c.v. (left) or t.s. > c.v. (right)
The decision rule can be summarized as follows:
Reject H0 if the test statistic falls in the critical region
(Reject H0 if the test statistic is more extreme than the critical value)
Stats: Confidence Intervals as Tests
Using the confidence interval to perform a hypothesis test only works with a two-tailed test.
 If the hypothesized value of the parameter lies within the confidence interval with a 1-
alpha level of confidence, then the decision at an alpha level of significance is to fail to
reject the null hypothesis.
 If the hypothesized value of the parameter lies outside the confidence interval with a 1-
alpha level of confidence, then the decision at an alpha level of significance is to reject
the null hypothesis.
Sounds simple enough, right? It is.
However, it has a couple of problems.
 It only works with two-tail hypothesis tests.

 It requires that you compute the confidence interval first. This involves taking a z-score
or t-score and converting it into an x-score, which is more difficult than standardizing an
x-score.
Stats: Hypothesis Testing Steps
Here are the steps to performing hypothesis testing

1. Write the original claim and identify whether it is the null hypothesis or the alternative
hypothesis.
2. Write the null and alternative hypothesis. Use the alternative hypothesis to identify the
type of test.
3. Write down all information from the problem.
4. Find the critical value using the tables
5. Compute the test statistic
6. Make a decision to reject or fail to reject the null hypothesis. A picture showing the
critical value and test statistic may be useful.
7. Write the conclusion.
Stats: Testing a Single Mean
You are testing mu, you are not testing x bar. If you knew the value of mu, then there would be
nothing to test.
All hypothesis testing is done under the assumption the null

hypothesis is true!
I can't emphasize this enough. The value for all population parameters in the test statistics come
from the null hypothesis. This is true not only for means, but all of the testing we're going to be
doing.
Population Standard Deviation Known

If the population standard deviation, sigma, is known, then the population
mean has a normal distribution, and you will be using the z-score formula for
sample means. The test statistic is the standard formula you've seen before.
The critical value is obtained from the normal table, or the bottom line from
the t-table.
Population Standard Deviation Unknown

If the population standard deviation, sigma, is unknown, then the population
mean has a student's t distribution, and you will be using the t-score formula
for sample means. The test statistic is very similar to that for the z-score, except that sigma has
been replaced by s and z has been replaced by t.
The critical value is obtained from the t-table. The degrees of freedom for this test is n-1.
If you're performing a t-test where you found the statistics on the calculator (as opposed to being
given them in the problem), then use the VARS key to pull up the statistics in the calculation of
the test statistic. This will save you data entry and avoid round off errors.
General Pattern
Notice the general pattern of these test statistics is (observed - expected) / standard deviation.
Stats: Hypothesis Test: Pi = 3.2?
In 1897, legislature was introduced in Indiana which would make 3.2 the official value of pi for
the State. Now, that sounds ridiculous, but is it really?
Claim: Pi is 3.2.
To test the claim, we're going to generate a whole bunch of values for pi, and then test to see if
the mean is 3.2.
H0 : mu = 3.2 (original claim)

H1 : mu <> 3.2 (two tail test)
Procedure:
The area of the unit circle is pi. The area of the unit circle in the first quadrant is pi/4. The
calculator generates random numbers between 0 and 1. What we're going to do is generate two
random numbers which will simulate a randomly selected point in a unit square in the first
quadrant. If the point is within the circle, then the distance from (0,0) will be less than or equal to
1, if the point is outside the circle, the distance will be greater than 1.
Have the calculator generate a squared distance from zero (the square of the distance illustrates
the same properties as far as being less than 1 or greater than 1). Do this 25 times. Each time,
record whether the point is inside the circle (<1) or outside the circle (>1).
RAND^2 + RAND^2
Pi/4 is approximately equal to the ratio of the points inside the circle to the total number of
points. Therefore, pi will be 4 times the ratio of the points inside the circle to the total number of
points.
This whole process is repeated several times, and the mean and standard deviation is recorded.
The hypothesis test is then conducted using the t-test to see if the true mean is 3.2 (based on the
sample mean).
Example:
20 values for pi were generated by generating 25 pairs of random numbers and checking to see if
they were inside or outside the circle as illustrated above.
3.68 3.20 3.04 2.56 3.36
3.36 3.36 3.52 3.04 3.20
3.52 3.36 3.04 2.72 3.36
3.52 2.88 2.88 3.68 2.60
The mean of the sample is 3.194, the standard deviation is 0.3384857923.
The test statistic t = (3.194 - 3.2) / (0.3384857293/sqrt(20)) = -0.0792730931
The critical value, with an 0.05 level of significance since none was stated, for a two-tail test
with 19 degrees of freedom is t = +/- 2.093.
Since the test statistic is not in the critical region, the decision is fail to reject the null hypothesis
There is insufficient evidence at the 0.05 level of significance to reject the claim that pi is 3.2.
Note the double speak, but it serves to illustrate the point. We would not dare to claim that pi
was 3.2, even though this sample seems to illustrate this. The sample doesn't provide enough
evidence to show it's not 3.2, but there may be another sample somewhere which does provide
enough evidence (let's hope so). So, we won't say it is 3.2, just that we don't have enough
evidence to prove it isn't 3.2.
Stats: Testing a Single Proportion

You are testing p, you are not testing p hat. If you knew the value of p, then there would be
nothing to test.
All hypothesis testing is done under the assumption the null

hypothesis is true!
I can't emphasize this enough. The value for all population parameters in the test statistics come
from the null hypothesis. This is true not only for proportions, but all of the testing we're going
to be doing.
The population proportion has an approximately normal distribution if np and

nq are both at least 5. Remember that we are approximating the binomial using
the normal, and that the p we're talking about is the probability of success on a
single trial. The test statistic is shown in the box to the right.
The critical value is found from the normal table, or from the bottom row of
the t-table.
The steps involved in the hypothesis testing remain the same. The only thing that changes is the
formula for calculating the test statistic and perhaps the distribution which is used.
General Pattern
Notice the general pattern of these test statistics is (observed - expected) / standard deviation.
Stats: Probability Values
Classical Approach
The Classical Approach to hypothesis testing is to compare a test statistic and a critical value. It
is best used for distributions which give areas and require you to look up the critical value (like
the Student's t distribution) rather than distributions which have you look up a test statistic to
find an area (like the normal distribution).
The Classical Approach also has three different decision rules, depending on whether it is a left
tail, right tail, or two tail test.
One problem with the Classical Approach is that if a different level of significance is desired, a
different critical value must be read from the table.
P-Value Approach
The P-Value Approach, short for Probability Value, approaches hypothesis testing from a
different manner. Instead of comparing z-scores or t-scores as in the classical approach, you're
comparing probabilities, or areas.
The level of significance (alpha) is the area in the critical region. That is, the area in the tails to
the right or left of the critical values.
The p-value is the area to the right or left of the test statistic. If it is a two tail test, then look up
the probability in one tail and double it.
If the test statistic is in the critical region, then the p-value will be less than the level of
significance. It does not matter whether it is a left tail, right tail, or two tail test. This rule always
holds.
Reject the null hypothesis if the p-value is less than the level of significance.
You will fail to reject the null hypothesis if the p-value is greater than or equal to the level of
significance.
The p-value approach is best suited for the normal distribution when doing calculations by hand.
However, many statistical packages will give the p-value but not the critical value. This is
because it is easier for a computer or calculator to find the probability than it is to find the critical
value.
Another benefit of the p-value is that the statistician immediately knows at what level the testing
becomes significant. That is, a p-value of 0.06 would be rejected at an 0.10 level of significance,
but it would fail to reject at an 0.05 level of significance. Warning: Do not decide on the level of
significance after calculating the test statistic and finding the p-value.
Here is a proportion to help you keep the order straight. Any proportion equivalent to the
following statement is correct.
The test statistic is to the p-value as the critical value is to the level of significance.

Satistics Chapters

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Satistics Chapters

Uploaded by

Copyright:

Available Formats

Azad Balochistan________________________________________________________________Statistics

 Other Distributions: Multinomial, Poisson, HyperGeometric

 Nominal is the lowest level. Only names are meaningful here.

 TI-82: Generating Random Numbers

Write down the running time of the first eight movies.

Statistics: Grouped Frequency Distributions

Guidelines for classes

1. There should be between 5 and 20 classes.

Creating a Grouped Frequency Distribution

1. Find the largest and smallest values

TI-82: Lists and Statistics

1. Edit - Use this to enter data into a list.

1. min( - Returns the minimum value in a list.

The VARS key can be used to retrieve the value of a statistic.

Here are some common values you will be using:

VARS 5 1 n, the sample size

VARS 5 1 x bar, the sample mean

VARS 5 1 Sx, the sample standard deviation

VARS 5 1 minX, the minimum value

VARS 5 1 maxX, the maximum value

Mathematical Operations and Functions

STAT ClrList L1,L2,L3

TI-82: Histograms, BoxPlots

Finding the Frequency

2. Hit the TRACE key

TI-82: Plotting an Ogive

The following assumes that the cumulative frequencies are in List 2.

Statistics: Data Description

Characteristic or measure obtained from a sample

where k > 1. Chebyshev's theorem can be applied to any distribution regardless

Stats: Measures of Central Tendency

The term "Average" is vague

Ungrouped Frequency Distribution

Grouped Frequency Distribution

This is the tough one.

Property Mean Median Mode Midrange

Using the TI-82

Stats: Measures of Variation

RANGE = MAXIMUM - MINIMUM

Unbiased Estimate of the Population Variance

Sum of Squares (shortcuts)

A little algebraic simplification returns:

1. Total the data values: 23

4 4 - 4.6 = -0.6 ( - 0.6 )^2 = 0.36

5 5 - 4.6 = 0.4 ( 0.4 ) ^2 = 0.16

3 3 - 4.6 = -1.6 ( - 1.6 )^2 = 2.56

6 6 - 4.6 = 1.4 ( 1.4 )^2 = 1.96

5 5 - 4.6 = 0.4 ( 0.4 )^2 = 0.16

23 0.00 (Always) 5.2

, where k is an number greater than 1.

"Within k standard deviations" interprets as the interval: to .

Using the TI-82 to find these values

Stats: Measures of Position

Standard Scores (z-scores)

Percentiles, Deciles, Quartiles

1. Rank the data

Note: The 50th percentile is the median.

1. Take the number of values below the number

Deciles (10 regions)

Example 1: sample size of 20

Example 2: sample size of 21

Five Number Summary

Box and Whiskers Plot

Interquartile Range (IQR)