Professional Documents
Culture Documents
Key Terms
Key Terms
BSHM-BLOCK 1
ASSIGNMENT
POPULATION
A population is a selected individual or group representing the full set of members of a certain group of
interest.
SAMPLE
A sample is a subset drawn from a larger population. If this drawing is accomplished in such a manner
that each member of the population has an equitable chance of selection, the result is referred to as a
random sample.
BOX PLOT
also called box-and-whisker plots or box-whisker plots. a graph that gives a quick picture of the middle
50 percent of the data. They also show how far the extreme values are from most of the data. As
mentioned previously, a box plot is constructed from five values: the minimum value, the first quartile,
the median, the third quartile, and the maximum value. We use these values to compare how close
other data values are to them.
Example:
The following data are the heights of 40 students in a statistics class:
59, 60, 61, 62, 62, 63, 63, 64, 64, 64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 66, 66, 67, 67, 68, 68, 69, 70, 70,
70, 70, 70, 71, 71, 72, 72, 73, 74, 74, 75, 77.
Construct a box plot with the following properties. Calculator instructions for finding the five number
summary follow this example:
• Minimum value = 59
• Maximum value = 77
• Q1: First quartile = 64.5
• Q2: Second quartile or median = 66
• Q3: Third quartile = 70
FIRST QUARTILE
the value that is the median of the lower half of the ordered data set
Example:
For example, consider the following data:
1, 11.5, 6, 7.2, 4, 8, 9, 10, 6.8, 8.3, 2, 2, 10, 1
Ordered from smallest to largest:
1, 1, 2, 2, 4, 6, 6.8, 7.2, 8, 8.3, 9, 10, 10, 11.5
The first quartile is the median of the lower half of the data, so if we divide the data into seven values
in the lower half and seven values in the upper half, we can see that we have an odd number of values
in the lower half. Thus, the median of the lower half, or the first quartile (Q1Q1) will be the middle
value, or 2.
The quartiles are illustrated below:
FREQUENCY
the number of times a value of the data occurs
FREQUENCY TABLE
a data representation in which grouped data are displayed along with the corresponding frequencies
HISTOGRAM
a graphical representation in x-y form of the distribution of data in a data set; x represents the data
and y represents the frequency, or relative frequency; the graph consists of contiguous rectangles
Example:
Jeff is the branch manager at a local bank. Recently, Jeff’s been receiving customer feedback saying
that the wait times for a client to be served by a customer service representative are too long. Jeff
decides to observe and write down the time spent by each customer on waiting. Here are his findings
from observing and writing down the wait times spent by 20 customers:
The corresponding histogram with 5-second bins (5-second intervals) would look as follows:
INTERQUARTILE RANGE
or IQR, is the range of the middle 50 percent of the data values; the IQR is found by subtracting the
first quartile from the third quartile
Example:
The quartiles are illustrated below:
The interquartile range is a number that indicates the spread of the middle half, or the middle 50
percent of the data. It is the difference between the third quartile (Q3) and the first quartile (Q1)
IQR = Q3 – Q1. The IQR for this data set is calculated as 9 minus 2, or 7.
INTERVAL
also called a class interval; an interval represents a range of data and is used when displaying large
data sets
Example:
temperature (Farenheit), temperature (Celcius), pH, SAT score (200-800), credit score (300-850)
MEAN
a number that measures the central tendency of the data; a common name for mean is average.
The term mean is a shortened form of arithmetic mean.
Example:
1, 2, 4, 5
Start by adding the data:
1+2+4+5=12
There are 4 data points.
The mean is 3.
MEDIAN
a number that separates ordered data into halves; half the values are the same number or smaller than
the median, and half the values are the same number or larger than the median
To find the median:
• Arrange the data points from smallest to largest.
• If the number of data points is odd, the median is the middle data point in the list.
Example:
MODE
The mode is the most commonly occurring data point in a dataset. The mode is useful when there are
a lot of repeated values in a dataset. There can be no mode, one mode, or multiple modes in a dataset.
Example:
RELATIVE FREQUENCY
the ratio of the number of times a value of the data occurs in the set of all outcomes to the number of
all outcomes
Example:
Your team has won 9 games from a total of 12 games played: the Frequency of winning is 9.
the Relative Frequency of winning is 9/12 = 75%
VARIANCE
mean of the squared deviations from the mean, or the square of the standard deviation. The variance
reflects the variability of your dataset by taking the average of squared deviations from the mean
STANDARD DEVIATION
a number that is equal to the square root of the variance and measures how far data values are from
their mean; notation: s for sample standard deviation and σ for population standard deviation
Example of Variance and Standard Deviation:
PERCENTILE
a number that divides ordered data into hundredths; percentiles may or may not be part of the data.
The median of the data is the second quartile and the 50th percentile
The first and third quartiles are the 25th and the 75th percentiles, respectively.
QUARTILES
the numbers that separate the data into quarters; quartiles may or may not be part of the data; the
second quartile is the median of the data