Staticus: Math 103 Lecture 9 Class Notes

Math 103 Lecture 9 notes page 1
Math 103 Lecture 9 class notes

Statistics – from the Latin staticus – “out of state” is the study of methods of collecting,
organizing, presenting, analyzing, and drawing conclusions about data, commonly in numerical form.
The three branches of statistics are: descriptive, inferential, and survey/sampling.
Descriptive Statistics: organizing, summarizing, graphing and presenting data

I. Organize data into frequency tables
a. class and frequency
b. extended table includes relative frequency, cumulative frequency, and cumulative
relative frequency, as well as class marks
II. Make charts or graphs
a. histogram and bar graphs
b. frequency curve or polygon
c. ogive
d. box & whisker or boxplot
e. circle or pie graph
f. stem & leaf
g. pictographs
h. scatter plots
i. pictographs
j. line plots
III. Calculate measures
a. central tendency (mean, median, mode)
b. variation (range, standard deviation)
c. position (percentiles, quartiles)
I. Organize data into frequency tables

Frequency Table = is an excellent device for making larger collections of data much more
intelligible. A frequency table is so named because it lists categories of scores along with their
corresponding frequencies. The frequency for a category or class is the number of original
scores that fall into that class. The columns of an extended frequency table generate various
graphs or charts. Extended frequency tables therefore become important prerequisites for
creating graphs and charts used in statistics.
Guidelines for frequency tables:
1. Class intervals should not overlap. Classes are mutually exclusive.
2. Classes should continue throughout the distribution with NO gaps. Include all classes.
3. All classes should have the same width.
4. Class widths should be “convenient” numbers.
5. Use 5-20 classes.
6. Make lower or upper limits multiples of the width.
An extended frequency table includes the following:
a. class intervals (lower and upper limits)
b. marks
c. frequency
d. cumulative frequency
e. relative frequency
f. cumulative relative frequency
Example Data Set: Dr. Brown’s Exam Scores

98 90 85 84 81 79 76 73 69 60
98 90 85 83 80 79 75 72 68 60
93 88 85 82 80 78 75 71 67 59
93 87 84 82 79 77 74 70 64 57
91 86 84 81 79 77 74 70 63 54
note: Typically, you will have to rank data first; data does not usually come ordered!
The first thing to do with numerical data is to organize it into a frequency table. Each column of a
frequency table generates (is used to create) a particular graph or chart.
Extended Frequency Table of Dr. Brown’s Exam Scores

class freq cumulative relative cumulative mark boundaries
freq. freq. relative freq.
50-54
55-59
60-64
65-69
70-74
75-79
80-84
85-89
90-94
95-99
100+
The width of each class is 5 (size of each class).

The lower limits are the smaller numbers of each class (50, 55, 60, 65, 70, etc.)
The upper limits are the larger numbers of each class (54, 59, 64, 69, 74, etc.)
Note: the class limits (either lower or upper) should be a multiple of the width.
The mark is the midpoint of each class.
Only the last class can be "open-ended."
There should be no "gaps" in organizing classes.
There should be no "overlap" in class numbers.
II. Make charts or graphs

Histogram: a type of bar graph representing an entire set of data. It is helpful when you need to
discover or display the distribution of interval or ratio data. Histograms illustrate central
tendency, shape, and how the data is spread out or dispersed. A histogram is made up of the
following components:
1. a title, which identifies the population of concern
2. a vertical scale, which identifies the frequencies in the various classes
3. a horizontal scale, which identifies the variable. Values for class boundaries, class limits, or
class marks may be labeled along the axis.
Shapes of histograms: symmetrical, uniform, skewed, J-shaped, and bimodal.
Frequency Curve or Polygon: the horizontal axis uses marks. The vertical axis is either frequency or
relative frequency. Several sets of data can be depicted on the same graph.
Ogive: a cumulative frequency curve, always with a typical “upward” trend.
Box-&-Whisker = a representation of the data set by splitting the distribution into four groups of
25%, often referred to as quartile distribution. Several sets of data can be pictures side-by=side
using box-&-whisker plots, making the data comparisons easier for the reader. “key” points are:
1. 0% (or 10%)
2. 25%
3. 50%
4. 75%
5. 100% (or 90%)
III. Calculate Measures

AVERAGES:
Mode = the data value that occurs most frequently.
Ex: 6 7 8 9 9 10
Another ex: 6 3 2 3 3 5 3 2
If you cannot identify the ONE value that occurs most frequently, the data set has no mode.
Ex: 3 3 4 5 5 7
Median = middle score in ranked data.
Ex: 3 4 6 8 9 11 15 27 31
When there is an even number of data values, the median is halfway between the middle scores.
Ex: 3 5 6 7 9 10 10 12
The median need not be a member of the data set.
Midrange = the value halfway between the highest and lowest data value.
Ex: 6 7 8 9 9 10
The midrange need not be a member of the data set.
Midhinge = value halfway between the left hinge and right hinge of a box-&-whisker plot.
The midhinge need not be a member of the data set.

Mean = the value which is the sum of all data values divided by the number of pieces of data.
Ex: 6 3 8 5 3
Mean = (6 + 3 + 8 + 5 + 3)/5 = 5
Ex: 85 76 93 82 96
Mean =
The mean need not be a member of the data set.
The mean is the most common measure of central tendency and is the statistics usually denoted
by the word “average.” The mean is the “balance point” of a distribution, or the sum of the
distances to the right of the mean equals the sum of the distances to the left.
Ex: There is a salary dispute between management and labor at Castellon Manufacturing. The labor
Union claims that the average salary is only $3000/year. Management says the average salary is
$7300. You have been called in as a federal mediator. The first thing you need to do is to figure out
the average salary. Suppose there are only 10 employees and you can get their monthly salaries
from payroll. They are:
$3000, $3000, $3000, $3500, $4000, $4500, $6000, $6000, $1000 and $25000
Does the Unions’ claim of #3000 seem like the “average”?
Does the Management’s claim of $7300 seem like the “average”?
Weighted Mean = Suppose one class of 20 students averaged 80% on a test, while another class of
30 students averaged 74%. What is the average for the combined group of students?
DISPERSION OR VARIATION
Range = the difference or distance between the highest to lowest data value.
Variance, σ = sum of squared deviations divided by the number of data points
Standard Deviation, s = √variance = (x – µ)^2/ n or (x – µ)^2/ (n-1)
Note: for any distribution, the virtual spread (range) of the data is about 6 standard deviations.
Standard deviation is usually rounded 1-2 places.
Ex: data: 1 3 5 6 6 9
s=
POSITION
Quartiles = numbers that divide ranked data into fourths. A data set has 3 quartiles.
1st Quartile = a number such that at most 1/4 of the data are smaller in value, and at most 3/4
are larger.
2nd Quartile = median
3rd Quartile = a number such that at most 3/4 of the data are smaller in value, and at most 1/4
are larger.
Percentiles = numbers that divide ranked data into 100 parts. A data set has 99 percentiles.
Deciles = numbers that divide ranked data into 10 parts. A data set has 9 deciles.
Here’s an example using a small data set, which contains an odd number of values.
35 47 48 50 51 53 54 70 75
Split the data in half, at the median, then find the median of each half.
Interquartile range, IQR, Q3 – Q1 = 54–48 = 6
Here’s an example using a small data set, which contains an even number of values:
35 47 48 50 51 53 54 60 70 75
Split the data in half, at the median, then find the median of each half.
Interquartile range, IQR, Q3 – Q1 = 60–48 = 12

Staticus: Math 103 Lecture 9 Class Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Staticus: Math 103 Lecture 9 Class Notes

Uploaded by

Copyright:

Available Formats

Math 103 Lecture 9 notes page 1

Math 103 Lecture 9 class notes

The three branches of statistics are: descriptive, inferential, and survey/sampling.

Descriptive Statistics: organizing, summarizing, graphing and presenting data

I. Organize data into frequency tables

Example Data Set: Dr. Brown’s Exam Scores

Extended Frequency Table of Dr. Brown’s Exam Scores

The width of each class is 5 (size of each class).

II. Make charts or graphs

Ogive: a cumulative frequency curve, always with a typical “upward” trend.

III. Calculate Measures

The midhinge need not be a member of the data set.

You might also like