Professional Documents
Culture Documents
e. relative frequency
f. cumulative relative frequency
The first thing to do with numerical data is to organize it into a frequency table. Each column of a
frequency table generates (is used to create) a particular graph or chart.
3. a horizontal scale, which identifies the variable. Values for class boundaries, class limits, or
class marks may be labeled along the axis.
Shapes of histograms: symmetrical, uniform, skewed, J-shaped, and bimodal.
Frequency Curve or Polygon: the horizontal axis uses marks. The vertical axis is either frequency or
relative frequency. Several sets of data can be depicted on the same graph.
Box-&-Whisker = a representation of the data set by splitting the distribution into four groups of
25%, often referred to as quartile distribution. Several sets of data can be pictures side-by=side
using box-&-whisker plots, making the data comparisons easier for the reader. “key” points are:
1. 0% (or 10%)
2. 25%
3. 50%
4. 75%
5. 100% (or 90%)
Ex: There is a salary dispute between management and labor at Castellon Manufacturing. The labor
Union claims that the average salary is only $3000/year. Management says the average salary is
$7300. You have been called in as a federal mediator. The first thing you need to do is to figure out
the average salary. Suppose there are only 10 employees and you can get their monthly salaries
from payroll. They are:
$3000, $3000, $3000, $3500, $4000, $4500, $6000, $6000, $1000 and $25000
Does the Unions’ claim of #3000 seem like the “average”?
Does the Management’s claim of $7300 seem like the “average”?
Weighted Mean = Suppose one class of 20 students averaged 80% on a test, while another class of
30 students averaged 74%. What is the average for the combined group of students?
DISPERSION OR VARIATION
Range = the difference or distance between the highest to lowest data value.
Variance, σ = sum of squared deviations divided by the number of data points
Standard Deviation, s = √variance = (x – µ)^2/ n or (x – µ)^2/ (n-1)
Note: for any distribution, the virtual spread (range) of the data is about 6 standard deviations.
Standard deviation is usually rounded 1-2 places.
Ex: data: 1 3 5 6 6 9
s=
POSITION
Quartiles = numbers that divide ranked data into fourths. A data set has 3 quartiles.
1st Quartile = a number such that at most 1/4 of the data are smaller in value, and at most 3/4
are larger.
2nd Quartile = median
3rd Quartile = a number such that at most 3/4 of the data are smaller in value, and at most 1/4
are larger.
Percentiles = numbers that divide ranked data into 100 parts. A data set has 99 percentiles.
Deciles = numbers that divide ranked data into 10 parts. A data set has 9 deciles.
Here’s an example using a small data set, which contains an odd number of values.
35 47 48 50 51 53 54 70 75
Split the data in half, at the median, then find the median of each half.
Interquartile range, IQR, Q3 – Q1 = 54–48 = 6
Here’s an example using a small data set, which contains an even number of values:
35 47 48 50 51 53 54 60 70 75
Split the data in half, at the median, then find the median of each half.
Interquartile range, IQR, Q3 – Q1 = 60–48 = 12