You are on page 1of 32

Introduction to Summary

Statistics

Introduction to Engineering Design © 2012 Project Lead The Way, Inc.


Statistics
• The collection, evaluation, and interpretation of
data

• Statistical analysis of measurements can help


verify the quality of a design or process
Summary Statistics
Central Tendency
• “Center” of a distribution
– Mean, median, mode
Variation
• Spread of values around the center
– Range, standard deviation, interquartile range
Distribution
• Summary of the frequency of values
– Frequency tables, histograms, normal distribution
Mean Central Tendency

• The mean is the sum of the values of a set


of data divided by the number of values in
that data set.

σ xi
μ=
N
Mean Central Tendency

σ xi
μ=
N
μ = mean value
xi = individual data value
σ xi = summation of all data values
N = # of data values in the data set
Mean Central Tendency

• Data Set
3 7 12 17 21 21 23 27 32 36 44
• Sum of the values = 243
• Number of values = 11
σ xi 243
Mean = μ = = = 22.09
N 11
A Note about Rounding in Statistics
• General Rule: Don’t round until the final
answer
– If you are writing intermediate results you may
round values, but keep unrounded number in
memory
• Mean – round to one more decimal place
than the original data
• Standard Deviation – round to one more
decimal place than the original data
Mean – Rounding
• Data Set
3 7 12 17 21 21 23 27 32 36 44
• Sum of the values = 243
• Number of values = 11
σ xi 243
Mean = μ = = = 22.09
N 11

• Reported: Mean = 22.1


Mode Central Tendency

• Measure of central tendency


• The most frequently occurring value in a
set of data is the mode
• Symbol is M

Data Set:
27 17 12 7 21 44 23 3 36 32 21
Mode Central Tendency

• The most frequently occurring value in a


set of data is the mode

Data Set:
3 7 12 17 21 21 23 27 32 36 44

Mode = M = 21
Mode Central Tendency

• The most frequently occurring value in a


set of data is the mode
• Bimodal Data Set: Two numbers of equal
frequency stand out
• Multimodal Data Set: More than two
numbers of equal frequency stand out
Mode Central Tendency

Determine the mode of


48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Mode = 63
Determine the mode of
48, 63, 62, 59, 58, 2, 63, 5, 60, 59, 55
Mode = 63 & 59 Bimodal
Determine the mode of
48, 63, 62, 59, 48, 2, 63, 5, 60, 59, 55
Mode = 63, 59, & 48 Multimodal
Median Central Tendency

• Measure of central tendency


• The median is the value that occurs in the
middle of a set of data that has been
arranged in numerical order
• Symbol is ~x, pronounced “x-tilde”
Median Central Tendency

• The median is the value that occurs in the


middle of a set of data that has been
arranged in numerical order

Data Set:
27
3 7171212177 21
21 21
44 23
23 27
3 36
32 32
36 21
44
Median Central Tendency

• A data set that contains an odd number of


values always has a Median

Data Set:
3 7 12 17 21 21 23 27 32 36 44
Median Central Tendency

• For a data set that contains an even


number of values, the two middle values
are averaged with the result being the
Median
Middle of data set

Data Set:
3 7 12 17 21 21 23 27 31 32 36 44
Range Variation

• Measure of data variation


• The range is the difference between the
largest and smallest values that occur in a
set of data
• Symbol is R

Data Set:
3 7 12 17 21 21 23 27 32 36 44
Range = R = maximum value – minimum value
R = 44 – 3 = 41
Standard Deviation Variation

• Measure of data variation


• The standard deviation is a measure of
the spread of data values
– A larger standard deviation indicates a wider
spread in data values
Standard Deviation Variation

2
σ xi − μ
σ=
N

σ = standard deviation
xi = individual data value ( x1, x2, x3, …)
μ = mean
N = size of population
Standard Deviation Variation

2
Procedure σ xi − μ
σ=
1. Calculate the mean, μ N

2. Subtract the mean from each value and


then square each difference
3. Sum all squared differences
4. Divide the summation by the size of the
population (number of data values), N
5. Calculate the square root of the result
A Note about Rounding in Statistics, Again

• General Rule: Don’t round until the final


answer
– If you are writing intermediate results you may
round values, but keep unrounded number in
memory
• Standard Deviation: Round to one more
decimal place than the original data
2
Standard Deviation σ xi − μ
σ=
Calculate the standard N
deviation for the data array
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
σ xi 524
1. Calculate the mean μ=  = 47.63
N 11
2. Subtract the mean from each data value and square each
difference 2
xi − μ
(2 - 47.63)2 = 2082.6777 (59 - 47.63)2 = 129.1322
(5 - 47.63)2 = 1817.8595 (60 - 47.63)2 = 152.8595
(48 - 47.63)2 = 0.1322 (62 - 47.63)2 = 206.3140
(49 - 47.63)2 = 1.8595 (63 - 47.63)2 = 236.0413
(55 - 47.63)2 = 54.2231 (63 - 47.63)2 = 236.0413
(58 - 47.63)2 = 107.4050
Standard Deviation Variation

3. Sum all squared differences


2 2082.6777 + 1817.8595 + 0.1322 + 1.8595 + 54.2231 +
σ xi − μ = 107.4050 + 129.1322 + 152.8595 + 206.3140
+ 236.0413 + 236.0413

Note that this is the sum of the


= 5,024.5455 unrounded squared differences.

4. Divide the summation by the number of data values


2
σ x − μ 5024.5455
i = = 456.7769
N 11
5. Calculate the square root of the result
2
σ xi − μ
= 456.7769 = 21.4
N
Histogram Distribution
• A histogram is a common data distribution
chart that is used to show the frequency
with which specific values, or values within
ranges, occur in a set of data.
• An engineer might use a histogram to
show the variation of a dimension that
exists among a group of parts that are
intended to be identical. 5
Frequency
4
3
2
1
0 0.745
0.746
0.747
0.748
0.749
0.750
0.751
0.752
0.753
0.754
0.755
0.756
0.757
0.758
0.759
0.760
Length (in.)
Histogram Distribution

• Large sets of data are often divided into a


limited number of groups. These groups
are called class intervals.

-16 to -6 -5 to 5 6 to 16
Class Intervals
Histogram Distribution

• The number of data elements in each


class interval is shown by the frequency,
which is indicated along the Y-axis of the
graph.
Frequency

-16 to -6 -5 to 5 6 to 16
Histogram Distribution

Example

1, 7, 15, 4, 8, 8, 5, 12, 10

1, 4, 5, 7, 8, 8, 10, 12,15
Frequency

3
0.5 ≤ x < 5.5 5.5 ≤ x < 10.5 10.5 ≤ x < 15.5
2

1 to 5 6 to 10 11 to 15
0.5 5.5 10.5 15.5
Histogram Distribution

• The height of each bar in the chart


indicates the number of data elements, or
frequency of occurrence, within each
range.
1, 4, 5, 7, 8, 8, 10,12,15
Frequency

1 to 5 6 to 10 11 to 15
Histogram Distribution

5 0.7495 < x ≤ 0.7505

4
Frequency

Length (in.)

MINIMUM MAXIMUM
= 0.745 in. = 0.760 in.
Dot Plot Distribution

0 3 -1 -3
3 2 1 0
-1 -1 2 1
0 1 -1 -2
1 2 1 0
-2 -4 0 0

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Dot Plot Distribution

0 3 -1 -3
3 2 1 0
-1 -1 2 1
0 1 -1 -2
1 2 1 0
-2 -4 0 0
Frequency

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Normal Distribution Distribution

Bell shaped curve


Frequency

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

Data Elements

You might also like