Professional Documents
Culture Documents
The Challenge
1: Introduction to Statistics • With the advancement in sciences and engineering occurring in large
part through the collection and analysis of data, proper analysis of
and Data Analysis data can be challenging, because scientific data are subject to random
variation.
EM 7: Engineering Data Analysis • How can one draw conclusions from the results of an experiment
Pamantasan ng Lungsod ng Valenzuela when those results could have come out differently?
• The method of statistics allow scientists and engineers to design valid
experiments and to draw reliable conclusions from the data they
produce.
Sampling Sampling
DEFINITION
• A population is the entire collection of objects or outcomes about which
Sample information is sought.
Population • A sample is a subset of a population, containing the objects or outcomes that
are actually observed.
Think of a lottery consisting of 10,000 tickets and 5 winners will be
chosen. What is the fairest way to choose the winners?
For example, we wished to study the heights of students at PLV by
measuring a sample of 100 students.
A simple random sample of size n is a sample chosen by a method in
• How should we choose the 100 students to measure? which each collection of n population items is equally likely to comprise
the sample, just as in a lottery.
1
22/01/2020
Sampling Sampling
EXAMPLE: A utility company wants to conduct a survey to measure the EXAMPLE: A quality engineer wants to inspect electronic microcircuits
satisfaction level of its customers in a certain town. There are 10,000 in order to obtain information on the proportion that are defective. She
customers in the town, and utility employees want to draw a sample of decide to draw a sample of 100 circuit from a day’s production. Each
size 200 to interview personally. They obtain a list of all 10,000 hour for 5 hours, she takes the 20 most recently produced circuits and
customers, and number them from 1 to 10,000. They use a computer tests them. Is this a simple random sample?
random number generator to generate 200 random integers between 1
and 10,000 and then contact the customers who correspond to those
numbers. Is this a simple random sample?
Sampling Sampling
EXAMPLE: A construction engineer has just received a shipment of If, for example, a quality inspector draws a random sample of 40 bolts
1000 concrete blocks, each weighing approximately 25 kilograms. The from a large shipment, measures the length of each and finds that 32
blocks have been delivered in a large pile. The engineer wishes to of them (80%) meet a length specification. By chance, a second
investigate the compressive strength of the blocks by measuring the inspector got a few more good bolts, about 90% in her sample. The
strengths in a sample of blocks. What is the more appropriate method proportion of good bolts in the population is likely to be close to 80%
of selecting random samples? or 90%, but it is not likely that it is exactly equal to either value.
DEFINITION DEFINITION
• A sample of convenience is a sample that is not drawn by a well-defined • A sampling variation happens when two or more different samples from the
random method. same population will differ from each other as well.
2
22/01/2020
Sample Mean
The sample mean, also known as the “arithmetic mean” or the
“average” is the sum of the numbers in a sample, divided by how many
there are.
DEFINITION
Let 𝑋 , … , 𝑋 be a sample. The sample mean is:
Summary Statistics 𝑋=
1
𝑋
𝑛
Sample Variance and Standard Deviation Sample Variance and Standard Deviation
The sample standard deviation is a quantity that measures the degree DEFINITION
of spread in a sample. The square of the sample standard deviation is Let 𝑋 , … , 𝑋 be a sample. The sample standard deviation is the quantity:
the sample variance.
1
DEFINITION 𝑠= 𝑋 −𝑋
Let 𝑋 , … , 𝑋 be a sample. The sample variance is the quantity: 𝑛−1
1 An equivalent formula can be used:
𝑠 = 𝑋 −𝑋
𝑛−1
An equivalent formula can be used: 1
𝑠= 𝑋 − 𝑛𝑋
𝑛−1
1
𝑠 = 𝑋 − 𝑛𝑋
𝑛−1
3
22/01/2020
Quartiles Quartiles
If the median divides the sample in half, quartiles divide it nearly as Example: In the article “Evaluation of Low-Temperature Properties of
possible into quarters. A sample has 3 quartiles. HMA Mixtures” (P. Sebasly, A. Lake, and J. Epps, Journal of
Transportation Engineering, 2002-578-583), the following values of
fracture stress (in Mpa) were measured for a sample of 22 mixtures of
Let n represent the sample size. hot-mixed asphalt (HMA).
First quartile: 0.25(𝑛 + 1) 30 75 79 80 80 105 126 138 149 179 191
Second quartile: 0.50(𝑛 + 1) 223 232 236 240 242 245 247 254 274 384 470
Third quartile: 0.75(𝑛 + 1)
Find the first and third quartiles.
Note that the second quartile is the same as the median.
Percentiles Percentiles
The pth percentile of a sample, for a number p between 0 and 100, Example: In the article “Evaluation of Low-Temperature Properties of
divides the sample so that as nearly as possible p% of the sample HMA Mixtures” (P. Sebasly, A. Lake, and J. Epps, Journal of
values are less than the pth percentile and (100-p)% are greater. Transportation Engineering, 2002-578-583), the following values of
Let n represent the sample size. fracture stress (in Mpa) were measured for a sample of 22 mixtures of
hot-mixed asphalt (HMA).
pth percentile: (𝑛 + 1)
30 75 79 80 80 105 126 138 149 179 191
223 232 236 240 242 245 247 254 274 384 470
Note that the 25th percentile is the 1st quartile, the median is the 50th
percentile and 2nd quartile, and the 75th percentile is the 3rd quartile. If
the quantity is an integer, that is the percentile, otherwise, get the Find the 65th percentile.
average of the two sample values on either side.
4
22/01/2020
Stem-and-leaf Plot
Example: The table below shows a study of the bioactivity of a certain
antifungal drug. The drug was applied to the skin of 48 subjects. After 3
hours, the amount of drug remaining in the skin were measured in
units of ng/cm2. The list has been sorted in numerical order.
Graphical Summaries 3
15
4
16
4
16
7
17
7
17
8
18
9
20
9
20
12
21
12
21
22 22 22 23 24 25 26 26 26 26
27 33 34 34 35 36 36 37 38 40
40 41 41 51 53 55 55 74
5
22/01/2020
Histogram (g/gal)
1≤x <3 12 0.1935
Skewness
3≤x<5 11 0.1774
5≤x<7 18 0.2903
7≤x<9 9 0.1452
9 ≤ x < 11 5 0.0806
11 ≤ x < 13 1 0.0161
13 ≤ x < 15 2 0.0323
15 ≤ x < 17 0 0.0000
17 ≤ x < 19 2 0.0323
19 ≤ x < 21 1 0.0161
21 ≤ x < 23 0 0.0000 Skewness refers to the asymmetry of a histogram; a symmetric histogram has its right
23 ≤ x < 25 1 0.0161 half a mirror image of its eft half. A histogram skewed to the left or negatively skewed
To construct a histogram: (1) determine the number of classes to use and construct intervals of equal has long left-hand tail. On the same hand, a histogram skewed to the right or
width; (2) compute the frequency and relative frequency for each class; and, (3) draw a rectangle for each positively skewed has long right-hand tail.
class, the heights of the rectangles may be set equal to the frequencies or to the relative frequencies.
Histogram Modes