You are on page 1of 47

5/30/2021

Note : These hand outs are used for GB Training of Henry Harwin
Management Academy and for the purpose of GB course reference 2

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 1
5/30/2021

PROCESS MAP

What is a Process Map?

A picture of the process


showing the process steps,
the inputs and outputs

It is always produced as the


first step of a Six Sigma
project

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 2
5/30/2021

Levels of Mapping Detail

High Level Maps Detailed Maps


○ Scope: critical process steps
○ Scope: entire process
○ Process steps: identified by
○ Process steps: grouped individual task/activity – shows
alternate paths and activities
into major activities
○ Inputs & Outputs: stated in
○ Inputs & Outputs: stated in terms of attributes,
generalized terms characteristics or variables

○ Measurement: all data


○ Measurement: major collection, process
inspection points noted measurement, test and
inspection steps noted

Start at high level move to detailed maps as required 5

Process Mapping Steps

1. Identify the process, the inputs and the outputs (customer


requirement)
2. Identify all process steps
3. List vital output variables at each step
4. List vital input variables and classify process inputs as
controlled [C] or uncontrolled [U]
5. Add process specifications for Input Variables
6. Start an initial assessment of the control plan

Do not forget to
Walk the process 6

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 3
5/30/2021

Example: Process Operations

Inputs Outputs
Making Bread
Flour Good flavor
Yeast Right Texture
Water Color
Energy Correct Weight
Equipment
Personnel
Other Ingredients

Identify Major Process Steps

Include all process steps, including verification & rework. (1000 meter view)

Making Bread

Mixing Kneading Rising Baking

•Assess demand •Turnout on pastry •Turn twice •Pre-Heating


•Measure flour, yeast, board •Let rise until •Racking
milk, butter, sugar, salt •Knead center out doubled •Time/Temperature
& water until stiffens •Punch down & rise Cycle
•Prepare yeast mix •Knead in until •Shape loaves & • Deliver Loaves
•Mix sugar, salt & butter smooth & satiny place on greased
•Combine with flour •Place in greased trays
8
•Beat well bowl

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 4
5/30/2021

60X,s

Standard Process Mapping Symbols

10

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 5
5/30/2021

Cross Functional Flow Charts – Swim Lanes

PRIORITIZATION MATRIX

12

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 6
5/30/2021

Prioritization Matrix  Identify Vital Few from Many

A matrix that details:


the importance of the outputs
and the relationship between
the inputs and those outputs

The tool produces a list of


inputs that the team consider
to be the most important.

It starts the funnelling process


13

Process Map & Prioritization Matrix Exercise

Define the process step and the inputs /


outputs for the process

You have 10 minutes

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 7
5/30/2021

WHY DATA ???

Data Collection Plan

Define a Metric (CTQ)

Define Operational Definition of CTQ

Define How & by Whom Measurements will be done

Collection of Data

Graphical Representation of the Data

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 8
5/30/2021

Operational Definition - Baseline Data Collection

17

Data Collection Plan

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 9
5/30/2021

Data Collection Plan

19

TYPES OF DATA

20

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 10
5/30/2021

TYPES OF DATA

21

TYPES OF DATA

22

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 11
5/30/2021

TYPE OF DATA

23

Exercise - Data

24

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 12
5/30/2021

Sampling of Data from Population

25

Sampling of Data from Population

26

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 13
5/30/2021

Sampling works when...


● Each member of the population has an equal chance of being
selected (unbiased)

● Selecting one member doesn’t influence likelihood of another


member being selected or not (independent)

● There aren’t any significant differences between those selected


and those that weren’t (representative)

● You have a large enough sample to find what you’re looking for.
If it’s a rare event or you want to be very precise, you’ll need a
large sample (big enough)

27

Sampling Plan

● A good sampling plan will capture all relevant sources of


noise variability, ie will capture the process going wrong
○ Lot-to-lot, batch-to-batch
○ Different shifts, operators, machines or processes
● Sample Size rule of thumb: 30
● Input Variables do not always have to be measured for
each sample
○ Example:
■ Samples are drawn for an output variable measurement every hour
■ Ambient humidity (input variable) is assumed to be constant over a 4
hour period 28

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 14
5/30/2021

Sampling Strategies

● You can use one or more of the following sampling designs


● Sampling designs:
○ Simple Random Sample
○ Stratified Random Sampling
○ Cluster Sampling
○ Systematic Sampling
○ Subgroup Sampling

29

BASIC STATISTICS

INFERENTIAL STATISTICS

30

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 15
5/30/2021

Three Things to Know about DATA

When you have a collection of data, there are three things to know

Where is the middle?


(location / central tendency)
How spread out is the data?
(dispersion)
How are the data distributed?
(shape of the distribution)
31

Three Measures of Central Tendency

Mean
Median
Mode
32

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 16
5/30/2021

Mean
“Mean” is the statistical term for what most people
call “average”. It is usually the best indicator of
where the center is.
5 X 1
3 X2
We also use
6 this notation X3
2 =4
Mean X4

33

Mean

is the symbol we use for the true


population mean

is the symbol we use for the sample


mean. It is an estimate of the
population mean, based on a sample.

34

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 17
5/30/2021

Median
Sorted
Set of Data
Data

8 7
8
13 Median is the middle value,
9
7 there are as many samples
10
12 above as there are below.
10 10
This is the physical middle – median
11 Mean is the statistical mean 11

10 12

9 13
35

Mode
Mode is the most common value, the
peak of the distribution.
It is a very weak indicator of where
the center is.
.

36

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 18
5/30/2021

Mean vs. Median

The mean is usually the


best indicator of where the
center of the data is.
However there are times
when the median is better.
37

Mean vs. Median


Salaries of randomly selected Salaries of randomly selected
employees, case 1. employees, case 2.
Worker 1 $30,000 Worker 1 $30,000
Worker 2 $35,000 Worker 2 $35,000
Worker 3 $40,000 Worker 3 $40,000
Worker 4 $45,000 Worker 4 $45,000
Worker 5 $50,000 President, $400,000
Find the mean and the median for both cases.
38
What effect does a single large value have on the mean? median?

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 19
5/30/2021

Three Things to Know

When you have a collection of data, there are three things to know

Where is the middle?


(location / central tendency)
How spread out is the data?
(dispersion)
How are the data distributed?
(shape of the distribution)
39

Measures of Dispersion (Spread)

Range
Variance
Standard deviation
40

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 20
5/30/2021

Range
Largest
Set of Data
8 Smallest
13 Definition
7 Range=Largest minus the smallest
10 = (13-7) = 6
12 For groups of 2-10 items, range is about as
11 sophisticated as you need to get.
10 Easy to calculate.
9 Easily distorted by one unusually large or small
datum (outlier). 41

Variance
Set of Data
8
13
7
=
10
12
11
10
Calculate the Mean
9
42

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 21
5/30/2021

Plot the data points


Plot the data in order, relative to the mean
13

12

11
10
9

In English 7

we say X bar
What does (Xi - Xbar) mean?
43

Mathematics
Calculate the deviation (Xi – Xbar)
13

12

11
10
9

Fill the others in Xi 8 13 7 10 12 11 10 9


Xi -Xbar -2 3
44

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 22
5/30/2021

Mathematics
Square the differences, and fill in the (Xi-Xbar)2 row
13

12

11
10
9

Xi 8 13 7 10 12 11 10 9
Fill the others in Xi- Xbar -2 3 -3 0 2 1 0 -1
(Xi- Xbar )2 4 9

45

Mathematics
Sum the (Xi-Xbar)2 row
13
We write it as
12

11 S(Xi- Xbar )2
10
9
This is the
8
“sum of the squares”,
7 a measure of
total variation
Xi 8 13 7 10 12 11 10 9
Xi- Xbar -2 3 -3 0 2 1 0 -1
(Xi- Xbar )2 4 9 9 0 4 1 0 1 = 28
46

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 23
5/30/2021

Formulae

The formula for mean is: The formula for the


variance of a sample is:
SXi
S(Xi- X )2
n
n-1
It s a correction factor so that for small no of data points the outliers do not effect the stad dev value

Can you see that variance is just a fancy “average deviation”?


The variance of our set of sample data =
47

Standard Deviation

The formula for The formula for the


the variance of a standard deviation of a
sample is: sample is:

S(Xi- X )2 S(Xi-X)2
n-1 n-1
The standard deviation is just the square root of the variance.
The standard deviation of our set of sample data =
48

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 24
5/30/2021

Standard Deviation
The standard deviation is the most common measure of
spread for collections of data larger than 10 items. It also
works fine for collections as small as n=2.
It is not easily “pulled” by one outlier.

The symbol for the population standard deviation is s, we


say sigma, and the symbol for an estimate of standard
deviation, based on a sample, is s.

49

Symbols
Roman letters for Greek letters for
estimates based true population
on a sample. parameters

Mean X m
Range R
Standard s
deviation
s

Variance s2 s2 50

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 25
5/30/2021

Summary of Measures
Mean is usually the best measure of where the middle of the data is.
Median is another measure of where the middle is, and it is best when
the data contains outliers, or is known to be non-normal.
Range is the maximum minus the minimum. It is easy to compute, and
gives a good measure of “spread” for small groups of data.
Standard deviation and variance are very sophisticated measures of
“spread”. They are not easily influenced by an outlier.
Mean, median, range, standard deviation, and variance apply
regardless of how your data are distributed. 51

Exercise
For each data set, choose which measure of
“middle” and “spread” is best.
Data Set Middle Spread

2,5,3

3,4,6,1,4,5,7,2,4,1000
,1,5,7,3
3,4,6,1,4,5,7,2,4,
1,5,7,3,3,7,4
2,4,6,651

52

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 26
5/30/2021

Three Things to Know

When you have a collection of data, there are three things to know

Where is the middle?


(location / central tendency)
How spread out is the data?
(dispersion)
How are the data distributed?
(shape of the distribution)
53

Distributions
A bag of marbles is sorted according to size.

-2 -1 1 +1 +2
mm mm mm mm
cm
54

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 27
5/30/2021

Distributions
A bag of marbles is sorted according to size.

-2 -1 1 +1 +2
mm mm mm mm
cm
55

Dotplot
This gives us a natural graph of the number of cases (frequency) vs.
diameter of the marble. This is a dotplot.
Number
of cases

-2 -1 1 +1 +2 Size of
mm mm mm mm
cm marble
56

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 28
5/30/2021

Histogram
If we make a bar chart, with bars the length of the
stacks of marbles, we have made a histogram.
Number
of cases

-2 -1 1 +1 +2 Size of
mm mm mm mm
cm marble
57

Distribution Curve
If we do an infinite number of measurements, and make our increments of
size infinitesimal, we get a continuous distribution curve.

Number
of cases

-2 -1 1 +1 +2 Size of
mm mm mm mm
cm marble
58

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 29
5/30/2021

Distribution Curve
The number of cases that happen between any two points on the horizontal axis is approximately
the area under the distribution curve, between those two points.

Number Point 1
of cases

Point 2

-2 -1 1 +1 +2 Size of
mm mm mm mm
cm marble
59

There are Many Distributions


● Normal, Gaussian, or “bell curve”
● F distribution
● T distribution
● Chi-square distribution
● Uniform distribution
● Weibull distribution
These are all mathematical models. If your data fits one of these
models, you can use the model to represent your data.

60

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 30
5/30/2021

Normal Distribution
The Normal Distribution often occurs in nature.
It is a very useful model.

61

Properties of the Normal Distribution

The Normal Distribution is symmetrical. The left half


is the exact mirror image of the right half.

62

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 31
5/30/2021

Properties of the Normal Distribution


The Mean, the Median, and the Mode all occur
exactly in the middle of the curve.

63

Properties of the Normal Distribution


Once you specify the mean and the standard deviation of the
normal curve, the curve is completely known.

64

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 32
5/30/2021

Properties of the Normal Distribution


The areas under the curve is 1 and we calculate the area under + - std dev
In sample size calculation we take 1.96 which is close to 2 and we calculate for 95% of the data

About 68% of all cases occur within + / - 1 Standard


Deviation of the Mean. 65

Properties of the Normal Distribution


The areas under the curve is 1 and we calculate the area under + - std dev
In sample size calculation we take 1.96 which is close to 2 and we calculate for 95% of the data

About 95% of all cases occur within + / - 2 Standard


Deviations of the Mean. 66

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 33
5/30/2021

Properties of the Normal Distribution

About 99.7% of all cases occur within + / - 3 Standard


Deviations of the Mean. 67

Summary: Normal Distribution

● For a Normal Distribution:


68% of the data is within +/- 1 standard deviations
95% of the data is within +/- 2 standard deviations
99.73% of the data is within +/- 3 standard deviations

68

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 34
5/30/2021

Testing for a Normal Distribution

● We can test whether a given data set can be described as “normal” with a
Normal Probability Plot

● If a distribution is close to normal, the Normal Probability Plot will be close


to a straight line

● Minitab makes the normal probability plot easy

69

Standard Normal Distribution – Z distribution

m2
m3
m1

Z1=0 Z2=0 Z3=0

70

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 35
5/30/2021

THE NORMAL DISTRIBUTION

The Area Bounded By Std. Deviations Can Be Used To Estimate The


Cumulative Probability Of A Certain “Event” Occurring
Probability of sample value

68.26%
40%

30% 95.44%

20%
99.73%
10%

0%
m - 3s m - 2s m - s m m + s m + 2s m + 3s
50%
71

THE NORMAL DISTRIBUTION


Important Because:
 Many Natural Phenomena Seem To Follow It
 Provides The Basis For Statistical Inference Because Of Its Relationship To The Central
Limit Theorem (SPC Will Explain More)

Properties Of Normal Distribution:

 Symmetrical In Appearance (One Side Mirror Image of Another)

 Measures Of Central Tendency Are All Identical (Mean, Median, Mode)

 +/-1 Sigma – 68.26%

 +/-2 Sigma – 95.44%

 +/-3 Sigma – 99.73% 72

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 36
5/30/2021

THE NORMAL CURVE / BELL-SHAPED CURVE

Standard Normal Distribution


• Average (Mean) =0
68.26%
• Standard Deviation = 1

95.44%

-¥ 99.73%

Std Dev (s) -3 -2 -1 0 1 2 3

Characteristics Z -

• 68.26% of data lie within +/- 1 standard deviation


• 95.44% of data lie within +/- 2 standard deviation
• 99.73% of data lie within +/- 3 standard deviation
73
• 99.9996 % of data lie within +/- 6 standard deviation

Z VALUE - SCALE OF MEASURE


A Unit of Measure, equivalent to the number of Standard Deviations
Z = that a value is Away from the Target Value

A 6Sigma Process Mean

Lower Z
Specification Upper
Limit Specification
6.0s 6.0s Limit
LSL USL
Z - Values 74
-6.0 0 6.0

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 37
5/30/2021

STANDARD NORMAL DISTRIBUTION

-3s -2s -1s 0 +1s +2s +3s

3 4 .1 % 3 4 .1 %

1 3 .6 % 1 3 .6 %
0 .1 3 5 % 2 .2 % 0 .1 3 5 %
2 .2 %
-3 -2 -1 0 +1 +2 +3
S ta n d a r d D e v ia tio n U n its
75

INTRODUCTION TO MINITAB

76

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 38
5/30/2021

You will see a series of windows..


Session Window
This reports the results
of your calculations

Project Manager
Gives you control
over items in
your project
Data Window
This is where data is entered
It can be typed,
pasted from other applications
or generated internally by Minitab 77

STAT>BASIC STATISTICS> GRAPHICAL SUMMARY

GRAPHS … histogram, box plot

NORMALITY TEST

NORMALITY TEST +/- 1 SIGMA = 68%

NORMALITY TEST + / - 2 Sigma = 95%

78

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 39
5/30/2021

Box Plot

79

EXAMPLE --

In a Factory, the reactor Cycle time is measured every day for 365 days
In year 2002 2003 and 2004

1) Calculate the average cycle time each year

2) Calculate the Median cycle time each year

3) Calculate the Range for each year

4) Calculate the Standard deviation each yr

5) What % data is above cycle time 22 Hrs


80

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 40
5/30/2021

THE NORMAL PROBABILITY DENSITY FUNCTION

● Minitab can help us verify the properties of normal distribution.


● Go to Calc>Probability Distributions>Normal.

● You have 3 Choices here


1. Probability Density (Gives the Height of Curve at any Value)
2. Cumulative Probability (Area under the curve from -∞ to a given Value)
3. Inverse Cumulative probability ( The value at which the area under the curve
from -∞ is given)

● For a Normal Distribution You need to specify Mean and Standard


Deviation

● In Input Constant, give value for which you want The Calculations Done
● If calculations need to be done for more than 1 values, give values in a
Column in worksheet and use Input Column option 81

THE NORMAL PROBABILITY DENSITY FUNCTION

1. Probability Density (Gives


the Height of Curve at any
Value)
2. Cumulative Probability (Area
under the curve from -∞ to a
given Value )
3. Inverse Cumulative probability
2
( The value at which the area 1
under the curve from -∞ is
given)
3

82

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 41
5/30/2021

THE NORMAL PROBABILITY DENSITY FUNCTION


● Let us now verify the properties of normal distribution.
● Consider 2 sets of Normally distributed data
1. μ= 0, σ=1
2. μ= 70, σ=10

● Find out what fraction of data points lie left of mean in Case 1 ?
● Use Calc>Probability Distributions> Normal.
● Choose Cumulative Probability
● Specify Mean =0 and Standard Deviation =1
● Choose Input Constant and Specify value =0.
● Click OK
83

THE NORMAL PROBABILITY DENSITY FUNCTION

● Repeat the calculation for Case 2. What do the results tell us?
Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1
x P( X <= x )
0 0.5

Cumulative Distribution Function


Normal with mean = 70 and standard deviation = 10
x P( X <= x )
70 0.5

84

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 42
5/30/2021

THE NORMAL PROBABILITY DENSITY FUNCTION

● If we now need to find out what fraction of data points fall between μ-σ and
μ+σ in Case 1 ?
Use Calc>Probability Distributions> Normal.
■ Choose Cumulative Probability
■ Specify Mean =0 and Standard Deviation =1
■ Choose Input Constant and Specify value =-1.
■ Click OK
Use Calc>Probability Distributions> Normal.
■ Choose Cumulative Probability
■ Specify Mean =0 and Standard Deviation =1
■ Choose Input Constant and Specify value =1.
■ Click OK

85

THE NORMAL PROBABILITY DENSITY FUNCTION

Cumulative Distribution Function


Normal with mean = 0 and standard deviation = 1
x P( X <= x )
-1 0.158655
Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1
x P( X <= x )
1 0.841345
Total fraction within –1 and 1 =0.84 – 0.16 = .68 Approx.
Does this match with our knowledge ?
What are the results for case 2 ?

86

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 43
5/30/2021

THE NORMAL PROBABILITY DENSITY FUNCTION

● Find out fraction of Data points within μ±2σin both the cases
1.
2.
● What about fraction of Data points within μ±3σin both the cases
1.
2.
● Why are the results always same in both the cases ?
● Does this match our knowledge of Normal Distribution ?
● What are the Practical Limits of a Normally Distributed Process ?
● These Practical Limits represent what fraction of Data points ?

87

88

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 44
5/30/2021

Although the normal distribution takes center stage in statistics, many


processes follow a non normal distribution.

Non Normal Distribution This can be due to the data naturally following a specific type of non normal
distribution (for example, bacteria growth naturally follows an exponential
distribution).

In other cases, your data collection methods or other methodologies may be at


fault.

Types of Non Normal Distribution


Beta distribution with different
parameter values 1) Beta Distribution.
2) Exponential Distribution.
3) Gamma Distribution.
4) Inverse Gamma Distribution.
5) Log Normal Distribution.
6) Logistic Distribution.
7) Maxwell-Boltzmann Distribution.
8) Poisson Distribution.
9) Skewed Distribution.
10) Symmetric Distribution.
11) Uniform Distribution.
12) Unimodal Distribution.
13) Weibull Distribution.

89

Reasons for the Non Normal Distribution


Many data sets naturally fit a non normal model. For example, the number of accidents
tends to fit a Poisson distribution and lifetimes of products usually fit a Weibull distribution.

However, there may be times when your data is supposed to fit a normal
distribution, but doesn’t. If this is a case, it’s time to take a close look at your data.

Outliers can cause your data the become skewed. The mean is especially
sensitive to outliers. Try removing any extreme high or low values and testing your
data again.

Insufficient Data can cause a normal distribution to look completely scattered. For example,
For example..classroom test results are usually normally distributed.

An extreme example: if you choose three random students and plot the results on a graph,
you won’t get a normal distribution. You might get a uniform distribution (i.e. 62 62 63) or you might
get a skewed distribution (80 92 99). If you are in doubt about whether you have a sufficient sample size,
90
collect more data.

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 45
5/30/2021

Dealing with Non Normal Distributions

You may still be able to run these tests if your sample size is large
enough (usually over 20 items) for non-normal distributions

You can also choose to transform the data with a function, forcing it to
fit a normal model.

However, if you have a very small sample, a sample that is skewed or


one that naturally fits another distribution type, you may want to run
a non parametric test.
A non parametric test is one that doesn’t assume the data fits a specific distribution type.
Non parametric tests include the Wilcoxon signed rank test, the Mann-Whitney U Test and
the Kruskal-Wallis test.

91

Chebyshev’s Rule

● Applies to any data set regardless of the frequency distribution of the data

● The rules are:


○ +/- 1 standard deviation, no useful information
○ +/- 2 standard deviations, at least 75% of the data
○ +/- 3 standard deviations, at least 88.9% of the data
○ +/- 4 standard deviations, at least 99.8% of the data

This is applicable for all kinds of distribution

So, for any data set, approximately 90% of the data will fall within +/- 3 StDev of the mean 92

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 46
5/30/2021

The Empirical Rule ... Fat pencil test


● This applies to data sets with frequency distributions that are
mound-shaped and symmetrical
● The rules are:
+/- 1 standard deviation, approximately 68%
+/- 2 standard deviations, approximately 95% of the data
+/- 3 standard deviations, approximately 99.7% of the data
Tip
Use Chebyshev’s rule for any data set
Use The Empirical rule for symmetrical and mound-shaped data sets
Use the properties of the normal distribution for non-normal data
We can see put the pencil test on the normality plot and draw the + and – 1 std dev and find the % data if 68% and + and –2 std dev

We have 95% then we can treat the data set as normal. 93

Propritary/Gopalakrishna/Nirmala
Pharma Consultancy Services 47

You might also like