You are on page 1of 20

STATISTICAL APPROACH TO MACHINERY CONDITION MONITORING

Randall W. Blake
Predictive Maintenance Staff Manager
St. Johns River Power Park
~acksonville, Florida
._. . . .. . .
.:.. .. .
".::.I '
. , ' . : Randall W. Blake is the Predictive Maintenance
. : Staff Manager for St. Johns River Power Park in
' ; Jacksonville, Florida. In this capacity, he is
i responsible for the predictive maintenance program
which includes vibration analysis, oil analysis
and equipment reliability assessment. Randy has
been a SJRPP employee for eight years and has
twenty-two years experience in electrical power
generation.
Abstract :
This paper shows how to apply straightforward statistical methods
to real world vibration problems. Statistical methods are
effective tools for improving production processes and reducing
unscheduled failures. Statistical tools can lend objectivity,
accuracy and focus to your vibration program.
Introduction:
All rotating or reciprocating machinery emits a unique pattern of
vibration characteristics. The pattern of vibration, or
vibration signature, represent the current mechanical condition
of the individual machine. As time passes, the equipment
mechanical condition will change due to internal wearing,
unbalance, misalignment, looseness, and related problems. These
changes in machinery condition will also affect the machines
vibration signature. The purpose of a vibration monitoring and
analysis program is to detect changes in equipment signatures and
use this information to pinpoint equipment degradation and thus
schedule corrective maintenance or overhaul. The statistical
methods outlined in this paper are effective tools for monitoring
machinery condition, improving production processes and reducing
unscheduled failures.
Statistics:
Webster defines statistics as, "the mathematics of the
collection, organization, and interpretation of numerical data,
esp. the analysis of population characteristics by inference from
sampling.
As Webster's definition points out, data collection by sampling,
interpretation of numerical data, and analysis of population
characteristics are all tools used in a vibration program. The
statistical tools outlined in this paper are more tools to help
monitor machinery condition.
The following FACTORS are presented as an aid to help avoid
problems in the statistical methods presented in this paper.
Factor DATA COLLECTION
Regardless of type of data collection equipment used, CONSISTENCY
is of the utmost importance in data collection. Consistency in
the, data collection techniques, location of data collection
points, equipment operating parameters and, in some cases,
ambient conditions.
It is highly desirable that the same personnel remain in the
program. It is especially important to have the same personnel
collect data. Consistency in personnel helps to ensure accurate
and consistent data collection and analyses.
Factor 2: DOCUMENTATION
Once data is collected, various statistical methods may be used
for analysis so that the data becomes a meaningful source of
information. The data recording methods will vary greatly but
the following points need to be considered no matter what method
is used.
First, the origin of the data must be clearly recorded. Data
whose origin is not clearly known becomes dead data. Quite
often, little useful information is obtained despite the fact
that months were spent collecting data, because the date it was
collected or which machines .the data represented was omitted.
Secondly, data should be recorded in such a way that it can be
used easily. Since data is often used later to calculate the
mean, standard deviation, and days to alarm etc, it is better to
record the data in a manner which will facilitate these
computations.
Statistical Methods:
All of the statistical methods outlined in this paper can easily
be operated with a simple calculator or with any number of
inexpensive computer software packages.
Method 1 : HISTOGRAMS
The data obtained from a sample serve as the basis for decisions
on the population. The larger the sample size, the more
information we gain about the population. But an increase of
sample size also means an increase in the amount of data and it
becomes difficult to understand the population from these data,
even when they are arranged into tables or reports. In such
cases, we need a method which will enable us to understand the
population at a glance. A histogram answers that need. By
organizing many data into a histogram, we can understand the
population in a objective manner.
Making a Histogram
Let's assume, for this demonstration, that we want to set or
adjust the alarm level for one of the vibration parameters for
pump motor B1.
Table 1 shows the latest vibration data for the pump motor. Let
us make a histogram using the data set PAR#4.
Step 1: Obtain the largest and the smallest of collected values and
calculate R.
R = (the largest value) - (the smallest value)
The largest value = .048 in/sec.
The smallest value = .020 in/sec.
R = .048 - .020 = .028
Step 2: The class interval is determined so that the range, which
includes the maximum and the minimum of values, is divided into equal
sizes. Obtain the number of interval by, dividing R by .001,.002
or.005 (or 0.1, 0.2, 0.5:10,20,50, etc.) so as to obtain from 5 to 20
class intervals of equal size. When there are two possibilities, use
the narrower interval.
Thus, the class intervals can be either 0.002 or 0.005, since
0.002 is the narrower of the two intervals, we will use it for this
example.
TABLE 1
Machine 1: PUMP MOTOR B1
..........................................
DATE TIME SPEED OVERALL PAR#l PAR#2 PAR83 PAR#4 PAR#5 PAR#6
---- ---- ----- ------- ----- ----- ----- ----- ----- -----
1 AFP B1 -MPA
06-NOV-91 11:32
03-DEC-91 09:48
13-DEC-91 13:27
02-JAN-92 16:Ol
29-JAN-92 10: 55
11-FEB-92 14:12
13-FEB-92 10~18
09-MAR-92 08: 49
27-MAR-92 08:48
29-APR-92 08:54
29-APR-92 13:58
30-APR-92 06:34
13-MAY-92 05: 54
01-JUL-92 14:13
15-JUL-92 12:31
22-JUL-92 14: 12
21-AUG-92 15:16
08-SEP-92 08:44
24-SEP-92 14:55
30-SEP-92 09:48
12-OCT-92 15:23
27-OCT-92 10:57
27-NOV-92 14:45
16-DEC-92 14:28
29-DEC-92 10:57
Step 3: Prepare a frequency table, as in Table 2, on which the class,
midpoint, frequency count, frequency, etc., can be recorded.
Step 4: Determine the boundaries of the intervals so that they
include the smallest and the largest of values, and write these down
on the frequency table.
Determine the lower boundary of the first class by subtracting 112 of
the class interval size from the smallest sample value. If the lower
boundary is 0 or less, use 0 for the lower boundary. The upper
boundary can be determined by adding 112 of the class interval size to
the smallest sample value.
Lower class boundaries = 0.02 - 0.001 = 0.019
Upper class'boundaries = 0.02 + 0.001= 0.021
Then keep adding the size of the interval to the previous value to
obtain the second boundary, the third, and so on, and make sure that
the last class includes the maximum value.
Boundaries for the first class = 0.0190 -- 0.0210
Boundaries for the second class = 0.0210 -- 0.0230
Step 5: Calculate the mid-point of each class, and write them on the
frequency table.
Mid-point of the first class = (0.0190 + 0.0210) / 2 = 0.02
Mid-point of the second class = (0.0210 + 0.0230) / 2 = 0.022
Step 6: Read the collected values one by one and record the
frequencies falling in each class
Table 2 Frequency table
Step 7: Plotting the Histogram
On a sheet of squared paper or with one of the software programs,
label the horizontal axis scale based on the class interval. Mark the
left-hand vertical axis with the frequency scale. The height of the
vertical scale axis should be one unit of measure above the maximum
frequency notation. Draw a bar whose height corresponds with the
frequency in that class. (refer to figure 1)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Step 8: Mean and Standard Deviation
Mid- Point
of class
0.020
0.022
0.024
0.026
0.028
0.030
0.032
0.034
0.036
0.038
0.040
0.042
0.044
0.046
0.048
Class
0.019 - 0.021
0.021 - 0.023
0.023 - 0.025
0.025 - 0.027
0.027 - 0.029
0.029 - 0.031
0.031 - 0.033
0.033 - 0.035
0.035 - 0.037
0.037 - 0.039
0.039 - 0.041
0.041 - 0.043
0.043 - 0.045
0.045 - 0.047
0.047 - 0.049
Standard Deviation (STD) measures the degree to which individual
values in the population vary from the Mean (average) of all values in
the population. The lower the STD, the less individual values vary
from the mean.
Frequency
notation
1
0
0
0
0
0
0
1
3
12
4
2
1
0
1
Ex,
Mean = -
N
xi = The ,th samples in the population
N = number of samples in the population
Mean = average of samples in the population
In a blank area of the
histogram, note the number of
sample points, the mean, and
the standard deviation.
Using the Histogram:
Now that the data has been
organized into a histogram,
objective decisions can be made as
to the alarm levels. Knowing the
mean and standard deviation of the
population, we can set or adjust
our alarm levels based on the
histogram and statistical data.
For example, draw a vertical line
on the histogram that represents
the mean + 2 STD and another that
represents the mean - 2 STD, 95 percent of the data falls between the two
lines. Alarm levels could be set +/- 1 (68%), 2 (95%), or 3 (99%) STD
for multiple alarm levels, as shown in figure 1.
Level 1 = 0.03784 + (1 * 0.004531) = 0.04531 in/sec.
level 2 = 0.03784 + (2 * 0.004.531) = 0.04690 in/sec.
level 3 = 0.03784 + (3 * 0.004531) = 0.05140 in/sec.
Figure 1 Histogram
14
Frepuency
N-PS
-. .
h. aam
YOH
12
BrD-03D(Dl .7
.am
Another important point may be more obvious by using the histogram in this
fashion, that being OUTLIERS. Outliers are data points that, for one
reason or another are in error. As shown in figure 1 data point .020
in/sec. may be an outlier. If it is discovered to be an outlier,
corrections to the histogram, mean, standard deviation and alarm levels
will need to be made.
Method 2: OUTLIERS
. .
All data collection systems may produce corrupted data points. These
points may be caused by variations in process parameters, ambient
conditions or actual variations in the vibration levels. Errors of this
type should not be included as part of the analyses. Such points are
meaningless as test data. All data should be inspected for corrupted data
points as a continuing check in the data collection process.
The effects of these outliers will increase the random error of the
population. A test is needed to determine if a particular point from a
sample is an outlier. The test should consider two types of errors in
detecting outliers:
- a m
.
- 1 MI U
.
- +
Vibration (intsec.)
. . . . .
.am
e I a10
a .,m
10 r
01. , .
6 .
(1) Rejecting a good data point
(2) Not rejecting a bad data point
l-.bml'i:
.. , -
. . ..
4
2 -
J-J
--. - "
,
.o19 .& .&7 ' &l h5 .030 .A3 .MT .&I .055 .W
m m n m
. ,
0
. .
. . .
The probability of rejecting a good point is usually set at 5%. This
means that the odds of rejecting a good point are 20 to 1 (or less). For
larger populations, (several hundred sample points), almost all corrupted
data points can be identified. For small populations (five or ten),
corrupted data points are more difficult to identify.
A test commonly used to identify questionable data points as outliers is
the GRUBBS1 Method. (ref.1)
Consider the ,th sample of N measurements. The mean (M) and an
experimental standard deviation (S) are calculated using (1) & (2).
Suppose the ith observation is the questionable data point; then, the
absolute statistic calculated is:
Using table 3, a value of TN is obtained for the sample size (N) at
the 5% significance level. This limits the probability of rejecting a
good data point to 5%.
To test for outliers, compare the calculated TN with the table TN. If
TN calculated is larger than or equal to TN table, is an outlier. If
TN calculated is smaller than TN table, is not an outlier.
Table 3 Rejection values for
Grubbs' Method
Let's reconsider the data points in TABLE 1, PAR#4. The histogram in
figure 1 gives reason to suspect data point .02 inlsec. to be an
outlier. By using the Grubbsl Method:
Mean (M) = 0.03784
Exp. STD (S) = 0.00453
Sample
size N
3
4
5
6
7
8
9
0
11
12
13
14
15
16
17
18
19
Sample
size N
20
21
22
23
24
25
30
35
40
45
50
60
70
80
90
100
5%
(1-side)
1.15
1.46
1.67
1.82
1.94
2.03
2.11
2.18
2.23
2.29
2.33
2.37
2.41
2.44
2.47
2.50
2.53
5%
( 1-side)
2.56
2.58
2.60
2.62
2.64
2.66
2.75
2.82
2.87
2.92
2.96
3.03
3.09
3.14
3.19
3.21
Sample Size (N) = 25
From TABLE 3 TN,,, = 2.66,
therefore, since 3.937 is
larger than .2.66, data point
-02 in/sec. is an outlier
according to the Grubbsl test
Method (data point .048 in/sec
is also an outlier).
Figure 2 shows the corrected
histogram plot with possible
alarm levels.
Note: Its important to do the
Grubbs test and set alarm levels
before the vibration data begins to
show signs of trending up.
-- Figure 2 Histogram
0 flcquancy u-m
Yn.r&mn
810.- Lk
. . . . . . . . . 7 -
. - . .
. . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . . . . . .
.......
....
7
Vibration (inlsec.)
am 01-
Method 3: SCATTER DIAGRAMS
It is often essential to study the relationship of two corresponding
sample data sets. For example, to what extent will the vibration of a
machine change in respect to time. To study two variables such as
vibration of, the machine and time, a scatter diagram can be used.
Making a Scatter Diagram
A scatter diagram can by made by following these steps:
Step 1: Collect paired data (x,
you want to study the relationshi
Step 2: Find the maximum and
minimum values for both the x
and y. Make the scales of
horizontal and vertical axis
so that lengths are approxi-
mately equal. Keep the number
of unit graduations to 3 to 10
for each axis.
Step 3: Plot the data on
section paper or a computer
.software program. Include all
necessary information: 1.
title of the diagram, 2. time
interval, 3. number of points,
* 4. title and units of each
axis, 5. name of person who
made the diagram. (See figure 3)
y ) or ( time, vibration), on which
?
I 1
In order to understand the strength of the relationship in quantitative
terms, it is useful to calculate the correlation coefficient.
TABLE 4 (x, y) DATA
Outliers (*)
The correlation coefficient, r, is in the range -1to +l. When r is near
+I, it indicates a strong positive correlation between x and y. Like-
wise, when r is near -1, it indicates a strong negative correlation.
Using the data collected in table 4, overall vibration amplitude and
cumulative days data, the correlation coefficient can be calculated (see
table 5)
Sample
Date
06-NOV-91
03-DEC-91
13-DEC-91
02-JAN-92
29-JAN-92
11-FEB-92
13-FEB-92
09-MAR-92
27-MAR-92
29-APR-92
29-APR-92
3 0-APR-9 2
13-MAY-92
01-JUL-92
15-JUL-92
22-JUL-92
21-AUG-92
08-SEP-92
24-SEP-92
30-SEP-92
12 -0CT-9 2
2 7-OCT-9 2
27-NOV-92
16-DEC-92
29-DEC-92
Cumulative
Days
1
27
37
57
84
97
99
124
142
175
: 175
176
189
238
252
259
289
307
323
329
341
356
387
406
419
Vibration
Amplitude
0.131
0.162
0.149
0.169
0.172
0.107 *
0.106 *
0.186
0.190
0.178
0.174
0.172
0.222
0.197
0.154 *
0.168 *
0.148 *
0.167 *
0.176 *
0.246
0.253
0.246
0.178 *
0.256
0.232
TABLE 5
Method 4: Regression Analysis
NO.
1
2
' 3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SUM
After establishing a strong correlation between x and y, a regression
analysis can be used to extrapolate how many days a machine can run before
reaching an alarm point. To realize this analysis and to determine days
to alarm, it is necessary to comprehend the relation between the vibration
amplitude and cumulative days, quantitatively.
As previously determined, the scatter diagram in figure 3 shows a strong
correlation between vibration amplitude (y) and cumulative days ( x) . From
this diagram, it would seem that vibration and days have a straight-line
relation.
Such a straight line is called a linear regression line. The least
squares regression analysis is the most popular means of curve fitting.
For higher order fitting see attachment A.
Cumulative
Days
y = dependent variable
x
1
27
37
57
84
124
142
175
175
176
189
238
329
341
356
406
419
3276
X*Y
0. 131
4. 374
5. 513
9. 633
14. 448
23. 064
26. 98
31. 15
30. 45
30. 272
41. 958
46. 886
80. 934
86. 273
87. 576
103. 936
97. 208
720. 786
xA2
1
729
1369
3249
7056
15376
20164
30625
30625
30976
35721
56644
108241
116281
126736
164836
175561
924190
vibration
Amplitude
Y
0. 131
0. 162
0. 149
0. 169
0. 172
0. 186
0. 19
0. 178
0. 174
0. 172
0. 222
0. 197
0. 246
0. 253
0. 246
0. 256
0. 232
3. 335
YA2
0. 017161
0.026244
0. 022201
0. 028561
0. 029584
0. 034596
0. 0361
0. 031684
0. 030276
0.029584
0.049284
0. 038809
0. 060516
0. 064009
0. 060516
0. 065536
0. 053824
0. 678485
x = independent vari abl e
a = constant
p = regression c oe f f i c i e nt
This quantitative way of grasping the relationship between x and y, by
seeking a regression from x and y is called regression analysis. Using
the data in Table 6, lets calculate the regression line.
TABLE 6
-
x = mean of x = 192.7058
-
y =mean o f y = 0.196176
The regression line is expressed by
y = 0.144782 + 0.000266~. That is,
for every day of run time, the
vibration will increase by 0.000266
inlsec. .
Figure 4 shows the regression line
as calculated above. The points on
the scatter diagram should be
evenly distributed around the
regression line.
Method 5: Goodness Of Fit (1)
There are two quantitative measures
of goodness of fit. The deviation of data points around the regression
line can be characterized by the standard of error of estimate (SEE), the
precision index of residuals. The smaller the residuals, the smaller the
SEE, the better the fit.
SEE =
0 (Yi - Yi,fiC)*
\i N - c
Where C is the number of coefficients of the regression. For a linear
relation, C = 2. This same equation applies to higher order fits, where C
again indicates the number of coefficients of the regression.
Table 7 and equation 11 demonstrate the goodness of fit for the previous
regression calculations.
TABLE 7
SEE = 0' 003404 = 0. 015064
4 15
Method 6: Goodness Of Fit (2)
Another commonly used goodness of fit test is the coefficient of
determination. The fraction SSR/SST is called the coefficient of
determination and is represented by the symbol r2, Which varies from -1 to
+l. The closer r2 is to 1, the better the fit.
S S R
12 = -
SST
S S R = C ( yi drl t - 7)
SST = SSR + SSE ( 15)
TABLE 8
The preceding calculation shows goodness of fit r2 to be 0.999979, which
was determined earlier to be a very good fit.
YI,M
0.145048
0.151983
0.154650
0.159984
0.167184
0.177852
0.182653
0.191454
0.191454
0.191721
0.195188
0.208256
0.232525
0.235726
0.239726
0.253061
0.256528
SUM
Method 7: Days To Alarm Analysis
Once a regression line has been fitted to the data, goodness of fit has
been determined and alarm levels have been set, Days To Alarm can be
calculated. Days to alarm can be determined in two ways, by calculations
or reading it from a graph.
Yi
0.131
0.162
0.149
0.169
0.172
0.186
0.19
0.178
0.174
0.172
0.222
0.197
0.246
0.253
0.246
0.256
0.232
3.335
Calculation Method:
Now that the constant and regression coefficient has been
established, days to alarm can be calculated as follows;
SSR
10.17578
10.13159
10.11462
10.08072
10.03505
9.967578
9.937289
9.881879
9.881879
9.880202
9.858418
9.776526
9.625346
9.605498
9.580717
9.498344
9.476986
167.5084
DTA =
( Yalarm - a
P
1 - ct (16)
SSE
0.000197
0.000100
0.000031
0.000081
0.000023
0.000066
0.000053
0.000181
0.000304
0.000388
0.000718
0.000126
0.000181
0.000298
0.000039
0.000008
0.000601
0.003404
DTA = Days To Alarm
C, = Cumulative Time (Days)
yh = Alarm Point = 0.300 in/sec
for exponential regression curve fits;
DTA = (
log Y,,,, - log a
log P
1 - ct
for'power regression curve fit;
log Y,,,, - log a
DTA = 1 0 ' ~ ) - Cc
Graphical Method:
The days to alarm can simply be read from the scatter diagram with
a fitted regression line by following these steps (see figure 5).
step 1. Extend the regression line to the y axis alarm
intersect point.
step 2. Read the corresponding x axis, cumulative time
(days).
step 3. Subtract the present cumulative days from projected
cumulative days.
Method 8: Confidence Interval
Confidence Intervals are measurements of precision in estimating a
parameter, in this case days to alarm. A confidence interval around an
unknown parameter is an interval of numbers derived from sample data that
almost assuredly contains the parameter.
Managers are often interested in how far from the true value an estimated
days to alarm might deviate. For this example, 95% confidence level will
be used, that is to say, the calculated interval has a 95% chance of
containing the true parameter.
DTA, = (
0.3 - (0.144782 + (2~0.015064) ) - qlg , 51
0.000266
DTA, = (
0.3 - (0.144782 - (2x0 -015064) ) ) - 41g
= 278
0.000266
DTA, = Days to .Alarm upper limit
DTA, = Days to Alarm lower limit
In an earlier example, the days to alarm was calculated at 165 days. We
can now state with 95% confidence, the days to alarm will be between 51
and 278 days.
Method 9: Confidence Bands on the Regression Line
In Method 8, the confidence interval was calculated for a single point.
In much the same way, upper and lower Confidence Bands can be placed on a
regression line. (see figure 5.1)
y, = (a - (2 x SEE)) + px
yl = (a + (2 x SEE)) + px
ATTACHMENT A
Higher Order Curve ~itting
Exponential Curve Fit:
For vibration data, the most often used curve fitting calculation will be
either Exponential or Linear regression. Use the goodness of fit test to
determine which curve fit calculation best fits the data. (see figure 6)
TABLE 9
X
2
Log x EXP fit Y log y2 Log Y X*Y
1
2 7
37
57
8 4
124
142
175
175
176
189
238
329
341
356
406
419
SUM-
s(xy) = C( x log Y) -
(C x C log Y)
N
S( xx) = Ex2 -
(Ex)
N
Power Regression Curve Fit:
Another curve fit that may be useful in analyzing vibration data is the
Power Regression fit.
S(xy) = C(l0g x logy) - (
Elog x Zlog y )
N
REFERENCES
1. Grubbs, F. E., "Procedure for Detecting Outlying Observations in
Sample,I1 Technometrics, Vol. 11, no. 1, February 1969.
2. ANSIIASME PTC 19.1-1985, Measurement Uncertaintv. Part 1,.
3. Dr. Robert B. Abernethy, I1Test Measurement Accuracy," January 1989.
4. Hitoshi Kume, Statistical Methods for Qualitv Im~rovement, June
1990.