You are on page 1of 31

A developed algorithm for the evaluation of

calibration intervals for equipment devices in


testing and calibration laboratories
A. M. Sadek (  dr_amrsadek@hotmail.com )
National Institute for Standards
Hussain M. Alsalamah
Saudi Standards, Metrology, and Quality Organization

Research Article

Keywords:

Posted Date: January 31st, 2023

DOI: https://doi.org/10.21203/rs.3.rs-2523280/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Page 1/31
Abstract
In the present study, a method has been developed to evaluate the calibration interval for equipment
devices based on their calibration records. Small-shift sensitive control charts along with a linear
predictive model have been used to predict the date at which the performance of the equipment is
expected to be out-of-tolerance. The Monte-Carlo algorithm has been implemented to simulate errors of
random and systematic trend components. The responses of the cumulative sum (CUSUM), V-mask, and
moving average (MA) control charts have been investigated and compared with the usual Shewhart
control chart in various cases. The CUSUM control chart has shown an optimum performance in
assessing the systematic trend and reasonable estimation for the calibration intervals. An oversensitivity
to the systematic trend error, on the other hand, has been observed in the performance of the V-mask QC
that may lead to unnecessary short calibration intervals. The MA control chart has shown a similar
performance to the CUSUM. However, its parameter requires to be adjusted based on the size of the
calibration records.

1. Introduction
The calibration of equipment device is one of the essential requirements in testing and calibration
laboratories to ensure the accuracy and the metrological traceability of the reported results (ISO 17025,
2017). Furthermore, the calibration certificate provides the laboratory with vital information about the
performance of the equipment. The laboratory is required to establish a calibration plan to monitor the
performance of his equipment devices and does the recalibrations when required. From this point, the
calibration interval was introduced as the time period in which the equipment device is likely to remain
within the prescribed limits after calibration (ILAC G24 - OMIL D10, 2022).

Unfortunately, there is no standard procedure or specific criteria for the evaluation of the calibration
interval. The facto that ought to be taken into account in the evaluation of the calibration interval are,
nevertheless, included in the ILAC guide (ILAC G24 - OMIL D10, 2022). Based on these factors, a general
rule of thumb can be established that the calibration interval of equipment is dependent on the nature of
the measurements being made and how frequently the instrument is used. The higher risk of the
measurements and/or the more frequently the instrument is used, the shorter the calibration interval is
(Bucher, 2006). Furthermore, the cost of the calibrations should also be taken into account (Pashnina,
2018).

Several methods have been developed to estimate the calibration interval using the calibration records of
the equipment devices. The laboratory establishes an initial calibration interval which is then updated
based on the calibration statuses. These methods could be classified into five categories (ILAC G24 -
OMIL D10, 2022);

1. The automatic-adjustment or staircase methods in which the initial calibration interval is either kept
unchanged, extended, or diminished based on the most recent calibration certificate.

Page 2/31
A developed version of this method was established (NCSLI, 2010) to use the last three calibration
certificates with the appropriate weighting factors to account for the most significant calibration
certificate.

2. The in-use time method which is based on specifying a time duration for the equipment device to be
used. After time elapsed, the equipment device is recalibrated.

A similar approach is the “standard-given” in which the calibration interval is specified by the
corresponding standard procedure. The difference between the two approaches is that in the first one, the
calibration interval is based on the actual time in which the device was in use. While, in the “standard-
given” approach, the calibration interval is set regardless the time duration in which the equipment device
was in use. The significant drawback in both of these methods is that they do not account for the
systematic trend in equipment device’s performance.

3. The in-serving checking method which does not require to evaluate the calibration interval but rather to
perform frequently checks on the calibration status of the equipment.

These checks are conducted using a portable calibration that is designed specifically to check equipment.
If the calibration status is out-of-tolerance, the equipment device is recalibrated. Indeed, this method may
be more effective than evaluating the calibration interval. However, the difficulty of designing or the
availability of the portable calibration. On the other, it has a concern regarding the calibration interval of
the calibration portable design itself.

4. The ”uncertainty-growing” method that is based on the idea that the uncertainty in a measurement
standard is growing with time due to its instability (Lepek, 2001).

Similar method was developed by Wang et al. (Wang, et al., 2017) in which a grey prediction model has
been used to predict the uncertainty of the equipment based on the results denoted in the calibration
certificates.

5. The statistical quality control chart that assesses the performance of the equipment over the time of
use.

In fact, it is one of the most important tools that can be implemented to estimate the calibration interval
for the equipment device (ILAC G24 - OMIL D10, 2022). Although, there are various types of the control
charts (Montgomery, 2005), there is no specified procedure to evaluate the calibration interval.
Furthermore, it has been reported that the evaluation of the calibration interval from this method is
suitable only for the equipment with instrumental drift (ILAC G24 - OMIL D10, 2022).

The main object of the present study is to develop a method that implements the control charts to assess
the performance of the equipment device and predict the out-of-tolerance date based on its calibration
records.

Page 3/31
2. An Overview On The Calibration Interval Determination Methods
2.1. Automatic adjustment method
In the automatic adjustment method (ILAC G24 - OMIL D10, 2022), the initial calibration interval is
extended or kept unchanged as long as the error stated in the calibration certificate did not exceed an
appropriately defined percentage from the maximum permissible error. Otherwise, the calibration interval
is reduced. In fact, in the previous version of the international guide (ILAC &B OIML, 2007), a limit of 80%
was pointed out as a guidance. However, in the recent version, no specified limit was referred to.

The simplicity of the implementation of this method is one of its advantages. Furthermore, setting the
tolerance limits lower than the maximum permissible error would prevent the occurrence of the out-of-
tolerance case. Nevertheless, it does not monitor or provide an assessment of the potential systematic
trend that might affect the performance of the equipment.
2.2. Three-calibrations method
The three calibrations method is a developed approach from the automatic adjustment methods, but the
initial calibration interval is adjusted based on the last three calibration certificates (NCSLI, 2010). Each
calibration certificate is weighted by a corresponding factor so that the calibration interval is evaluated
as:

C I = C Ii (W1 X1 + W2 X2 + W3 X3 )#1a

where W1 , W2 and W3 are weighting factors for the most recent, the penultimate, and antepenultimate
calibrations, respectively and C Ii is the initial Calibration interval. The weighting factors are adjustable
based on the desired reliability of the outcome. The X1 , X2 and X3 are multipliers that describe the
calibration status of the equipment as illustrated in Table 1.0.

Page 4/31
Table 1
The values of the multipliers X, Y and Z adjusted in three-calibration method for
evaluating the calibration interval
Value of X1, X2, X3 Calibration Status

1.0 Calibration within acceptance criterion

0.8 Calibration outside the acceptance criterion up to one time

0.6 Calibration outside the acceptance criterion up to twice.

0.4 Calibration outside the acceptance criterion up to four times

0.3 Calibration outside the acceptance criterion more than four times

One of the advantages of this method is that it uses more than one calibration records to adjust the
calibration interval of the equipment device. Indeed, Eq. [1a] can be extended to include more than three
calibration records as;

C I = C Ii ∑ Wi Xi #1b
i

In this case Xi can be set using the values of Table 1, while, Wi would require a technical criterion. It is
often to set correlate the Wi to the date of the calibration records in such a way, the more recent
calibration weight, the higher weighting factor. (Lucas, 1976; de Oliveria & de Jesys, 2015). However, it
seems that the calibration status of the equipment device should be also considered so that they
inversely proportional to the values of Xi . Furthermore, the factor listed by the international guide (ILAC
G24 - OMIL D10, 2022) should be also considered.

It should be emphasized that the equations used to evaluate the calibration interval are entirely empirical.
Furthermore, both the Xi and Wi are adjustable parameters that accept wide ranges of values. This
would lead to large discrepancies in the evaluations of the calibration interval for the same equipment,
even using the same calibration records.

Figure 1, illustrates the variations that could be yielded due to setting different sets of the weighting
factors W1 , W2 and W3 . A variation by more than 170% in the estimated calibration interval for the
same equipment could be obtained, even though the calibration statuses of the equipment device were
within the acceptance tolerance.

2.3. Schumacher method


Similar to the three calibration methods, the initial calibration interval is updated based on the calibration
records of the equipment device. However, the calibration status is described by either conform (C),
Damage (A), or out-of-tolerances (F) (NCSLI, 2010). The initial calibration interval is either diminished (D),

Page 5/31
extended (E), maintained (P), diminished to the minimum possible period (M) based on the results of the
calibration records as illustrated in Table 2.

Although, the factor by which the calibration interval is updated changes with the initial calibration
interval, it would be more effective to consider a constant value (e.g., the average) independently off the
initial calibration interval. This assumption is reasonable, especially for the “diminish” and “minimize as
possible” cases where the factors tend to have a constant value over the durations of the initial
calibration intervals. In this case, the previous calibration interval would be changed by the factors 1.21,
0.88, 0.64 for the “extend”, ‘diminish”, and “minimizes as possible” cases, respectively.

Table 2
Schumacher method to perform the decision regarding diminishing or
extending the calibration interval of an equipment based on the previous
calibration statues.
Calibration Records History Recent Calibration Status

A F C

Updated CI

CCC P D E

FCC P D P

ACC P D E

CF M M P

CA M M P

FC P M P

FF M M P

FA M M P

AC P D P

AF M M P

AA M M P

A = Damage, F = out-of-tolerance, C = Conform,

D = Diminish the calibration interval,

E = Extend the calibration interval

P = Maintain on the calibration interval,

M = Diminish the calibration interval to the minimum possible period.

Page 6/31
Although both the three calibrations method and these methods are easy to be implemented in the
laboratory, they do not provide an accurate assessment for the calibration status of the equipment.
Moreover, they do not notify the laboratory when an out-of-tolerance or even damage cases are
approached. Instead, the updated the calibration interval after these cases have already occurred. This
may be considered a violation for the definition of the intermediate check which demonstrates to ensure
that the calibration status of the equipment is maintained within the accepted tolerance between the two
calibrations. The reason for this drawback is that these methods do not assess or account for the trend in
the performance of the equipment, and therefore, they do not provide a future prediction about the
performance of the equipment.

2.4. Poisson method


This method was developed to adjust the calibration interval for a set of equipment devices r nj based on
the number of out-of-tolerances cases COut recorded within the calibration records. The calibration
interval is updated ny (Huang, 2010)

ln(R) ∑C
Out
I = − ,λ = #1
λ ∑ nj tj
j

where R is the desired level of confidence, usually set at 0.95, and tj is the initial calibration interval.

Figure 3 depicts the factors by which the calibration interval is updated as a function of the number of
equipment devices nj and the out-of-tolerance cases Cout .

As the out-of-tolerance cases increase, the calibration interval decreases. Indeed, the calibration interval is
diminished when the out-of-tolerance cases exceed 5% of the total number of the equipment. However,
extending the calibration interval does not exceed 5 times the initial calibration interval, even though all
the equipment devices were within the tolerance limits.

3. Small-shift Sensitive Qcs


3.1. Implementation of QCs in evaluating the calibration
interval
It has been pointed out in the international guide (ILAC G24 - OMIL D10, 2022) that the quality control
(QC) chart is an efficient tool that can be implemented to evaluate the calibration interval. In addition, the
QC can provide a reasonable analysis for the performance of the equipment device. However, the main
challenges of applying this method in estimating the calibration intervals are;

i. There is no a clear procedure on how to apply the QC to evaluate the calibration interval, and which one
of the various QC chart types should be selected for this purpose.

Page 7/31
ii. The performance of the QC chart is crucially dependent on the data size (ISO 7870, 2013) which could
be a challenge with small calibration records.

iii. It has been stated that this method is not suitable for equipment without a drift.

These issues may have raised because most of the methods evaluated the equipment’s performance
using the error values stated in the calibration records. However, for small-change in performance of the
equipment or low data-size, the error values may not provide a reliable assessment.

In the present study, it has been proposed to use a small-change sensitive QC in conjunction with linear
regression analysis to assess the performance of the equipment device and predict the out-of-tolerance
case. The cumulative sum (CUSUM), V-mask, and moving average (MA) QCs have been used in the
present study to evaluate the performance of the equipment.

The error in the equipment’s result has been assumed to have random and systematic-trend components,
i.e.,

e = ϵnrm + ϵtrnd #2

The Monte-Carlo (JCGM-101, 2008) random generators have been utilized to simulate the two
components of error. A normal distribution of mean value μ and standard deviation σ has been assigned
to the random error component, i.e.,

ϵnrm = N (μ, σ) #3

where N refers to the normal distribution. The systematic-drift in the equipment, on the other hand, was
simulated assuming an error of a uniform distribution with ascending orders as;

− +
ϵtrnd = U (a = 0, a = Σ) #4

where a− and a+ set the lower and upper limits of the distribution, respectively and the parameter
Σ = Δσ is adjustable to control the strength of the systematic drift. With negligible Σ, the error

component ϵnorm is dominant and the equipment device performs with no drift. While, with considerable
Σ, the systematic trend error starts with zero value that does influence the error at the earlier

observations. However, as the size of the calibration records increases, the equipment’s performance is
affected by the imposed drift.

3.2. Cumulative-sum QC chart


The cumulative sum (CUSUM) QC chart was introduced as a diagnostic and predictive statistical tool
(ISO 3534-2, 2006). It has the capability to identify the small shift in the process (Sheu & Tai, 2006). In
addition, it has the advantage that it can be applied to measurement results of single observations (i.e.,
replicate number n = 1 ).

Page 8/31
The CUSUM control chart monitors the upper and lower cumulative sum for the error ei throughout the
two statistics (Montgomery, 2005);

0, i = 1
+
C = { + 1
} #5
i
max (0, C + ei − δσ) , i > 1
i−1 2

0, i = 1

C = { − 1
} #6
i
min (0, C + ei − δσ) , i > 1
i−1 2

where the parameter δ determines the number of the standard deviation from the target that make a shift
detectable.

+ −
Assuming an error with only the ϵnrm component (i.e., negligible Σ), the distributions of the C and C
are skewed toward the positive and negative directions, respectively as illustrated in Fig. 4. Furthermore,
the two statistics are crucially dependent on the adjustable parameter δ, which controls the resolution of
the CUSUM QC chart. This value is often set to 1 to detect shifts within 1σ of the process. As it increases,
the sensitivity of CUSUM QC to change in the data decreases. Figure 5 illustrates the effect of the
+
adjustable parameter δ on the vales and standard deviation of the CUSUM statistical parameters C and

C .

The resolution adjustable parameter δ should be set based on technical or statistical information about
the strength of the systematical error (drift) in the equipment performance. However, it is clear that both
+ −
C and C start with zero value regardless of the strength of the drift in the equipment device results. It
implies that the CUSUM may not be able to detect the shift change from the earlier observations.
Therefore, Lucas and Crosier (Lucase & Crosier, 1982) proposed the fast initial response (FIR) technique
to increase the sensitivity of the CUSUM QC. This developed method is based on setting the initial values
+ −
of C and C to nonzero values, typically a fraction of the standard error.

3.3. V-mask QC
The cumulative sum QC charts can be also interpreted throughout the V-mask QC chart. However, instead
of the C + and C − statistics, the cumulative sum of errors Sm = ∑ ei
i
at each observation i are plotted
against the observation numbers, and the limits of a V-mask QC are placed on top of the last point. The
results are in statistical control when all the Sm points are within the arms of the V-mask QC, otherwise,
the process is suspected to be out-of-control.

The original point of the vertex of the V-mask is placed at a distance d from the last evaluated Sm point.
This distance is defined in terms of the probabilities of type-I error (α), type-II error (β), and the amount of
shift in the process mean that we wish to detect δ as (NIST, 2012);

2 1−β
d = ln ( ) #7
2 α
δ

Page 9/31
where the values of α and β are often set to 0.0027 and 0.01, respectively establish a ±3σ-criterion
(NIST, 2012). From the original point of the vertex two arms are plotted with a slope;

1
k = δσ#8
2

where σ and δ have the same meaning as in the CUSUM QC chart. Then, the vertical rise distance in the
arm has the length of;

h = dk#9

When the V-mask is placed over the last data point, the mask clearly indicates the out-of-control situation.
Then, the V-mask is then shifted to the first out-of-tolerance case. A comparison between the V-mask and
CUSUM QC charts assuming an error with ϵnrm component is illustrated in Fig. 6.

Despite of the high efficiency of the two QCs, recommendations against the V-mask were made
(Montgomery, 2005) due to the ambiguity associated with its parameters α and β. Furthermore, it was
concluded that these parameters are time-dependent rather than constant (Lowry, et al., 1992). In
contrast, recommendations for using the V-mask QC chars were also made in literature (Lucas, 1976;
Edwards, 1980).

3.4. Moving average QC


The moving average (also called running or rolling averages) control charts were also introduced to
identify the small-shift processes (Allen, 2006). The moving average (MA) estimator has the advantage
that it can be used as a data-noise filters (Smith, 1999).

For a set of errors ei , M A evaluates the arithmetic mean value of a fixed number of consecutive ei
within a window of size ω. As the window moves, the oldest observation value is excluded and the new
observation value is included. The MA can be expressed as:

1 n
M An−ω+1 = (∑ en−k+i ) #10
ω i=1

Assuming an error of with a component ϵnrm , the distribution of the MA compared to the distribution of
the error is illustrated in Fig. 7.

It is clear that the width of the error distribution is shrinked when subjected to the MA estimator. The
Monte-Carlo algorithm has been implemented to determine the correlation between the window size ω
and the population standard deviation of the error. Random error vectors of standard normal distribution
have been simulated vi the Monte-Carlo generators, and for each vector has been subjected to MA
estimator with prespecified window size. The correlation between the population standard deviation of
the MA σM A and the population standard deviation of the error σe as a function of the window size ω is
illustrated in Fig. 8 and tabulated in Table 3.

Page 10/31
Using this correlation, the limits of the MA control chart that corresponds to the desired limits of ±σ can
be obtained. Table 3 lists the factor require to evaluate the moving average limits as a function of the
moving average window ω.

Table 3
Correlation between the moving average standard deviation to the error standard deviation as a function
of the window size of the moving average. The simulations have been performed using the Monte Carlo
(JCGM-101, 2008) assuming an error of normal distribution.
𝝎 𝝈𝑴𝑨/ 𝝎 𝝈𝑴𝑨/𝝈𝒆
𝝈𝒆

1 1.000 16 0.250

2 0.707 17 0.242

3 0.577 18 0.235

4 0.500 19 0.229

5 0.447 20 0.224

6 0.408 21 0.218

7 0.377 22 0.213

8 0.354 23 0.209

9 0.333 24 0.203

10 0.317 25 0.200

11 0.302 26 0.196

12 0.289 27 0.193

13 0.278 28 0.188

14 0.268 29 0.185

15 0.258 30 0.182

The MA window size ω is similar to the resolution parameter δ in CUSUM and V-Mask QCs. The
sensitivity of the MA-QC chart can be controlled throughout the adjustable ω parameter. As the ω
increases, the sensitivity of the MA QC chart increases.

3.5. Effect of systematic trend error 𝝐𝒕𝒓𝒏𝒅


In this section, the ϵtrnd error component has been considered in the simulation. The error values have
been subjected to various QC charts as depicted in Fig. 9. It is clear how effective the small-change
Page 11/31
sensitive QCs are when the systematic trend can be observed in the chars’ graphical representation.
Therefore, an out-of-control case has been declared by these QCs. Since no significant dispersion in the
results, no out-of-control case was identified by the Shewhart QC.

It has been found that the sensitivities of the CUSUM, V-Mask and MA QCs are not identical. Therefore,
the Monte-Carlo algorithm has been implemented to simulate the systematical trend error ϵtrnd with
different Σ = Δσ. For an iteration size 10 , the error has been simulated and subjected to the various
3

QCs at each Δ value. The sensitivities of the QCs have been expressed in terms of the number of the out-
of-tolerance points detected at each Δ. The results are presented in Fig. 10.

In the MA QC, a window size of ω = 4 has been used, while the control limits have been estimated using
Table 3. For the V-mask and CUSUM QCs, the control limits have been set at ±3σ and δ = 1.0 . While, the
parameters α and β, have been set respectively to 0.0027 and 0.01 for the V-Mask QC.

Indeed, it is incorrect to evaluate this type of error using the Shewhart QC since its performance is
independent on the strength of the systematic error component. It is a fact that was emphasized by the
international standard ISO 7878 series (ISO 7870, 2013). Despite having different sensitivities, the
CUSUM and MA QCs have shown similar performance over the systematic trend factor Δ. In both of
these charts, the size of the out-of-tolerance points increases with increasing Δ. The V-Mask QC, on the
other hand, has shown a very quick response at small Δ values and then attaining constant levels at
high values.

The performance of the CUSUM and MA appears to be more useful for the aims of the current study, even
though the performance of the V-mask is sought for particular applications. Indeed, there would be no
need to classify the equipment as out-of-control when the drift component ϵtrnd is negligible compared to
the random component ϵrnd .

4. Evaluation Of Calibration Interval


4.1. Setting an initial calibration interval
An initial calibration interval should be set by the laboratory based on technical experience or scientific
evidence. It is best to take into account as many as the factors stated in the international guide (ILAC G24
- OMIL D10, 2022). A minimum of three calibration records shall be available must be supplied in order to
implement the procedure outlined in the current work.
4.2. Prediction model
In order to estimate the calibration interval for an equipment, the date at which the equipment
performance is expected to be out-of-tolerance should be predicted. Of course, selecting the appropriate
model is dependent on the function by which the systematic trend error evolves time. However, it could be
reasonable to assume a linear model for small data size (Subramani & Singh, 2014; Erick, et al., 2017; ISO
Guide 35, 2017). Therefore, in the present study, the performance of the equipment device has been
Page 12/31
predicted using a liner. The calibration records (error against the time) are subjected to linear regression
analysis;

y = a + bx#11

where x and y are the independent (date) and the variable (error), and a and b are the regression model
coefficients.
4.3. Maintaining or extending the initial calibration interval
In this section, it has been assumed that all the calibration records were within the tolerance limits. In this
case, the initial calibration interval is either maintained or extended following these steps:

i. The calibration records are subjected to linear regression analysis.

ii. The error of the equipment is predicted at the next calibration date.

iii. The actual calibration records and the predicted response of the equipment are subjected to the QCs.

iv. The preceding steps are continued until an out-of-tolerance case is found.

v. The calibration interval is evaluated as the duration between the date of last calibration record and the
date at which the out-of-tolerance case was detected.

It has been shown previously that the detection of the out-of-tolerance case is affected by the sensitivity
of the QCs to the systematic trend error. However, in the evaluation of the calibration interval, the QCs are
applied to both the calibration records and the predicted data. Therefore, the performance of the applied
QCs would also be affected by the predictive model. To illustrate this effect, an example of evaluation the
calibration interval using calibration records simulated with a considerable systematic trend error ϵtrnd is
depicted in Fig. 11.

In this example, the calibration records have been subjected to linear regression analysis. The responses
of the equipment at the subsequent six months (the initial calibration interval) have been predicted up
until the detection of an out-of-tolerance case. Both the CUSUM and the V-Mask QCs, have predicted an
out-of-tolerance case after one year from the last calibration record. While, the MA Qc, has estimated a
calibration interval of 1.5 year.

Figure 12 shows the estimated calibration interval using the CUSUM, V-Mask and MA while taking into
account the influence of the linear regression analysis. The Monte-Carlo algorithm has been used to
generate calibration records with various systematic trend errors of strength Σ = σΔi and an initial
calibration interval of six months. The procedures outlined at the beginning of this section were applied
to the data and the calibration interval has been computed.

The outcomes from Fig. 12 are in line with the conclusions derived from Fig. 10. Because the V-mask is
more sensitive to the systematic trend error, it evaluated the narrower calibration intervals. On the other
Page 13/31
hand, the calibration intervals evaluated using the moving average and CUSUM QCs have shown
comparable performance over the trend strength factor Δ. In fact, the similarity of the CUSUM and MA
QCs supports the fact that they have a reasonable sensitivity to the systematical trend error. Furthermore,
they can also be simply adopted in most laboratories because they are less complicated than the V-Mask.

A general conclusion could be made that both of the CUSUM or the MA QCs are more appropriate than
the V-Mask to provide adequate calibration intervals. Nevertheless, for quick detection of small
systematic trend error, the V-Mask QC may still be necessary.
4.4. Effect calibration records size
The size of the calibration records affects the accuracy of both the regression analysis and the QC. As the
size of the calibration records grows, the model becomes more capable to capture the performance of the
equipment and provides accurate prediction for the QCs. Therefore, the calibration interval for an
equipment will be updated whenever new data are added to the calibration records.

In the present section, the effect of the data size on the calibration interval has been investigated. Using
the same simulation parameters of the previous simulations and varying the calibration records size from
N = [5 − 30] , the calibration interval using the CUSUM and MA QCs has been investigated. The results
are presented in Fig. 13.

Assuming a considerable Δ = 1.0 , the effect of the systematical error increases each time the
equipment undergoes a new calibration, i.e., (the size of calibration records increases). As a result, the
calibration intervals evaluated by both the MA and CUSUM QCs decreases as the size of the calibration
records increases. It is important to note that when the size of the calibration records grows, the width of
the MA QC has been modified. Indeed, setting the window of the MA QC to a constant value will result int
erroneous of the calibration intervals. The CUSUM, on the other hand, does not require any adjustments.

The discussion above has led to the conclusion that the CUSUM QC is more suited to be employed in the
calibration interval evaluation for the reason listed below;

i. The calibration intervals evaluated using this QC intermediate those evaluated by the MA and V-Mask
QCs.

ii. The CUSUM does not depend on the size of the calibration records, in contract to the MA QCs.

iii. Unlike the V-Mask QC where the parameters α and β might also be dependent on the size of the
calibration records.
4.5. Diminishing of the calibration interval
In the previous section, it has been assumed that the error values obtained from the calibration records
are all within the tolerance of the QC. Therefore, the initial calibration interval is either maintained or
extended. However, occasionally, the error stated in the calibration certificates might be out-of-tolerance.

Page 14/31
In this case, the initial calibration interval was overestimated in such away the out-of-tolerance case
occurred before it could be predicted by the QC. Therefore, the initial calibration interval should be
diminished to a reasonable period that allows the prediction of the out-of-tolerance case before its
occurrence.

A criterion has been established to reduce the initial calibration interval to the time interval between the
date of last point that is within the tolerance and the point at which the linear line intersects the tolerance
limit as illustrated in Fig. 14.

In the above example, examining the performance of the equipment throughout the calibration every 6
months was a large scale to assess the change in the equipment’s performance. In other words,
significant changes in the performance of the equipment have been occurred without being predicted by
the QC. Therefore, the established criterion proposed to reduce the calibration interval to 2.6 months to
precisely track the drift in the equipment’s performance and predict the out-of-tolerance case before its
occurrence.

The implementation of this criterion demonstrates carrying out the necessary maintenance to the
equipment device if necessary. Indeed, the interpretation of the out-of-tolerance case shall be made on
technical basis. The detection of many out-of-tolerance cases in the calibration records may require to
classify the equipment as out-of-use.

4.6. irregular initial calibration intervals


In the calibration records, the time interval between each two calibrations may not always be the same
since the laboratory may occasionally set different initial calibration intervals. In case that all the
calibration results were within the tolerance limits, only the most recent calibration is considered in the
evaluation of the calibration interval as previously discussed in this study.

If one of the calibration records was, on the other hand, out-of-tolerance, the initial calibration interval
could be estimated as the average of the time interval between each two calibrations. The average of
these calibration intervals will be diminished by the appropriate factor as addressed in the previous
section.
4.7. Monitoring the performance of the equipment during
the calibration interval period
The evaluation of the calibration interval discussed in the present study assumes that the equipment
device is maintained safe so that the calibration status has not been altered. In other words, neither nor
the systematic trend errors. are subjected to change from outside influences. Therefore, the stability of
the equipment’s performance shall be continuously examined.

The Shewhart QC would be useful to assess the stability of the equipment’s device during the period of
the calibration interval. There is no need in this case to assess the deviation of the results from a

Page 15/31
reference value, but rather to ensure that the results produced by the equipment are stable and do not
undergo significant changes.

5. General Discussion
Setting the calibration interval of the equipment is one of the requirements of ISO 17025 (ISO 17025,
2017). However, there is no a universal procedure that can be implemented to evaluate the next
calibration date. Various methods were developed to establish criteria for the evaluation of calibration
interval based on the calibration records of the equipment (NCSLI, 2010; Huang, 2010). Most of these
methods used the error stated in the calibration certificates to either extend or diminish the initial
calibration interval.

In the present study, it has been proposed that using the error values stated in the calibration certificates
may not provide accurate assessment for the calibration status of the equipment. Instead, it has been
proposed to use the small-drift sensitive quality control chart along with a linear regression model to
assess the calibration status of the equipment and predict the out-of-tolerance case. The idea is based on
fitting the error values with a linear model, and predict the out-of-control case using the QCs. In addition to
the usual Shewhart, three types of small-drift sensitive QCs have been investigated.

The CUSUM QC has shown an optimum performance in comparison with the V-Mask and MA QCs. The
calibration intervals evaluated by this QC intermediate the results of the other QCs. Furthermore, its
parameters require no adjustments and they are not depend on the size of the calibration records.

6. Conclusion
• Using the error values stated in the calibration records may not provide accurate assessment for the
calibration status of the equipment. Instead, subjecting the errors to small-shift sensitive control chart
provides precise assessment for its calibration status.

• The small-shift sensitive quality control charts can be used with the linear regression model to predict
the out-of-tolerance case and evaluate the calibration interval of an equipment.

• The CUSUM QC has shown an optimum performance in its sensitivity to the small shifts in the
equipment’s performance and in the evaluation of the calibration interval.

• In contrast to the V-mask or the MA QCs, the adjustment parameters of the CUSUM QC are independent
off the size of the calibration records.

References
1. Allen, T. T., 2006. Introduction to engineering statistics and six sigma. USA: Springer.

Page 16/31
2. Bucher, J. L., 2006. The quality calibration handbook: Developing and managing a calibration
program. USA: ASQ Quality Press.
3. de Oliveria, E. C. & de Jesys, V., 2015. Management of calibration intervals for temperature and static
pressure transmitters applied to the natural gas industry. J. Nat. Gas. Sci. Eng., Volume 24, pp. 178-
184.
4. Edwards, R., 1980. Internal analytical quality control using the Cusum charts and truncated V-mask
procedure. Annals of Clinical Biochemistry: An International journal of biochemistry and laboratory
medicine, 17(4), pp. 205-211.
5. Erick, O. O., Kahiri, J. & Erick, W. M., 2017. Impact of measurement errors on estimators of parameters
of a finite population with linear trend under systematic sampling. American Journal of Theoretical
and Applied Statistics, 6(6), pp. 270-277.
6. Huang, D., 2010. Calibration intervals by Bayesian approach: Information management, In:
Measurement system conference. s.l., s.n.
7. ILAC &B OIML, 2007. ILAC-G24 & OMIL D10: Guidelines for the determination of calibration intervals
of measuring instruments, s.l.: ILAC & OIML.
8. ILAC G24 - OMIL D10, 2022. Guidelines for the determination of recalibration intervals of measuring
equipment, s.l.: ILAC-OMIL.
9. ISO 17025, 2017. Testing and calibration laboratories, s.l.: ISO.
10. ISO 3534-2, 2006. Statistics - Vocabulary and symbols, Part 2: Applied Statistics, s.l.: s.n.
11. ISO 7870, 2013. Control charts - Part 2: Shewhart control charts, s.l.: ISO.
12. ISO Guide 35, 2017. Reference materials - Guidance for characterization and assessment of
homogeneity and stability, s.l.: ISO.
13. JCGM-101, 2008. JCGM 101: Evaluation of measurement data - Supplement 1 to the "Guide to the
expression of uncertainty in measurement" - Propagation of distributions using a Monte Carlo
method, s.l.: JCGM.
14. Lepek, A., 2001. Software for the prediction of measurement standards. s.l., NCSL International
Conference.
15. Lowry, C. A., Woodali, W. H., Champs, C. W. & Rigdon, S. E., 1992. A multivariate exponentialiy
weighted moving average control chart. Technometrics, 34(1), pp. 46-53.
16. Lucase, J. M. & Crosier, R. B., 1982. Fast initial response CUSUM quality control schemes.
Technometrics, 24(3), pp. 199-205.
17. Lucas, J. M., 1976. The design and use of V-mask control schemes. Journal of Quality and
Technology, 8(1), pp. 1-12.
18. Montgomery, D. C., 2005. Introduction to statistical quality control. 5th ed. USA: John Wiley & Sons,
Inc..
19. NCSLI, 2010. Establishment and adjustment of calibration intervals, Recommended Parctice RP-
1, s.l.: National conference of standard laboratories international.

Page 17/31
20. NIST, 2012. NIST/SEMATECH e-Handbook of statistics methods. USA: NIST.
21. Pashnina, N., 2018. Determination of optimal calibration intervals by balancing financial exposure
against measurement costs. Flow Measurement and Instrumentation, Volume 60, pp. 115-123.
22. Sheu, S.-H. & Tai, S.-H., 2006. Generaly weighted moving average control chart monitoring process
variability. Int. J. Adv. Manuf. Technol., Volume 30, pp. 452-458.
23. Smith, S. V., 1999. The scientist and Engineer's Guide to Digital Signal Processing. 2nd ed. San Diego:
Californai Technical Publishing.
24. Subramani, J. & Singh, S., 2014. Estimation of population mean in the presence of linear trend.
Communications in Statistics - Theory and Methods, 43(12), pp. 3095-3166.
25. Wang, J., Zhang, Q. & Jiang, W., 2017. Optimization of calibration inervals for automatic test
equipment. Meas., Volume 103, pp. 87-92.

Figures

Page 18/31
Figure 1

The calibration interval evaluated using the three-calibration method (NCSLI, 2010) for different ranges of
the weighting factors 𝑊1,𝑊2 and 𝑊3 assuming that the calibration status in each calibration process
was the accepted tolerance method, i.e., 𝑋1=𝑋2=𝑋3=1. The initial (old) calibration interval was
assumed to be 2 years.

Page 19/31
Figure 2

Evaluation of calibration interval based on Schumacher method (NCSLI, 2010).

Page 20/31
Figure 3

Calibration intervals evaluated following the Poisson method (Huang, 2010) as a function of the out-of-
tolerance cases.

Page 21/31
Figure 4

Characteristics of 𝐶+ and 𝐶− statistics of the cumulative sum (CUSUM) control charts.

Page 22/31
Figure 5

The effect of the adjustable parameter 𝛿 on the CUSUM statistical parameters 𝐶+ and 𝐶−. The lines
represent the mean values while the standard deviation is represented by the size of the bubbles.

Figure 6

Page 23/31
The performances of the V-Mask and CUSUM for error with only 𝜖𝑛𝑟𝑚 component. The adjustable
parameter 𝛿 was set to 1 in both charts.

Figure 7

Distribution of the moving average of the errors compared to the error of a standard normal distribution
with 𝑁(𝜇=0,𝜎=1). The window of the moving average has been set to 𝜔=5.0.

Page 24/31
Figure 8

Correlation between the population standard deviation of moving average and population standard
deviation of the error as a function of the moving average window size.

Page 25/31
Figure 9

Comparison between the performances of various QC charts to error with random error component 𝜖𝑛𝑟𝑚
and systematic component 𝜖𝑡𝑟𝑛𝑑.

Page 26/31
Figure 10

The performances of the various QCs to an error with systematical component 𝜖𝑡𝑟𝑛𝑑 with different trend-
strength factor 𝛥.

Page 27/31
Figure 11

Prediction of the out-of-tolerance case using various QC charts. The error was simulated with a
systematic error component. The calibration records against the duration were subjected to linear
regression analysis.

Page 28/31
Figure 12

The calibration interval predicted by QCs for calibration records with various systematic error strengths.

Page 29/31
Figure 13

Effect of the calibration records size on the calibration interval evaluated by the CUSUM and MA with the
prediction of linear model.

Page 30/31
Figure 14

Example of reducing the initial calibration interval using the CUSUM QC.

Page 31/31

You might also like