You are on page 1of 56

Analytical data interpretation and statistical

CPD
tools for procedure validation

Presented by
Rafeek Fahmi, PhD
Egyptian Drug Authority
25-8-2021
Session 1

1
Statistical estimation

Estimation
Normal distribution of population
Analytical procedure distribution (Unknown)
(estimated)

CPD
Mean
Mean Point estimator of μ
(Y − τ) point estimator of bias

SD Point estimator of σ
μ

2
Statistical inference
(=estimation , prediction)
Estimation:
How? By calculation dep on Confidence interval Using T table

CPD
Example of prediction

3
Errors in analytical validation

Experimental design:
Calibration standards
(regression equation)

CPD Validation standards


(precision/accuracy)

4
Variance and SD

CPD
5
Normal distribution

CPD
• Confidence limit (of population) = μ ± z. σ
• X-μ= z. σ
Estimation by using Confidence interval

Using T table
CPD
(ex of accuracy and precision estimation in USP 1210)

7
T table

CPD
8
We will discuss…

• Some Statistics considerations in


accuracy and precision
• Some Statistics considerations in

• LOD, LOQ CPD


regression

• Statistical analytical methods comparison


• Outliers

9
• Accuracy and precision
statistics

CPD
10
Accuracy and precision statistics

CPD
Value from Reference sd or value from an
established procedure

11
Accuracy and precision statistics
Analytical procedure distribution
(estimated)

CPD
12
Accuracy and precision statistics

CPD
Just an example illustrated.
The gray area: represents the expected probability that a measure will fall within –
15% and +15% (example of limit in bioanalytical)
on the basis of the observed distribution of the method characterized by estimated
bias and precision in validation phase.

The distribution not shadowed: represents the true distribution (unknown) of the
procedure
under validation.

It can also be observed that the estimated bias and precision of the procedure are
different of the “true bias” and “true precision”.
13
Accuracy Requirements

Procedure limit

Confidence limit
Accuracy

CPD
(Bias)
Confidence limit

Procedure limit

Note: this CL is not that in excel Ref SFSTP

14
Accuracy and precision statistics

CPD
Accuracy and precision example in In USP 1210

15
Some Statistics considerations in
accuracy and precision

CPD
16
Some Statistics considerations in
accuracy and precision

CPD
17
T table

CPD
18
Accuracy results in ex
Procedure limit
USP 1210
Confidence limit
Accuracy

CPD
(Bias) Since the computed confidence
Confidence limit
interval from −9.94 to −4.44 mg/g

Procedure limit falls entirely

within the range from −15 to +15


mg/g,

The bias criterion is satisfied.


19
Precision in USP example

CPD
20
Qui square table: 2 pages

CPD
21
Qui square table: 2 pages

CPD
22
Combined validation of accuracy and
precision
Prediction interval Tolerance interval

CPD
End of usp example
23
Accuracy and precision statistics
Confidence limits are limits within which we expect a given
population parameter, such as the mean, to lie.
Statistical tolerance limits are limits within which we expect a
stated proportion of the population to lie

CPD
Value from Reference sd or value from an
established procedure

Do accepted reference values or results from an established procedure exist


for validation of accuracy?
○ If not, as stated in International Council for Harmonisation (ICH) Q2,
accuracy may be inferred once precision, linearity, and specificity have been
established.

24
Examples on accuracy profile

CPD
25
Accuracy and precision statistics
The requirements in USP 1225
• Assessment of accuracy can be accomplished in a variety of ways, including
evaluating the recovery of the analyte (percent recovery) across the range of
the assay, or
• evaluating the linearity of the relationship between estimated and actual
concentrations. The statistically preferred criterion is that the confidence

CPD
interval for the slope be contained in an interval around 1.0, or alternatively,
that the slope be close to 1.0.
In either case, the interval or the definition of closeness should be specified in the
validation protocol.

The precision of an analytical procedure is determined by assaying a sufficient


number of aliquots of a homogeneous sample to be able to calculate statistically
valid estimates of standard deviation or relative standard deviation (coefficient of
variation).
Assays in this context are independent analyses of samples that have been carried
through the complete analytical procedure from sample preparation to final test
26
result.
Some Statistics considerations in

CPD
Regression equation

27
CPD
Actual random
error and
regression residual

28
The probability of
distribution of Ɛ

CPD
USP 1225:
Data from the regression line itself may be
helpful to provide mathematical estimates of
the degree of linearity.
The correlation coefficient, y-intercept, slope
of the regression line, and residual sum of
squares should be submitted.

29
predicted

CPD
30
Important in regression (excel)

CPD
31
• ANOVA

CPD
32
Some Statistics considerations in regression

that the error in the x values should be insignificant compared with that of the y
values.
In addition, the error associated with the y values must be normally distributed.

• Normality is hard to test for statistically with only small data sets.
• If there is doubt about the normality it may be sufficient to replace single y

CPD
values
with averages of three or more for each value of x, as mean values tend to be
normally
distributed even where individual results are not.
The magnitude of the error in the y values should also be constant across the range
of interest, i.e. the standard deviation should be constant.

Simple least squares regression gives equal weight to all points – this will not be
appropriate if some points are much less precise than others.

33
Some Statistics considerations in linearity

carrying out regression analysis using software

CPD
34
Some Statistics considerations in linearity
Examples of residual plots

CPD
35
Residual plot problem and solution

Residual plot showing the

CPD
change of variance of Ɛ

Residual plot for quadratic model


If straight line model show this residual plot
(bell shaped)

36
Residual plot problem and solution

CPD
Residual plot against y Residual plot against y (outlier deleted)
(outlier deleted)

37
Some Statistics considerations in linearity

1 Linear Least Squares


Ordinary Least Squares, OLS
Weighted Least Squares, WLS
General Least Squares, GLS

CPD
2. Nonlinear Least Squares
Nonlinear WLS by Linearization
Nonlinear WLS by other Methods
The Gradient or Steepest Descent Method
Newton’s Method
The Gauss-Newton Method
The Levenberg-Marquardt Method . .

38
Some Statistics considerations in linearity
Kth order polynomial which takes the form:

CPD
the error ϵ is serving as a reminder that the polynomial will typically provide an
estimate rather than an implicit value of the dataset for any given value of x.
Polynomial regression is non-linear in the way that x is not linearly correlated
with f(x,β); the equation itself is still linear.
In the other hand, non-linear regression is both non-linear in equation and x not
linearly correlated with f(x,β).

39
If compare between models, choose the least maximal error

CPD
40
Some Statistics considerations in linearity
The correlation coefficient r
The coefficient r is a measure of correlation not a measure of linearity

CPD
41
Model parameters should be interpreted only within
sample range
The range of the procedure is validated by verifying
that the analytical procedure provides acceptable

CPD
precision, accuracy, and linearity when applied to
samples containing analyte at the extremes of the
range as well as within the range.

How much confidence do we have that the estimated


slope accurately approximates the true slope?
This requires statistical inference in form of confidence
interval
42
LOD LOQ

CPD
43
CPD
Statistical comparison between 2 methods

44
Statistical comparison between 2 methods
Student t test

CPD
Statistical comparison

Statistical comparison of the results obtained by the proposed method and


the reported one (both methods n=5, t and F at P = 0.05 )

Parameter Reported HPLC

CPD
proposed method
method
Mean ± SD 99.84 ± 1.89 98.07 ± 1.55
RSD 1.89 1.58
Variance 3.57 2.40
Student's t-test (2.306
1.621
from table df total n-2)
F-value ( 6.39 from table
1.49
df n-1)

Since 1.621 less than tabulated 2.306, there are non significant difference between the 246proced
CPD
Two-sample t-test is used when the data
of two samples are statistically independent,
while the paired t-test is used when data is in
the form of matched pairs

47
Statistical comparison
Results of ANOVA (single factor) for comparison of the proposed methods and
the reported one for drug determination.

Source of variation
DF* SS** MS*** F F-critical

CPD
Between groups
3 12.70 4.23 1.70 3.34

Within groups
14 34.88 2.49

Total
17 47.58

48
Statistical comparison

CPD
49
Outliers

CPD
50
Outliers
Factors to be considered when investigating an outlying result include
•human error,
•instrumentation error,
•calculation error
•product or component deficiency.
•precision and accuracy of the procedure,
•the USP or in-house Reference Standard
•controls, process and analytical trends, and the specification limits.

CPD
If an assignable cause due to the analytical procedure can be identified, then
retesting may be performed ,data may be invalidated and eliminated from subsequent
calculations.

51
Outliers
The N values comprising the set of observations under examination are arranged in ascending order:
x1 < x 2 < . . . < xN
Testing the largest observation as an outlier:

Testing the smallest observation as an outlier:

CPD
Testing the largest observation as an outlier avoiding the smallest observation:

Testing the smallest observation as an outlier avoiding the largest observation:

52
Outliers

CPD
53
Outliers

51.8 − 49.9/ 51.8 − 49.8 = 0.95


three values using a type 1 error rate of 0.05
and assuming a normal distribution is 0.941.

Since the computed value of 0.95 exceeds

CPD
0.941, the measurement of vial 3 is identified
as an outlier

54
Outliers

CPD
55
CPD
Thanks

56

You might also like