Professional Documents
Culture Documents
Analytical Data
Analytical Data
CPD
tools for procedure validation
Presented by
Rafeek Fahmi, PhD
Egyptian Drug Authority
25-8-2021
Session 1
1
Statistical estimation
Estimation
Normal distribution of population
Analytical procedure distribution (Unknown)
(estimated)
CPD
Mean
Mean Point estimator of μ
(Y − τ) point estimator of bias
SD Point estimator of σ
μ
2
Statistical inference
(=estimation , prediction)
Estimation:
How? By calculation dep on Confidence interval Using T table
CPD
Example of prediction
3
Errors in analytical validation
Experimental design:
Calibration standards
(regression equation)
4
Variance and SD
CPD
5
Normal distribution
CPD
• Confidence limit (of population) = μ ± z. σ
• X-μ= z. σ
Estimation by using Confidence interval
Using T table
CPD
(ex of accuracy and precision estimation in USP 1210)
7
T table
CPD
8
We will discuss…
9
• Accuracy and precision
statistics
CPD
10
Accuracy and precision statistics
CPD
Value from Reference sd or value from an
established procedure
11
Accuracy and precision statistics
Analytical procedure distribution
(estimated)
CPD
12
Accuracy and precision statistics
CPD
Just an example illustrated.
The gray area: represents the expected probability that a measure will fall within –
15% and +15% (example of limit in bioanalytical)
on the basis of the observed distribution of the method characterized by estimated
bias and precision in validation phase.
The distribution not shadowed: represents the true distribution (unknown) of the
procedure
under validation.
It can also be observed that the estimated bias and precision of the procedure are
different of the “true bias” and “true precision”.
13
Accuracy Requirements
Procedure limit
Confidence limit
Accuracy
CPD
(Bias)
Confidence limit
Procedure limit
14
Accuracy and precision statistics
CPD
Accuracy and precision example in In USP 1210
15
Some Statistics considerations in
accuracy and precision
CPD
16
Some Statistics considerations in
accuracy and precision
CPD
17
T table
CPD
18
Accuracy results in ex
Procedure limit
USP 1210
Confidence limit
Accuracy
CPD
(Bias) Since the computed confidence
Confidence limit
interval from −9.94 to −4.44 mg/g
CPD
20
Qui square table: 2 pages
CPD
21
Qui square table: 2 pages
CPD
22
Combined validation of accuracy and
precision
Prediction interval Tolerance interval
CPD
End of usp example
23
Accuracy and precision statistics
Confidence limits are limits within which we expect a given
population parameter, such as the mean, to lie.
Statistical tolerance limits are limits within which we expect a
stated proportion of the population to lie
CPD
Value from Reference sd or value from an
established procedure
24
Examples on accuracy profile
CPD
25
Accuracy and precision statistics
The requirements in USP 1225
• Assessment of accuracy can be accomplished in a variety of ways, including
evaluating the recovery of the analyte (percent recovery) across the range of
the assay, or
• evaluating the linearity of the relationship between estimated and actual
concentrations. The statistically preferred criterion is that the confidence
CPD
interval for the slope be contained in an interval around 1.0, or alternatively,
that the slope be close to 1.0.
In either case, the interval or the definition of closeness should be specified in the
validation protocol.
CPD
Regression equation
27
CPD
Actual random
error and
regression residual
28
The probability of
distribution of Ɛ
CPD
USP 1225:
Data from the regression line itself may be
helpful to provide mathematical estimates of
the degree of linearity.
The correlation coefficient, y-intercept, slope
of the regression line, and residual sum of
squares should be submitted.
29
predicted
CPD
30
Important in regression (excel)
CPD
31
• ANOVA
CPD
32
Some Statistics considerations in regression
that the error in the x values should be insignificant compared with that of the y
values.
In addition, the error associated with the y values must be normally distributed.
• Normality is hard to test for statistically with only small data sets.
• If there is doubt about the normality it may be sufficient to replace single y
CPD
values
with averages of three or more for each value of x, as mean values tend to be
normally
distributed even where individual results are not.
The magnitude of the error in the y values should also be constant across the range
of interest, i.e. the standard deviation should be constant.
Simple least squares regression gives equal weight to all points – this will not be
appropriate if some points are much less precise than others.
33
Some Statistics considerations in linearity
CPD
34
Some Statistics considerations in linearity
Examples of residual plots
CPD
35
Residual plot problem and solution
CPD
change of variance of Ɛ
36
Residual plot problem and solution
CPD
Residual plot against y Residual plot against y (outlier deleted)
(outlier deleted)
37
Some Statistics considerations in linearity
CPD
2. Nonlinear Least Squares
Nonlinear WLS by Linearization
Nonlinear WLS by other Methods
The Gradient or Steepest Descent Method
Newton’s Method
The Gauss-Newton Method
The Levenberg-Marquardt Method . .
38
Some Statistics considerations in linearity
Kth order polynomial which takes the form:
CPD
the error ϵ is serving as a reminder that the polynomial will typically provide an
estimate rather than an implicit value of the dataset for any given value of x.
Polynomial regression is non-linear in the way that x is not linearly correlated
with f(x,β); the equation itself is still linear.
In the other hand, non-linear regression is both non-linear in equation and x not
linearly correlated with f(x,β).
39
If compare between models, choose the least maximal error
CPD
40
Some Statistics considerations in linearity
The correlation coefficient r
The coefficient r is a measure of correlation not a measure of linearity
CPD
41
Model parameters should be interpreted only within
sample range
The range of the procedure is validated by verifying
that the analytical procedure provides acceptable
CPD
precision, accuracy, and linearity when applied to
samples containing analyte at the extremes of the
range as well as within the range.
CPD
43
CPD
Statistical comparison between 2 methods
44
Statistical comparison between 2 methods
Student t test
CPD
Statistical comparison
CPD
proposed method
method
Mean ± SD 99.84 ± 1.89 98.07 ± 1.55
RSD 1.89 1.58
Variance 3.57 2.40
Student's t-test (2.306
1.621
from table df total n-2)
F-value ( 6.39 from table
1.49
df n-1)
Since 1.621 less than tabulated 2.306, there are non significant difference between the 246proced
CPD
Two-sample t-test is used when the data
of two samples are statistically independent,
while the paired t-test is used when data is in
the form of matched pairs
47
Statistical comparison
Results of ANOVA (single factor) for comparison of the proposed methods and
the reported one for drug determination.
Source of variation
DF* SS** MS*** F F-critical
CPD
Between groups
3 12.70 4.23 1.70 3.34
Within groups
14 34.88 2.49
Total
17 47.58
48
Statistical comparison
CPD
49
Outliers
CPD
50
Outliers
Factors to be considered when investigating an outlying result include
•human error,
•instrumentation error,
•calculation error
•product or component deficiency.
•precision and accuracy of the procedure,
•the USP or in-house Reference Standard
•controls, process and analytical trends, and the specification limits.
CPD
If an assignable cause due to the analytical procedure can be identified, then
retesting may be performed ,data may be invalidated and eliminated from subsequent
calculations.
51
Outliers
The N values comprising the set of observations under examination are arranged in ascending order:
x1 < x 2 < . . . < xN
Testing the largest observation as an outlier:
CPD
Testing the largest observation as an outlier avoiding the smallest observation:
52
Outliers
CPD
53
Outliers
CPD
0.941, the measurement of vial 3 is identified
as an outlier
54
Outliers
CPD
55
CPD
Thanks
56