Professional Documents
Culture Documents
Suregate Test Method
Suregate Test Method
Recognizing Surrogate
Test Methods
AAS HID
American Association of State Highway and Transportation Officials
555 12th Street NW, Suite 1000
Washington, DC 20004
Standard Practice for
1. SCOPE
1.1. This practice describes a method to assess the capability of a surrogate test to be used as a
substitute for a standard test. The surrogate tests must demonstrate the capability to provide test
results similar to the standard test method with a minimum of additional variability. The test
results from the standard and the surrogate are compared and the error in their relationship is
quantified.
1.2. It may be desirable to use a surrogate test to monitor material quality during construction or
production, especially if the standard test method uses complex/expensive equipment, requires
time-consuming sample preparation and testing techniques, uses hazardous materials, or involves
destroying a portion of the material. This standard practice is intended to be applied for
construction quality control but may also be used for acceptance if specifically approved by the
Agency.
1.3. The procedure in this standard should be followed in the evaluation of potential surrogate test
methods and to set parameters for their use.
3. TERMINOLOGY
3.1. Definitions used herein are consistent with R IO; additional terms are provided as follows:
3.1.2. surrogate test-A test that has been statistically correlated with the results from a standard test
method, but it is performed in a different manner than the standard test method.
4.1. The procedure involves three parts, 1) comparing the testing repeatability and reproducibility of
the standard test to the repeatability and reproducibility of the surrogate test, 2) identifying the
equivalence limit (E) for the test, and 3 ) measuring the equivalence (upper confidence limit
(E UCL) and lower confidence limit (ELCL)) in the means and variance relationship between the
standard and surrogate test. E very test method has inherent variability. In using a surrogate test
method, the variability of the surrogate test method adds an additional component of variability to
the standard test results. If that variability is minimal compared to the benefits of the surrogate
test, the surrogate test is acceptable for use. This procedure identifies and quantifies the added
variability, establishes a recommended limit to that variability and performs a hypothesis test to
test if the difference between the standard and surrogate test results exceeds the identified limit.
4.2. Step I-Compare the testing repeatability and reproducibility of the standard test to the
repeatability and reproducibility of the surrogate test. Step I provides confidence that the surrogate
test can be performed with similar variability by comparing the repeatability and reproducibility or
by comparing the standard and surrogate test methods using a statistical F-test on the variances.
R 9 Appendix X2 describes the F-test procedure, it is also a standard statistical method found in
standard computer spreadsheet and statistical analysis programs.
4.2.1. Identify single-operator and multilaboratory precision from the standard test method, if available.
See Section 4.2.2 if the standard test method does not have a precision statement. Compare the
maximum ratio of the variance of the standard and surrogate procedures to an Fcritical of 1.84
(Fcritical from an F-test, assuming alpha= 0.05 (a 95 percent confidence limit) and numerator and
denominator values of n= 30).
Note 1-It is preferred to use this practice with a standard test method that has an established
precision statement.
4.2.1.1. Identify single-operator and multi-laboratory precision of the surrogate test using ASTM C670.
4.2.1.2. If the repeatability standard deviation ( S ) and reproducibility standard deviation (SR) from the
r
standard test is greater than the surrogate tests, report the repeatability standard deviation (Sr) and
reproducibility standard deviation (SR) from the surrogate test. Perform an F-test to compare the
variances as noted in Section 4.2.1., but note that the surrogate test has less variance.
4.2.1.3. If the repeatability standard deviation (S,.) and reproducibility standard deviation (SR) from the
standard test is less than the repeatability standard deviation (Sr) and reproducibility standard
deviation (SR) from the surrogate tests, perform an F-test to statistically compare repeatability and
reproducibility variances. Confirm that the reproducibility and repeatability variance between the
results obtained from the standard test and the surrogate test are statistically the same at 95 percent
confidence, assuming the numerator and denominator values of n= 30.
4.2.2. lf the standard test method does not have a precision and bias statement compare the variability of
the tests by performing a statistical F-test on a minimum of 30 split samples that have been tested
using the surrogate and standard tests. Confirm that the variance, based on the standard deviation
(s2) between the results obtained from the standard test and the surrogate test, are statistically the
same at 95 percent confidence.
4.3. Step 2-Identify E and n. The appropriate equivalence limit, E, shall be no more than two times
the reproducibility standard deviation of the standard test (2 * SR), unless another value is
specifically approved by the agency implementing the surrogate test. E should also be identified
considering the practical significance of the test values. Identify the minimum number of split
samples (n) using ASTM E2935 Section X2.l , with sigma= S, or split samples, alpha= 0.05, and
E, for a power level of at least 90 percent.
Note 2-See Section Xl .1.3 for an example of determining n using Power curves.
4.4. Step 3-Perform a hypothesis test to identify if the error in the relationship (means and slope)
between the standard and surrogate test are within the defined equivalence limit (£) defined in
Section 4.3 using a minimum ( n) number of samples as identified in Section 4.3 . Step 3 defines
the relationship between the standard and surrogate test results and provides a limit to the measure
of potential error between the two test results. The procedure is similar to simple regression in that
it evaluates the linear statistical relationship between the standard and surrogate test method.
When performing this test the residuals should be checked for normality.
4.4.1. Perform a means equivalence two one-sided test (TOSI) using split samples in accordance with
Section 7 of ASTM E2935 and report the equivalence upper confidence limit (EUCL) and
equivalence lower confidence limit (ELCL) for the difference in means. Using the same data
perform a paired t-test and an F-test as described in R 9 and report the p-values.
4.4.2. Perform a slope equivalence test using split samples in accordance with ASTM E2935 Section 8
and report the equivalence upper confidence limit (E UCL) and equivalence lower confidence limit
(ELCL) for the slope.
4.4.3. Plot the dot plot for residuals, residuals versus x, and line slope comparison as noted in ASTM
E2935 Section 8.3 to evaluate any outliers or heteroscedasticity.
5.1. A calibration and standardization frequency for the surrogate test shall be defined following the
requirements in R 61 and RP-1. Consider the frequency of calibration required for the standard test
method and any uncertainty growth in the surrogate test method. It is recommended to start with a
very high frequency (> 1 per 30 samples) and adjust the frequency if the results remain consistent.
5.1.1. The surrogate test should describe the range of values and types of materials that were used in
calibrating the test. The surrogate test should not be used outside the range of values used in
calibrating the surrogate test.
5.1.2. Recalibration should be performed if the material being tested varies greatly from the material that
was used to calibrate the surrogate test. If there is a concern of material changes, a quick one-test
check can be performed to confirm the relationship is the same. The process in Section 4 should be
repeated if the materials change outside the range previously considered or there is concern the
relationship is not remaining consistent over time.
6.1. Precision-Records of precision shall be maintained and made available on request. Records
should include total number of samples, sample type/description, and origin. Results provided
should include the mean and range of data values, and the results of the three-step analysis
performed in Section 4.
6.2. Bias-Bias should be addressed by any algorithm that was used to develop the surrogate test
value, so it should not be a consideration, unless the standard test method bias must be considered
in the referee testing. In that case, issues related to bias should be documented and made available
on request.
6.3. Limitations-Record of any limitations or conditions that increase variability of the test shall be
maintained and provided with the equipment.
7. KEYWORDS
APPENDIX
(Nonmandatory Information)
X1.1. E xample 1: iCCL (incremental creep for cracking at low temperature) surrogate test that provides
continuous low-temperature PG similar to BBR using a DSR.
X1.1.1. For this example only, continuous low-temperature PG (LTPGc) of R 29 is considered. The
surrogate test for determining LTPGc is iCCL (incremental creep for cracking at low temperature).
X1.1.2. Step I-Compare the testing repeatability and reproducibility of the standard test to the
repeatability and reproducibility of the surrogate test.
X1.1.2.1. Table X.l presents the repeatability and reproducibility coefficient of variation for the standard
and surrogate test. The testing variability for the surrogate test are less than the standard test. A
comparison of the variance between the standard and surrogate provides a maximum F value of
1.452/1.402 1.07 for S,. and 3 .42/1.9 2 3 .2, as compared to Fcritical of 1.84. Therefore, the
= =
variances are statistically similar for repeatability and the variances are statistically different for
reproducibility. This is for information only since the variance of the surrogate is less.
The first set of values are the precision estimates for continuous low-temperature PG, calculated from the precision estimates of m-value provided in T 313,
according to ASTM D4460.
The second set of values(in parentheses) are the single-operator and multilaboratory coefficient of variation of iCCL from the 2019 round robin study.
X1.1.3.1. Eis limited to two times the SR. Eis identified using the d2s of the BBR SR: -22 *9 .621100= 2.11
degrees. Then divide 2.11 by 2 to get the ±E value. For simplicity, Eis determined to be
±1 degree. Since PG grades are typically identified in ranges of 6 degrees (i.e. -22, -28, -3 4),
±1 degree also appears practical.
X1.1.3.2. The minimum number of split samples,n, is determined by developing power curves for the
condition of sigma (S,)= 0.6, alpha= 0.05, and E= 1 as shown in Table XI.2 and Figure Xl. I.
X1.1.3.3. The minimum sample size is the value that provides a minimum90 percent power at the value of
sigma. In Figure Xl.2, the absolute difference of 0.6 (x-axis) provides a power of90 percent
(y-axis) near a value ofn= 20. Therefore, minimumn= 20. Note that a value ofn greater than 20
is also acceptable.
0.9
0.8
0.7
0.6
Oi
� 0.5
Cl.
0.4
50
0.3
n =
0.2
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2
X1.1.4. Step 3-Perform a hypothesis test to identify if the error in the relationship (means and slope)
between the standard and surrogate test are within the defined equivalence limit (£) using at least
the minimum number of split samples (n) determined in Section Xl .1.3 .
X1.1.4.1. Table X1.3 shows a portion of the 60 samples from the data, and the results of the TOST analysis
comparing the means. Table XI .3 also shows thep-value for a standard F-test and paired t-test for
the same data. The results of the TOST analysis are within the ELCL and EUCL. The results of
the F-test and paired t-test also indicate that the variance and means are statistically similar.
n 60 60
Degrees ofFreedom (f l , f2)) 59 59
X1.1.4.2. Table X1.4 shows a portion of the 60 samples from the data, and the results of the analysis
comparing the slope. The results of the slope analysis are within the ELCL and EUCL.
2 -36.04 -36.60 -8.27 -8. 91 68.40 79. 44 73. 71 -36.34 -0.18 -35. 9 8 -36.72
-35.40 -36.01 -7.62 -8.33 58.11 69 . 33 63.47 -35.66 -0.24 -35.33 -36.01
4 -35.38 -35.92 -7. 61 -8.23 57. 9 4 67. 71 62.64 -35.65 -0.18 -35.32 -36.00
-31.07 -30.89 -3.29 -3.21 10.84 10.28 10.56 -31.13 0. 17 -30. 9 9 -31.28
8 -30. 9 7 -30.20 -3.19 -2.51 10.19 6.31 8.02 -31.03 0. 57 -30.89 -31.17
9 -30. 96 -30. 9 5 -3.18 -3.27 10.14 10.67 10.41 -31.02 0. 05 -30.88 -31.17
58 -23.54 -24.56 4.23 3.13 17. 93 9 . 78 13.24 -23.26 -0. 90 -23. 44 -23.06
60 -22.74 -22.39 5.04 5. 30 25.37 28.04 26.67 -22.42 0.02 -22.64 -22.19
R R2
1.046 1.372
El 0. 9 slope
0.808 1.672
LCL
(teta) UCL(teta)
0.786 0. 829
ELCL EUCL
(beta) (beta)
1.002 1.092
ELCL> EUCL<
0. 9 l. l
LCL(a) UCL(a)
X1.1.4.3. Figure Xl .2 shows the graphs of the surrogate vs standard tests, dot plot of residuals and the
slopes of the standard/surrogate relationship. The residuals were checked for normality both
statistically ( Shapiro-Wilkp-value = 0.2576) and with a Q-Q plot as shown in Figure Xl .3 .
•
·�·
•
•
• • -2 5
..
•
_J
u -3 0
!-:::! "r..
.....:.1
-3 5
•
••
-4 0
BBR
1
- 35 -30 -25 0
-25
-3 0
-3 5
0.5
-0.5
-1
0.02 0.05 0.1 0.2 0.3 0.5 0.7 0.8 0.9 0.95 0.98
[9uantiles
100.0% maximum 0.98
99.5% 0.98
97.5% 0.9695
90.0% 0.798
75.0% quartile 0.29
50.0% median -0.03
25.0% quartile -0.2975
10.0% -0.65
2.5% -0.94425
0.5% -0.96
0.0% minimum -0.96
�ummary Statistics J
Mean 0.0005
Std Dev 0.4908844
Std Err Mean 0.0633729
Upper 95% Mean 0.1273089
Lower 95% Mean -0.126309
N 60
X1.2.1. Figure Xl .4 shows a low correlation between the standard and surrogate tests. Based on the
regression slope, this test would not be an acceptable surrogate under this standard.
0.9 •
0.8 •
0.7
QJ
+-' y = 0.1986x + 0.2826
ro 0.6 •
b.O R2 = 0.0497
0 0.5 •
:I...
:I...
0.4
::J
(/) 0.3 •
•
0.2 • •
•
0.1 • •
0
0 0.2 0.4 0.6 0.8 1 1.2
Standard