Professional Documents
Culture Documents
JÉRÔME BOURSIER,*,‡ ANSELME KONATÉ,*,‡ GABRIELLA GOREA,* STÉPHANE REAUD,‡ EMMANUEL QUEMENER,*
FRÉDÉRIC OBERTI,*,‡ ISABELLE HUBERT–FOUCHARD,*,‡ NINA DIB,*,‡ and PAUL CALÈS*,‡
*University of Angers, IFR 132, HIFIH Laboratory (UPRES 3859), Angers; and ‡University Hospital, Hepatogastroenterology Department, Angers, France
Background & Aims: Fibroscan is a noninvasive device evaluate the reproducibility of liver stiffness measurement by
that assesses liver fibrosis by liver stiffness evaluation (LSE) transient elastography with Fibroscan. The secondary aims were
with ultrasonographic elastometry. We evaluated LSE re- to evaluate the following influencing factors: observer, device,
producibility and its influencing factors. Methods: LSE liver stiffness value, anatomic conditions, body mass index
was performed by 4 experienced physicians (>100 LSEs) in (BMI), knowledge of previous measurement, location by stan-
46 patients with chronic liver disease at 4 different ana- dard ultrasonography, and probe pressure.
tomic sites. Additional LSEs were performed for ancillary
aims, so that 534 LSEs were available. Results: Overall Patients and Methods
interobserver agreement for LSE results was considered as Patients
excellent, with intraclass coefficient correlation (Ric) of All patients included in the study were hospitalized in
0.93. Low LSE level, nonrecommended sites, LSE interquar- the Department of Hepato-Gastroenterology of the Angers Uni-
tile range >25%, and body mass index >25 independently versity Hospital, France, for a clinical assessment of chronic
decreased agreement. Thus, agreement was fair (Ric ⴝ 0.53) liver disease. The only noninclusion criterion was ascites, which
for LSE <9 kilopascals and excellent (Ric ⴝ 0.90) beyond. makes LSE impossible.7 The study protocol conformed to the
The best measurement site for LSE reproducibility was the Helsinki Declaration and was approved by the local Ethics
median axillary line on the first intercostal space under the Committee. All patients gave their informed consent before
liver dullness upper limit, with the patient lying in dorsal being included.
decubitus. When LSE results were categorized into fibrosis
Metavir stages, interobserver discordance was noticed in Observers
about 25% of the cases and was the highest for F2 and F3 Four physicians specialized in hepatology performed
stages and the lowest for F4. Intraobserver (Ric ⴝ 0.94), the examinations, and each had already performed at least 100
intersite (Ric ⴝ 0.92– 0.98), and interequipment (Ric ⴝ 0.92) LSEs before the study: judge A, 750; judge B, 109; judge C, 227;
agreements for LSE results were excellent. Preliminary stan- and judge D, 117 LSEs. In a reproducibility study, each patient
dard ultrasonography or probe pressure changes did not should classically be examined by all the observers to guarantee
improve interobserver agreement. Conclusions: The best the highest homogeneity. However, this design was not possible
measurement site for LSE is the one generally used for liver here because the duration of all LSEs performed by one judge
biopsy. Reproducibility of LSE is globally excellent but is was approximately 30 – 45 minutes.
fair in patient with low liver stiffness. The fibrosis diagno- Therefore, each patient was only examined by 2 judges. This
sis by ultrasonographic elastometry in low stages or cate- needed 2 different judge pairs who consequently examined 2
gorized into fibrosis Metavir stages must be interpreted different patient groups (Table 1); judges A and B examined a
with caution. group of 22 patients (P1), and judges C and D examined a
group of 24 patients (P2). Each patient was first examined by
the most experienced judge (A for P1, C for P2) and second by
Table 1. Study Design: Assignment of Judges According to apply a higher probe pressure on the skin and to perform
Patient and Observer Groups another 15 measurements. High pressure was defined as a grade
9 –12 on the 12-grade scale of recommended pressure. This
Observer group
Patient procedure was performed only on the first site for which diffi-
group Patients (n) First observer (O1) Second observer (O2) cult LSE was observed.
P1 22 Judge A Judge B Fibrosis Blood Tests
P2 24 Judge C Judge D
To determine which observers provided the reference
NOTE. Interobserver agreement was evaluated between O1 and O2. LSE results, we evaluated the correlation between O1 or O2 LSE
results and fibrosis blood tests taken as an independent fibrosis
reference: APRI3 and FibroMeter.5
Liver Stiffness Evaluation
Examination conditions. Examination conditions
Statistics
were those recommended by Echosens.7 The second observer Agreement was estimated with the intraclass correlation
was blinded to the previous results of the first observer. coefficient (Ric) for quantitative variables, kappa index for bi-
Definitions. An LSE corresponded to all the measure- nary variables, and weighted kappa index (w) for ordinal vari-
ments recorded during an examination. The LSE success rate ables. Ric combines correlation and similarity but does not take
(%) was calculated as the ratio of number of valid measure- into account chance-expected agreement unlike .14 Its mean-
ments/total number of measurements. The LSE result was ing is Ric ⱖ0.87, excellent; 0.87 ⬎Ric ⱖ0.71, good; 0.71 ⬎Ric
expressed in kilopascals (kPa) and corresponded to the median ⱖ0.50, fair; and Ric ⬍0.5, poor agreement.15
of all the valid measurements performed within the examina- We analysed the independent agreement predictors in the set
tion. Metavir fibrosis stages were determined from the LSE of 368 LSEs performed by observers O1 and O2 in sites I–IV
result by applying liver stiffness cutoffs previously published by (Table 2). The statistical comparison of agreement required us
Foucher et al10 that were calculated in various causes of chronic to use observer discrepancy. Discrepancy was expressed in 2
liver disease as in our study. Interquartile range (IQR) is an ways, either crude difference (c⌬LSE expressed in kPa): absolute
index of intrinsic variability of LSE corresponding to the inter- value of (O2 LSE result – O1 LSE result) or relative difference
val around the LSE result containing 50% of the valid measure- (r⌬LSE expressed in %): absolute value of [(O1 LSE result – O2
ments between the 25th and 75th percentiles. LSE IQR (kPa) LSE result)/O2 LSE result]. Statistical software used was SPSS
was expressed when necessary as LSE IQR ratio (%): LSE IQR/ for Windows, version 11.5.1 (SPSS Inc, Chicago, IL).
LSE result. When 10 valid measurements were obtained, LSE
was stopped, and the result was considered as reliable. However, Results
a LSE could not exceed a total of 20 (valid and invalid) mea-
surements. In a previous study, accuracy of LSE for the diag-
Patients
nosis of significant fibrosis or cirrhosis was similar whether 3, 5, Forty-six patients were enrolled in the study, 22 in the
or 10 valid measurements were recorded.11 Thus, as in previous patient group P1 and 24 in the group P2 (Table 1). Their
studies,10,12 we defined an LSE result as acceptable when the characteristics at inclusion are summarized in Table 3. Globally,
LSE success rate was at least 30%, corresponding to 6 valid 534 LSEs were performed during the study (Table 2); 184 LSEs
measurements according to our protocol. were performed on sites I–IV by each observer O1 and O2 (46
patients ⫻ 4 sites, Table 2): LSE results varied from 1.5–75 kPa IQR ratio was ⱕ25% and decreased in LSE IQR ratio ⬎25%
(median, 11.9), and LSEs were acceptable for both O1 and O2 (Figure 5).
in 73.4% of the cases. O2 LSE result was considered as the Body mass index. Interobserver agreement for LSE
reference LSE result because it was independently associated result was excellent when the BMI was ⬍25 and decreased in
with the fibrosis blood tests (data not shown). BMI ⱖ25 (Figure 6).
Liver stiffness evaluation site. Interobserver agree-
Liver Stiffness Evaluation Agreement
ment for LSE results was excellent in each site, but it was the
The material is the 184 LSEs performed by O1 and O2 highest in site IV and the lowest in site I (Table 5). LSE
observers on sites I–IV. Overall interobserver agreement was agreement was excellent between all sites (Table 5). LSE agree-
considered as excellent with Ric of 0.93 (Table 4; Figure 2). After
ment was excellent between sites IV and II or III (Table 6),
conversion of LSE result into Metavir fibrosis stage,10 O1/O2
suggesting no influence of axillary line and intercostal space
agreement was excellent (w, 0.80), and fibrosis stages were
tested. LSE agreement between sites I and IV was lower than the
concordant in around 75% of cases. O1/O2 discordance rate for
previous ones, supporting again a lower reproducibility in lat-
Metavir fibrosis stages was high in F2 and F3, moderate in F0/1,
and low in F4 (Figure 3). eral than in dorsal decubitus.
Screen withdrawal, additional ultrasonography. LSE
Influencing Factors agreements between LSE #4 and #5 or #6 were excellent (Ric ⫽
Liver stiffness evaluation result. Interobserver 0.99), suggesting no influence of knowledge of previous mea-
agreement for LSE results increased linearly as a function of surement on LSE result and no interest of location by standard
liver stiffness (Figure 4). Interobserver agreement was fair in ultrasonography, respectively.
reference LSE results ⬍9 kPa with Ric of 0.53 and excellent Liver stiffness evaluation repeatability. Agree-
when ⱖ9 kPa with Ric of 0.90. ment between LSE results of the 10 first and the 10 last valid
Liver stiffness evaluation interquartile range. In- measurements of LSE #7 was excellent (Ric ⫽ 0.94), suggesting
terobserver agreement for LSE result was excellent when LSE an excellent repeatability and intraobserver agreement of LSE.
No. of patients 46 22 24 —
Age (y) 34,45-50-62,83 34,44-50-61,74 34,45-51-64,83 .91
Male sex (%) 70 59 79 .14
BMI (kg/m2) 16.9,21.5-23.8-26.4,34.6 17.0,20.8-22.5-25.1,34.6 16.9,23.1-24.9-28.5,33.5 .05
Cause (%) .13
Alcohol 52 59 46
Virus 26 32 21
Others 22 9 33
AST (IU/L) 12,29-44-62,271 12,28-47-82,271 22,32-44-53,125 .67
Bilirubin (mol/L) 3,8-11-26,329 3,8-10-43,329 4,7-11-18,136 .70
FibroMeter 0.0,0.15-0.46-0.99,1.0 0.06,0.28-0.46-0.99,1.0 0.0,0.11-0.41-0.99,1.0 .52
APRI score 0.15,0.32-0.54-1.15,3.28 0.15,0.32-0.61-1.43,3.28 0.16,0.31-0.49-1.07,2.50 .46
NOTE. Quantitative variables are expressed as median (in bold), interquartiles, and extremes (in italic).
aMann-Whitney test or Fisher test between the patient groups.
1266 BOURSIER ET AL CLINICAL GASTROENTEROLOGY AND HEPATOLOGY Vol. 6, No. 11
Patient groupc
Both 23.5 ⫾ 23.6 23.1 ⫾ 23.0 .64 .93
P1 33.6 ⫾ 26.5 34.2 ⫾ 26.4 .60 .93
P2 13.5 ⫾ 14.5 12.2 ⫾ 11.2 .12 .84
P valued ⬍.001 ⬍.001 — —
aO1: judges A (for P1) and C (P2); O2: judges B (P1) and D (P2).
bPaired comparison between O1 and O2 by Wilcoxon test (P1 and P2)
or t test (all).
cPatient groups: P1: judges A (O1) and B (O2); P2: judges C (O1) and
D (O2).
dUnpaired comparison between P1 and P2 by t test.
Figure 3. Discordance rate for Metavir fibrosis stage between observ-
ers O1 and O2 as a function of anatomic site and reference (O2) Metavir
fibrosis stage. Discordance was defined by a difference of at least 1
Interequipment reproducibility. LSE agreement be- fibrosis stage between O1 and O2 results.
tween the 2 Fibroscan devices was excellent (Ric ⫽ 0.92).
Probe pressure. Difficult LSEs were noted in 43 of Acceptable liver stiffness evaluation. The rate of
the 368 LSE performed on sites I–IV (11.7%), mainly on site I acceptable LSE was significantly different between observers O1
(86%). A high probe pressure did not significantly increase LSE and O2, 87.5% and 79.9%, respectively (P ⫽ .03). However, this
success rate (P ⫽ .43) and did not alter LSE result (Ric ⫽ 0.96). difference did not influence interobserver reproducibility be-
Judge pair. Overall interobserver agreement was ex- cause all the previous results were similar whether the statistical
cellent in patient group P1 and good in P2 (Table 4). However, analysis was restricted to acceptable LSE in both O1 and O2
LSE results were significantly lower in P2 than in P1 patients observers (detailed data not shown).
(Table 4), precluding a reliable comparison of reproducibility
between both judge pairs. Therefore, this confusing factor, Observer Discrepancy
judge pair, was evaluated. On one hand, O1 LSE result (P ⬍
Crude discrepancy. c⌬LSE was independently asso-
.001), as expected, and judge pair (P ⫽ .002) were independent
ciated with O2 LSE IQR at the first step (P ⬍ .001), O2 LSE
predictors of O2 LSE result. On the other hand, O2 LSE result
result at the second step (P ⫽ .002), BMI at the third step (P ⫽
was the only independent predictor of O1 LSE result (P ⬍ .001),
.006), and LSE site at the fourth step (P ⫽ .01). c⌬LSE grew
with no independent role for judge pair (P ⫽ .51).
almost linearly as a function of O2 LSE result (Figure 7).
Finally, c⌬LSE was significantly different between the 4 LSE
sites and the smallest in site IV (Table 5).
Relative discrepancy. r⌬LSE was independently as-
sociated only with O2 LSE result (P ⫽ .02). r⌬LSE, plotted
against O2 LSE result, initially decreased and then reached a
plateau around 20%–30% by 5 kPa (Figure 7). In addition,
r⌬LSE was significantly different between the 4 LSE sites (the
smallest for site IV; Table 5).
Figure 2. Correlation of LSE results between observers O1 and O2, Figure 4. Interobserver agreement for LSE results as a function of
providing an agreement of Ric of 0.93. different intervals of LSE result (reference: O2) and anatomic sites.
November 2008 REPRODUCIBILITY OF LIVER STIFFNESS MEASUREMENT 1267
Table 5. Influence of Measurement Site on Interobserver Agreement as a Function of Patient Group (Ric) and of
Interobserver Difference
Measurement site
I II III IV All
Patient group
P1 0.82 0.96 0.93 0.98 0.93
P2 0.75 0.85 0.80 0.97 0.84
Both 0.86 0.95 0.91 0.98 0.93
Interobserver difference
c⌬LSEa (kPa) 7.6 ⫾ 10.4 4.4 ⫾ 6.2 5.6 ⫾ 8.0 2.5 ⫾ 3.4 4.9 ⫾ 7.4b
r⌬LSEc (%) 34 ⫾ 39 25 ⫾ 21 41 ⫾ 59 19 ⫾ 18 30 ⫾ 39b
aCrude difference between O1 and O2 LSE results.
bP ⫽ .03 (paired Friedman test between the 4 sites).
cRelative difference between O1 and O2 LSE results.