You are on page 1of 8

Arthritis Care & Research

Vol. 63, No. 7, July 2011, pp 929 –936


DOI 10.1002/acr.20497
© 2011, American College of Rheumatology
SPECIAL ARTICLE

American College of Rheumatology Provisional


Criteria for Defining Clinical Inactive Disease in
Select Categories of Juvenile Idiopathic Arthritis
CAROL A. WALLACE,1 EDWARD H. GIANNINI,2 BIN HUANG,2 LUKASZ ITERT,2 AND
NICOLINO RUPERTO,3 FOR THE CHILDHOOD ARTHRITIS AND RHEUMATOLOGY RESEARCH
ALLIANCE (CARRA), THE PEDIATRIC RHEUMATOLOGY COLLABORATIVE STUDY GROUP (PRCSG),
AND THE PAEDIATRIC RHEUMATOLOGY INTERNATIONAL TRIALS ORGANISATION (PRINTO)

Objective. To prospectively validate the preliminary criteria for clinical inactive disease (CID) in patients with select
categories of juvenile idiopathic arthritis (JIA).
Methods. We used the process for development of classification and response criteria recommended by the American
College of Rheumatology Quality of Care Committee. Patient-visit profiles were extracted from the phase III randomized
controlled trial of infliximab in polyarticular-course JIA (i.e., patients considered to resemble those with select categories
of JIA) and sent to an international group of expert physician raters. Using the physician ratings as the gold standard, the
sensitivity and specificity were calculated using the preliminary criteria. Modifications to the criteria were made, and
these were sent to a larger group of pediatric rheumatologists to determine quantitative, face, and content validity.
Results. Variables weighted heaviest by physicians when making their judgment were the number of joints with active
arthritis, erythrocyte sedimentation rate (ESR), physician’s global assessment, and duration of morning stiffness. Three
modifications were made: the definition of uveitis, the definition of abnormal ESR, and the addition of morning stiffness.
These changes did not alter the accuracy of the preliminary set.
Conclusion. The modified criteria, termed the “criteria for CID in select categories of JIA,” have excellent feasibility and
face, content, criterion, and discriminant validity to detect CID in select categories of JIA. The small changes made to the
preliminary criteria set did not alter the area under the receiver operating characteristic curve (0.954) or accuracy (91%),
but have increased face and content validity.

This criteria set has been approved by the American College of Rheumatology (ACR) Board of Directors as Provisional. This signi-
fies that the criteria set has been quantitatively validated using patient data, but it has not undergone validation based on an exter-
nal data set. All ACR-approved criteria sets are expected to undergo intermittent updates.
As disclosed in the manuscript, these criteria were developed with partial financial support from industry sources. The industry
supporters were not involved in any stage of criteria development. As a courtesy, the authors sent copies of submitted manuscripts
to their industry supporters, but review and approval of the manuscripts were neither requested nor given.
Although current ACR practice is to decline requests for review of criteria that have been supported by industry, an exception
was made in this case due to prior ACR project support and because the ACR policy change took place after the industry support
was solicited and received by the investigators. ACR is an independent professional, medical and scientific society which does not
guarantee, warrant or endorse any commercial product or service. The ACR reviewed this manuscript on its merits and found the crite-
ria to be methodologically rigorous and clinically meaningful. The ACR received no compensation for its approval of these criteria.

INTRODUCTION status in individual patients, development of standards of


care, assessment of quality care, and as potential end
Validated, clinically useful, and reliable criteria for defin- points in clinical trials. The availability of new more ef-
ing disease states are crucial for monitoring of disease fective therapies for children with juvenile idiopathic ar-

Supported by Centocor through the investigator-initiated ward H. Giannini, MSc, DrPH, Bin Huang, PhD, Lukasz
study program and by the National Institute of Arthritis and Itert, MS: Cincinnati Children’s Hospital Medical Center
Musculoskeletal and Skin Diseases, NIH (grant P60-AR- and University of Cincinnati College of Medicine, Cincin-
44059). nati, Ohio; 3Nicolino Ruperto, MD, MPH: IRCCS G. Gaslini,
1
Carol A. Wallace, MD: Seattle Children’s Hospital and Pediatria II Reumatologia, Paediatric Rheumatology Inter-
University of Washington School of Medicine, Seattle; 2Ed- national Trials Organisation, Genoa, Italy.

929
930 Wallace et al

nary biologic evidence for the validation of the prelimi-


Table 1. Preliminary criteria for inactive disease in
oligoarticular (persistent and extended), polyarticular nary criteria has recently appeared in the literature.
(RF ⴙ and ⴚ), and systemic JIA* Knowlton and colleagues demonstrated that the disease
states defined by the preliminary clinically-based criteria
Inactive disease: display differently expressed genes in peripheral blood
No joints with active arthritis mononuclear cells, using microarray data and hierarchal
No fever, rash, serositis, splenomegaly, or generalized
clustering analysis (16).
lymphadenopathy attributable to JIA
No active uveitis to be defined
From its inception, this project has followed the recom-
ESR or CRP level within normal limits in the mended process of the Classification and Response
laboratory where tested. If both are tested, both must Criteria Subcommittee of the American College of Rheu-
be normal matology Committee on Quality Measures (17). The sub-
Physician’s global assessment of disease activity score committee recommends the use of large, high-quality data
of best possible on the scale used sets for prospective validation of preliminary response
criteria sets. One source of such data sets for JIA is ran-
* All criteria must be met. RF ⫽ rheumatoid factor; JIA ⫽ juvenile
idiopathic arthritis; ESR ⫽ erythrocyte sedimentation rate; CRP ⫽ domized controlled trials (RCTs) conducted for submis-
C-reactive protein. sion to regulatory agencies for drug approval. We used
data from the phase III RCT of infliximab in patients with
polyarticular-course juvenile rheumatoid arthritis. As
thritis (JIA), with the potential for eliminating disease ac-
with most existing RCTs, this limited diagnostic subset of
tivity for extended periods, underscores the need for cri-
patients is confined to those who resemble patients cur-
teria defining inactive disease (ID), clinical remission on
rently classifiable into the following categories of JIA:
medication (CRM), and clinical remission off medications
polyarthritis (both RF⫹ and RF⫺), extended oligoarthritis,
(CR) (1–5). There is as yet no indisputable gold standard or
and systemic arthritis (without currently active systemic
biomarker for determining whether a patient with JIA is in
features). We aimed to prospectively validate the criteria
a state of ID. At present, physical examination and clinical
for CID in patients with polyarticular-course JIA. The
laboratory criteria must be used to define this state, and is
methods and results of the multistep approach to prospec-
the reason why we refer to the current effort as defining
tively validate the criteria for CID using data from the
criteria for clinical inactive disease (CID) rather than ID;
phase III RCT (2) of infliximab in polyarticular-course JIA
the latter referring to both clinical and biologic quiescent
are the subjects of this report. For this reason, this study
disease. In the absence of a biologic marker for active or
was limited to using clinical trial data from subjects with
inactive JIA, aggregated expert judgment becomes neces-
polyarticular-course JIA without systemic features or
sary to determine criteria for clinical inactive JIA (6).
uveitis. Because results of this exercise indicated that
Synthesis of the preliminary criteria using the literature
changes be made to the original preliminary criteria to
and expert opinion based upon Delphi and nominal group
maximize validity, this report also includes results of the
consensus formation approaches (7,8) have been described
effort to estimate the quantitative content validity index
previously (9). Focus was placed on the polyarticular
(CVI) of the modified criteria overall, and face validity
(rheumatoid factor positive [RF⫹] and RF⫺), extended
index (FVI) of each criterion and its respective critical
oligoarticular, and systemic categories of JIA (Table 1). The
value.
preliminary criteria were used successfully to characterize
disease patterns of activity in JIA in 2005, and retrospec-
tive validation studies were completed in 2006 using the MATERIALS AND METHODS
Outcome Measures in Rheumatology Clinical Trials filter Terminology associated with the description of perfor-
(10 –12). These investigations found the criteria for ID to be mance characteristics of criteria is not standardized across
quite feasible for use in the routine clinic, and because various fields of research; the terms used here are those
consensus formation was used to produce the preliminary most commonly employed in rheumatology (10,18 –24).
set, face (clinical sensibility) and content (comprehensive)
validity were considered to be high. Comparison to other Approach to prospective validation of the preliminary
criteria sets in the literature for describing remission in criteria. A summary of the multistep process used for this
JIA (13–15) showed construct validity (convergent sub- exercise is given in Table 2.
type) to be high. To date, the preliminary criteria have
Step 1: extraction of 60 patient profiles showing low or
been utilized and cited in over 80 publications. Prelimi-
no disease activity from the 1,096 patient visits in the
infliximab trial database. Sixty patient profiles were ex-
Dr. Ruperto has received consultant fees, speaking fees, tracted from the 1,096 deidentified patient study visit re-
and/or honoraria (less than $10,000 each) from Bristol- cords from the phase III prospective RCT of infliximab in
Myers Squibb and Roche.
JIA reported by Ruperto et al (2). This trial, completed in
Address correspondence to Carol A. Wallace, MD, Uni-
versity of Washington School of Medicine, Division of Rheu- 2004, enrolled a total of 122 children ages 4 –17 years with
matology – Pediatrics, Seattle Children’s Hospital, 4800 active polyarticular-course JIA not adequately controlled
Sand Point Way NE R-5420, Seattle, WA 98104. E-mail: with methotrexate. To maximize the usefulness of this
cwallace@u.washington.edu. step, only patient-visit profiles demonstrating low or no
Submitted for publication January 15, 2010; accepted in
revised form November 17, 2010. clinically apparent disease were extracted from the data-
base because profiles reflecting very active disease (AD)
ACR Provisional Criteria for Inactive Disease in JIA 931

Table 2. Steps in the prospective validation process for


criteria for defining inactive disease in select categories
of juvenile idiopathic arthritis

1. Extraction of 60 patient-visit profiles showing low or


no disease activity from the 1,096 patient visits in the
infliximab trial database
2. Rating by 40 pediatric rheumatologists of disease state
(active or inactive) of the 60 patient-visit profiles
(referred to as survey 1)
3. Intraphysician agreement survey in which 20 of the
original 60 patient-visit profiles were re-sent to the 40
physician raters (referred to as survey 2)
4. Regression analysis to derive a best-fit model of
physician judgment to be applied to the remaining
1,036 patient profiles Figure 1. Example of a patient profile: this subject has systemic
5. Application of the best-fit model to predict how the juvenile idiopathic arthritis (JIA) with polyarticular course and at
remaining 1,036 profiles would have been scored by this visit has the clinical characteristics shown in the table below.
the physician raters VAS ⫽ visual analog scale; C-HAQ ⫽ Childhood Health Assess-
6. Calculation of agreement among the 1,036 patient ment Questionnaire; ESR ⫽ erythrocyte sedimentation rate;
visits between the predicted likelihood of the WBCs ⫽ white blood cells.
physicians’ score and by the preliminary criteria to
assess sensitivity and specificity and area under the and then score each patient profile as being in a state of AD
receiving operating characteristic curve or ID, or if the state was unable to be determined based
7. Modified criteria sent to 60 pediatric rheumatologists upon the information provided.
to estimate quantitative content and face validity
indices, and final optimization (referred to as survey 3) Step 3: intraphysician agreement survey in which 20 of
8. Final modification of the criteria is shown in Table 5 the original 60 patient profiles were re-sent to the 40
physician raters (referred to as survey 2). Because the
would have been scored as AD by both physician raters physician ratings of the profiles were to be used as the
and the preliminary criteria set. In order to not bias the “gold standard” for criteria validation, we thought it nec-
physician raters into basing their judgments solely on essary to determine if intraphysician reliability was in the
those parameters that are elements of the preliminary cri- acceptable range. Twenty of the original 60 patient profiles
teria set, each profile contained 7 clinical assessments: were re-sent to the physicians 2 months after receipt of the
duration of morning stiffness (DMS), visual analog scale initial ratings in order to assess intraphysician reliability.
(VAS) for pain, Childhood Health Assessment Question- Intraphysician reliability was calculated using the un-
naire, parent assessment of overall well-being, the physi- weighted kappa method as described by Fleiss (25). Inter-
cian’s global assessment of overall disease activity (PGA), rater reliability of a finalized criteria set was not part of
the number of joints with active arthritis and the number this exercise.
of joints with limitation of motion, and 4 laboratory as-
sessments (erythrocyte sedimentation rate [ESR], hemato- Step 4: regression analysis to derive a best-fit model of
crit, white blood cells, and platelets). The profiles con- physician judgment to be applied to the remaining 1,036
tained the actual raw data from the trial; no value of any patient profiles. We anticipated that very few of the 60
variable was imputed. Figure 1 is an example of a patient- patient profiles would be judged by the 40 physician raters
visit profile. Because of the inclusion/exclusion criteria to be in a state of ID, using the 80% consensus agreement
used for the infliximab RCT, no subject had active sys- rule. Therefore, we used a series of binomial logistic re-
temic features or uveitis and all had disease for at least 6 gression analyses (GENMOD procedure in SAS) to develop
months. a best-fit model of variables physicians weighted most
heavily when making their judgment of disease state. In
Step 2: rating by 40 pediatric rheumatologists of the the regression analysis procedure, the physician ratings
disease state (active or inactive) of the 60 patient profiles served as the dependent variable, and the 7 clinical and 4
(referred to as survey 1). These patient-visit profiles laboratory assessments served as the independent (explan-
were sent via computer link survey to 40 pediatric rheu- atory) variables. Because there are two analysis units, one
matologists in 27 countries who were members of the for patient profiles and one for the physicians’ ratings
Paediatric Rheumatology International Trials Organisation nested within each patient profile, we used generalized
(PRINTO), the Childhood Arthritis and Rheumatology Re- estimating equations to account for the clustering of the 40
search Alliance (CARRA), or the Pediatric Rheumatology physicians’ ratings within each patient profile. All clinical
Collaborative Study Group (PRCSG) and who had not par- and laboratory variables of the patient profiles were in-
ticipated in the development of the preliminary criteria, cluded in the initial multivariate logistic regression model.
and were not currently using the criteria in a clinical trial. Stepwise and forward selection procedures were used to
All physician raters were board certified in pediatric rheu- select the best subset of variables that predicted (showed
matology and had a minimum of 10 years of postfellow- the highest degree of correlation with) the patient state as
ship clinical experience. Physicians were asked to review judged by physician rating. The final logistic regression
932 Wallace et al

models included only those variables that remained sig- Table 3. Generalized estimating equations estimates
nificant at the 0.05 level. This allowed for identification of finding the best-fit model to physician ratings of inactive
those variables that influenced most heavily the physi- disease versus active disease*
cians’ determination of the disease state of the patient.
OR (95% CI)
Step 5: application of the best-fit model to predict how
No joints with active arthritis 121.3 (59.63–247.05)
the remaining 1,036 profiles would have been scored by ESR up to 110% the ULN for the 13.2 (1.32–22.57)
the physician raters. In step 5, the remaining 1,036 pa- test used†
tient-visit profiles (1,096 minus the 60 profiles actually Physician’s global assessment ⫽ best 8.9 (4.59–17.42)
scored by physician raters) that were not directly rated by attainable score on the scale used
physicians were computer scored using the best-fit regres- Duration of morning stiffness of ⱕ15 2.7 (1.04–6.78)
sion model from step 4. This allowed us to use all of the minutes
patient visits from the trial by predicting, with a consid-
* Area under the curve ⫽ 0.942. OR ⫽ odds ratio; 95% CI ⫽ 95%
erable amount of precision, how each of the profiles would confidence interval; ESR ⫽ erythrocyte sedimentation rate; ULN ⫽
have been scored, had the physicians rated each one. The upper limit of normal.
resulting predicted probabilities of patient status were † The ESR was used in the randomized controlled trial of infliximab
in juvenile idiopathic arthritis; the C-reactive protein level was not
compared to the ratings of disease status by the prelimi- assessed. The ESR criterion was later modified as shown in Table 5
nary criteria using the Kruskal-Wallis test. and described in the text.

Step 6: calculation of agreement between how the 1,036


patient visits were scored by the best-fit model (physician Step 8: final modification of the criteria set. Following
likelihood ratings) and by the preliminary criteria to as- the final survey, an additional modification to the criteria
sess accuracy, sensitivity and specificity, and area under was made.
the receiver operating characteristic (ROC) curve. These
metrics were calculated in the standardized manner, using
RESULTS
the physician likelihood ratings as the gold standard.
All 40 physician raters responded to the initial survey and
Step 7: estimation of quantitative content and face va- each scored all 60 patient profiles (2,400 evaluations). Of
lidity and final optimization (referred to as survey 3). all 2,400 physician evaluations, 1,744 (72.7%) were scored
Results from steps 1 through 6 suggested that changes to as AD, 374 (15.7%) were scored as ID, and the remaining
the preliminary criteria would be necessary to optimize 282 (11.7%) were scored as unable to determine. As ex-
their agreement with physician judgment. Therefore, after pected, only 3 of the 60 patient-visit profiles met the 80%
modification of the preliminary criteria, a third online consensus agreement among physician raters to be classi-
survey was sent that was designed to establish the quan- fied as ID. The stepwise selection of the binomial logistic
titative CVI of the criteria set overall and the FVI of each regression utilizing all 2,400 physician ratings resulted in
criterion and its respective cut point/critical value in the a final best-fit model that included active joint count, PGA,
set using methods described by Davies et al (26). Survey 3 ESR, and DMS. Testing different critical values for DMS
was sent to 60 pediatric rheumatologists, 40 of whom had (ⱖ5 minutes and ⱖ15 minutes) and ESR (ⱕ20 mm/hour
participated in the exercise above and 20 of whom had and ⱕ25 mm/hour), the model with DMS of ⱕ15 minutes
participated in the original consensus conference that led and ⱕ20 mm/hour for the ESR produced the best area
to the development of the preliminary criteria. The survey under the ROC curve. Odds ratios for these variables are
contained a cover e-mail that explained the purpose of the displayed in Table 3. Highest weight was placed on the
survey and definitions of content and face validity. A active joint count, followed by the ESR of ⱕ20 mm/hour,
computer link supplied the modified criteria set and an the PGA, and finally, DMS. Overall, the best-fit model
online questionnaire that asked that the criteria be scored (preliminary criteria with addition of DMS of ⱕ15 min-
as a whole for content validity using a 4-point ordinal level utes) resulted in an area under the ROC curve of 0.942,
scale (where 1 ⫽ irrelevant and should not be used to indicating excellent fit with physician ratings. Therefore,
assess ID, 2 ⫽ unable to assess relevance without revision, we were confident about using this model to predict how
3 ⫽ relevant but needs minor alterations, and 4 ⫽ very the remaining 1,036 patient visits (not scored by physi-
relevant and succinct). A second question asked that face cians) would have been scored if physician raters had
validity of each of the variables and its corresponding done so.
critical value (e.g., active joint count ⫽ 0) be scored using Thirty-seven (92.5%) of 40 physician raters responded
the same 4-point scale. A free-text box allowed physician to the survey designed to estimate intraphysician reliabil-
raters to make comments about their replies. ity in rating the patient-visit profiles, with a resulting
From survey 3 data, the CVI was calculated as the per- kappa value of 0.7 (95% confidence interval 0.63– 0.77),
centage of physician raters who scored the revised criteria indicating “substantial agreement” (27).
set as a whole as either a 3 or 4 on the 4-point scale Applying the best-fit model from the regression analysis
described above. The FVI was calculated for each item and to the remaining 1,036 patient profiles, the predicted like-
its corresponding cut point/critical value using the iden- lihood of a physician’s rating was calculated as the inverse
tical method. A CVI or FVI score of ⬎0.80 is considered to logit function of the linear combinations of the best-fit
have excellent content validity (26). model. Using this method, a total of 744 profiles were
ACR Provisional Criteria for Inactive Disease in JIA 933

Table 4. Agreement between preliminary criteria and Table 5. Criteria for defining clinical inactive disease in
predicted physician ratings based on the best-fit model oligoarticular (persistent and extended), polyarticular
for disease activity among 1,036 patient profiles* (RF ⴙ and ⴚ), and systemic JIA*

Disease activity rating Inactive disease:


according to the No joints with active arthritis†
predicted likelihood of No fever, rash, serositis, splenomegaly, or generalized
physician rating (ID and lymphadenopathy attributable to JIA
AD are classified by No active uveitis as defined by the SUN Working
likelihood of >0.8) Group (28)‡
ESR or CRP level within normal limits in the
ID, no. AD, no. laboratory where tested or, if elevated, not
patients patients attributable to JIA
Physician’s global assessment of disease activity score
Disease activity rating
of best possible on the scale used
according to the
Duration of morning stiffness of ⱕ15 minutes
preliminary
criteria for ID * All criteria must be met. Although this table contains criteria that
ID 37 0 refer to extraarticular manifestations of disease and uveitis, these
AD 76 744 were not part of this exercise because patients with systemic man-
Total 113 744 ifestations or uveitis were ineligible for enrollment into the random-
ized controlled trial. The uveitis and systemic criteria are shown
here in order to present the entire set as it currently exists. RF ⫽
* One hundred seventy-nine of the 1,036 patient profiles were rheumatoid factor; JIA ⫽ juvenile idiopathic arthritis; ESR ⫽ eryth-
scored as “unable to determine” and did not enter the analysis. rocyte sedimentation rate; CRP ⫽ C-reactive protein.
Therefore, the total number of patients for the table is 857. Sensi- † The American College of Rheumatology defines a joint with active
tivity ⫽ 33%; specificity ⫽ 100%; area under the curve ⫽ 0.954; arthritis as a joint with swelling not due to bony enlargement or, if
accuracy ⫽ 91%. Results using the modified criteria (shown in no swelling is present, limitation of motion accompanied by either
Table 5) yielded the exact same results. ID ⫽ inactive disease; AD ⫽ pain on motion and/or tenderness. An isolated finding of pain on
active disease. motion, tenderness, or limitation of motion on joint examination
may be present only if explained by either prior damage attributable
to arthritis that is now considered inactive or nonrheumatologic
reasons, such as trauma.
judged to be in AD and 113 in ID (179 patient profiles were ‡ The Standardization of Uveitis Nomenclature (SUN) Working
scored as unable to determine). The abnormal items for Group defines inactive anterior uveitis as “grade zero cells,” indi-
cating ⬍1 cell in field sizes of 1 mm by a 1-mm slit beam.
profiles that were scored as “unable to be determined”
included parent global assessment and joints with limited
range of motion (70%), pain (65%), PGA and ESR (45%), original preliminary criteria, uveitis as defined by the SUN
and hematocrit (30%). Using the best-fit model (derived Working Group and more detail regarding the ESR, and
from the physician likelihood ratings) as the “gold stan- included the addition of DMS. Physicians were accepting
dard,” the original preliminary criteria correctly classified of the SUN Working Group’s definition of inactive uveitis,
744 of the profiles as being in AD and 37 as being in ID, which was not available for the preliminary criteria. Per-
thus yielding an area under the curve of 0.954 and an taining to the fact that many different methodologies for
accuracy of 91%. These results are shown in Table 4. determination of the ESR are used throughout the world,
Based on the analyses described above, 3 proposed no specification should be made of an upper limit of nor-
changes were made to the criteria set and presented to mal. Further, ESR elevation associated with concurrent
physicians for their opinion in an online survey (survey 3). illness and not attributable to JIA should not be the sole
The proposed changes were: 1) addition of the definition criterion on which to exclude a patient from being classi-
of inactive uveitis as developed by the Standardization of fied as ID. Finally, results from the multivariate analyses
Uveitis Nomenclature (SUN) Working Group (28), 2) addi- and the final survey revealed that DMS in excess of 15
tion of a new criterion of DMS not in excess of 15 minutes, minutes is a clinically important indicator of AD. The
and 3) clarification of abnormal ESR. Forty-one (68%) of modified criteria for CID are shown in Table 5.
60 physicians replied to this third online survey (survey
3). The CVI of the modified criteria set as a whole was
DISCUSSION
95%, with only 2 of the 41 respondents indicating a score
below 3, indicating extremely high content validity. The To our knowledge, this is the first report of prospective
FVI, shown in parentheses as a percentage, for each crite- validation of the preliminary criteria for defining ID in JIA
rion and its critical value were as follows: active joint using prospectively collected RCT data done under Good
count ⫽ 0, no fever or rash, no active uveitis as defined by Clinical Practice regulations. The multistep approach used
the SUN Working Group (all 100%), ESR up to 110% of the in our analysis resulted in the modification of two of the
upper limit of normal (93%), and DMS of ⱕ15 minutes criteria (definition of active uveitis and ESR) and the ad-
(95%). The majority of physicians expressed opinions dition of another (DMS ⱕ15 minutes). This demonstrates
along with their score. Use of the results of the physician the importance of prospective validation and underscores
rating survey and analysis and accommodation of opin- the fact that criteria sets continue to evolve and are never
ions expressed in this final survey yielded the modified considered “final.”
criteria shown in Table 5. Specifically, the final optimiza- The addition of DMS is the only new variable intro-
tion specified changes to two individual elements of the duced to the criteria set. Comments from the final survey
934 Wallace et al

suggest that physicians believe that DMS of a short dura- have been classified previously as pauciarticular onset–
tion (i.e., ⱕ15 minutes) can represent residua of previously polyarticular course. Additionally, some patients in the
active disease without current active disease. However, trial had experienced a systemic onset, but followed a
DMS of a longer duration was considered to be sufficient polyarticular course. Therefore, while results presented
cause by itself to classify the patient as being in a state of here are likely most applicable to patients with polyarthri-
AD. Despite these modifications, the revised criteria set tis, oligoarticular and systemic-onset patients were in-
yielded the same area under the curve as did the prelimi- cluded in the data set from the trial. Still, results of crite-
nary criteria when using the physician ratings of patient rion and content validity and reliability may have been
profiles as the gold standard. Several reasons can be pos- different if patients with other forms of JIA, particularly
tulated for this finding. First, all patients classified as those with active systemic features, had been included.
being in a state of ID by the preliminary criteria had a DMS Patients with juvenile psoriatic arthritis, those with en-
ⱕ15 minutes. Next, the change in the criterion for inactive thesitis-related arthritis, and patients with uveitis were not
uveitis would not be expected to change the scoring, since included in this data set. Additional clinical trial data-
no patients had uveitis and the modification represents bases now in development will serve as the basis on which
only a further refinement of the definition of inactive eye to validate the criteria in other JIA disease categories as
involvement. well as permit further estimations of sensitivity and spec-
An important change from the preliminary criteria is ificity.
that the ESR may be elevated, and therefore the addition of Recently, the use of a 21-circle VAS rather than a 10-cm
the term “or, if elevated, not attributable to JIA.” No spec- line or 11-circle (0 –10 Likert-like scale) instrument to
ification of an upper limit of normal for the ESR is appro- assess PGA has become popular. The database used in this
priate due to the multitude of methods for its determina- exercise used the 11-circle scale, and therefore we were
tion, many of which have different limits of normality. unable to assess the whether the 21-circle VAS (with a
Importantly, many children with new-onset active JIA and gradation of 0.5) would have allowed more patients to be
a flare of JIA do not have elevation of the ESR. For this classified as being in ID, and increase the sensitivity of the
reason, comments from the surveys repeatedly empha- criteria set. Other databases will need to be used to inves-
sized the fact that elevations of acute-phase reactants fre- tigate this possibility.
quently are caused by conditions unrelated to JIA, and that The reliability exercises described in this manuscript do
such an elevation should not, by itself, be the sole justifi- not estimate either inter- or intrarater reliability of the
cation for classifying a patient as having active JIA. The assessment of individual components of the criteria. Non-
option of finding another source for an elevated ESR may reliability is known to exist among physician raters when
introduce bias into the definition of ID and could allow for judging the specific clinical parameters in the current cri-
potential misinterpretation by physicians. However, these teria set. Interphysician reliability coefficients of an estab-
clinical interpretation issues are the same for the joint lished criteria set cannot be estimated using data from this
examination and the PGA and reinforce the appropriate- exercise. Agreement among physicians from the first sur-
ness of calling these criteria for CID, since a biomarker is vey helped us establish what the criteria should be, not the
not currently available for the state of ID. Recently, the reliability of an established criteria set. The analysis of
C-reactive protein (CRP) level has gained popularity as the intraphysician reliability is more useful in the conven-
acute-phase reactant of choice. Therefore, clinicians and tional sense. Physicians who were able to make a determi-
investigators should feel free to use either the ESR or CRP nation of patient status on the initial survey were quite
level as a laboratory measure of inflammation. reliable in their reassessment at a later date of the same
The approach described in this article has limitations, patient profile.
and additional validation in other data sets is appropriate. The convergent validity subtype of construct validity
When the patient profile survey was conducted, we used has been estimated in prior retrospective validation exer-
the term ID rather than CID. Results of ratings may have cises using existing instruments for describing remission
been different if CID had been used in the survey rather in JIA. Analyses of these subtypes of validity were not
than ID. We believe that the term “clinical inactive dis- performed in the current effort, as other remission criteria
ease” is best to use, as these criteria have not been shown used variables that were not collected during the inflix-
to identify biologically inactive disease. It is hoped that in imab trial. Therefore, further work is needed to establish
the near future, translational research will develop such a both convergent and divergent validity, although prelimi-
definition or biomarker. Because systemically ill and per- nary results from prior retrospective work are encouraging,
sistent oligoarthritis patients were not included in the RCT as published earlier (12).
of infliximab in JIA, the results shown here apply most Van Tuyl and colleagues recently have described the
directly to those with polyarticular-course JIA (RF⫹ or collaborative process between the American College of
RF⫺). The systemic criteria in the current set are those that Rheumatology and the European League Against Rheuma-
had high face validity to physicians who participated in tism to redefine remission in adults with rheumatoid ar-
synthesis of the preliminary set. The clinical trial used for thritis (30). Two forms of remission have been developed
this exercise used the former classification of juvenile for use in clinical trials and in the clinic (30,31). In con-
rheumatoid arthritis, as defined by the American College trast, the current efforts in pediatric rheumatology have
of Rheumatology (29). However, some patients included been focused on a single definition of CID, CRM, and CR.
in the current study did have an oligoarticular onset Our ultimate goal is to establish worldwide criteria for
that progressed to extended oligoarthritis, which would CID, CRM, and CR for JIA that can be easily used in clinical
ACR Provisional Criteria for Inactive Disease in JIA 935

care and research settings. Criteria for CID have been pro- course juvenile rheumatoid arthritis. Arthritis Rheum 2007;
spectively validated and modified as a result of the vali- 56:3096 –106.
3. Lovell DJ, Reiff A, Ilowite NT, Wallace CA, Chon Y, Lin SL, et
dation process. The modified criteria, while having the
al, for the Pediatric Rheumatology Collaborative Study Group.
same sensitivity, specificity, and accuracy as the prelimi- Safety and efficacy of up to eight years of continuous etaner-
nary criteria, likely have greater face and content validity. cept therapy in patients with juvenile rheumatoid arthritis.
Future efforts are necessary to 1) validate these criteria in Arthritis Rheum 2008;58:1496 –504.
additional prospectively collected data sets (32), 2) esti- 4. Lovell DJ, Ruperto N, Goodman S, Reiff A, Jung L, Jarosova K,
et al, for the Paediatric Rheumatology International Trials
mate the conditional probability of CRM, given CID and Organisation and the Pediatric Rheumatology Collaborative
CR given CRM, and 3) establish the predictive validity of Study Group. Adalimumab with or without methotrexate in
CRM and CR. juvenile rheumatoid arthritis. N Engl J Med 2008;359:810 –20.
5. Ruperto N, Lovell DJ, Quartier P, Paz E, Rubio-Perez N, Silva
CA, et al, for the Paediatric Rheumatology International Trials
Organisation and the Pediatric Rheumatology Collaborative
ACKNOWLEDGMENTS Study Group. Abatacept in children with juvenile idiopathic
The authors wish to acknowledge the important participa- arthritis: a randomised, double-blind, placebo-controlled
tion by the following members of the PRCSG, CARRA, and withdrawal trial. Lancet 2008;372:383–91.
PRINTO: L. Abramson, S. Bowyer, B. Chalom, R. Cron, M. 6. Raine R, Sanderson C, Hutchings A, Carter S, Larkin K, Black
N. An experimental study of determinants of group judgments
Elder, H. Gewanter, N. Ilowite, L. Jung, Y. Kimura, D.
in clinical guideline development. Lancet 2004;364:429 –37.
Kingsbury, C. Lindsley, D. Lovell, D. McCurdy, R. Moore, 7. Bowles N. The Delphi technique. Nurs Stand 1999;13:32– 6.
K. O’Neil, L. Rider, C. Rose, K. Schickler, D. Sherry, J. 8. Horton JN. Nominal group technique: a method of decision-
Soep, L. Stein, R. Vehe, L. Wagner-Weiner, L. Zemel (US); making by committee. Anaesthesia 1980;35:811– 4.
B. Lang, R. Laxer, R. Schneider, L. Tucker (Canada); S. 9. Wallace CA, Ruperto N, Giannini E. Preliminary criteria for
clinical remission for select categories of juvenile idiopathic
Al-Mayouf (Saudi Arabia); B. Andersson-Gare (Sweden); arthritis. J Rheumatol 2004;31:2290 – 4.
T. Avcin (Slovenia); B. Flato (Norway); R. Cimaz, F. De 10. Boers M, Brooks P, Strand CV, Tugwell P. The OMERACT
Benedetti, F. Fantini, A. Martini, A. Ravelli (Italy); C. De filter for outcome measures in rheumatology [editorial].
Cunto, S. Garay (Argentina); C. Saad-Magalhaes, S. J Rheumatol 1998;25:198 –9.
Oliveira (Brazil); J. de Inocencio (Spain); P. Dolezalova 11. Wallace CA, Huang B, Bandeira M, Ravelli A, Giannini EH.
Patterns of clinical remission in select categories of juvenile
(Czech Republic); D. Foell, G. Horneff (Germany); J. Melo- idiopathic arthritis. Arthritis Rheum 2005;52:3554 – 62.
Gomes (Portugal); S. Nielsen (Denmark); H. Ozdogan, S. 12. Wallace CA, Ravelli A, Huang B, Giannini EH. Preliminary
Ozen (Turkey); P. Quartier (France); F. Kanakoudi-Tsaka- validation of clinical remission criteria using the OMERACT
lidou (Greece); Y. Uziel (Israel); R. Vesely (Slovakia); and filter for select categories of juvenile idiopathic arthritis.
J Rheumatol 2006;33:789 –95.
N. Wulffraat (The Netherlands). 13. Oen K. Long-term outcomes and predictors of outcomes for
patients with juvenile idiopathic arthritis. Best Pract Res Clin
Rheumatol 2002;16:347– 60.
AUTHOR CONTRIBUTIONS 14. Fantini F, Gerloni V, Gattinara M, Cimaz R, Arnoldi C, Lupi E.
All authors were involved in drafting the article or revising it Remission in juvenile chronic arthritis: a cohort study of 683
critically for important intellectual content, and all authors ap- consecutive cases with a mean 10 year followup. J Rheumatol
proved the final version to be published. Dr. Wallace had full 2003;30:579 – 84.
access to all of the data in the study and takes responsibility for 15. Flato B, Lien G, Smerdel A, Vinje O, Dale K, Johnston V, et al.
the integrity of the data and the accuracy of the data analysis. Prognostic factors in juvenile rheumatoid arthritis: a case-
Study conception and design. Wallace, Giannini, Ruperto. control study revealing early predictors and outcome after
Acquisition of data. Wallace, Giannini, Itert, Ruperto. 14.9 years. J Rheumatol 2003;30:386 –93.
Analysis and interpretation of data. Wallace, Giannini, Huang, 16. Knowlton N, Jiang K, Frank MB, Aggarwal A, Wallace C,
Ruperto. McKee R, et al. The meaning of clinical remission in polyar-
ticular juvenile idiopathic arthritis: gene expression profiling
in peripheral blood mononuclear cells identifies distinct dis-
ROLE OF THE STUDY SPONSOR ease states. Arthritis Rheum 2009;60:892–900.
17. Classification and Response Criteria Subcommittee of the
This study was supported by an unrestricted grant from Cen- American College of Rheumatology Committee on Quality
tocor. As disclosed in the manuscript, these criteria were devel- Measures. Development of classification and response criteria
oped with partial financial support from industry sources. The for rheumatic diseases [editorial]. Arthritis Rheum 2006;55:
industry supporters were not involved in any stage of criteria 348 –52.
development. As a courtesy, the authors sent copies of submitted 18. Bombardier C, Tugwell P. A methodological framework to
manuscripts to their industry supporters, but review and approval develop and select indices for clinical trials: statistical and
of the manuscripts were neither requested nor given. judgmental approaches. J Rheumatol 1982;9:753–7.
19. Felson DT, Anderson JJ, Boers M, Bombardier C, Furst D,
Goldsmith C, et al. American College of Rheumatology pre-
REFERENCES liminary definition of improvement in rheumatoid arthritis.
Arthritis Rheum 1995;38:727–35.
1. Lovell DJ, Giannini EH, Reiff A, Cawkwell GD, Silverman ED, 20. Magni-Manzoni S, Ruperto N, Pistorio A, Sala E, Solari N,
Nocton JJ, et al, for the Pediatric Rheumatology Collaborative Palmisani E, et al. Development and validation of a prelimi-
Study Group. Etanercept in children with polyarticular juve- nary definition of minimal disease activity in patients with
nile rheumatoid arthritis. N Engl J Med 2000;342:763–9. juvenile idiopathic arthritis. Arthritis Rheum 2008;59:
2. Ruperto N, Lovell DJ, Cuttica R, Wilkinson N, Woo P, Espada 1120 –7.
G, et al, for the Paediatric Rheumatology International Trials 21. Ruperto N, Ravelli A, Oliveira S, Alessio M, Mihaylova D,
Organisation and the Pediatric Rheumatology Collaborative Pasic S, et al, for the Pediatric Rheumatology International
Study Group. A randomized, placebo-controlled trial of inf- Trials Organization (PRINTO) and the Pediatric Rheumatol-
liximab plus methotrexate for the treatment of polyarticular- ogy Collaborative Study Group (PRCSG). The Pediatric Rheu-
936 Wallace et al

matology International Trials Organization/American College 28. Jabs DA, Nussenblatt RB, Rosenbaum JT. Standardization of
of Rheumatology provisional criteria for the evaluation of uveitis nomenclature for reporting clinical data: results of the
response to therapy in juvenile systemic lupus erythem- First International Workshop. Am J Ophthalmol 2005;140:
atosus: prospective validation of the definition of improve- 509 –16.
ment. Arthritis Rheum 2006;55:355– 63. 29. Cassidy JT, Levinson JE, Brewer EJ Jr. The development of
22. Giannini EH, Ruperto N, Ravelli A, Lovell DJ, Felson DT, classification criteria for children with juvenile rheumatoid
Martini A. Preliminary definition of improvement in juvenile arthritis. Bull Rheum Dis 1989;38:1–7.
arthritis. Arthritis Rheum 1997;40:1202–9. 30. Van Tuyl LH, Vlad SC, Felson DT, Wells G, Boers M. Defining
23. Rider LG, Giannini EH, Brunner HI, Ruperto N, James-Newton remission in rheumatoid arthritis: results of an initial Amer-
L, Reed AM, et al, for the International Myositis Assessment ican College of Rheumatology/European League Against
and Clinical Studies Group. International consensus on pre-
Rheumatism consensus conference. Arthritis Rheum 2009;61:
liminary definitions of improvement in adult and juvenile
704 –10.
myositis. Arthritis Rheum 2004;50:2281–90.
31. Felson DT, Smolen JS, Wells G, Zhang B, van Tuyl LH, Funo-
24. Khanna D, Lovell DJ, Giannini E, Clements PJ, Merkel PA,
Seibold JR, et al. Development of a provisional core set of vits J, et al. American College of Rheumatology/European
response measures for clinical trials of systemic sclerosis. League Against Rheumatism provisional definition of remis-
Ann Rheum Dis 2008;67:703–9. sion in rheumatoid arthritis for clinical trials. Arthritis
25. Fleiss JL. Measuring nominal scale agreement among many Rheum 2011;63:573– 86.
raters. Psychol Bull 1971;76:378 – 82. 32. Foell D, Wulffraat N, Wedderburn LR, Wittkowski H, Frosch
26. Davies EH, Surtees R, DeVile C, Schoon I, Vellodi A. A sever- M, Gerss J, et al, for the Paediatric Rheumatology Interna-
ity scoring tool to assess the neurological features of neurono- tional Trials Organization (PRINTO). Methotrexate with-
pathic Gaucher disease. J Inherit Metab Dis 2007;30:768 – 82. drawal at 6 vs 12 months in juvenile idiopathic arthritis in
27. Landis JR, Koch GG. The measurement of observer agreement remission: a randomized clinical trial. JAMA 2010;303:1266 –
for categorical data. Biometrics 1977;33:159 –74. 73.

You might also like