You are on page 1of 13

ORIGINAL RESEARCH

PSYCHOMETRICS

Psychometric Evaluation of the Hypogonadism Impact of Symptoms


Questionnaire Short Form (HIS-Q-SF)
Heather L. Gelhorn, PhD,1 Laurie J. Roberts, MPH,1 Nikhil Khandelwal, PhD,2 Dennis A. Revicki, PhD,1
Leonard R. DeRogatis, PhD,3 Adrian Dobs, MD, MHS,4 Zsolt Hepp, PharmD, MS,2 and
Michael G. Miller, PharmD2

ABSTRACT

Background: The Hypogonadism Impact of Symptoms Questionnaire Short Form (HIS-Q-SF) is a patient-
reported outcome measurement designed to evaluate the symptoms of hypogonadism. The HIS-Q-SF is an
abbreviated version including17 items from the original 28-item HIS-Q.
Aim: To conduct item analyses and reduction, evaluate the psychometric properties of the HIS-Q-SF, and
provide guidance on score interpretation.
Methods: A 12-week observational longitudinal study of hypogonadal men was conducted as part of the original
HIS-Q psychometric evaluation. Participants completed the original HIS-Q every 2 weeks. Blood samples were
collected to evaluate testosterone levels. Participants completed the Aging Male’s Symptoms Scale, the Inter-
national Index of Erectile Function, the Short Form-12, and the PROMIS Sexual Activity, Satisfaction with Sex
Life, Sleep Disturbance, and Applied Cognition Scales (baseline and weeks 6 and 12). Clinicians completed the
Clinical Global Impression of Severity and Change scales and a clinical form.
Main Outcome Measures: Item performance was evaluated using descriptive statistics and Rasch analyses. Reli-
ability (internal consistency and test-retest), validity (concurrent and know groups), and responsiveness were assessed.
Results: One hundred seventy-seven men participated (mean age ¼ 54.1 years, range ¼ 23e83). Similar to the
full HIS-Q, the final abbreviated HIS-Q-SF instrument includes five domains (sexual, energy, sleep, cognition,
and mood) with two sexual subdomains (libido and sexual function). For key domains, test-retest reliability was
very good, and construct validity was good for all domains. Known-groups validity was demonstrated for all
domain scores, subdomain scores, and total score based on the Clinical Global ImpressioneSeverity. All domains
and subdomains were responsive to change based on patient-rated anchor questions.
Clinical Implications: The HIS-Q-SF could be a useful tool in clinical practice, epidemiologic studies, and other
academic research settings.
Strengths and Limitations: Careful consideration was given to the selection of the final HIS-Q-SF items based
on quantitative data and clinical expert feedback. Overall, the reduced set of items demonstrated strong psy-
chometric properties. Testosterone levels for the participating men were not as low as anticipated, which could
have limited the ability to examine the relations between the HIS-Q-SF and testosterone levels. Further, the
analyses used data collected through administration of the full HIS-Q, and future studies should administer the
standalone HIS-Q-SF to replicate the psychometric analyses reported in the present study.
Conclusion: Similar to the original HIS-Q, the HIS-Q-SF has evidence supporting reliability, validity, and
responsiveness. The short form includes a smaller set of items that might be more suitable for use in clinical
practice or academic research settings. Gelhorn HL, Roberts LJ, Khandelwal N, et al. Psychometric Evalu-
ation of the Hypogonadism Impact of Symptoms Questionnaire Short Form (HIS-Q-SF). J Sex Med
2017;14:1046e1058.
Copyright  2017, International Society for Sexual Medicine. Published by Elsevier Inc. All rights reserved.

4
Received January 10, 2017. Accepted May 26, 2017. Johns Hopkins University, Baltimore, MD, USA
1
Evidera, Bethesda, MD, USA; Copyright ª 2017, International Society for Sexual Medicine. Published by
2
AbbVie, North Chicago, IL, USA; Elsevier Inc. All rights reserved.
3 http://dx.doi.org/10.1016/j.jsxm.2017.05.013
Maryland Center for Sexual Health, Lutherville, MD, USA;

1046 J Sex Med 2017;14:1046e1058


Psychometric Evaluation of the HIS-Q-SF 1047

Key Words: Hypogonadism; Patient-Reported Outcome (PRO); Hypogonadism Impact of Symptoms


Questionnaire Short Form (HIS-Q-SF); Psychometric Properties; Reliability; Validity; Responsiveness

INTRODUCTION HIS-Q-SF were further supported by input from clinical experts,


Hypogonadism in men is often associated with a range of the instrument development team, and prior research conducted
symptoms that can include poor libido, erectile dysfunction, in the development of the original HIS-Q. The final step in the
irritability, fatigue, and psychological and relationship problems, development of the HIS-Q-SF, reported in the present study, was
among other symptoms.1,2 Many of these symptoms are difficult to perform a psychometric evaluation of the reduced item set.
for clinicians to evaluate accurately and might be best assessed
through patient-reported outcome (PRO) measurements. Recent METHODS
research has suggested that androgens have a measurable impact Aims
on functioning and health-related quality of life.3,4 The Hypo- In the development of the original HIS-Q, a 12-week pro-
gonadism Impact of Symptoms Question (HIS-Q) is a recently spective, observational, longitudinal study was conducted.7 The
developed 28-item PRO measurement designed to assess the full present study used these existing HIS-Q data, and further ana-
range of symptoms in men.5 This instrument comprehensively lyses were conducted to confirm the item content, scoring al-
captures important hypogonadal symptoms but, because of its gorithm, and psychometric properties (ie, reliability, validity and
length, it might be less practical for use in clinical, epidemiologic, responsiveness) of the items and scales composing the HIS-Q-SF.
or other research settings. There is increasing interest in
including PRO measurements in electronic medical records to
document outcomes from the patients’ perspective.6 In addition,
Participants
Twenty US clinical sites with specialties in urology and sexual
relatively brief and psychometrically sound measurements are
medicine participated in the study. Eligible participants, who
needed for hypogonadism-related epidemiologic studies. There-
signed informed consents, included men who were at least 18
fore, there is a need for a shorter instrument to practically assess
years old; diagnosed with hypogonadism (a clinical diagnosis
changes in hypogonadism symptoms over time to measure how
based on historical laboratory results[s] of serum total testos-
patients respond to treatment in real-world clinical settings.
terone concentration < 300 ng/dL by at least one laboratory test
The HIS-Q is a self-report questionnaire designed to assess before enrollment in the study); switching TRT treatments or on
changes in symptoms in hypogonadal men in response to maintenance therapy (currently on treatment with no change) or
testosterone replacement therapy (TRT).5,7 The 28-item treatment naïve (new to treatment); and able to understand
instrument was developed primarily for use in clinical trial set- English. Diagnostic tests for patients who were currently taking
tings, has been developed in accordance with the Food and Drug treatment were required to have been conducted before initiating
Administration’s (FDA) PRO guidance for industry,8 and treatment. Patients were excluded if they had non-stabilized
addresses the limitations of existing instruments. The developers depression (stabilized depression was defined as being on the
sought to develop an abbreviated version of the original HIS-Q, same antidepressant medication at the same dosage for 3
the HIS-Q short form (HIS-Q-SF), for inclusion in clinical months), severe psychiatric illness, or addictions; history of or
practice and academic settings (Appendix A, available online).9 current obstructive sleep apnea; a clinically significant medical
Because the HIS-Q-SF was developed concurrently with the condition; or were taking a concurrent medication that would
original HIS-Q, it is composed of a subset of items from the affect hormonal balance, sexual functioning (eg, phosphodies-
original version with the same instructions and recall period. The terase type 5 inhibitors), or interfere with participants’ partici-
goal was to align the items and scoring between the two forms so pation in the study. The sample size for the study was driven by
that the use of the HIS-Q and HIS-Q-SF would be meaningful the estimation of the factor analytic models; the total of 177
to clinicians in clinical practice and in evaluating and interpreting participants and estimation of a factor model with 14 items yields
results from clinical trials (Appendix B, available online). more than 12 observations per item. Sample size recommenda-
The two versions of the HIS-Q were developed based on a tions for factor analysis indicate that there should be at least five
literature review, extensive qualitative work, and input from subjects for each item included in the factor analysis.10,11 The
expert clinicians.5,9 The SF instrument contains 17 items sample size in this study exceeded these recommendations.
assessing the following five domains and two subdomains: the
sexual domain, including libido and sexual function subdomains; Procedures and Measurements
and additional domains for energy, sleep, cognition, and mood. The study protocol and procedures used were reviewed and
The final 17 items in the HIS-Q-SF were initially selected based approved by the appropriate institutional review committee
on qualitative research with patients to determine those most (Ethical and Independent Review Services, September 25, 2013,
relevant for inclusion in a short form.9 The item selections for the protocol 13110-01). All study staff members at every site were

J Sex Med 2017;14:1046e1058


1048
Table 1. Study events schedule
Visit 1, Visit 2, Week Week Week Week Visit 3, Items,
Study events Mode Screening baseline week 2* 4† 6† 8† 10† week 12* n Concepts measured Interpretation guidelines
Investigator/site completed
Clinical form Paper U Patients’ clinical —
characteristics
Clinical Global Paper U U U Single Physician’s overall Higher scores indicate
ImpressioneSeverity item impression of patient’s greater severity
(CGI-S) hypogonadism
symptoms (no
symptoms, very mild,
mild, moderate, severe,
very severe)
Clinical Global Paper U U Single Clinician’s perception of —
ImpressioneChange item change in symptoms
(CGI-C) between study visits
(much improved,
minimally improved, no
change, minimally worse,
much worse)
Serum testosterone Blood draw U U‡ U‡ — —
testing
Medical report form Paper Completed as needed to report any changes to patient’s treatment — —
or testosterone levels§
Patient completed
Hypogonadism Impact of U U U U U U U 53 Sexual, physical signs and Lower scores indicate better
Symptoms symptoms, energy, sleep, function, fewer
Questionnaire (HIS-Q) cognition, and mood symptoms
Aging Male’s Symptoms Electronic U U U 17 Sleep difficulty, low energy, Lower scores indicate better
Scale (AMS)12 physical symptoms, functioning
affects sexual functioning
and mood
International Index of Electronic U U U 15 Domains of sexual function Higher scores indicate less
Erectile Function (erectile dysfunction, dysfunction
(IIEF)13 orgasmic function, sexual
desire, intercourse
J Sex Med 2017;14:1046e1058

satisfaction, and overall


satisfaction)
PROMIS Interest in Electronic U U U 4 Sexual function, including Higher scores indicate more
Sexual Activity Scale14 desire in past 30 d interest in sexual activity
PROMIS Global Electronic U U U 7 Satisfaction with sex life in Higher scores indicate

Gelhorn et al
Satisfaction with Sex past 30 d greater satisfaction
Life Scale14
(continued)
J Sex Med 2017;14:1046e1058

Psychometric Evaluation of the HIS-Q-SF


Table 1. Continued
Visit 1, Visit 2, Week Week Week Week Visit 3, Items,
Study events Mode Screening baseline week 2* 4† 6† 8† 10† week 12* n Concepts measured Interpretation guidelines
PROMIS Sleep Electronic U U U 8 Perceptions of sleep quality, Higher scores indicate
Disturbance Scale15 sleep depth, and greater sleep disturbance
restoration associated
with sleep
PROMIS Applied Electronic U U U 8 Perceptions of cognitive Higher scores indicate
CognitioneAbilities functioning and changes better cognitive
(SF 8a)16 in cognitive abilities (eg, functioning
concentration, memory)
over 7-d period
Short Form-12 Health Electronic U U U 12 Functional health and well- Higher scores indicate
Survey (SF-12)17 being, physical and better health status
mental health, and health
utility during typical day
and over past 4 wk
Anchor questions Electronic U U U U U 9 Recall: past 14 d; response Higher scores indicate
options: frequency or better functioning,
severity Likert response outcomes
scale; sexual activity,
libido, erectile
functioning, overall sexual
function, tiredness,
mood, cognitive
functioning, sleep, and
overall hypogonadism
condition
Daily diaryk Electronic Daily from baseline to day 28 12 Sexual activity, erectile Higher scores indicate
function, energy, sleep, greater frequency, more
cognition, and mood symptoms
*At visits 1, 2, and 3, patients attended a study visit; however, all patient-reported outcome measurements for visits 2 and 3 were completed by patients at home, before the visit, at their regularly scheduled
time on the electronic patient-reported outcome device.

Patients did not come to the sites for study-related visits during these weeks but completed study questionnaires on the electronic patient-reported outcome device at home at each of these time points.

Only patients new to treatment had study-related blood draws at visits 2 and 3.
§
Sites reported any additional serum testosterone blood sample results that were independent of the study blood draws at each visit.
k
A subset of 60 patients completed a daily diary of sexual activity domain-related questions and questions related to the other domains (energy, sleep, cognitive, mood) from baseline to day 28.

1049
1050 Gelhorn et al

trained using a standardized training protocol on the purpose and Convergent and divergent validity of the HIS-Q-SF was
procedures for the study. Participants completed a total of three assessed. Pearson product-moment and Spearman rank correlation
in-person study visits at baseline, week 2, and week 12. Partic- coefficients were used to estimate the relation at baseline between
ipants also completed assessments at home on an electronic PRO all HIS-Q-SF domain scores and (i) the Aging Male’s Symptoms
device. The first set of assessments were completed during their Scale (AMS) sexual, somato-vegetative, and psychological domain
first in-clinic visit at baseline and then at home every 2 weeks scores; (ii) the International Index of Erectile Function (IIEF)
from week 2 to week 12. A summary of site and patient- erectile dysfunction, orgasmic function, sexual desire, intercourse
completed study events is presented in Table 1, which includes satisfaction, and overall satisfaction scores; (iii) PROMIS Sexual
brief descriptions of each instrument.12e17 Testosterone levels Activity score; (iv) PROMIS Global Satisfaction with Sex Life
were assessed through blood draws from all participants at score; (v) PROMIS Sleep Disturbance score; (vi) PROMIS
baseline and from participants who were beginning treatment Applied Cognition score; (vii) the Short Form-12 Health Survey
(treatment naïve) or switching to a different treatment (SF-12) vitality and physical component summary score; and (viii)
(switchers) at weeks 2 and 12. Clinical Global ImpressioneSeverity (CGI-S) scores. Correlations
between the HIS-Q-SF domain scores and testosterone (free and
total) also were assessed for the total sample at baseline. The HIS-
Statistical Analyses Q-SF scores were expected to be moderately (r ¼ 0.30e0.50) to
Sociodemographic and clinical variables were used to charac- highly (r > 0.50) correlated with conceptually corresponding
terize the patient sample using descriptive statistics. Item-level measurements and domains, demonstrating convergent validity28
descriptive statistics (mean, SD, median, range, frequency) (eg, HIS-Q sexual domains and PROMIS and AMS sexual scales,
were used to evaluate the performance of individual HIS-Q-SF HIS-Q energy domain and AMS somato-vegetative scale and SF-
items. Confirmatory factor analysis with specification of a six- 12 vitality item, HIS-Q sleep domain and PROMIS sleep, HIS-Q
factor model (consistent with the original HIS-Q) was used to cognition domain and PROMIS cognition, and HIS-Q mood
confirm the factor structure of the instrument. Model fit was domain and AMS psychological scale). Divergent validity was
assessed with the confirmatory fit index and the root mean assessed by examining correlations among domains that were hy-
squared error of approximation. In general, a model is considered pothesized to be unrelated (eg, PROMIS sleep disturbance and
to have good fit and explain the data well if the confirmatory fit HIS-Q-SF sexual domain and subdomain scores), and small cor-
index is at least 0.90.18,19 The root mean squared error of relations (r < 0.30) were expected.28
approximation is a measurement of fit assessing the discrepancy
between the predicted and observed data per degree of freedom; To evaluate known-groups validity, domain scores on the
values of 0.01, 0.05, and 0.08 suggest excellent, good, and HIS-Q-SF were analyzed by disease severity based on the CGI-S.
mediocre fit, respectively.20 Rasch analyses21 were used to eval- Mean scores on the HIS-Q-SF domains were compared for each
uate individual item and subscale properties. The confirmatory of the CGI-S severity levels (no symptoms or very mild, mild,
factors and Rasch analyses were conducted using Mplus22 and moderate, and severe) using analysis of covariance at baseline and
RUMM 2030,23 respectively. week 12 controlling for age, sex, and race. In addition, the
known-groups validity of the HIS-Q-SF was examined using
After item evaluation of the individual HIS-Q-SF items and baseline measurements of total and free testosterone.
development of the scoring algorithm, the psychometric prop-
erties of the HIS-Q-SF were evaluated. Internal consistency To evaluate responsiveness, the extent to which the instru-
reliability of the HIS-Q-SF domains was evaluated using the ment can detect true change in participants known to have
Cronbach a coefficient24 at baseline, with reliability values of at changed in clinical status, patient-rated anchor questions were
least 0.70 indicating a more reliable (precise) instrument.25 used to characterize change from baseline to week 6 and from
baseline to week 12. Responsiveness analyses also were con-
Test-retest reliability was examined for all HIS-Q-SF domain ducted using the CGI-S score changes from baseline to week 12.
scores to examine the stability of the HIS-Q-SF over time within a
stable population. Stable subjects were defined as those with “no Responder definitions were identified for the HIS-Q-SF do-
change” in patient-rated anchor questions for each HIS-Q-SF mains, subdomains, and total score using anchor-based and
domain and the total score from baseline to week 2. Test-retest distribution-based methods. Mean scores for each HIS-Q-SF
reliability was assessed using intraclass correlations coefficients domain and subdomain for participants who improved by one
(ICCs) and paired-sample t-tests among stable patients only. ICCs point on each concept-specific anchor question were used to
range from 0 to 1.0, with higher scores indicating a more stable establish anchor-based responder definitions. The SDs at baseline
instrument. The hypothesis was that there would be no significant (0.2, 0.3, and 0.5) and the standard error of measurement29,30
differences in scale scores when there was no change in disease were used to establish distribution-based responder definitions.
status. ICCs should be statistically significant and high (>0.70). Then, the anchor- and distribution-based definitions were
An ICC of at least 0.7 indicates good test-retest reliability, 0.4 to triangulated to derive final responder definitions of clinically
0.7 indicates moderate test-retest reliability, and lower than 0.4 meaningful change for each of the HIS-Q-SF domain, sub-
indicates low test-retest reliability.25e27 domain, and total scores.

J Sex Med 2017;14:1046e1058


Psychometric Evaluation of the HIS-Q-SF 1051

A direct comparison between the original HIS-Q and the Table 2. Sociodemographic characteristics
HIS-Q-SF total and domain scores was conducted. Pearson Total sample
correlation coefficients were calculated for the total, domain, and (N ¼ 177)
subdomain scores of the two measurements. It was expected that
Age
the same domains across measurements would be very highly
Mean (SD) 54.1 (11.4)
correlated (>0.80).
Median (range) 55.0 (23.0e83.0)
Missing, n (%) 1 (0.6)
RESULTS Race, n (%)*
Black or African American 32 (18.1)
The participants were recruited from 20 clinical sites across White 131 (74.0)
the United States (number of patients per site: mean ¼ 9.9, SD ¼ Other† 7 (4.0)
4.6). The final analysis sample included data from 177 men including Missing 7 (4.0)
89 men who were on the same TRT throughout the study (main- Ethnicity, n (%)
tenance patients), 41 men who were switching from one form of Hispanic or Latino 9 (5.1)
TRT to another at the start of the study (switchers), and 47 men who Not Hispanic or Latino 166 (93.8)
were initiating TRT for the first time (treatment-naïve patients). Missing 2 (1.1)
Employment status, n (%)
Employed fulltime 105 (59.3)
Demographics and Clinical Characteristics Employed part-time 12 (6.8)
The men participating in the study had a mean age of 54.1 years Student 2 (1.1)
(range ¼ 23e83), and most were white (74.0%). Most reported Unemployed, disabled, retired 52 (29.4)
being involved in an intimate relationship (82.5%; Table 2). The Other‡ 5 (2.8)
average duration of hypogonadism diagnosis was 2.2 years (SD ¼ Missing 1 (0.6)
3.2). The mean baseline testosterone level of participants was Education, n (%)
507.6 ng/dL (Table 3). The men participating in the study had Secondary, high school, some college, 88 (49.7)
moderate to severe levels of impairment in sexual functioning as trade school
measured by the AMS (overall mean ¼ 11.7, SD ¼ 4.3), and College degree 58 (32.8)
clinician ratings of hypogonadism severity for the participants Postgraduate degree 30 (16.9)
indicated mild to moderate symptom severity at baseline. Missing 1 (0.6)
Currently in an intimate relationship, n (%)
Yes 146 (82.5)
HIS-Q-SF Item Evaluation, Factor Structure, and No 30 (16.9)
Scoring of HIS-Q-SF Missing 1 (0.6)
Overall, the individual item-level analyses demonstrated *Categories are not mutually exclusive.
acceptable distribution of the HIS-Q-SF item responses across †
Other race: Asian (n ¼ 3), Native Hawaiian or Pacific Islander (n ¼ 1),
the response categories and good distributional characteristics. Hispanic (n ¼ 1), Haitian (n ¼ 1), and Jamaican (n ¼ 1).

After item evaluation, factor analysis was completed on the Other employment: self-employed (n ¼ 4), and sales (n ¼ 1).
17-item HIS-Q-SF. Because the original HIS-Q had a six-factor not precisely distinguish between participants with different
solution (including the libido subdomain, sexual function sub- levels of hypogonadism severity) for three items (“difficult
domain, and energy, sleep, cognition, and mood domains), this achieving erections,” “difficulty ejaculating,” and “feeling sad”).
model also was examined for the HIS-Q-SF. This six-factor
The final HIS-Q-SF scoring includes each of the 14 ordinal
model showed acceptable factor loadings (0.500e1.004)
response scale items and yields five domain scores (sexual; energy,
and demonstrated acceptable fit to the data (confirmatory
sleep, cognition, and mood) and two sexual subdomain scores
fit index ¼ 0.993, root mean squared error of approximation ¼
(libido and sexual function); a total score also can be calculated.
0.086; eTable 1).
Scores are scaled from 0 to 100, where higher scores indicate
Rasch analyses were conducted using baseline data on each of greater levels of dysfunction. The open-ended items (items 1e3)
the domains and subdomains identified through the factor ana- representing numerical response data were not included in the
lyses. Item performance was very good, with all items demon- final scoring algorithm but could provide useful information on
strating fit to the Rasch model (P > .05 for all comparisons) and the frequency of sexual activity.
good distributions of item thresholds (b range: libido ¼ 4.8 to
3.6, sexual function ¼ 2.8 to 2.9, energy ¼ 5.6 to 5.6,
Psychometric Evaluation of HIS-Q-SF
sleep ¼ 1.3 to 1.8, cognition ¼ 2.9 to 1.4, mood ¼ 3.4 to
2.1). The item thresholds were well matched to the distributions Reliability
of individuals within each domain. There were a few minor issues The internal consistency reliability of the HIS-Q-SF instru-
with disordered thresholds (ie, some item response categories did ment was evaluated for each of the HIS-Q-SF scores at baseline.

J Sex Med 2017;14:1046e1058


1052 Gelhorn et al

Table 3. Baseline participant clinical characteristics—site reported Internal consistency reliability was acceptable for the sexual
Total sample domain (0.82), energy domain (0.88), libido subdomain (0.78),
(N ¼ 177) sexual function subdomain (0.91), and total score (0.84). The
mood domain demonstrated internal consistency that was
Time since initial diagnosis of hypogonadism (y)
slightly lower than the accepted threshold (Cronbach
Mean (SD) [range] 2.2 (3.2)
a ¼ 0.64).The internal consistency reliability estimates of the
[0.0e20.6]
Unknown, n (%) 1 (0.6)
sleep and cognition domains were lower at 0.39 and 0.44,
Provider-reported hypogonadism etiology, n (%) respectively, and did not reach the acceptable target of a
Primary congenital 23 (13.0) Cronbach a of at least 0.70.
Primary acquired 81 (45.8) Test-retest reliability was assessed for stable patients from
Secondary congenital 0 (0.0) baseline to week 2. Using the patient-completed anchor questions
Secondary acquired 28 (15.8) to identify stable patients, test-retest reliability for the sexual
Combined 7 (4.0) domain, libido subdomain, sexual function subdomain, energy
Unknown 38 (21.5) domain, mood domain, and HIS-Q-SF total score were very good
Specific suspected etiology or diagnosis, n (%) (ie, ICCs > 0.70). The sleep and cognition domains had mod-
Pituitary adenoma or disorder 2 (1.1)
erate test-retest reliability (ICC ¼ 0.67 and 0.61, respectively).
Testicular trauma or disorder 2 (1.1)
Other* 19 (10.7)
Unknown 154 (87.0)
Validity
Chief complaint or presenting symptom, n (%) Overall, good convergent and divergent validity was demon-
Erectile dysfunction 46 (26.0) strated for all HIS-Q-SF domains and subdomains, as reflected by a
Low libido 45 (25.4) pattern of statistically significant moderate to large correlations (r
Tiredness 29 (16.4) > 0.30) between each HIS-Q-SF domain score or subdomain
Fatigue 48 (27.1) score and the corresponding PRO or clinician rating at baseline. As
Other† 5 (2.8) expected, acceptable correlations (r > 0.30) were generally
Unknown 4 (2.3) observed between HIS-Q-SF domains and PRO or clinical sub-
History of testosterone replacement scales measuring similar concepts, and smaller correlations were
medications, n (%) demonstrated between items that were less conceptually related (r
No history of testosterone 52 (29.4)  0.30). For example, the sexual domain and libido and sexual
replacement medications function subdomains were strongly correlated with the AMS sexual
Buccal 2 (1.1)
scale (r ¼ 0.59, 0.39, 0.55; P < .0001 for all comparisons), the
Topical 69 (39.0)
IIEF sexual desire domain (r ¼ 0.49, 0.71, 0.35; P < .0001 for all
Patch 6 (3.4)
comparisons), and the PROMIS Sexual Activity score
Subcutaneous pellet 20 (11.3)
(r ¼ 0.57, 0.76, 0.40; P < .0001 for all comparisons). The
Injection 69 (39.0)
Missing 1 (0.6)
sexual and sexual function subdomains also were strongly corre-
BMI‡ (calculated), mean (SD) [range] 30.2 (5.2) lated with the PROMIS Global Satisfaction with Sex Life score
[21.5e53.2] (r ¼ 0.59, 0.57; P < .0001). The energy domain was strongly
Baseline serum total testosterone correlated with the AMS somato-vegetative scale (r ¼ 0.66; P <
concentration (n ¼ 172)§ .0001) and the SF-12 vitality item (r ¼ 0.62; P < .0001). The
Concentration (ng/dL), mean (SD) 507.6 (495.4) sleep domain was strongly correlated with the PROMIS Sleep
[range] [19.7e4,160.0] Disturbance score (r ¼ 0.66; P < .0001), the cognition domain
Missing, n (%) 5 (2.8) was strongly correlated with the PROMIS Applied Cognition score
Baseline free testosterone concentration (r ¼ 0.65; P < .0001), and the mood domain was strongly
(n ¼ 1,490)
correlated with the AMS psychological scale (r ¼ 0.78; P < .0001).
Concentration (ng/Dl), mean (SD) 15.0 (15.9)
Divergent validity was demonstrated through low correlations
[range] [0.5e126.0]
among conceptually unrelated scales. In the total sample, the HIS-
Missing, n (%) 28 (15.8)
Q-SF sexual, energy, sleep, and cognition domains, the libido and
BMI ¼ body mass index. sexual function subdomains, and the total score demonstrated
*Other suspected etiology: senescent (n ¼ 13), obesity (n ¼ 2), aging
(n ¼ 3), and testicular failure (n ¼ 1).
notable (but small to moderate) correlations with testosterone

Other chief complaint: no symptom (n ¼ 1), poor concentration (n ¼ 1), low levels at baseline (r ¼ 0.16 to 0.38; P < .05 for all
energy (n ¼ 1), weakness (n ¼ 1), and weight gain (n ¼ 1). comparisons).

BMI ¼ (weight in pounds  703)/(height in inches)2.
§
Baseline serum total testosterone concentration lower than 300 ng/dL in The known-groups validity was good for all HIS-Q-SF
68 patients. domains and subdomains based on clinician-rated impression

J Sex Med 2017;14:1046e1058


Psychometric Evaluation of the HIS-Q-SF 1053

Table 4A. Known-groups validity—CGI-S symptom severity categories at baseline


CGI-S symptom severity categories

No symptoms
or very mild Mild Moderate Severe Overall F-test*
Pairwise
LS mean LS mean LS mean LS mean comparison†
HIS-Q-SF score n (SE) n (SE) n (SE) n (SE) F P value (P value)
Sexual symptoms domain 34 31.3 (3.9) 36 41.5 (3.8) 73 49.8 (2.7) 30 61.8 (4.2) 10.53 <.0001 2‡, 3k, 5‡
Libido subdomain 34 34.2 (3.6) 36 43.1 (3.5) 73 42.8 (2.4) 30 48.8 (3.8) 2.72 .0459
Sexual function subdomain 35 31.0 (5.5) 36 42.4 (5.4) 72 55.1 (3.8) 30 70.6 (5.9) 9.37 <.0001 2‡, 3k, 5‡
Energy symptoms domain 35 34.6 (4.2) 37 44.3 (4.1) 73 51.7 (2.9) 30 59.6 (4.5) 6.45 .0004 2‡, 3‡
Sleep symptoms domain 35 27.5 (3.5) 37 28.7 (3.4) 71 34.9 (2.5) 30 45.8 (3.8) 5.28 .0017 3‡, 5‡
Cognition symptoms domain 35 24.3 (2.9) 36 29.9 (2.8) 73 36.1 (2.0) 30 45.0 (3.1) 9.26 <.0001 2‡, 3k, 5‡
Mood symptoms domain 35 20.2 (2.9) 37 27.5 (2.8) 72 31.1 (2.0) 30 37.5 (3.1) 6.01 .0006 2‡, 3‡
HIS-Q-SF total score (14 items) 34 27.6 (2.3) 34 35.6 (2.3) 70 41.7 (1.6) 30 51.6 (2.5) 18.06 <.0001 2k, 3k, 5§, 6‡
CGI-S ¼ Clinical Global ImpressioneSeverity; HIS-Q-SF ¼ Hypogonadism Impact of Symptoms Questionnaire Short Form; LS ¼ least squares;
SE ¼ standard error.
*General linear model (PROC GLM): 1 ¼ no symptoms or very mild vs mild; 2 ¼ no symptoms or very mild vs moderate; 3 ¼ no symptoms or very mild vs
severe; 4 ¼ mild vs moderate; 5 ¼ mild vs severe; 6 ¼ moderate vs severe.

Pairwise comparisons between LS means were performed using the Scheffe test adjusting for multiple comparisons.

P < .05; §P < .001; kP < .0001.

of severity of symptoms (P < .05 for all comparisons; Table 4A). significant changes in the expected direction for all HIS-Q-SF scales
All domains also discriminated between categories of total from baseline to week 6 and from baseline to week 12 (P < .05 for all
testosterone levels (P < .05; Table 4B), except the libido sub- comparisons; Table 5). Although lower, responsiveness also was
domain (P ¼ .0536). The sexual symptoms domain, sexual demonstrated using changes based on the CGI-S; all domains, except
function subdomain, energy domain, and sleep domain and the cognition and libido, showed significant changes in the expected di-
total score discriminated between free testosterone levels rection from baseline to week 12 (P < .05 for all comparisons; Table 6).
(P < .05 for all comparisons; eTable 2).
Responder Definitions
Responsiveness Responder definitions were defined using anchor- and
Responsiveness of the instrument was very good for each of the distribution-based methods. To obtain anchor-based estimates,
domains, subdomains, and total score at each time point, in particular the mean score for participants who improved by one point on
when assessed using patients’ reports of their condition. Changes in the anchor questions are reported in Table 5 for each domain.
each of the patient-reported anchor questions were reflected by The anchor-based and distribution-based responder estimates are

Table 4B. Known-groups validity—total testosterone categories at baseline


Total testosterone categories

<300 ng/dL 300e500 ng/dL >500 ng/dL Overall F-test* Pairwise


comparison†
HIS-Q-SF score N LS mean (SE) N LS mean (SE) N LS mean (SE) F P value (P value)

Sexual symptoms domain 68 54.3 (2.9) 55 46.1 (3.2) 46 37.4 (3.5) 6.91 .0013 2‡
Libido subdomain 68 46.9 (2.5) 56 42.2 (2.8) 45 37.2 (3.1) 2.98 .0536
Sexual function subdomain 68 59.2 (4.1) 54 50.2 (4.6) 47 39.0 (4.9) 4.98 .0079 2‡
Energy symptoms domain 68 53.9 (3.1) 56 46.7 (3.4) 47 40.7 (3.7) 3.90 .0222 2‡
Sleep symptoms domain 67 38.1 (2.6) 55 35.7 (2.9) 47 26.6 (3.1) 4.33 .0147 2‡
Cognition symptoms domain 68 39.2 (2.1) 55 32.3 (2.4) 47 28.5 (2.6) 5.55 .0046 2‡
Mood symptoms domain 68 33.0 (2.1) 55 28.5 (2.4) 47 24.5 (2.6) 3.28 .0400 2‡
HIS-Q-SF total score (14 items) 67 45.2 (1.8) 52 38.2 (2.0) 45 32.6 (2.2) 10.12 <.0001 1‡, 2k
HIS-Q-SF ¼ Hypogonadism Impact of Symptoms Questionnaire Short Form; LS ¼ least squares; SE ¼ standard error.
*General linear model (PROC GLM).

Pairwise comparisons between LS means were performed using the Scheffe test adjusting for multiple comparisons: 1 ¼ <300 vs 300e500 ng/dL;
2 ¼ <300 vs >500 ng/dL; 3 ¼ 300e500 vs >500 ng/dL.

P < .05; §P < .001; kP < .0001.

J Sex Med 2017;14:1046e1058


1054
Table 5. Responsiveness and anchor-based interpretation: HIS-Q-SF score change by concept-specific anchor question score change
Changes in sexual activity

Decline (1) Stable (0) Improvement (1) Improvement (2) Overall F-test*
Pairwise comparison†
HIS-Q-SF score change n LS mean (SE) n LS mean (SE) n LS mean (SE) n LS mean (SE) F P value (P value)
Sexual domain by sexual activity anchor
Baseline to week 6 30 10.5 (3.3) 65 2.6 (2.2) 42 13.7 (2.8) 18 43.9 (4.2) 38.31 <.0001 1‡, 2k, 3k, 4‡, 5k, 6k
Baseline to week 12 25 6.6 (3.7) 62 3.3 (2.4) 41 11.7 (2.9) 20 34.5 (4.2) 20.55 <.0001 2‡, 3k, 5k, 6§
Sexual domain by overall sexual function anchor
Baseline to week 6 31 11.9 (3.5) 54 5.6 (2.7) 41 10.7 (3.1) 29 29.3 (3.7) 22.36 <.0001 1‡, 2k, 3k, 5k, 6‡
Baseline to week 12 24 5.2 (3.9) 63 1.3 (2.4) 27 14.8 (3.7) 34 25.2 (3.3) 16.63 <.0001 2‡, 3k, 4‡, 5k
Libido subdomain by libido anchor
Baseline to week 6 39 9.9 (2.3) 64 4.7 (1.8) 42 12.8 (2.2) 9 27.8 (4.7) 26.79 <.0001 1k, 2k, 3k, 4‡, 5§, 6‡
Baseline to week 12 34 7.4 (3.0) 68 1.1 (2.1) 33 9.1 (3.0) 13 26.9 (4.8) 14.17 <.0001 2‡, 3k, 5k, 6‡
Sexual function subdomain by erectile function anchor
Baseline to week 6 31 3.0 (6.0) 79 5.7 (3.7) 26 20.5 (6.5) 19 27.6 (7.6) 3.55 .0160
Baseline to week 12 35 1.0 (5.0) 66 8.3 (3.7) 26 18.9 (5.9) 21 31.4 (6.5) 5.34 .0016 2‡, 3‡
Sexual function subdomain overall sexual function anchor
Baseline to week 6 31 17.2 (5.3) 54 8.3 (4.0) 41 14.0 (4.6) 29 38.2 (5.5) 17.90 <.0001 1‡, 2§, 3k, 5§, 6‡
Baseline to week 12 24 5.9 (5.6) 63 1.2 (3.4) 27 23.2 (5.2) 34 34.6 (4.7) 16.07 <.0001 2‡, 3k, 4‡, 5k
Energy symptom domain by tiredness anchor
Baseline to week 6 22 8.5 (4.4) 70 2.7 (2.5) 52 19.7 (2.9) 11 42.1 (6.2) 21.49 <.0001 1‡, 2k, 3k, 4§, 5k
Baseline to week 12 26 10.6 (3.9) 56 6.3 (2.7) 49 21.7 (2.9) 16 49.2 (5.0) 34.38 <.0001 1§, 2k, 3k, 4‡, 5k, 6‡
Mood symptom domain by mood anchor
Baseline to week 6 46 12.3 (2.2) 64 2.0 (1.9) 33 9.3 (2.6) 12 22.9 (4.3) 24.83 <.0001 1‡, 2k, 3k, 4‡, 5k
Baseline to week 12 37 9.0 (2.5) 59 1.3 (2.0) 36 6.3 (2.5) 16 25.5 (3.8) 21.35 <.0001 2§, 3k, 5k, 6§
Cognition symptom domain by cognition anchor
Baseline to week 6 43 7.0 (2.5) 69 0.0 (2.0) 29 8.6 (3.1) 14 21.4 (4.4) 12.55 <.0001 2‡, 3k, 5§
Baseline to week 12 30 7.5 (3.2) 75 0.2 (2.0) 29 10.8 (3.3) 14 18.8 (4.7) 9.99 <.0001 2‡, 3§, 4‡, 5‡
Sleep symptom domain by sleep anchor
Baseline to week 6 51 8.8 (2.3) 58 3.7 (2.1) 39 10.9 (2.6) 6 35.4 (6.7) 19.85 <.0001 1‡, 2k, 3k, 5§, 6‡
Baseline to week 12 34 8.8 (3.1) 54 4.9 (2.5) 48 9.6 (2.6) 10 35.0 (5.7) 16.90 <.0001 1‡, 2§, 3k, 5k, 6‡
HIS-Q-SF total score by overall hypogonadism anchor
J Sex Med 2017;14:1046e1058

Baseline to week 6 29 6.5 (2.2) 54 0.9 (1.6) 40 7.6 (1.9) 28 20.4 (2.3) 27.18 <.0001 2k, 3k, 5k, 6§
Baseline to week 12 22 5.1 (2.7) 62 1.8 (1.6) 25 10.6 (2.5) 34 19.1 (2.2) 21.11 <.0001 2§, 3k, 4‡, 5k
HIS-Q-SF ¼ Hypogonadism Impact of Symptoms Questionnaire Short-Form; LS ¼ least squares; SE ¼ standard error.
*General linear model (PROC GLM).

Gelhorn et al

Pairwise comparisons between LS means were performed using the Scheffe test adjusting for multiple comparisons: 1 ¼ decline vs stable group; 2 ¼ decline vs improvement (1) group; 3 ¼ decline vs
improvement (>1) group; 4 ¼ stable vs improvement (1) group; 5 ¼ stable vs improvement (>1) group; 6 ¼ improvement (1) vs improvement (>1) group.

P < .05; §P < .001; kP < .0001.
Psychometric Evaluation of the HIS-Q-SF 1055

Table 6. Responsiveness: HIS-Q-SF score change by CGI-S score change from baseline to week 12
CGI-S Score Change

Decline (1) Stable (0) Improvement (1) Overall F-test* Pairwise


comparison†
HIS-Q-SF score change n LS mean (SE) n LS mean (SE) n LS mean (SE) F P value (P value)

Sexual domain 13 6.9 (5.9) 58 0.7 (2.8) 73 15.9 (2.5) 11.70 <.0001 1 §, 2 ‡
Libido subdomain 14 3.6 (5.0) 57 0.7 (2.5) 73 7.7 (2.2) 3.21 .0433 1*
Sexual subdomain 14 7.7 (7.8) 58 1.6 (3.8) 72 22.9 (3.4) 11.87 <.0001 1 §, 2 ‡
Energy domain 16 0.0 (6.6) 57 9.6 (3.5) 73 17.6 (3.1) 3.49 .0332
Sleep domain 16 0.0 (5.2) 57 1.3 (2.8) 72 10.2 (2.5) 3.56 .0311
Cognition domain 16 7.0 (4.7) 57 0.9 (2.5) 73 4.8 (2.2) 2.79 .0646
Mood domain 16 6.8 (4.4) 58 0.7 (2.3) 72 4.9 (2.1) 3.49 .0332
Total score 13 2.8 (3.9) 54 1.8 (1.9) 71 11.9 (1.7) 11.03 <.0001 1 §, 2 ‡
CGI-S ¼ Clinical Global ImpressioneSeverity; HIS-Q-SF ¼ Hypogonadism Impact of Symptoms Questionnaire Short Form; LS ¼ least squares;
SE ¼ standard error.
*General linear model (PROC GLM).

Pairwise comparisons between LS means were performed using the Scheffe test adjusting for multiple comparisons: 1 ¼ improvement vs stable;
2 ¼ improvement vs decline; 3 ¼ stable vs decline.

P < .05; §P < .001; kP < .0001.

presented in Figure 1, as are the final responder definitions Comparison of Original HIS-Q and HIS-Q-SF
that were determined by triangulating across all estimates
Correlational Analysis
(responder definitions: sexual ¼ 10.0; libido ¼ 12.5; sexual
Pearson correlation coefficients were calculated for the
function ¼ 16.6; energy ¼ 12.5; sleep ¼ 12.5;
domain, subdomain, and total scores from the HIS-Q-SF and
cognition ¼ 12.5; mood ¼ 8.3, total ¼ 7.1; Figure 1).
the original HIS-Q, a published instrument with demonstrated

20.00

15.00
Clinically Meaningful Change Estimates

0.2 Standard Deviation


0.3 Standard Deviation
0.5 Standard Deviation
10.00
Standard Error of Measurement
Anchor Based - Week 2
Anchor-based - Week 6
Anchor-based - Week 12
Triangulation

5.00

0.00

HIS-Q Domains and Subdomains

Figure 1. Summary of anchor- and distribution-based estimates of clinically meaningful change for the Hypogonadism Impact of
Symptoms Questionnaire Short Form.

J Sex Med 2017;14:1046e1058


1056 Gelhorn et al

reliability and validity.7 Same domain, subdomain, and total The final 17 items of the HIS-Q-SF fit on a single page and
score correlations were very high (r ¼ 0.91e0.97). In particular, can easily be administered in clinical settings in less than 2 mi-
the sexual domain and total scores across the two versions of the nutes. The original HIS-Q form is expected to be used in future
HIS-Q were very highly correlated at 0.97 and 0.96, clinical trials evaluating TRTs. Therefore, the alignment of items
respectively. and domains will allow for meaningful comparisons to be made
by individual clinicians. Although the results suggest that the
correlations among the domains, subdomains, and total score for
DISCUSSION the HIS-Q and HIS-Q-SF were very high, to some extent these
The 17-item HIS-Q-SF is an abbreviated version of the high values are expected because the items overlap and response
original 28-item HIS-Q PRO instrument, which was designed to data were drawn from the same dataset. Further research is
assess changes in hypogonadal symptoms in response to TRT. necessary to estimate these correlations among separate admin-
The present analysis provides strong support for the psycho- istrations of the HIS-Q and HIS-Q-SF. Future research also
metric properties (reliability, validity, and responsiveness) of the could be conducted to establish a common score metric between
reduced set of items in the HIS-Q-SF. When comparing the the original HIS-Q and the HIS-Q-SF.
psychometric properties between the original and short forms, in As evidenced by the deceased responsiveness of the HIS-Q-SF
general the measurements performed in a similar manner (for a to clinician ratings, the results of this study also highlighted that
comparison, see eTable 3), which would be expected because the changes in particular symptoms, such as libido, cognition, and
HIS-Q-SF includes a subset of items from the original HIS-Q sleep, are often not accurately detected or rated by a clinician;
(Appendix B). However, there were a few notable differences. therefore, a brief PRO tool that can be used in clinical practice
For internal consistency reliability measured at baseline, the might be useful. This highlights the potential utility of the HIS-
HIS-Q-SF had higher Cronbach a values for the sexual domains Q-SF because this measurement was developed and designed
and subdomains, but lower values for the sleep (0.39 vs 0.58), specifically for tracking the change of hypogonadal symptoms
cognition (0.44 vs 0.65), and mood (0.64 vs 0.85) domains. This after the initiation of TRT. Unlike many other existing mea-
is not surprising because scales with larger numbers of items tend surements that are often used in studies of these patients, the
to have higher internal consistency reliability. The test-retest HIS-Q-SF provides a comprehensive and content valid means of
reliability of the two versions was very good and fairly consis- assessing the broad range of symptoms that affect this population
tent across domains. specifically. In addition, the HIS-Q-SF provides numerical
The convergent validity was very consistent across the HIS- feedback on frequency of sexual activity and Likert-type feedback
Q and HIS-Q-SF versions. The libido subdomain of the HIS- on symptoms from all relevant domains; this distinguishes the
Q-SF did not distinguish between patients with different measurement from other available tools. The HIS-Q-SF also was
categories of total testosterone levels, which differs from the developed in accord with the FDA guidelines for PROs,
findings for the original HIS-Q. Other measurements of including multiple rounds of direct patient input.8 The FDA was
known-groups validity were generally very good and compa- involved in the review of the various iterations of the HIS-Q.
rable between measurements. The responsiveness of the HIS- The developers believe that using the HIS-Q-SF will allow cli-
Q-SF and the HIS-Q also were highly similar. Overall, the nicians to monitor patients’ hypogonadal symptoms in response
HIS-Q-SF and original HIS-Q had highly similar psycho- to treatment and over time.
metric characteristics. There were several limitations that should be noted. The
All domains from the 28-item original HIS-Q were retained to testosterone levels for the participating men were not as low as
closely align the HIS-Q and HIS-Q-SF measurements. Careful anticipated at baseline owing to the three categories of partici-
consideration was given to the selection of the final HIS-Q-SF pants sought by the original research. This could have limited the
items based on a combination of direct patient input, quantita- authors’ ability to examine the relations between the HIS-Q-SF
tive data, and clinical expert feedback. The authors acknowledge and testosterone levels. The participants in the present study
that “morning erections,” an item included in the original HIS- were recruited through clinical sites, and the utility of the HIS-
Q, is an important clinical symptom. However, based on the Q-SF among other samples of men, for example, those from the
qualitative research to confirm the items in the HIS-Q-SF, general population or men who have not sought treatment for
including concept elicitation, patients did not bring up the symptoms of hypogonadism, is unknown and could be a focus of
symptom as being one of the most relevant or important future research. The correlations that have been reported be-
symptoms, which would support inclusion in a short form. For tween scores on the full HIS-Q and the HIS-Q-SF are based on
those investigators specifically interested in the item on morning data from a single administration of the HIS-Q; future research
erections, the complete three-item libido scale from the longer should evaluate correlations between the measurements when
HIS-Q could be included in the HIS-Q-SF with no substantive each is administered as a standalone instrument in the same
change in psychometric qualities, content coverage, or respon- sample. Further, the analyses were conducted using data
dent burden (18 vs 17 items). collected through administration of the full HIS-Q; future

J Sex Med 2017;14:1046e1058


Psychometric Evaluation of the HIS-Q-SF 1057

studies should administer the standalone HIS-Q-SF (ie, only the 2. Zitzmann M, Faber S, Nieschlag E. Association of specific
17-item form) and replicate the psychometric analyses reported symptoms and metabolic risks with serum testosterone in
in the present study. older men. J Clin Endocrinol Metab 2006;91:4335-4343.
3. Gooren LJ. Endocrine aspects of ageing in the male. Mol Cell
CONCLUSIONS Endocrinol 1998;145:153-159.

The 17-item HIS-Q-SF is a brief assessment tool for the 4. Morley JE. Testosterone replacement and the physiologic as-
pects of aging in men. Mayo Clin Proc 2000;75(Suppl):S83-
evaluation of hypogonadal symptoms. Like the original HIS-Q,
S87.
the HIS-Q-SF demonstrated good reliability, validity, and
responsiveness. The measurement could be a useful tool for 5. Gelhorn HL, Vernon MK, Stewart KD, et al. Content validity
of the Hypogonadism Impact of Symptoms Questionnaire
application in clinical practice, epidemiologic studies, and other
(HIS-Q): a patient-reported outcome measure to evaluate
academic research settings.
symptoms of hypogonadism. Patient 2016;9:181-190.
6. Jensen RE, Snyder CF, Basch E, et al. All together now:
Corresponding Author: Heather L. Gelhorn, PhD, Senior findings from a PCORI workshop to align patient-reported
Research Scientist, Evidera, 7101 Wisconsin Avenue, Suite outcomes in the electronic health record. J Comp Eff Res
1400, Bethesda, MD 20814, USA. Tel: 1-970-363-7333; Fax: 2016;5:561-567.
1-301-654-9864; E-mail: heather.gelhorn@evidera.com 7. Gelhorn H, Dashiell-Aje E, Miller M, et al. Psychometric eval-
Conflicts of Interest: This work was conducted by Evidera, an uation of the Hypogonadism Impact of Symptoms Question-
naire (HIS-QTM). J Sex Med 2016;13:1737-1749.
independent research organization. Dr Gelhorn, Ms Roberts,
and Dr Revicki are employees of Evidera; Evidera received 8. Food and Drug Administration. Guidance for industry on
research study support from AbbVie. Mr Miller, Dr Khandelwal, patient-reported outcome measures: use in medical product
and Mr Hepp are employees and stockholders of AbbVie. Dr development to support labeling claims. Fed Regist 2009;
74:65132-65133.
Dobs and Dr DeRogatis have no conflicts of interest to declare.
9. Gelhorn H, Bodhani A, Wahala L, et al. A qualitative study to
Funding: None. inform the development of the Hypogonadism Impact of
Symptoms Questionnaire Short Form (HIS-Q SF). In press.
STATEMENT OF AUTHORSHIP 10. Hogarty KY, Hines CV, Kromrey JD, et al. The quality of factor
Category 1 solutions in exploratory factor analysis: the influence of sample
size, communality, and overdetermination. Educ Psychol
(a) Conception and Design Meas 2005;65:202-226.
Heather L. Gelhorn; Laurie J. Roberts; Dennis A. Revicki;
Michael G. Miller 11. MacCallum RC, Widaman KF, Zhang S, et al. Sample size in
(b) Acquisition of Data factor analysis. Psychol Methods 1999;4:84-99.
Heather L. Gelhorn; Dennis A. Revicki; Michael G. Miller 12. Heinemann LAJ, Zimmerman T, Vermeulen A, et al. A new
(c) Analysis and Interpretation of Data ‘Aging Males’ Symptoms’ (AMS) rating scale. Aging Male
Heather L. Gelhorn; Laurie J. Roberts; Dennis A. Revicki; 1992;2:105-114.
Leonard R. DeRogatis; Adrian Dobs; Zsolt Hepp; Michael G.
Miller 13. Rosen RC, Riley A, Wagner G, et al. The International Index of
Erectile Function (IIEF): a multidimensional scale for assess-
Category 2 ment of erectile dysfunction. Urology 1997;49:822-830.
(a) Drafting the Article 14. Flynn KE, Jeffery DD, Keefe FJ, et al. Sexual functioning along
Heather L. Gelhorn; Laurie J. Roberts the cancer continuum: focus group results from the Patient-
(b) Revising It for Intellectual Content Reported Outcomes Measurement Information System
Heather L. Gelhorn; Laurie J. Roberts; Nikhil Khandelwal; (PROMIS(R)). Psychooncology 2011;20:378-386.
Dennis A. Revicki; Leonard R. DeRogatis; Adrian Dobs; Zsolt
15. Yu L, Buysse DJ, Germain A, et al. Development of short
Hepp; Michael G. Miller
forms from the PROMIS sleep disturbance and sleep-related
Category 3 impairment item banks. Behav Sleep Med 2011;10:6-24.
(a) Final Approval of the Completed Article 16. Becker H, Stuifbergen A, Morrison J. Promising new ap-
Heather L. Gelhorn; Laurie J. Roberts; Nikhil Khandelwal; proaches to assess cognitive functioning in people with mul-
Dennis A. Revicki; Leonard R. DeRogatis; Adrian Dobs; Zsolt tiple sclerosis. Int J MS Care 2012;14:71-76.
Hepp; Michael G. Miller
17. Ware JE, Kosinski M, Turner-Bowker DM, et al. How to score
version 2 of the SF-12 Health Survey. Lincoln, RI: Quality
REFERENCES Metrics; 2002.
1. Sato Y, Tanda H, Kato S, et al. Prevalence of major depressive 18. Hu L-T, Bentler PM. Evaluating model fit. In: Hoyle RH, ed.
disorder in self-referred patients in a late onset hypogonadism Structural equation modelling: concepts, issues and applica-
clinic. Int J Impot Res 2007;19:407-410. tions. Thousand Oaks, CA: Sage; 1995. p. 77-99.

J Sex Med 2017;14:1046e1058


1058 Gelhorn et al

19. MacCallum RC, Browne MW, Sugawara HM. Power analysis 26. Nunnally JC, Bernstein IH. Psychometric theory. 3rd ed. New
and determination of sample size for covariance structure York: McGraw-Hill; 1994.
modeling. Psychol Methods 1996;1:130-149. 27. Leidy NK, Revicki DA, Geneste B. Recommendations for
20. Browne MW, Cudeck R. Alternative ways of assessing model evaluating the validity of quality of life claims for labeling and
fit. In: Bollen KA, Long JS, eds. Testing structural equation promotion. Value Health 1999;2:113-127.
models. Beverly Hills, CA: Sage; 1993. p. 136-162. 28. Cohen J. Statistical power analysis for the behavioral sciences.
2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
21. Jones PW, Chen WH, Wilcox TK, et al; EXACT-PRO Study
Group. Characterizing and quantifying the symptomatic fea- 29. Wyrwich KW, Nienaber NA, Tierney WM, et al. Linking clinical
tures of COPD exacerbations. Chest 2011;139:1388-1394. relevance and statistical significance in evaluating intra-
individual changes in health-related quality of life. Med Care
22. Muthén LK, Muthén B. Mplus user’s guide. 3rd ed. Los 1999;37:469-478.
Angeles, CA: Muthén & Muthén; 1998e2004.
30. Wyrwich KW, Tierney WM, Wolinsky FD. Further evidence
23. RUMM 2030: Rasch unidimensional measurement models. supporting an SEM-based criterion for identifying meaningful
Duncraig, Australia: RUMM Laboratory Pty Ltd; 2010. intra-individual changes in health-related quality of life. J Clin
24. Cronbach LJ. Coefficient alpha and the internal structure of Epidemiol 1999;52:861-873.
tests. Psychometrika 1951;163:297-334.
25. Hays RD, Revicki DA. Reliability and validity, including
responsiveness. In: Fayers PM, Hays RD, eds. Assessing
SUPPLEMENTARY DATA
quality of life in clinical trials. 2nd ed. New York: Oxford Supplementary data related to this article can be found at
University Press; 2005. p. 25-39. http://dx.doi.org/10.1016/j.jsxm.2017.05.013.

J Sex Med 2017;14:1046e1058

You might also like