You are on page 1of 12

Psychophysiology, 36 ~1999!, 233 – 244. Cambridge University Press. Printed in the USA.

Copyright © 1999 Society for Psychophysiological Research

How many nights are enough? The short-term stability


of sleep parameters in elderly insomniacs
and normal sleepers

WILLIAM K. WOHLGEMUTH,a JACK D. EDINGER,a,b ANA I. FINS,c


and ROBERT J. SULLIVAN, JR.b,d
a
Duke Sleep Disorders Center, Duke University, Durham, NC, USA
b
Durham VA Medical Center, Durham, NC, USA
c
Department of Psychiatry, University of Miami, Miami, FL, USA
d
Duke Center for the Study of Aging and Human Development, Duke University, Durham, NC, USA

Abstract
Temporal stability is an important fundamental quality when measuring sleep parameters, yet it has been infrequently
assessed. Generalizability theory was used to estimate the short-term temporal stability of five variables commonly used
to characterize insomnia: sleep onset latency, total sleep time, wake after sleep onset, time in bed, and sleep efficiency.
Estimates were calculated for 32 elderly primary insomniacs and 32 elderly normal sleepers, both in the lab and at home,
using both sleep logs and polysomnography ~PSG!. A week of recording using either PSG or sleep logs was typically
sufficient to achieve adequate stability ~defined as G coefficient of at least 0.80! with some notable exceptions: ~a! when
using log-derived measures with insomniacs, a 3-week average was necessary for wake after sleep onset and ~b! more
than a 2-week average was necessary for sleep onset latency. Because of the substantial commitment involved in the
physiological recording of sleep, alternative forms of aggregation are considered with the intent of improving temporal
stability.
Descriptors: Night-to night variability, Generalizability theory, Temporal stability, Sleep assessment

Sleep variables that are used for clinical decision making or re- is, there is a stable trait level of anxiety within an individual, upon
search analyses should demonstrate the fundamental property of which an anxiety state may be superimposed.
stability over time. Temporal stability is important for two reasons: Some biological variables have both state and trait character-
~a! if sleep variables represent characteristic dispositional traits in istics. For example, a loud, startling noise or soothing music may
individuals, then stability, by definition, is expected; and ~b! the cause a transient increase or decrease in blood pressure, perhaps
correlations among sleep measures are necessarily attenuated when raising it to hypertensive levels or decreasing it to normotensive
using unstable measures. Each of these are reasons are discussed in levels. However, the blood pressure will recover to its character-
more detail below. istic level after the auditory stimulus ceases. It would not be proper
If sleep variables do not represent dispositional traits but in- to begin or end an antihypertensive regimen based on one transient
stead represent fluctuating, transient states, then temporal stability reading. Multiple readings are needed so that transient factors will
from one night to the next would not be expected. The distinction cancel each other out, and an accurate blood pressure measurement
between traits and states is made with some psychological vari- can be attained.
ables. Traits, such as intelligence or personality, are hypothesized With respect to sleep, if, for example, it is hypothesized that
to be consistent from one day to the next or from one situation to individuals have a stable, inherent biological need for a particular
the next. States, such as fear or anger, can be transient and may be amount of sleep each night, then total sleep time ~TST! should be
present on one day or in one situation but gone the next. Some similar from one night to the next. If, however, it is hypothesized
variables, such as anxiety, exhibit characteristics of both states and that TST represents a transient state and is purely dependent on
traits ~Spielberger, Gorsuch, Lushene, Vagg, & Jacobs, 1983!. That fluctuating daily0nightly circumstances, then TST should not be
similar from night to night. Perhaps, like anxiety or blood pressure,
there exists both a trait and state component of TST such that
transient factors may be superimposed on trait levels leading to a
The study was supported by grants VA0009-1992-1994 from the Vet- greater or lesser sleep need on any given night. If both state and
eran’s Administration and R01MH48187 from the National Institutes of trait components of TST exist, it is important to measure each
Health.
Address reprint requests to: Dr. William K. Wohlgemuth, Duke Uni-
adequately.
versity Medical Center, Box 2908, Durham, NC 27710. E-mail: wkw@geri. A second reason for studying the temporal stability of sleep is
duke.edu. a well-known principle from psychometrics: validity coefficients

233
234 W.K. Wohlgemuth et al.

~correlations! are limited by the reliability of the variables used in recording is not representative or reproducible. Bootzin et al. ~1995!,
calculating the correlation ~Nunnally & Bernstein, 1994!. As the who studied 30 elderly subjects, found stability coefficients of
reliability of a variable decreases, the observed correlation be- PSG-derived SOL, WASO, TST, and SE% of 0, .39, .46 and .48,
tween that variable and others is necessarily attenuated. To the respectively. Moses et al. ~1972!, using a controlled sleep period of
extent that measures of sleep are unstable, empirical relationships 10:30 p.m. to 6:00 a.m. in 17–21-yr-old Navy recruits found the
between sleep variables and other variables may be underesti- stability of PSG-derived SOL, WASO, and TST to be .20, 0, and
mated. If TST, for example, cannot predict itself well, how can .13, respectively. Clausen et al. ~1974! found the stability of PSG-
TST be expected to relate to another variable with which it may derived TST of 10 young adults to be .31. Coates, Rosekind,
have a plausible theoretical relationship? Strossen, Thoresen, and Kirmil-Gray ~1979! studied eight pur-
The sleep of insomniacs frequently is quite variable from night ported normal sleepers ~subjects completed an insomnia treatment
to night ~Coates et al., 1982; Edinger, Marsh, McCall, Erwin, & program 1 yr earlier! in both the laboratory and at home. They
Lininger, 1991!. In fact, some researchers have claimed that this applied generalizability theory to three nights of PSG recording to
night-to-night variability is one of the critical elements leading to estimate variance components for subjects, occasions, raters, and
the self-definition of insomnia ~Frankel, Coursey, Buchbinder, & the interactions among those components. Generalizability coeffi-
Snyder, 1976!. Night-to-night variability, however, is problematic cients ~used to estimate stability! of SOL, WASO, TST, and
when measuring variables used to empirically document insomnia. SE% based on lab recordings were .24, .87, .52, and .75, respec-
Large fluctuations in TST or sleep onset latency ~SOL!, for exam- tively, and based on home recordings were .39, .22, .52, and .23,
ple, indicate that assessment of just one night may not adequately respectively.
represent an insomniac’s sleep. That is, variables obtained from With regard to home-based PSG, Coates et al. ~1982! assessed
one night may not be reproducible on the next. If an insomniac the stability of sleep parameters in 12 insomniacs and 12 normal
sleeps for 5 hr one night but 9 hr the next, which amount is sleepers ranging in age from 20 to 60 yr. The stability of SOL and
presumed to be more accurate? It is important to determine, then, WASO for normal sleepers was .58 and .72, respectively, and for
how many nights need to be averaged to achieve stable, represen- insomniacs was .70 and .67, respectively.
tative, and reproducible estimates of sleep parameters.
The recognition of the instability of sleep measures is not lim- Assessment Using Sleep Logs
ited to insomniacs. Researchers ~Clausen, Sersen, & Lidsky, 1974; Stability estimates generated from laboratory-based sleep logs have
Moses, Lubin, Naitoh, & Johnson, 1972! also have questioned the not been reported for either insomniacs or normal sleepers. Coates
reproducibility of sleep parameters in normal sleepers. Moses et al. et al. ~1982! have been the analyses to assess the stability mea-
~1972! pointed out that in the studies of the consistency of sleep surements derived from sleep logs. The stability of home-recorded
characteristics, the similarity of group means from one night to the SOL and WASO for normal sleepers was .58 and .37, respectively,
next is typically noted. Group means, however, do not indicate the and for insomniacs was .93 and .64, respectively.
relative fluctuation of the individuals within that group. A better As is evident from these published reports, none of these vari-
assessment of individual subject stability from night to night is a ables are representative when recorded for only one night. The
stability coefficient, which indicates changes in the rank ordering stability coefficients range from a low of 0 to a high of .93, with
of individuals from one night to the next. Both Clausen et al. most values being less than .70. Coates et al. ~1979, 1982! reported
~1974! and Moses et al. ~1972! recommended that more attention the highest coefficients; however, they used relatively small sam-
be paid to the reliability of sleep measures that are thought to ple sizes ~8 and 12, respectively! and the widest age range. Addi-
represent traits. tionally, potential participants with periodic limb movement disorder
In the present analysis, we focused on five variables that are ~PLMD! and restless legs syndrome ~RLS! were excluded, which
particularly relevant in characterizing and defining the nature of an may have made the sample more homogenous and less subject to
insomnia complaint ~American Sleep Disorders Association, 1990! error variance. None of the reported studies tested the same sleep-
and commonly reported outcome variables following an insomnia ers in both the laboratory and the home setting to assess the effects
treatment ~Morin, Culbert, & Schwartz, 1994; Murtagh & Green- of environmental factors on stability measures. Coates et al. ~1982!
wood, 1995!. These variables are SOL, TST, wake after sleep onset reported stability coefficients for insomniacs but only in the home
~WASO!, time in bed ~TIB!, and sleep efficiency ~SE%!. setting. Investigation of the stability of laboratory measures of
In previous research, the impact of night-to-night variability of insomniacs has never been reported, and research on the stability
sleep on the measurement process has been addressed infrequently. of home measures of normal subjects is incomplete. The current
Relatively few quantitative estimates of stability are available for study was undertaken to address the following specific questions:
polysomnographically ~PSG! derived measures and even fewer ~a! Which assessment technique ~PSG or logs! provides more sta-
from sleep log measures. We present here a brief review of the ble measures? ~b! Which setting ~home or lab! provides more
literature concerning the stability of sleep parameters ~SOL, TST, stable measures? ~c! How many nights need to be averaged to
WASO, TIB, and SE%!. Stability estimates, using both objective achieve adequate stability? These questions are considered for both
~PSG! and subjective ~sleep log! methods from both home-based insomniacs and normal sleepers.
and laboratory evaluations in normal sleepers and insomniacs, also
are reviewed. Statistical Model
Reliability is defined as the proportion of true variance in the
Assessment Using PSG observed variance of a measure. For instance, if the stability co-
In no studies to date have stability coefficients been estimated from efficient is .80, then 80% of the observed variance is true score
laboratory-based, PSG-derived sleep measures recorded from in- variance and 20% is error variance ~Nunnally & Bernstein, 1994!.
somniacs. In contrast to the lack of laboratory PSG data for in- Several types of reliability exist ~i.e., interrater, internal consis-
somniacs, several studies have been conducted with noncomplaining tency, and temporal stability!. In the present analysis, we focused
normal sleepers. These published reports indicate that one night of on temporal stability, which is that proportion of observed variance
Short-term stability of sleep 235

that is stable over time. Generalizability theory ~Brennan, 1983; ~spn,


2
e ! decreases, leading to an overall increase in the G coeffi-
Cronbach, Gleser, Nanda, & Rajaratnam 1972; Shavelson & Webb, cient. By increasing nn' , it becomes possible to estimate the number
1991; Shavelson, Webb, & Rowley, 1989! is the most comprehen- of nights required to sufficiently decrease error so that stability
sive theory of reliability and applies the random effects analysis of reaches the desired level. With this formula, calculation of the G
variance ~ANOVA! model to the study of multiple sources of error. coefficients is straightforward once the variance components are
Sources of error are called facets ~similar to factors in experimen- obtained. For example, if we want to know the stability of TST and
tal research! and levels within facets are called conditions. The if sp2 5 85, spxo 2
5 180, and nn' is 1, then the G coefficient would
average score an individual would get over all conditions of a facet be 850@85 1 ~18001!# 5 .32. The denominator of the coefficient
is called a universe score, which is analogous to a true score in ~nn' ! can be increased until the desired stability is reached. For
classical reliability theory. 1 week of recording, the G coefficient is 850@85 1 ~18007!# 5 .77.
In sleep research, there are many potential sources of error that Although ANOVA methodology is used in a generalizability
may contribute to the variability among subjects in an overnight analysis, the typical interpretation of useful variance and nuisance
PSG. This study will focus on the error due to night-to-night fluc- variance from experimental research is reversed. As is evident
tuations of an individual’s sleep. The statistical model used in the from the G coefficient ratio, the researcher is most interested in
current study is ~for explicit calculation of the variance compo- maximizing variability due to individual differences ~sp2 ! and min-
nents for this design, see Shavelson & Webb, 1991; see also Llabre imizing any systematic variability ~sn2 ! or relative fluctuations
et al., 1988, for a useful appendix! over time ~spn, 2
e !. In traditional experimental research designs, the
opposite is true. That is, the researcher wants to maximize sys-
s 2 ~TSTpn ! 5 sp2 1 sn2 1 spn,
2
e,
tematic sources of variance or interactions while minimizing dif-
ferences among individuals.
where s 2 ~TSTpn ! represents the observed variance in total sleep
time and p, n, and e represent persons, nights, and error, respec- Method
tively. The observed variance can be broken down into its com-
ponents; sp2 , sn2 , and spn,2
e represent the variance components, Subjects
which, because they are independent of each other, can be summed Thirty-two insomniacs and 32 noncomplaining normal sleepers
to equal the observed variance. This is a one-facet design with were recruited for the study. Insomniacs complained of difficulty
nights as the facet. Each of the variance components can be inter- initiating sleep ~n 5 3!, problems maintaining sleep ~n 5 12!, a
preted with substantive meaning. For example, if the researcher is combination of these problems ~n 5 14!, or chronic poor sleep
studying the stability of TST, the variance component for persons quality with daytime fatigue ~n 5 3! for greater than 6 months.
~sp2 ! quantifies individual differences in TST and is an estimate of Normal sleepers were matched for age and gender and had no
true error-free variance. In the present analysis, sp2 is the portion of sleep complaints. Using the SCID-P for DSM-III-R ~1987!, all
the variance that is stable from one night to the next. This quantity subjects were screened to rule out any history or current symptoms
is estimated by averaging TST across all of the nights for each of psychiatric disorders. A medical history was taken and a phys-
subject in the design. The variance component for nights ~sn2 ! ical exam was conducted, including a thyroid blood test to rule out
represents systematic variance that may impact measurement of any current medical disorder that could cause sleep problems.
TST from night to night. This quantity is estimated by averaging Subjects were required to be free of any psychotropic medication
TST across all subjects on each night. sn2 reflects the extent to during the study and for at least 2 weeks prior to beginning the
which there is variability among the group means from one night protocol. Seven of the insomniacs reported using sedative hypnot-
to the next. As such, it will detect any variance due to a systematic ics, but only three of these insomniacs reported using sedative
first night effect. Finally, the residual variance component ~spn, 2
e! hypnotics more than once a week. Any subject with an apnea0
reflects the change in relative standing among subjects’ TST from hypopnea index greater than 15 events0hr on the first study night
one night to the next. The subscript for the residual contains both was excluded. Subjects with PLMD were not excluded from the
the interaction of persons and nights and random error because sample. As reported by Edinger et al. ~1997!, the insomniacs and
these terms are confounded by the design and cannot be estimated normal sleepers did not differ in either periodic limb movements
separately. If TST represents a stable trait, there should not be ~PLMs! per hour or PLM-related arousals per hour. Table 1 in-
much change in the relative ordering of a group of subjects from cludes the demographic characteristics of the sample.
night to night. These variance components are useful in determin-
ing which facet or interaction of facets are contributing to the Polysomnography
instability of the measurement. All home PSG ~HPSG! and laboratory PSG ~LPSG! studies were
The G coefficient can be constructed from these variance com- conducted using the Oxford Medilog 9000 ~Oxford Medical, Clear-
ponents so that the proportion of true variance relative to observed water, FL! recorders. Research at the Duke Sleep Disorders Center
variance can be determined. The formula for the G coefficient is

sp2

S D
G5 . Table 1. Subject Demographics
spn,
2
e
sp2 1
nn' Subject M0F Age ~ years! Education ~ years!

The nn' in the denominator represents the number of nights that will Insomniacs 16016 67.7 ~4.8! 13.9 ~3.1!
Normal Sleepers 16016 67.5 ~5.7! 14.4 ~2.8!
be averaged together for a particular variable. For example, to
estimate the stability of TST based on 1 week of recording, nn'
would be 7. It is apparent that as nn' increases, the error component Note: Values are M ~SD).
236 W.K. Wohlgemuth et al.

has shown that the Medilog produces technically acceptable re- Table 2. Means (6 SDs) Averaged over Three Nights
cordings ~Hoelscher et al., 1987!. Other research has suggested
that manually reduced data from the Medilog and standard labo- Home Laboratory
ratory sleep monitoring equipment ~i.e., polygraph! produce com-
Variables PSG Logs PSG Logs
parable measures of such parameters as SOL, WASO, TST, TIB,
SE%, sleep stage architecture, REM latency, and REM activity Normals
~Ancoli-Israel, Kripke, Mason, & Messin, 1981; Edinger et al., SOL 21.0 6 12.0 23.0 6 21.1 19.0 6 15.1 22.1 6 16.8
1989; Edinger, Marsh, McCall, Erwin, & Lininger, 1990; McCall, ~31! ~27! ~32! ~32!
Erwin, Edinger, Krystal, & Marsh, 1992; Sewitch and Kupfer, WASO 91.0 6 43.6 65.0 6 44.6 68.2 6 32.9 49.3 6 44.1
~31! ~26! ~32! ~28!
1985!. TST 365.2 6 46.3 386.6 6 67.5 356.6 6 43.7 370.5 6 53.6
All subjects underwent three consecutive nights of HPSG and ~31! ~26! ~32! ~28!
LPSG. A standard monitoring montage, including two electro- TIB 473.3 6 61.5 466.6 6 62.3 440.8 6 48.9 446.2 6 46.5
encephalogram channels ~C3–M2 , O z–Cz !, one chin electromyo- ~31! ~28! ~32! ~32!
SE% 77.6 6 7.2 81.7 6 10.6 81.0 6 6.5 83.9 6 10.9
gram ~EMG! channel, two electrooculogram channels to monitor ~31! ~26! ~32! ~28!
eye movements ~left eye–M1 , right eye–M2 !, two channels to mon- Insomniacs
itor anterior tibialis EMG ~right and left legs!, and one channel to SOL 28.7 6 24.0 33.4 6 20.4 28.7 6 25.7 43.9 6 36.1
monitor air flow ~oral0nasal thermistor!, was used for all HPSG ~31! ~27! ~30! ~29!
and LPSG studies. WASO 83.8 6 51.7 84.0 6 38.7 76.5 6 33.5 74.5 6 41.3
~31! ~25! ~30! ~22!
All Medilog recordings were scored directly on the screen of TST 353.7 6 52.9 355.9 6 77.5 351.7 6 52.2 333.5 6 69.2
the Medilog scanner by experienced polysomnographers using stan- ~31! ~23! ~30! ~21!
dard scoring criteria ~Rechtshaffen & Kales, 1968!. This scoring TIB 460.6 6 43.7 469.6 6 47.9 451.1 6 46.5 455.0 6 48.4
procedure was used because it allows for both rapid scanning of ~31! ~27! ~30! ~30!
SE% 77.1 6 10.1 74.5 6 11.0 77.9 6 8.2 73.4 6 11.4
sleep data and screen-by-screen editing with more difficult studies
~31! ~23! ~30! ~21!
~such as those with more leg movements throughout the night!.
Moreover, previous research has shown that screen scoring pro- Note: Sample sizes given in parentheses.
duces estimates of standard sleep parameters that are comparable
to those obtained from conventional epoch-by-epoch scoring of
records on paper ~Hoelscher, McCall, Powell, Marsh, & Erwin,
1989!. Scorers were blind to the dates on which the studies were and 2 chart the proportion of total variance ~sp2 1 sn2 1 spn, 2
e!
performed, the type of subject from whom the study was obtained, attributable to true individual differences ~sp !. For example, when
2

and the location of the study ~home vs. laboratory!. In addition, considering SOL for normals at home the variance components are
scoring reliability checks were conducted with a small subset ~n 5 sp2 5 84.6, sn2 5 7.2, and spn, 2
e 5 178.3. Thus the total variance
1

20! of the PSGs. Scorer 1 randomly selected and blindly scored 10 is 84.6 1 7.2 1 178.3 5 270.1, and the proportion of sp2 in sp2 1
records previously scored by Scorers 2 and 3, with overall agree- sn2 1 spn,
2
e is 84.60~84.6 1 7.2 1 178.3! 5 .31. Higher proportions
ment rates of 91% and 93%, respectively. indicate greater stability. Inspection of the systematic variance com-
ponents ~sn2 ! indicated that variability due to nights represents a
Sleep Logs very small proportion of the overall observed variance. Only TIB
Each morning following a PSG, subjects were required to com- for insomniacs at home contributed greater than 5% to the total
plete a standard sleep log based on their subjective self-report of observed variance. Thus, systematic sources of variance, which
the previous night’s sleep. From the sleep log it was possible to include the first night effect, did not appear to adversely affect the
derive subjective measures of SOL, WASO, TST, TIB, and SE%. measurement stability of these five variables.
Random sources of variance ~spn, 2
e !, however, constituted a
Procedure large part of the observed variance for each of the five variables
After giving consent, subjects who met inclusion criteria were analyzed. In fact, in 31 of 40 variance component decompositions,
scheduled for three consecutive nights of LPSG and another three the random variance component comprised more than half of the
consecutive nights of HPSG monitoring. The study order was ran- observed variance. This indicates substantial random fluctuation in
domly determined so that one-half of the subjects in each group the rank ordering of subjects from one night to the next. Random
~normals and insomniacs! underwent LPSG first whereas the other fluctuations from night to night were limited to neither insomniacs
half underwent HPSG first. All sleep studies were scheduled so the nor normal sleepers, the home setting nor laboratory, PSG nor
home and laboratory PSGs were separated by at least 4 but not sleep logs. Although, some settings and methods provided more
more than 30 intervening days. During both LPSG and HPSG stable measures than did others, no setting or technique provided
studies, subjects were instructed to maintain customary home bed- measures that were stable enough to require only one night of
times and waketimes. In addition, all HPSG studies were sched- recording.
uled for nights when subjects planned to have no overnight
houseguests. Subjects were instructed to abstain from alcoholic PSG Versus Logs
beverages and to not consume caffeinated substances after 6:00 Inspection of Figure 1 indicates that in normal sleepers the stability
p.m. on study nights. of sleep logs was greater ~the white bars are higher than the black

Results 1
The complete tables of the components of variance for each variable
~SOL, WASO, TST, TIB, and SE%! for each type of sleeper ~normal and
Means and standard deviations averaged over the three nights in insomniac! in each setting ~home and lab! are not presented here but are
each setting ~lab and home! are reported in Table 2. Figures 1 available on request from the first author.
Short-term stability of sleep 237

Figure 1. sp2 0~sp2 1 sn2 1 spn,


2
e ! for normal sleepers using PSG and logs at home and in the lab.

bars! than that of PSG when assessing WASO, TST, TIB, and insomniacs depended on the method of assessment. For TST, PSG
SE%. The proportion of sp2 for sleep logs was greater than vari- was more stable in the lab but logs were more stable at home. For
ance due to persons for PSG regardless of whether the recording SE%, PSG was more stable at home but logs were more stable in
took place at home or in the lab. However, when assessing SOL in the lab.
normal sleepers, PSG was more stable. Simultaneous graphic representations of stability coefficients
For insomniacs ~see Figure 2!, the picture is less clear. When for the same sleepers in different settings are shown in Fig-
assessing WASO and SE%, PSG was more stable regardless of the ures 3–7. The stability coefficients for up to 2 weeks for each of the
setting. For TST and TIB, PSG was more stable only in the lab. five sleep variables ~TST, SOL, WASO, TIB, and SE%! is repre-
Sleep logs for TST and TIB were more stable at home. For SOL, sented for each method ~sleep logs and PSG! separately. Each one
PSG was more stable at home but logs were more stable in the lab. of the four lines on the figure represents the unique sleeper–setting
combination and documents the expected increase in the stability
Home Versus Lab coefficient by averaging more nights. The numbers represent es-
In normal sleepers, the stability of SOL and WASO was greater in timates of expected G coefficients. These estimates are extrapo-
the lab regardless of method ~both black and white bars for SOL-L lated from the variance components calculated from the sample of
are higher than their respective bars for SOL-H!; however, TST, three nights for each subject in each setting. Subjects were not
TIB, and SE% were more stable at home regardless of the method. actually recorded for 14 nights.
For insomniacs, SOL and WASO were, as with normal sleepers, Although PSG or sleep logs may be slightly more stable for a
more stable in the lab. However, unlike for normal sleepers, TIB particular variable in a particular sleeper–setting combination, the
was more stable in the lab. The stability of TST and SE% for overall pattern of stability coefficients for PSG versus sleep logs

Figure 2. sp2 0~sp2 1 sn2 1 spn,


2
e ! for insomniacs using PSG and logs at home and in the lab.
238 W.K. Wohlgemuth et al.

Figure 3. Sleep onset latency using PSG and logs.

does not indicate the superiority of one method over another. Fig- somniacs. Coefficients for SE% clustered around .40 for PSG.
ures 3–7 indicate that the stability coefficients of PSG-derived Coefficients derived from sleep logs were similar to those from
measures are not systematically greater than log-derived measures PSG ~around .40! in normal sleepers, but the SE% for insomniacs
but, instead, are comparable, with a few exceptions. Specifically, using sleep logs was lower than that for PSG ~around .20!.
when considering all sleeper–setting combinations, the stability
coefficient of SOL for both the sleep logs and PSG based on one Number of Nights for Adequate Stability
night of recording fell into the .25–.30 range. For WASO, using Unrepresentative measures based on a single night of recording do
both the sleep logs and PSG, coefficients clustered around .50, not indicate that reproducible measures are unattainable. To obtain
with the exception of insomniacs completing sleep logs at home, stable measures, however, requires averaging each variable over
for which the coefficient was .10. For TST, the coefficients for multiple nights so that random influences are reduced to only a
both the sleep logs and PSG clustered around .35, except for nor- minor portion of the observed variance. Figures 3–7 indicate how
mal sleepers completing sleep logs at home, for which the coeffi- many nights need to be averaged to achieve adequate stability
cient was .60. For TIB, sleep log coefficients were higher than ~although the selection of an adequate threshold value for a sta-
PSG coefficients in normal sleepers but were comparable in in- bility coefficient is arbitrary, coefficients of .80 are generally rec-
Short-term stability of sleep 239

Figure 4. Wake after sleep onset using PSG and logs.

ommended for research purposes; see Nunnally & Bernstein, 1994!. minished after approximately five to seven nights. For example,
In general, the stability curves for PSG-derived variables are more when considering SOL derived from PSG, the stability coefficient
tightly clustered together than are the log-derived measures. Thus, for each subject–setting combination increased by approximately
for PSG-derived measures, the relative proportion of true score .35 when averaging over 7 nights versus 1 night, but averaging
variance and error variance is similar among all four sleeper– over 14 nights only increased the coefficient by another .10. Thus,
setting combinations when using PSG. That is, the G coefficients for SOL not much gain is expected by extending the assessment
are all quite similar. Two exceptions to the clustering of PSG- process beyond the first week.
derived stability coefficients are TIB when insomniacs are re- With regard to the measures derived from the PSG, a coefficient
corded at home and SE% when normal sleepers are recorded in the of .80 is possible for all subject–setting combinations with 5 nights
lab. For sleep logs, the lack of clustering indicates that the relative of recording for WASO, 10 nights for SOL, and 11 nights for TST.
proportion of true score to error variance ~i.e., G coefficient! is TIB required 4 nights for all combinations except for insomniacs at
contingent on the type of sleeper and the setting. home, where 16 nights of recording were necessary. SE% required
For each of the five variables being considered, the greatest 6 nights, except for normals in the lab, who required 11 nights.
gain in stability in nearly all cases was made in the first few nights. Because the stability curves are not closely clustered together
The gain in stability to be made from averaging more nights di- when using measures derived from sleep logs, the decision about
240 W.K. Wohlgemuth et al.

Figure 5. Total sleep time using PSG and logs.

the number of nights is more complicated—it depends on the type SOL, WASO, TIB, and SE%. For WASO, 21 and 36 nights
of sleeper and the setting. For example, after averaging a week of were needed for stability coefficients of .70 and .80, respec-
records, the stability of SOL for insomniacs in the lab was nearly tively. For TIB and SE%, 10 and 12 nights, respectively, were
.80, but the stability coefficient was only about 0.50 for normal needed for a coefficient of .80. Normals in the home setting
sleepers at home. As with the PSG-derived measures, in most required more than a week of recording for TST and SE%. For
instances, there were diminishing gains after averaging about a TST and SE%, 10 and 14 nights, respectively, were needed for
week of log-derived measures. a coefficient of .80.
Because subjects typically record sleep logs in 1-week ep- Averaging only 1 week of data is not sufficient for SOL. In the
ochs, it is most useful to consider sleep log measures in weekly laboratory setting, 2 weeks of recording were sufficient to reach
units. For WASO, TST, TIB, and SE%, averaging a week of adequate stability for both normal sleepers and insomniacs, but not
recording was sufficient for adequate stability ~i.e., stability co- in the home setting. For normal sleepers at home, 17 and 29 nights
efficients $ .80! for all sleeper–setting combinations, with two were needed for stability coefficients of .70 and .80, respectively.
exceptions: both insomniacs and normals at home. Insomniacs For insomniacs at home, 10 and 17 nights were needed for stability
in the home setting required more than a week of averaging for coefficients of .70 and .80, respectively.
Short-term stability of sleep 241

Figure 6. Time in bed using PSG and logs.

To summarize, most measures derived from sleep logs have a Discussion


stability of .80 when averaging over a 2-week assessment. How-
ever, when using log-derived WASO for insomniacs, 3 weeks needed The present results indicating that one night of recording is not
to be averaged for a coefficient of .70 and more than 5 weeks were sufficient for adequate stability is not new ~Bootzin et al., 1995;
needed for a coefficient of .80. In addition, when using log-derived Clausen et al., 1974; Coates et al., 1979, 1982; Moses et al., 1972!.
SOL for insomniacs, about 2.5 weeks needed to be averaged for a Furthermore, the instability of the measures reported in this study
coefficient of .80. Therefore, if a researcher were studying insom- is not due to a systematic error facet, such as the first night effect.
nia using SOL and WASO derived from a sleep log, a minimum of Instead, the instability is due to random, unpredictable fluctuations
3 weeks of logs should be completed. If the researcher were not from one night to the next. Multiple nights of recording are needed
interested in SOL among normal sleepers in their homes, 2 weeks to overcome the instability due to random error. The decision about
of recording would be sufficient to reach a stability coefficient of the number of nights to record sleep depends on the type of sleeper
.80 for all variables. However, if SOL were an important variable and the setting where the recording is made, especially when using
in the study ~i.e., comparison of SOL between normal sleepers and log-derived measures. A particular research question and study
insomniacs at home!, 3 weeks would be required for a coefficient population should guide the investigator in choosing which stabil-
of .70 and 4 weeks would be required for a coefficient of .80. ity coefficients are most important. For example, will the data
242 W.K. Wohlgemuth et al.

Figure 7. Sleep efficiency using PSG and logs.

collection take place at home or in the lab? Will insomniacs, nor- night variability of measures based on bioelectric signals ~i.e.,
mal sleepers, or both be included in the sample? What will be the PSG! is similar among elderly insomniacs and normal sleepers in
primary outcome measure~s!? The number of nights required for both the home and lab settings. Furthermore, the random night-to-
adequate stability depends on the answers to these questions. night variability of measures based on the process of subjectively
The comparability of the stability coefficients using two meth- recalling, interpreting, and reporting a prior night’s sleep ~i.e.,
ods of assessment ~PSG and sleep logs! does not provide evidence sleep logs! does depend on the specific sleeper–setting combina-
that one method or another leads to superior stability. Instead, the tion, with measures from insomniacs having more random variability.
generally low coefficients, regardless of the method used, indicate Sleep measures must have adequate stability because quantita-
that the particular variable being studied, but not the assessment tive relationships among variables are limited by the stability co-
method, is influenced by random factors. That is, PSG, by provid- efficient. Some of the equivocal or null results reported in the sleep
ing “hard” biological measures, is not necessarily more stable than literature may be partially due to the use of unstable measures. For
the “soft” self-report measures. However, the PSG data were more example, Stepanski ~1994! in a summary of the insomnia treatment
consistent among subject–setting combinations than were the sleep literature, mentioned that positive outcomes have been found using
log data, as indicated by the closely clustered G coefficient curves sleep logs, but these results have not been corroborated by PSG
for the PSG data and the more divergent G coefficient curves for measures. Perhaps not enough nights of PSG have been recorded
the sleep-log data. This finding implies that the random night-to- and averaged to allow sensitive detection of positive outcomes.
Short-term stability of sleep 243

When interpreting these results, several limitations should be standard deviations. However, future research should focus on this
considered. These present results were obtained from elderly sleep- important issue.
ers, which typically have more fragmented sleep then younger It may be possible to increase the reproducibility of one night
sleepers. Such fragmentation may increase night-to-night variation of recording ~either PSG or sleep logs! by exerting more control
in this subgroup, and therefore the stability estimates may not be over the assessment process and reducing random factors. Some
representative of the entire population. Also, because these record- experimental control was exerted in the present study by banning
ings were made using ambulatory equipment, subjects were less alcohol and the use of caffeine after 6:00 p.m. on study nights.
restricted than those using traditional PSG recording devices. The Also, subjects were required not to take sleeping pills for at least
fewer constraints imposed by the ambulatory methodology may 2 weeks prior to data collection. However, many other random
have increased nightly variability such that these results cannot be possibilities exist. For example, we did not assess whether the
generalized to data obtained from the traditional laboratory setting. study participants had bed partners at home. Those with regular
Also, subjects were recruited as research volunteers and may not bed partners may have had more night-to-night variation ~because
be representative of the clinical insomniac population. Therapeutic of partner movement, snoring, etc.! and less stability in sleep pa-
change for patients who spontaneously present to a clinic for treat- rameters than those without bed partners. In the current data set, no
ment is greater than that for patients who are actively recruited for subjects had bed partners while they were in the lab. Furthermore,
participation ~Murtagh & Greenwood, 1995!. These data are based several sleep parameters were less stable in the lab, which provides
on estimates from three nights of recording. Subsequent research is evidence that having a bed partner does not systematically de-
needed to empirically verify these estimates by averaging actual crease the stability of sleep. It may not be desirable to control all
data collected over several weeks. aspects of subjects’ sleep so that random variability is greatly
To better understand the instability of these sleep measures, it reduced. If such experimental control is exerted, the sleeping con-
is useful to speculate on the possible sources of this random night- ditions may be too contrived to be ecologically valid.
to-night variation. Webb ~1988! presented a theoretical behavioral The necessity for recording and averaging multiple nights of
model through which predictions of sleep function could be made. sleep to produce stable measures is most problematic when using
Webb hypothesized that sleep measures are a function of three PSG. PSG is a technically demanding, labor intensive, and costly
dispositional constructs: sleep demand, circadian tendencies, and procedure from the experimenter’s perspective and is usually aver-
behavioral facilitators and inhibitors. Sleep demand is related to sive from the subject’s perspective when recording over multiple
amount of wakefulness before sleep. Circadian tendencies are re- nights. Each additional night of PSG recording represents substan-
lated to the timing of sleep within the 24-hr day. Behavioral fa- tial commitment from both experimenter and subject. However,
cilitators and inhibitors are individual behaviors that increase or keeping sleep logs is not technically demanding and is relatively
decrease the likelihood of sleep. Webb ~1988, p. 493! stated that inexpensive; thus, keeping additional logs entails only a longer
“sleep behavior may be delayed, interrupted or terminated, or fa- time commitment. An alternative to either PSG or logs may be
cilitated by a vast range of voluntary or involuntary, consciously or actigraphy, which is less obtrusive and less demanding than PSG
unconsciously determined responses.” Furthermore, “From these and can provide objective corroboration to the subjective impres-
considerations, it follows that the prediction of sleep behaviors is sions that are reported by participants. Future investigations should
a formidable but sensible process” ~p. 495!. Based on Webb’s include an assessment of the stability of actigraphy measures or
analysis, on any given night for any given individual, there prob- other alternatives ~e.g., Nightcap!.
ably are numerous contributions to the random variability of sleep It is not clear how to resolve the problem of requiring multiple
measures. Thus, measures from one night probably do not predict nights of recording using a costly procedure ~PSG! to obtain ad-
the next night’s sleep very well. A few nights of averaging are equate stability; however, one method, which is quite common in
necessary so that most of the random state factors involved in a psychological testing ~Spielberger et al., 1983!, entails aggregating
sleep assessment are canceled out and sleep traits can emerge. over a set of items that measure a common construct. This proce-
The night-to-night variability of sleep may be one of the crucial dure effectively cancels out interitem variance so that a represen-
elements in the discomfort experienced by insomniacs. The present tative, reproducible assessment of the underlying construct is made.
investigation was designed to determine the number of nights nec- Buysse, Reynolds, Monk, Berman, and Kupfer ~1989! used such a
essary to minimize this night-to-night variability so that stable technique in developing the Pittsburgh Sleep Quality Index. Indi-
measures can be obtained. As such, we neglected the assessment of vidual items are combined into subcomponents, and the subcom-
the frustrating nightly variability experienced by insomniacs. If ponents are combined into a global scale. The stability of the
night-to-night variability is a trait of insomniacs, then, as with global scale is greater that the stability of any subcomponent scale.
other trait measures, this nightly variability should be reproduc- The increase in stability from the component to the global scale is
ible. Night-to-night variability requires a somewhat different, yet a demonstration of the usefulness of aggregation. Latent constructs
appropriate, operationalization. The most obvious candidate to rep- may exist ~such as sleep quality! within a set of daily recorded
resent this nightly variability of a measure over several nights is sleep variables, and by aggregating across several sleep variables
the standard deviation of that measure. To assess the nightly vari- ~i.e., items!, a representative, reproducible assessment of the latent
ability of SOL, for instance, the standard deviation could be cal- construct can be made. Such a procedure could make multiple
culated over three nights. If this nightly variability were reproducible, replications of the measurement process unnecessary. Other inves-
it would be expected that the next three nights would produce a tigators ~e.g., Edinger et al., 1996! have used a factor analytic
similar standard deviation. Thus, the standard deviations of SOL procedure to generate latent constructs from a set of PSG variables.
~not the raw scores! from the two consecutive sets of three nights The factors are meaningful combinations of separate PSG vari-
would become the raw data for a generalizability analysis ~as ables ~i.e., items! that through aggregation may be more stable than
performed in the present study! to calculate variance components any of the separate variables. Perhaps one night of PSG would be
and G coefficients. Unfortunately, the present data set does not sufficiently stable if individual variables were combined into theo-
include enough nights to allow the investigation of the stability of retically meaningful composites.
244 W.K. Wohlgemuth et al.

REFERENCES

American Psychiatric Association. ~1987!. Diagnostic and statistical man- & Lininger, A. ~1987!. Ambulatory sleep monitoring with the Oxford
ual of mental disorders, 3rd edition, revised. Washington, DC: Author. Medilog 9000: Technical acceptability, patient acceptance, and clinical
American Sleep Disorders Association. ~1990!. International classification indications. Sleep, 10, 606– 607.
of sleep disorders. Rochester, MN: Author. Hoelscher, T. J., McCall, W. V., Powell, J., Marsh, G. R., & Erwin, C. W.
Ancoli-Israel, S., Kripke, D. F., Mason, W., & Messin, S. ~1981!. Com- ~1989!. Two methods for scoring sleep with the Oxford Medilog 9000:
parisons of home sleep recordings and polysomnography in older adults Comparison to conventional paper scoring. Sleep, 12, 133–139.
with sleep disorders. Sleep, 4, 283–291. Llabre, M. M., Ironson, G. H., Spitzer, S. B., Gellman, M. D., Weidler,
Bootzin, R. R., Bell, I. R., Halbisch, R., Kuo T. F., Wyatt, J. K., Rider, S. P., D. J., & Schneiderman, N. ~1988!. How many blood pressure measure-
& Manbers, R. ~1995!. Night-to-night variability in measures of sleep ments are enough? An application of generalizability theory to the
and sleep disorders: A six night PSG study. Sleep Research, 24, 121. study of blood pressure reliability. Psychophysiology, 25, 97–106.
Brennan, R. L. ~1983!. Elements of generalizability theory. Iowa City: ACT McCall, W. V., Erwin, C. W., Edinger, J. D., Krystal, A. D., & Marsh, G. R.
Publications. ~1992!. Ambulatory polysomnography: Technical aspects and norma-
Buysse, D. J., Reynolds, C. F., III, Monk, T. M., Berman, S. R., & Kupfer, tive data. Journal of Clinical Neurophysiology, 9, 68–77.
D. J. ~1989!. The Pittsburgh sleep quality index: A new instrument for Morin, C. M., Culbert, J. P., & Schwartz, S. M. ~1994!. Nonpharmacolog-
psychiatric practice and research. Psychiatry Research, 28, 193–213. ical interventions for insomnia: A meta-analysis of treatment efficacy.
Clausen, J., Sersen, E. A., & Lidsky, A. ~1974!. Variability of sleep mea- American Journal of Psychiatry, 151, 1172–1180.
sures in normal subjects. Psychophysiology, 11, 509–516. Moses, J., Lubin, A., Naitoh, P., & Johnson, L. C. ~1972!. Reliability of
Coates, T. J., Killen, J. D., George, J., Marchini, E., Silverman, S., & sleep measures. Psychophysiology, 9, 78–82.
Thoresen, C. ~1982!. Estimating sleep parameters: A multitrait- Murtagh, D. R. R., & Greenwood, K. M. ~1995!. Identifying effective
multimethod analysis. Journal of Consulting and Clinical Psychology, psychological treatments for insomnia: A meta-analysis. Journal of
50, 345–352. Consulting and Clinical Psychology, 63, 79–89.
Coates, T. J., Rosekind, M. R., Strossen, R. J., Thoresen, C. E., & Kirmil- Nunnally, J. C., & Bernstein, I. ~1994!. Psychometric theory. New York:
Gray, K. ~1979!. Sleep recordings in the laboratory and home: A com- McGraw-Hill.
parative analysis. Psychophysiology, 16, 339–346. Rechtshaffen, A., & Kales A. ~1968!. A manual of standardized terminol-
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. ~1972!. The ogy, techniques, and scoring systems of sleep stages of human sub-
dependability of behavioral measurements: Theory of generalizability jects. Los Angeles: UCLA Brain Information Service0 Brain Research
for scores and profiles. New York: Wiley. Institute.
Edinger, J. D., Fins, A. I., Sullivan, R. J., Jr., Marsh, G. R., Dailey, D. S., Sewitch, D. E., & Kupfer, D. J. ~1985!. A comparison of Telediagnostic and
Hope, T. V., Young, M., Shaw, E., Carlson, D., & Vasilas, D. ~1996!. Medilog systems for recording normal sleep in the home environment.
Laboratory versus home-based polysomnography in the comparison of Psychophysiology, 22, 718.
insomniacs and normal sleepers. Sleep Research, 25, 236. Shavelson, R. J., & Webb, N. M. ~1991!. Generalizability theory: A primer.
Edinger, J. D., Fins, A. I., Sullivan, R. J., Jr., Marsh, G. R., Dailey, D. S., Newbury Park, CA: Sage.
Hope, T. V., Young, M., Shaw, E., Carlson, D., & Vasilas, D. ~1997!. Shavelson, R. J., Webb, N. M., & Rowley, G. L. ~1989!. Generalizability
Sleep in the laboratory and sleep at home: Comparison of older insom- theory. American Psychologist, 44, 922–932.
niacs and normal sleepers. Sleep, 20, 1119–26. Spielberger, C. D., Gorsuch, R. L., Lushene, P. R., Vagg, P. R., & Jacobs,
Edinger, J. D., Hoelscher, T. J., Webb, M. D., Marsh, G. R., Radtke, R. A., G. A. ~1983!. Manual for the State-Trait Anxiety Inventory. Palo Alto,
& Erwin C. W. ~1989!. Polysomnographic assessment of DIMS: Em- CA: Consulting Psychologists Press.
pirical evaluation of its diagnostic value. Sleep, 12, 315–322. Stepanski, E. J. ~1994!. Behavioral therapy for insomnia. In M. H. Kryger,
Edinger, J. D., Marsh, G. R., McCall, W. V., Erwin, W. C., & Lininger, T. Roth, & W. C. Dement ~Eds.!, Principles and practice of sleep
A. W. ~1990!. Daytime functioning and night-time sleep before, during, medicine ~pp. 535–541!. Philadelphia: W. B. Saunders.
and after a 146 hour tennis match. Sleep, 13, 526–532. Webb, W. B. ~1988!. An objective behavioral model of sleep. Sleep, 11,
Edinger, J. D., Marsh, G. R., McCall, W. V., Erwin, C. W., & Lininger, 488– 496.
A. W. ~1991!. Sleep variability across consecutive nights of home
monitoring in older mixed DIMS patients. Sleep, 14, 13–17.
Frankel, B. L., Coursey, R. D., Buchbinder, R., & Snyder, F. ~1976!. Re-
corded and reported sleep in chronic primary insomnia. Archives of
General Psychiatry, 33, 615– 623.
Hoelscher, T. J., Erwin, C. W., Marsh, G. R., Webb, M. D., Radtke, R. A., ~Received August 19, 1997; Accepted August 27, 1998!

You might also like