Performing T-Tests To Compare Autocorrelated Time Series Data Collected From Direct-Reading Instruments

Journal of Occupational and Environmental Hygiene
ISSN: 1545-9624 (Print) 1545-9632 (Online) Journal homepage: http://www.tandfonline.com/loi/uoeh20
Performing T-tests to Compare Autocorrelated

Time Series Data Collected from Direct-Reading
Instruments
Patrick O’Shaughnessy & Joseph E. Cavanaugh
To cite this article: Patrick O’Shaughnessy & Joseph E. Cavanaugh (2015) Performing
T-tests to Compare Autocorrelated Time Series Data Collected from Direct-Reading
Instruments, Journal of Occupational and Environmental Hygiene, 12:11, 743-752, DOI:
10.1080/15459624.2015.1044603
To link to this article: http://dx.doi.org/10.1080/15459624.2015.1044603
Accepted author version posted online: 26

May 2015.
Published online: 09 Oct 2015.
Submit your article to this journal
Article views: 154
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=uoeh20
Download by: [University of Lethbridge] Date: 06 November 2015, At: 12:29

Journal of Occupational and Environmental Hygiene, 12: 743–752
ISSN: 1545-9624 print / 1545-9632 online
Copyright c 2015 JOEH, LLC
DOI: 10.1080/15459624.2015.1044603
Performing T-tests to Compare Autocorrelated Time Series

Data Collected from Direct-Reading Instruments
Patrick O’Shaughnessy1and Joseph E. Cavanaugh2
1
Department of Occupational and Environmental Health, College of Public Health, The University of Iowa,
Iowa City, Iowa
2
Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, Iowa
Downloaded by [University of Lethbridge] at 12:29 06 November 2015
Industrial hygienists now commonly use direct-reading in- measurement to a previous measurement, or multiple previous
struments to evaluate hazards in the workplace. The stored measurements, results in a sample containing measurements
values over time from these instruments constitute a time that are no longer independent, which therefore violates a
series of measurements that are often autocorrelated. Given
the need to statistically compare two occupational scenarios fundamental assumption required for proper use of a t-test or
using values from a direct-reading instrument, a t-test must an ANOVA.
consider measurement autocorrelation or the resulting test will The issue of autocorrelation, or “serial correlation,” in suc-
have a largely inflated type-1 error probability (false rejection cessive measurements made in occupational settings has been
of the null hypothesis). A method is described for both the discussed in the literature. For the situation in which a contam-
one-sample and two-sample cases which properly adjusts for
autocorrelation. This method involves the computation of an inant is generated in a perfectly-mixed ventilated room, Roach
“equivalent sample size” that effectively decreases the actual (1977) explained that serial correlation will decrease exponen-
sample size when determining the standard error of the mean tially from perfect correlation between successive measure-
for the time series. An example is provided for the one-sample ments, r = 1, as the sample interval, t, increases relative to
case, and an example is given where a two-sample t-test is the room air changes per hour, N .(1) Or, to state the relationship
conducted for two autocorrelated time series comprised of
lognormally distributed measurements. in terms of the room residence time, τ = 1/N :
r = e−t/τ . (1)
Keywords autocorrelation, direct-reading, t test, time series
In this case, contaminant concentrations between succes-
sive samples become less randomly varied as the time between
Address correspondence to Patrick O’Shaughnessy, Occupational samples decreases and therefore become more serially corre-
and Environmental Health, College of Public Health, 100 CPHB, lated. Roach also mentioned that the instrument itself may have
S320, Iowa City, IA 52242; e-mail: patrick-oshaughnessy@uiowa.edu a “time constant” which does not allow it to instantaneously
react to an actual change in concentration and therefore adds
to the serial correlation between measurements.
INTRODUCTION Serial correlation in measurements made by industrial hy-
gienists has been described over the ensuing decades, but pri-
marily in the context of successive grab samples(2,3) or day-to-
W ith a wide variety of direct-reading instruments avail-
able to the industrial hygienist, they are now commonly
used to provide measurements as part of a study to compare
day samples(4-6) rather than those obtained from direct-reading
instruments. A review of recent literature on this subject re-
two or more conditions. In such cases, a t-test or an analysis of sulted in only one article by Klein-Entink et al. who described
variance (ANOVA) might be applied to determine the statis- the use of statistical techniques for the analysis of time series
tical significance of the difference(s) in mean levels of those to achieve valid comparisons between the means of contiguous
conditions. Proper application of those statistical techniques measurements made at the same location with a direct-reading
relies on the premise that the samples for each condition instrument separated by some change, or intervention, in a pro-
contain observations that have been randomly obtained from cess producing airborne contaminants.(7) That article provides
a population and are independent and normally distributed. methods, and realistic examples, associated with the develop-
However, as the sampling rate increases, the time between ment of autoregressive-integrated-moving average (ARIMA)
successive measurements decreases, and the likelihood of a models, which are then used to compensate for autocorrela-
subsequent measurement being related in magnitude to a pre- tion when performing a comparison of the means of adja-
vious measurement increases. This “autocorrelation” of one cent time series. ARIMA models were principally developed
Journal of Occupational and Environmental Hygiene November 2015 743

by Box et al.(8) and are explained in most texts on time measurement, yt , by some constant multiple, φ, resulting in:
series analysis techniques. However, correct development of
ARIMA models requires advanced training and experience, as yt = (1 − φ) μ + φyt−1 + εt . (3)
they necessitate the consideration and assessment of multiple
potentially acceptable model forms. This equation models what is referred to as a first-order
The methods described here are restricted to cases of a one- autoregressive, AR(1), process.(8) Such a process, therefore, is
or two-sample t-test of autocorrelated data. Few explanations one in which a value displayed by a direct-reading instrument
were found of methods to perform comparisons of multiple is influenced by both the current condition occurring at the
independent time series with autocorrelated data analogous sample time, t, as well as the condition when a measurement
to an analysis of variance (ANOVA). Shumway and Stoffer was taken during the most recent previous sample time, t − 1,
described a method for testing the equality of means of a from that of the current measurement. Therefore, t indicates
collection of time series and included the associated R code the order of measurement without regard to the magnitude
required to perform that analysis.(9) Their method is developed of the sample interval, t, so that yt indicates the tth mea-
using Fourier transforms and frequency domain techniques. A surement of a time series.(12) Theoretically, the value of φ
similar ANOVA method that focuses on time series that can be in Eq. (3) has values in the range, −1 < φ < 1. However,
modeled as AR(1) processes is described by Sutradhar et al.(10) the autocorrelation that exists in a time series of airborne
In their article on methods to harmonize strategies for contaminant concentrations is typically such that any two
measuring engineered nano-sized objects, Brouwer et al. em- measurements are similar to each other (φ → 1) as versus
phasized the need to account for autocorrelation in time- very dissimilar to each other (φ → −1); therefore, the methods
series data obtained from direct-reading instruments.(11) They described here are restricted to φ in the range, 0 < φ < 1.(4,13)
also recognized that many in the field of occupational health Rappaport suggested the use of the AR(1) process for
research do not have experience with the development of modeling the autocorrelation of a time series of contaminant
ARIMA models. In that regard, the purpose of this paper is to measurements when means to obtain a more definitive model
provide an alternative method for comparing the mean levels are not available.(5) No additional guidance could be found
of data sets comprised of autocorrelated time series that fit in the literature to verify this claim. However, a simplified
within the structure of commonly applied statistical methods. statistical method is described here to verify the application
To that end, this article will first present a general method and of the AR(1) model to a measurement time series. Given the
then provide an example of its use from data obtained in an estimate of r provided by Eq. (1), a general rule to enhance the
occupational setting. prospect that the resulting time series will constitute an AR(1)
process would be to limit the sample interval so that t/τ >
0.1 which corresponds to r < 0.9. This rule will not eliminate
Analysis of an Ideal AR(1) Time Series
autocorrelation, as r may still be high, but will aid in choosing
A brief introduction to time series analysis is necessary for a sample rate that is not so fast as to overly complicate the
understanding the methods to be described. An overview of resulting time series beyond that which can be modeled as
this subject in the context of occupational health is also given an AR(1) process. For example, given air changes per hour
by George et al.(4) A time series of measurements,y, from (N) between 2 - 12, the minimum sample interval would be
a direct-reading instrument will undoubtedly be affected by 3–0.5 min, respectively.
random perturbations in the level of the contaminant measured.
A simple model of such a series is:
Checking for Autocorrelation
yt = μ + εt , (2) Before proceeding with the methods described here, the
measurements obtained over time with a real-time instrument
where yt represents the value of the measurement time series should first be checked to determine whether they constitute a
y taken at a discrete point in time, t, which is acquired over a time series consisting of first-order autocorrelated data. When
uniformly spaced sample interval, t; μ is the mean level of performing linear regression analysis, it is assumed that the set
the measurements; and εt represents the random component of differences, e, between the predicted value of y for a given x,
of the measurement and, over all sample episodes, is a series ŷ, and the measured value of y, will be normally distributed and
of normally-distributed values with a mean of 0 and constant independent of each other, i.e., not autocorrelated. The Durbin-
variance, σ 2. If expressing the measurements as deviations Watson (D-W) test was designed to test the null hypothesis that
from the mean, as is common in time series analysis, this the set of differences, e = y − ŷ, are not autocorrelated. This
equation results in yt − μ = εt . Conventional testing and test can also be used to test for autocorrelation in a time series
estimation procedures for μ are based on the assumption that of n measurements by first regressing the measurements,yt , on
the εt values are serially uncorrelated. their corresponding time index, t. In the absence of a linear
If autocorrelation is present in the time series, the simplest, trend in the series, the autocorrelation in successive measured
and most intuitive, adjustment to that model is to include a term values,yt , is approximately the same as the autocorrelation
in which a previous measurement, yt−1 , influences the present in successive differences, et = yt − ŷt . The test involves
744 Journal of Occupational and Environmental Hygiene November 2015

computing the D-W statistic, d̂, using: et al. as:(4)
n−k
n (yt − ȳ) (yt+k − ȳ)
(et − et−1 )2
ρ̂k = t=1n . (5)
d̂ = t=2
n 2 , (4) t=1 (yt − ȳ)
2
t=1 et
Given that ρ̂k ≈ φ̂ k , then the ρ̂k values calculated by
and comparing d̂ to a table of critical values for determining Eq. 5 should diminish in a similar manner with an increase
whether the autocorrelation in the data is statistically signif- in k. Computing the ρ̂k values can be accomplished with a
icant.(14) Given the use of t in Eqs. (2) and (3) as a discrete spreadsheet and used to verify that the AR(1) is appropriate
(integer) indication of a sample made at some point in time, for the data series by noting whether they decay approximately
note that t is likewise used as an integer index in Eq. (4). in an exponential manner.
The D-W statistic lies between 0 and 4. A value of d̂ near 2
is consistent with the null hypothesis of no first-order autocor- Determining the AR(1) Coefficient
relation, while d̂ → 0 indicates positive autocorrelation (the This article focuses on the need for statistical methods
typical case) and d̂ → 4 indicates negative autocorrelation to compare two occupational environments which have been
between successive values in the time series. Therefore, both a measured with the use of direct reading instruments. As will
lower and upper critical value is provided in tables established be explained, the development of a time series model for those
for the D-W statistic. data series is not necessary for that goal. However, determining
φ in Eq. (3) can aid in evaluating the efficacy of using the

AR(1) model to represent a time series. For example, once
Checking Time Series Stationarity φ is estimated, Eq. (3) can be solved for εt and that residual
Regarding the extremes of the value for φ in Eq. (3), one series can be assessed with the D-W statistic to determine that
can see that φ = 0 will revert Eq. (3) back to Eq. (2), which it does not contain serial correlation. In addition, approximate
represents a randomly fluctuating series about the mean level normality of the residual series can be gauged using any
μ. In that case, the best estimate of a future value of y is available techniques (e.g., QQ plots, goodness-of-fit tests).
μ. However, if φ = 1, the term containing μ drops out and If software capable of time series analysis was used to
leaves yt dependent on the same random fluctuation, but in this estimate φ in Eq. (3), the method applied involves an initial
case about the previous measurement, yt−1 . This condition is estimate of φ followed by successive iterations during which its
known as a “random walk” as the series meanders randomly value is adjusted until the mean squared error of the residuals
among successive values of y. In that case, the best estimate is not reduced further by some minimal amount. That approach
of a future value of y is the previous value of y. If successive is needed for complex ARIMA models as it also involves the
sample sets of, say, 50 samples each out of 1000 total sampled computation of “backforecasts” to fill in missing data when the
from a random walk series are statistically analyzed, the 20 autocorrelation is significant for many lags in the past. This
resulting means and variances will not be similar. Such a time iterative method of estimation is referred to as “unconditional
series is therefore referred to as “nonstationary.” If a time least squares.” However, an inspection of Eq. (3) reveals that
series is to be adequately modeled by Eq. (3), then the series the model of an AR(1) process has the form of a linear equation
must be “stationary,” which is a series that oscillates about with intercept = (1 - φ)μ, and slope = φ. Therefore, an estimate
a constant overall mean and with a constant overall variance. of φ can be obtained by simple linear regression of yt on yt−1 .
Furthermore, all AR(1) series with |φ| < 1 will be stationary.(8) This simple method of estimation is referred to as “conditional
The standard check to determine whether a time series is sta- least squares.”
tionary is to first plot the autocorrelation function (ACF). This
is typically presented as a bar chart showing the magnitude METHOD DEVELOPMENT
of the correlation between yt and the successive “lags,”yt−1 ,
yt−2 , . . . yt−k , where k indicates the number of lags into the The Effect of Autocorrelation
past from the present time, t. From inspection of Eq. 3 it can Explanations of the effect of autocorrelation on the out-
be seen that yt is correlated with yt−1 , but, after shifting the come of statistical tests are provided by George et al.(4) and
time index for both by one period, then yt−1 is correlated with Francis et al.(6) For the one-sample case where it is desired
yt−2 . Therefore, to some lesser extent, yt is also correlated with to test the null hypothesis, H0 : μ = C, where C is some
yt−2 , and, by extension, yt will be likewise correlated with all constant value, say, an occupational exposure limit, recall that
other yt−k . Box and Jenkins demonstrated that the correlation the test involves an estimate of the standard deviation of the
for each lag, ρk , is related to φ by ρk = φ k .(8) Given that sampling distribution of the mean, otherwise known as the
0 < φ < 1, the sample autocorrelation function for a series standard error. The sampling distribution is the distribution of
defined as an AR(1) process will rapidly decay from a starting the means of multiple samples taken from the same population.
value of ρ̂1 = φ̂ for lag 1, where ρ̂ and φ̂ are the natural As an outcome of the Central Limit Theorem, the sampling
estimates of ρ and φ, respectively. In fact, the decay in ρ̂k for distribution
√ is approximately normal with a standard √ error,
a measured series of data obtained in a well-mixed room is σȳ = σy / n. The natural estimate of σȳ is sȳ = sy / n, where
explained by Eq. (1).(3) The sample ACF is given by George sy denotes the sample standard deviation. A one-sample t-test

to accept or reject H0 will then involve the use of sȳ to calculate to solve for ne :
the test statistic, t̂:
n2 (ρ̂1 − 1)2
ȳ − C ne = . (9)
t̂ = . (6) n − nρ̂12 + 2ρ̂1 ρ̂1n − 1
sȳ
Given that as n increases, ρ̂1n → 0 so that Eq. (9) can be
When serial autocorrelation is present in a time series of further reduced to:
measurements, the sample averages vary to a greater extent
than estimated by sȳ for that time series. This effect can be n2 (ρ̂1 − 1)2
ne ≈ (10)
seen in Figure 1 which compares a data series resulting from n − nρ̂12 − 2ρ̂1
Eq. (2) (Figure 1A) and one resulting from Eq. (3) (Figure 1B)
for measurement time series such as those obtained in work
with φ = 0.8, and both with identical values for μ and εt . The
settings that will often exceed n = 50. Likewise, as n → ∞,
bars in each plot represent the average of every five successive
Eq. (7) and Eq. (10) further reduce to:(17)
data√values. It is apparent from this figure that computing sȳ =
sy / n will underestimate the true variability in the sample 1 − ρ̂1
ne ≈ n . (11)
means of the autocorrelated time series with the result that 1 + ρ̂1
the test statistic will be inflated and H0 will have a greater
The values for ρ̂1 can be computed using Eq. (5). Given that
likelihood of being rejected. Autocorrelation therefore inflates

the probability of a Type-1 error (false rejection) relative to its ρ̂1 = φ̂ and that φ̂ is restricted to the range 0 < φ̂ < 1, then it
declared value. is clear from Eq. (11) that ne < n, and ne → n as φ̂ → 0.(17)
For an AR(1) process this approach provides a method
for computing sa ȳ , and from which one-sample t-tests and
Compensating for Autocorrelation: The confidence intervals about a mean level can be performed that
One-Sample T-test compensate for the bias introduced by autocorrelation in the
A method for performing a t-test with data that constitutes data time series. To implement this approach, ρ̂1 is calculated
an autocorrelated time series is proposed here that is related using Eq. (5), then all other ρ̂k ≈ ρ̂1k in Eq. (7) to solve for ne .
to the commonly used t-test and will therefore be familiar to The equivalent sample size, ne , is then applied to Eq. (8) and
most involved in occupational health research. One approach the resulting sample estimate, sa ȳ , is substituted for sȳ in Eq.
could involve applying a correction factor to sȳ before its (6) to obtain the test statistic, t̂, which is compared to the table
application in Eq. (6) to correctly inflate its value to provide t-value with ne – 1 degrees of freedom and stated significance
a more accurate estimate of the standard error of an auto- level.
correlated data series. In fact, Spear et al. derived just such
an expression that, when applied to sȳ , provides an accurate
Compensating for Autocorrelation: The
estimate of the standard error of a data time series, σa ȳ , when
Two-Sample T-test
autocorrelation is present.(3) Wilks referred to that term as a
“variance inflation factor.”(15) Regardless, Spear et al. did not By extension of the one-sample case, Zwiers and von Storch
extend their variance inflation correction method to its use provided a related method for performing a two-sample
in a t-test of autocorrelated data, but were instead interested t-test.(13) Given two data time series, x and y, each with sample
in the relative difference in the variance of an entire series size n and m, respectively, the test statistic that compensates
of measurements relative to the variance obtained from a for autocorrelation in each series is computed using
collection of short-term samples of the same measurements. ȳ − x̄
However, an associated method has been described in papers t̂ =
. (12)
sp √1 + √1
primarily from the field of climatology(13,16,17) Rather than ne me
apply a variance inflation correction, Zweirs and von Storch(13)
Here, ne and me are calculated from a pooled estimate of the
first determined an “equivalent sample size,” ne :
lag-1 correlation coefficient, ρ̂p1 :
n−1 −1 m n
k t=2 (xt − x̄) (xt−1 − x̄) + (yt − ȳ) (yt−1 − ȳ)
ne = n 1 + 2 1− ρ̂k , (7) ρ̂p1 = m t=2
n
t=1 (xt − x̄) + t=1 (yt − ȳ)
n 2 2
k=1
(13)
that effectively reduces the true sample size to a value that followed by applying ρ̂k ≈ ρ̂p1 k
to Eq. (7). Likewise, the test
produces an improved estimate of the standard error: statistic given in Eq. (9) also includes a pooled estimate of the
standard deviation, sp :
sy
sa ȳ = √ . (8) 1/2
ne m
(xt − x̄)2 + nt=1 (yt − ȳ)2
sp = t=1
. (14)
The expression given in Eq. (7) has a closed-form solution m+n−2
as a function of ρ̂1 and n, which is less computationally difficult

FIGURE 1. A randomly varying time series (A) and a series with autocorrelation (B). The bars represent the average of every five data values.
FIGURE 2. Relationship between the normally distributed series, ln(ξt ) (A), the normally distributed, autocorrelated series, ln(zt ) (B), and the
lognormally distributed, autocorrelated series, zt (C), when μg = 5 and φ = 0.7 applied to Eq. (14).

FIGURE 3. Computer-generated AR(1) time series representing aerosol concentrations measured over a 200-min time period.
Incorporating Lognormally Distributed Data where μg is the geometric mean of all zt . Note that exponenti-
That measurements of airborne contaminants in occupa- ating both sides of Eq. (12) does not result in an equation with
tional settings are often lognormally distributed has been the same form as Eq. (3). Rather, the solution is:
recognized for decades.(18,19) Prior to performing a t-test on φ
zt = μ(1−φ)
g zt−1 ξt . (16)
randomly sampled data that are lognormally distributed, the
common practice is to first log-transform the values to achieve However, as was the case for a normally distributed, au-
normally distributed data sets that therefore meet the normality tocorrelated data series, the form of the mathematical model
assumption of the t-test. If the data are also autocorrelated, that best describes the data is not important when applying
the transformation will not remove the autocorrelation. As the methods described here to compensate for autocorrelation
shown in Figure 2, the log-transform (plot B) of a lognormally when performing a t-test. Here, the lag-1 autocorrelation coef-
distributed, autocorrelated series (plot C) still retains autocor- ficient, ρ̂1 , is computed for the normally distributed, autocor-
relation because the noise series, ξ t , analogous to εt in Eq. (3), related series, ln(zt ), using Eq. (5) for k = 1 for the one-sample
is lognormally distributed. When ξ t is log-transformed, ln(ξ t ), case, or Eq. (10) for the two-sample case, in order to adjust
a normally distributed series results (Figure 2, plot A). sample size using Eq. (7).
A lognormally distributed, autocorrelated data series, zt ,
must be modeled in a way analogous to that for normally
distributed, autocorrelated data by directly applying the log- ONE-SAMPLE T-TEST EXAMPLE
transformed values to Eq. (3):(20)

ln (zt ) = (1 − φ) ln μg + φ ln (zt−1 ) + ln (ξt ) , (15)
A set of measurements was created to demonstrate the
application of the one-sample t-test adjustment for au-
tocorrelated data described above. Using the model for an
FIGURE 4. Autocorrelation function (ACF) for the first seven lags. The “expected” curve indicates the magnitude of the ACF if it decayed in an
ideal manner from its value at lag-1 for an AR(1) process.

FIGURE 5. Aerosol concentrations recorded with a 1-min sample interval over 4 hr in two locations in a swine building. Station 1 was set near
the intake end and Station 3 near the exhaust end of the building.
AR(1) process given in Eq. (3), a series of 200 normally- coefficients in the regression model (given as “k” in the tables,
distributed random numbers were generated (μ = 0, σ = 1), but not to be confused with the k used here to indicate the
and applied to the model given μ = 9.9 and φ = 0.7 (Figure 3). number of lags). Given the simple linear regression performed,
In this hypothetical case, the measurements represent aerosol there were 2 coefficients (intercept and slope). Therefore, the
concentrations in units of mg/m3 for which the occupational α = 0.05 table was consulted for n = 200, k = 2, which resulted
exposure limit (OEL) is 10 mg/m3. A one-side, single-sample in a lower critical value of 1.758. Given that the test statistic is
t-test is therefore applied to test whether the concentrations are less than the lower critical value, the null hypothesis that the
significantly less than 10 mg/m3 at the 5% significance level data are not autocorrelated was rejected.
(α = 0.05). A spreadsheet was also developed to calculate the auto-
Initial steps involved procedures to determine whether the correlation function (ACF) for the first 7 lags (ρ̂1 , . . . , ρ̂7 )
data values were autocorrelated. First a spreadsheet was de- with the use of Eq. (5). The resulting bar graph is given
veloped to perform a calculation of the D-W statistic. The in Figure 4. A curve is also applied to the graph to indi-
spreadsheet (Excel, Microsoft Corp., Redmond, WA) was used cate the expected decrease in the ACF given the relationship
to perform a linear regression of the data values on time. The between the ACF value for the first lag and all other lags,
resulting slope and intercept were used to develop a column ρ̂k ≈ ρ̂1k . The relatively close association between the bar
of predicted measurements, ŷ, and from which a column chart and the curve provides reasonable assurance that the
of residuals, et = yt − ŷt was created. Calculation of the measurements constitute an AR(1) process. Furthermore, if
D-W statistic was performed with a function that utilized the the ACF does not decay appreciably the time series may
SUMXMY2() function to compute the numerator of Eq. (4), be nonstationary and the methods described here cannot be
and the SUMSQ() function to compute the denominator and applied.
obtaining a result of 0.658. The table of critical values for the Given assurance that the time series contains first-order
D-W statistic is segregated by sample size and the number of autocorrelated values, the methods described here, which were
FIGURE 6. Calculated autocorrelation function (bars) and theoretical decay in the function (solid line) for the data series obtained at Station
1 and Station 3.

formulated under the assumption that the time series consti- it can be concluded that both series contain autocorrelated
tutes an AR(1) process, can be applied. A spreadsheet was measurements.
developed to compute ne using Eq. (7). That equation requires The autocorrelation function (ACF) was then determined
calculations of ρ̂k for all 200 lags. However, as noted above, for the first 10 lags for each series. Figure 6 gives the ACF
each ρ̂k can be estimated given ρ̂k ≈ ρ̂1k , which therefore only for each series as well as a curve showing expected values if
requires the initial calculation of ρ̂1 using Eq. (5) and which successive correlations follow ρ̂k ≈ ρ̂1k . The ACF for Station
had already been evaluated when determining the ACF. For this 3 decays almost exactly as expected, and the decay for Station
case, ne = 42. The test statistic, t̂, was then calculated using 1 is not as pronounced as expected but shows a decline repre-
Eq. 6 after substituting sȳ with sa ȳ resulting in t̂ = −1.259. sentative of an AR(1) series and therefore justifies the use of
The critical t-value for t̂0.05,41 = 1.683. Since |-1.259| < 1.683 this method for that series as well. A secondary check on the
we fail to reject the null hypothesis. Note that the p-value suitability of the AR(1) model for these series was conducted
for this test can be computed in Excel using the T.DIST.2T() by first performing a regression analysis of yt vs. yt−1 to
function which resulted in 0.215. By comparison, if this test determine an estimate of φ from the slope of the resulting
had been conducted using a standard two-sample t-test, a linear equation (as defined in Eq. (3)). For each time point,
p-value = 0.003 would result, which therefore indicates that the fitted values from the model were then subtracted from
the mean of the data series was significantly less than the OEL. the data values to obtain a series of residuals. The D-W test
was then applied to the residual series with resulting values of
d̂ =2.145 and 2.086 for Stations 1 and 3, respectively. These

TWO-SAMPLE T-TEST EXAMPLE high values of the test statistic indicate that the residual series
are not significantly autocorrelated, and therefore meet the
P eters et al. described the spatial distribution of particles and

gases in a large swine building during winter, spring, and
summer sampling episodes.(21) During each sampling episode,
definition of the random component of the AR(1) model, εt .
To perform the t-test, a spreadsheet was developed to com-
pute the pooled estimate of the lag-1 correlation coefficient,
area samplers consisting of aerosol photometers (Model pDR- ρ̂p1 , (Eq. (10)) and the pooled estimate of the standard de-
1200, Thermo-Electron Corp., Waltham, MA) were placed in viation, sp , (Eq. (11)), obtaining values of 0.800 and 0.285,
the building to record aerosol concentration levels at several respectively. With ρ̂p1 the equivalent sample size, ne , was
locations throughout the building. One of the sampling stations calculated for each data series. Given that both series have
(Station 1) was placed near the air inlet to the tunnel-ventilated the same original sample size (n = 240) and ρ̂p1 is applied
building, and another station (Station 3) was placed near the to obtain the equivalent sample size for both series, the same
exhaust end of the building. Therefore, aerosol concentrations value of ne = me = 27 was obtained for both the Station 1
measured at Station 3 were expected to be higher than those and Station 3 data series. Given these values, the calculated
measured at Station 1. To provide an example of a t-test that t-value obtained using Eq. (9) was 2.251. This t-value can be
utilizes the methods described here, the mean levels of aerosol compared to the table t-value of 2.007 for ne + me − 2 = 52
concentrations recorded at the two stationary sites sampled degrees of freedom, which indicates a significant difference in
during the winter were compared. the means of the lognormally transformed data series (p-value
During that study the photometers were set to record with = 0.029). By comparison, a p-value = 2.02 × 10−19 would
a 10-sec sampling interval. However, wintertime ventilation result if the two series were compared without considering
flow rates were low; estimated to produce at most 6 air changes autocorrelation.
per hour, or τ = 10 min. Therefore, with regard to the sample-
interval guideline described previously, a spreadsheet was used
CONCLUSION
to average every 6 measurements to produce a series with a
1-min sample interval. The averaged 1-min measurements
made at each station over a 4-hr sample period (n = 240)
are given in Figure 5. As shown in that figure, Station 3
A method is presented for performing a t-test on the mean
of a single sample or to compare the means of two
independent samples comprised of autocorrelated data se-
concentrations were, in general, higher than those measured at ries. This method utilizes a traditional t-test while compen-
Station 1, but there were also considerable fluctuations in both sating for the inflation in the Type-1 error probability resulting
measurement series over time. from performing t-tests with autocorrelated data by computing
Probability plots of the measurements at each station indi- an “equivalent sample size” applied to the equation needed
cated that the data for both series were approximately lognor- to obtain a calculated t-value to compare against the table
mally distributed; therefore both series were log-transformed t-value. This method relies on the assumption that the data
and subsequent analyses were performed using the log- series have properties that can be adequately described as an
transformed time series for each station. The D-W statistic AR(1) process (Eq. (3)). The general steps involved when
was calculated using Eq. (4), resulting in d̂ = 0.709 and 0.367 conducting this t-test include the following.
for Stations 1 and 3, respectively. The D-W table value for
n = 240 and α = 0.05 is 1.78 for the lower limit. Given 1. Evaluating the data series for autocorrelation by deter-
that both calculated values were less than the lower limit mining the significance of the test statistic, d̂, (Eq. (4))

and visually assessing the nature of the autocorrelation 11. Brouwer, D., M. Berges, M.A. Virji W. Fransman, D. Bello, L.
function (Eq. (5)). Hodson: Harmonization of measurement strategies for exposure to
manufactured nano-objects; report of a workshop. Ann. Occup. Hyg.
2. Computing the equivalent sample size, ne , using Eq. (7)
56(1):1–9 (2012).
where ρ̂k is solved using Eq. (5) for the one-sample case 12. Vandaele, W.: Applied Time Series and Box-Jenkins Models. New York:
or Eqs. (10) and (11) for the two-sample case. Academic Press, 1983. p. 3.
3. Determining the test statistic, t̂, using Eq. (6) where sȳ is 13. Zwiers, F.W., and H. von Storch: Taking serial correlation into account
replaced with sa ȳ solved using Eq. (8) for the one-sample in tests of the mean. J. Climate 8:336–351 (1995).
14. Cummins, C.: “Critical Values for the Durbin-Watson Test.” [Online]
case, or determining t̂ using Eq. (9) for the two-sample
Available at http://web.stanford.edu/∼clint/bench/dwcrit.htm (accessed
case. June 25, 2014).
4. Testing for significance by comparing the resulting t̂ 15. Wilks, D.S.: Resampling hypothesis tests for autocorrelated fields. J.
value to a critical value obtained from a t-table for a Climate 10(1):65–82 (1997).
stated α level and ne − 1 degrees of freedom for the 16. Thiebaux, H.J., and F.W. Zwiers: The interpretation and estimation of
effective sample size. J. Clim. Appl. Meteorol. 23:800–811 (1984).
one-sample case, and ne + me − 2 for the two-sample
17. Lee J. and R. Lund: Equivalent sample sizes in time series regressions.
case. J. Statist. Comput. Simulation. 78(4):285–297 (2008).
18. Esmen, N.A., and Y. Hammad: Lognormality of environmental sam-
Spreadsheets developed to perform single-sample and two-
pling data. Environ. Sci. Technol. A 12:29–41 (1977).
sample t-tests are available upon request from the correspond- 19. National Institute for Occupational Safety and Health (NIOSH):
ing author. Occupational Exposure Sampling Strategy Manual (DHEW/NIOSH Pub.

no. 77-173). Cincinnati, OH:National Institute for Occupational Safety
and Health, 1977.
ACKNOWLEDGMENT 20. Male, L.M.: An experimental method for predicting plant yield
response to pollution time series. Atmos. Environ. 16(9):2247–2252
W e would like to thank the anonymous reviewer for

his/her thorough review of this article which resulted in
many meaningful changes to enhance the clarity of our expla-
(1982).
21. Peters, T.M., T.R. Anthony, C. Taylor, R. Altmaier, K. Anderson, and
P. T. O’Shaughnessy: Distribution of particle and gas concentrations in
Swine gestation confined animal feeding operations. Ann. Occup. Hyg.
nations. That reviewer was also responsible for determining 56(9):1080–1090 (2012).
the closed form expression of Eq. (7) that we present in Eq.
(9), for which we are very grateful. We would also like to thank
Mr. Craig Taylor and Dr. Kim Anderson for acquiring the data
displayed in Figure 5. APPENDIX
REFERENCES Notation Definitions

d̂ the Durbin-Watson statistic
1. Roach, S.A.: A most rational basis for air sampling programmes. Ann.
Occup. Hyg. 20:65–84 (1977).
εt the normally distributed random component of a mea-
2. Kumagai, S., I. Matsunaga, and Y. Kusaka: Autocorrelation of short- surement at time t
term and daily average exposure levels in workplaces. Am. Ind. Hyg. et the difference yt − ŷt at time t
Assoc. J. 54:341–350 (1993). ξt the lognormally distributed random component of a mea-
3. Spear, E.C., S. Selvin, and M. Francis: The influence of averaging time surement at time t
on the distribution of exposures. Am. Ind. Hyg. Assoc. J. 47:365–368
(1986).
k the number of lags (sample periods) from the time, t
4. George, D.K., M.R. Flynn, and R.L. Harris: Autocorrelation of interday me the equivalent sample size of a series with mtotal mea-
exposures at an automobile assembly plant. Am. Ind. Hyg. Assoc. J. surements
56:1187–1194 (1995). ne the equivalent sample size of a series with ntotal mea-
5. Rappaport, S.M.: Assessment of long-term exposures to toxic substances surements
in air. Ann. Occup. Hyg. 35(1):61–121 (1991).
6. Francis, M., S. Selvin, R. Spear, and S. Rappaport: The effect of
n the total number of measurements in a time series of
autocorrelation on the estimation of workers’ daily exposures. Am. Ind. measurements
Hyg. Assoc. J. 50:37–43 (1989). N room air changes per hour
7. Klein-Entink, R.H., W. Fransman, and D.H. Brouwer: How to φ the autocorrelation coefficient
statistically analyze nano exposure measurement results: Using an φ̂ the natural estimate of φ
ARIMA time series approach. J. Nanopart. Res. 13(12):6991–7004
(2011).
r Pearson correlation coefficient
8. Box, G.E.P., G.M. Jenkins, and G.C. Reinsel: Time Series Analysis: ρk the correlation of values of a time series with the values
Forecasting and Control. Englewood Cliffs, NJ:Prentice Hall, 1994. from that time series lagged by k time units
9. Shumway, R.H., and D.S. Stoffer: Time Series Analysis and its ρ̂k the natural estimate of ρk
Applications: With R Examples. New York:Springer, 2011. ρ̂p1 the pooled estimate of the lag-1 correlation coefficient
10. Sutradhar, B.C., I.B. MacNeill, and H.F. Sahrmann: Time series
valued experimental designs: one-way analysis of variance with au-
for two time series
tocorrelated errors. In Time Series and Econometric Modeling, I.B. σy the standard deviation of the time series
MacNeill and G. J. Umphrey (eds.). Dordrecht: D. Reidel Pub. Co., 1987. σȳ the standard error of the time series
pp. 113–129. sy the natural estimate of σy

sȳ the natural estimate of σȳ μg the geometric mean of a lognormally distributed time
sa ȳ the estimate of sȳ for an autocorrelated time series series
sp the pooled estimate of the standard deviation of two time y a time series of measurements
series yt the value of a time series at time t
t a discrete point in time at which a sample was taken ŷ the predicted value of y resulting from linear regression
t time interval between consecutive samples analysis at time t
t̂ the t-test test statistic ȳ the natural estimate of μ
τ room residence time zt the value of a lognormally distributed time series at
μ the mean of a time series time t

Performing T-Tests To Compare Autocorrelated Time Series Data Collected From Direct-Reading Instruments

Uploaded by

Copyright:

Available Formats

You might also like

Performing T-Tests To Compare Autocorrelated Time Series Data Collected From Direct-Reading Instruments

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Performing T-Tests To Compare Autocorrelated Time Series Data Collected From Direct-Reading Instruments

Uploaded by

Copyright:

Available Formats

Journal of Occupational and Environmental Hygiene

ISSN: 1545-9624 (Print) 1545-9632 (Online) Journal homepage: http://www.tandfonline.com/loi/uoeh20

Performing T-tests to Compare Autocorrelated

Patrick O’Shaughnessy & Joseph E. Cavanaugh

To link to this article: http://dx.doi.org/10.1080/15459624.2015.1044603

Accepted author version posted online: 26

Submit your article to this journal

Article views: 154

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Download by: [University of Lethbridge] Date: 06 November 2015, At: 12:29

Performing T-tests to Compare Autocorrelated Time Series

Journal of Occupational and Environmental Hygiene November 2015 743

744 Journal of Occupational and Environmental Hygiene November 2015

φ in Eq. (3) can aid in evaluating the efficacy of using the

Journal of Occupational and Environmental Hygiene November 2015 745

likelihood of being rejected. Autocorrelation therefore inflates

746 Journal of Occupational and Environmental Hygiene November 2015

Journal of Occupational and Environmental Hygiene November 2015 747

748 Journal of Occupational and Environmental Hygiene November 2015

Journal of Occupational and Environmental Hygiene November 2015 749

d̂ =2.145 and 2.086 for Stations 1 and 3, respectively. These

P eters et al. described the spatial distribution of particles and

750 Journal of Occupational and Environmental Hygiene November 2015

ing author. Occupational Exposure Sampling Strategy Manual (DHEW/NIOSH Pub.

W e would like to thank the anonymous reviewer for

REFERENCES Notation Definitions

Journal of Occupational and Environmental Hygiene November 2015 751

752 Journal of Occupational and Environmental Hygiene November 2015

You might also like