You are on page 1of 10

A Note on the Fourier Series Model for Analysing Line Transect Data

Author(s): Stephen T. Buckland


Source: Biometrics, Vol. 38, No. 2 (Jun., 1982), pp. 469-477
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2530461
Accessed: 28-04-2017 16:20 UTC

REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/2530461?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to
Biometrics

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
BIOMET:E{ICS 38, 469-477

June, 1982

A Note on the Fourier Series Model for


Analysing Line Transect Data

Stephen T. Buckland

Department of Statistics, University of Aberdeen, Old Aberdeen AB9 2UB, Scotland

S UMMARY

The Fourier series model offers a powerful procedure for the estimation of animal population
density from line transect data. The estimate is reliable over a wide range of detection functions. In
contrast, analytic confidence intervals yield, at best, 90% confidence for nominal 95% intervals.
Three solutions, one using Monte Carlo techniques, another making direct use of replicate lines and
the third based on the jackknife method, are discussed and compared.

1. InWoduction

The Fourier series model for analysing perpendicular distance data from line transect
sampling was proposed by Crain et al. (1979), and was examined further by Burnham,
Anderson and Laake (1980), who recommended it as a general model. This model
requires the Fourier series to be truncated at some point: when too many terms are
incorporated, over-parameterization occurs, but with too few terms, the Fourier series
approximation is inadequate. Burnham et al. (1980) suggested a stopping rule which
depends on the sample data. Hence, the number of terms selected is a random variable.
The variances quoted by Burnham et al. are conditional on the value indicated by the
stopping rule, so the confidence intervals may be too narrow. We show that the actual
confidence level is at most 90% for nominal 95/O coverage, even for 'well-behaved'
detection functions. Furthermore, the length of the confidence interval is very dependent
on the number of terms incorporated. We conclude that the most reliable solution of the
three discussed is a Monte Carlo interval, as defined in an unpublished report (Technical
Report No. 1, Department of Statistics, University of Aberdeen, 1981). This technique
has been used in capture-recapture by Buckland (1980).

2. The Model axld its Assumptions

The following notation is used: D, average density of objects; L, length of transect line;
n, total number of objects seen; xi, perpendicular distance of the ith observed object from
the transect line, i= 1, . . ., n; w8, perpendicular distance beyond which all observations
are discarded in the analysis; g(x), the detection function, i.e. the conditional probability
that an object is observed, given that it is at perpendicular distance x from the line; t(x),
the probability density function of the observed perpendicular distances.
For a discussion of the assumptions and their relative importance, see Burnham et al.
(1980). It is not necessary to assume that objects are distributed randomly provided there
is a sufficient number of randomly positioned transect lines (or a randomly positioned grid
of parallel lines). This number may be small if deviation from randomness is slight.

Key words: Line transect sampling; Fourier series model; Confidence intervals; Jackknife; Monte
Carlo methods; Bootstrap.

469

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
n
470 Biometrics, June 1982

The probability density function, f(x), is approximated by a Fourier series with m


cosine terms:

f(x)- * + E ak cos ( * )-

The coefficients, ak, are estimated by

ak = * E cos(kxxJw*), k = 1, . . ., m.

Hence,
1 Fn
f() w* E

and

1 nf(0) . 1 E(n)f(0)

2 L eshmates D = 2 L

The stopping rule states that the first value of m should be chosen such that

w* (n + 1) lam+ll-

An approximate 95/O confidence interval for D is

D zt 1.960{varest(D)}2,

where

(D) D2 [Varest(n) + Varestft(O)}]

and

ln m
varest(O)}= E E CVest(ak) aj)-
j=1 k=

Now,

Varest(ak) = n _ 1 {w* (a2k + w*)- ak}) k 1,

CVest(ak, ai) = n 1 1 {w1* (ak+i + ak-j) - akaj}, k > j 1,

and
Covest(ak) ak) = Varest(ak)

When objects are randomly distributed, var(n) = E(n) if the finite population correction is
ignored; thence varest(n) = n. Otherwise varest(n) must be estimated, for example from
replicate lines (see Burnham et al., 1980, p. 54).

3. Direct Estimatioll

If a large sample of randomly positioned transects (or a randomly positioned grid of many
parallel transects) is taken, the Fourier series estimate, Di, may be found for each transect

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
Fourier Series Model for Line Transect Data 471

by application of the stopping rule to that transect. The sample variance of these
estimates, weighted for line length if this varies, may be used to find an estimated variance
of D, varest(D), that is not conditional on m, and hence a confidence interval for D:

D i tR-l{Varest(D)}2,

where R is the number of replicate lines and tR_1 is the appropriate val
with R-1 df.
This approach is restricted to studies that have many transects, each with a respectable
sample size. Burnham et al. (1980, p. 52) suggested that the method may not be reliable if
some or all of the transects have sample sizes of less than 25.

4. Jackknife Estimation

The jackknife method is described by Burnham et al. (1980, pp. 53-54). Suppose data arise
from R replicate lines. If all data from Line i are excluded, a Fourier series estimate, D(i),
may be calculated from the remaining data, i = 1, . . ., R. The stopping rule for Pn should
be applied separately for each estimate. Pseudovalues are then calculated as

D(i) = LD (L li)D(i), i = 1, . . ., R,

where D is the estimate based on the complete data set and li is the length of the ith
replicate line. A confidence interval is now given by

D i tR_l{varest(D)}2,

where

varest(D) _ E. = 1 li (D(i) _ D)2 ( 1)

This method does not require the sample size from each replicate line to be large.
However, if the number of replicate lines is small, interval length is highly variable. In
other words, the jackknife method performs well when there is a reasonably large sample
of randomly positioned transects.
Equation (1) is slightly different from the procedure in Burnham et al. (1980), where DJ
was recommended as an estimate of D, and

vare5t(DJ) = LZ=1 I,(D(i)-DJ)2 (2)

where
D _R=1 liD(i)

J L

However, simulations show that (1) gives better confidence interval coverage than (2) (see
6).

5. Monte Carlo Intervals

Monte Carlo intervals, which are discussed in detail in the 1981 technical report by
Buckland, are an extension of Bootstrap Method 2 (Efron, 1979) to robust confidence
interval estimation. Three types of Monte Carlo intervals can be defined; Types 1 and 2,
respectively, provide Methods 1 and 2 considered here. Method 2 is recommended if the

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
472 Biometrics, June 1982

sample size is less than 50, and Method 1 performs better for larger samples; unlike the
direct and jackknife approaches, neither Monte Carlo method requires replicate lines and
Method 2 is robust when the sample size is small
The Fourier series model generally provides a reliable estimatorv D, of Dv and so it is
proposed that Monte Carlo intervals are generated from D. Both Method l and Method 2
allow for the variability of m, and Method 2 is more robust against skewness than both
Method 1 and the analytic method.
The Fourier series approximation to the density t(x) may be reformulated as

f(x) w*+ E akak cos( 8 ),

where the parameter ak =O or 1, k -- 1, 2, . . O, 6. This model is no longer linear in the


parameters. The stopping rule is now a device to estimate bk:
A A

6 1 = bTt1 1 )

bm + 1 66 -

Note that this approach has wider applications: whenever there is a choice between
models, and that choice is made oll the basis of the observed datar a similar problem
arises; for example, in capture-recapture, where goodness-of-fit tests may be used to
estimate the 6 values and in multiple regression, where a stepwise procedure yields 8
values.

Method 1

The sample of n perpendicular distances is used to estimate f(x) by a Fourier series. From
this estimated density, generate, say7 1000 sets of n observations, using a pseudo-random
number generator, and use each set in turn to estimate f(x), and hence f(O)7 by fi() say,
i = 1, . . . ,1000. Calculate the sample varial^ Rlqrest{t(0)} of these estimates, and pro-
ceed as with the analytic method:

varest(D) = D2 [Varest(n) + varest{f(O)}]

which yields Di1 960{varest(D)}^- as an approximate 95/0 confidence illterval for D.


Note that the stopping rule to select nt is applied separately to each simulated data set.
If the value of W7 indicated by the real data set is used in the calculation of each t (O), then
Monte C:arlo Method 1 wil1 simply duplicate the analytic method, apart from Monte Carlo
variation.

Method 2

Here, a slightly different procedure is necessary Suppose objects are randomly distri-
buted. The sample size will have a Poisson distribution with parameter E(n), which may
be estimated from the observed sample size n. A Poisson random deviate7 nl) is now
generated, and that number of observations is simulated by the use of the estimated
density t(x). The estimates fl(O) and D1= nlt1(O)/2L are calculated from these sirrlulated
observations. Repeat, say, 1000 times, arrange the Di values in ascending order, and use
D(25) and D(976), respectively, as approximate lower and upper 95/0 confidence limitsO
Again, the stopping rule is applied to each simulated data set.
If varest(n) > n, the assumption tlaat objects are randomly distributed may be relaxed by

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
a
Fourier Series Model lfor LiBle Transect Data 473

assurlling that sample size has, sayv a neoative binomial distributiorl:

p(n) ( s_I )p (1 P) -

Estimate E(n) = {S(1-p)}/p froIn n and var(n) = {s(1 _ p)}/p2 f


replicate linesr and so obtain p = n/{varzBst(n)} and s = np/(l
distribution using an iterative maximuln likelihood proce
gamma functions if s is not an integer; generate random dev
distriblltion, and proceed as before.

6 D Siaton Resits

To assess the usefulness of tlle corlfidence intervals, the true detection


assumed to be an expoflential power series (Pollock, 1978)o

g(x) = exp{- (-) }, x, a and b all pos,itive.

Now a is a scale parameter ancl, without loss of generality, may be set equal to unityO
Values of b collsidered were 1.0 (the negative exponelatial), 1.5, 2.0 (the half-normal)^
3.0^ 5.0; these cover a range of plausible shapes for the detection fullction. To analyse any
sirllulated data, a truncation pO!llt, W@, must be selected. The expected proportion of
observations discarded is

r {l (W )b} ( | ) b > 1.07

neXP{W-} b=lO

whe1^e (c u) = 90 y(c-1) exp(-y) dy and F(c) - (c,ot).


The jackknife method was assessed with a half-normal de
assumed and with w S chosen such that the expected proportion
was 2-%. Two populations of ralldomly distributed objects were
average sample size of about 50 and the other of 100. The sam
10 transects of equal length in each case, so that the average sa
roughly five for the first populatiotl and 10 for the second. Th
Table 1. Equation (2) (4) with R = 10 leads to roughly 88% conf

Table 1
Estimczted actual confidence levels of the jackknife method with
10 replicate lines, calculated from 5000 sinxulations (nominal
level is 95 % )

Equation used Estimated


for calculation Average actual confidence Standard
of intervals sample size level (%) error (%)

(1) 52 96.4 0.3


104 96.2 0.3
(2) 52 87.9 0.5
104 88.2 0.5

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
474 Biometricss June 1982

coverage; this indicates that (1), which leads to roughly 96% confidence on average, is
superior. The interval length calculated from either (1) or (2) is subject to additional
variation which is substantial when sample size R is small, as here. Although average
coverage is satisfactory with (1), the jackknife method will be comparable with the Monte
Carlo method only when the number of replicate lines is large.
To illustrate the effect of sample size on the Monte Carlo confidence intervals, the
half-normal detection function (b = 2.0), with we chosen such that the expected propor-
tion of discarded observations was 22%, was again adopted. For this set of simulations
only, the observed sample size was predetermined. Results appear in Table 2. Any
difference between Methods 1 arld 2 is largely a result of the robustness of Method 2
against its skewness; it does llOt produce an interval that is symmetric about D. Th
lengths of the intervals for Methods 1 and 2 are comparable, though for small sample
sizes, Method 2 tends to produce an interval shifted to the right of the synlmetric interval.
Method 3 is the analytic method, which is conditional on m, and it generally produces a
shorter interval than either Method 1 or 2. For sample sizes 30, 40, 75, 100 and 200, just
a single simulation was carried out; however, to obtain the results in the final column of
Table 2, 11 simulated data sets were used. For sample size 50, simulations were carried
out until each value of m from 1 to 6 had occurred 11 times. The simulation correspond-
ing to the first occurrence of each value is presented in Table 2, and the final column in
the Table was calculated from that and the other 10 simulations. The value m = 1
occurred considerably more frequently than any other, followed by 2 and 3. The m-value
4, 5 and 6 were rare, and the average was approximately 1.5 terms. These results indicate
that the lengths of the Monte Carlo intervals are not strongly dependent on the value of m
indicated by the data. The shape of the estimated detection function is more important
since this dictates the values of m that are likely to occur in the simulations. The analyti
method produces considerably shorter intervals thall the Monte Carlo methods for m- 1,
and considerably longer intervals for m = 5 or 6. There is a marked trend of increasing
interval length with increasing w.
The next set of simulations enables the actual confidence levels of the three methods to
be estimated. Simulations showed that sample size has no discernible eHect on actua
confidence levels for the analytic method (although sample sizes below 25 were no
investigated)s and the density corresponding to this set of simulations was chosen so that
the average sample size was roughly 50. Two truncation points were investigated for each
value of b (i.e. each underlying shape of detection function), one corresponding to a 1%
discard rate of observations and the other to a 5% discard rate. The simulations suggested
that the choice of discard rate (1% or 5%) is irrelevant for the analytic method if b 2.0
and for the Monte Carlo methods if b 1.5. The actual value of b also has no significan
effect on actual confidence levels, except when b = 1.0 for all methods and also when
b = 1.5 for the analytic method. Of the values considered, only b = 1.0 (the negativ
exponential) gives a detection function with t'(0)+0. This detection function seldom
occurs in practice, and because the Fourier series with no sine terms has zero slope at
x = 0, the model does not perform well when b = 1.0, since estimation of t(0) is crucial.
Table 3 summarizes the results of the simulations. Since considerably less computer
time is required to assess the analytic method, it was examined more rigorously; this is
reflected in the comparatively small standard errors for the estimated actual confiden
levels. For given b and w8, 2100 analytic intervals were generated by simulation. Th
proportion, p, of these that covers the true density, D = 3.0, was found and the associate
standard error, {p(1-p)/2100}-2, was calculated. For a subset of 100 simulations, Monte
Carlo intervals were also generated. Estimates may be found as before.

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
Fourier Series Model for Line Transect Data 475

Table 2
Confidence intervals for density, D. Data were generated from a half-notmal detection
function; D is the Fourier series estimate
J

Ratio of Average ratio of


Number 95% interval interval lengths,
Sample of Fourier D Method* confidence lengths, Method 3: Method 2,
slze, n series interval Method 3: calculated from 11
terms, m for D Method 2 sets of intervals

1 (1.36,5.48)
30 1 3.42 2 (1.38,5.86) 0.60 0.71
3 (2.08, 4.75)

1 (1.29, 4.64)
40 1 2.97 2 (1.37, 4.73) 0.63 0.78
3 (1.91, 4.02)

1 (1.37, 4.15)
50 1 2.76 2 (1.35,4.18) 0.66 0.71
3 (1.83, 3.70)

1 (1.50,5.05)
50 2 3.27 2 (1.86,5.46) 0.74 0.97
3 (1.94, 4.60)

1 (0.51,3.00)
50 3 1.76 2 (0.81,3.30) 1.03 1.05
3 (0.48,3.04)

1 (2.31,7.38)
50 4 4.85 2 (2.59,7.42) 0.88 1.19
3 (2.71,6.98)

1 (0.91,3.97)
50 5 2.44 2 (1.55,4.81) 1.17 1.27
3 (0.53,4.35)

1 (0.63, 3.49)
50 6 2.06 2 (0.91,3.77) 1.13 1.32
3 (0.45,3.67)

1 (1.65,3.78)
75 1 2.71 2 (1.68,3.85) 0.69 0.79
3 (1.97,3.46)

1 (1.52,3.69)
100 2 2.60 2 (1.43,3.59) 0.81 0.72
3 (1.73,3.47)

1 (2.17,3.63)
200 1 2.90 2 (2.08,3.63) 0.62 0.77
3 (2.42,3.38)

8 Method 1 is Monte Carlo interval Type 1 Method 2 is Monte Carlo interval Type 2, and
Method 3 is the analytic method.

The correlation between the Monte Carlo methods and the analytic method can be
utilized to improve precision. Let Es denote summation over the subset of simulations for
which both analytic and Monte Carlo intervals were found, and let La denote summation
over all simulations for which an analytic interval was found.

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
476 Biometrics, June 1982

Table 3
Estimated actual confidence levels for density, D, for a nominal level of
95% (standard errors in brackets). Data were generated with an exponen-
tial series model assumed for the detection function (see text)

Percentage
Parameter of observations Analytic Monte Carlo method
b discarded method 1 2

1.0 1 67.10 74.84 79.68


(1.03) (2.46) (2.85)
5 67.14 79.60 86.40
(1.02) (2.63) (3.17)

1.5 1 78.52
(0.90)
5 86.57
(0.74)
94.65 96.93
(0.64) (0.56)

2.0, 3.0 and 5.0 1 and 5 89.88


(0.27)

Definitions:
ns, number of simulations for which both analytic and Monte Carlo intervals were found;
na, number of simulations for which an analytic interval was found;

f0, Monte Carlo interval includes D,


Y t1, otherwise;
60, analytic interval irlcludes X,
ll, otherwise;

r=s Y/s x;

X = La XlNae

The estirnated actual confidence level for the relevant Monte Carlo method is given by

100(1-rX)%,

with approximate standard error

{ (y-rx)2 }2

This formula is similar to that given by Cochran (1977, p. 155), with the finite population
correction ignored. The procedure was carried out for each relevant category in Table 3.
As expected, all methods perform badly when the detection function is the negative
exponential. For more plausible detection functions, the analytic methed achieves at most
90/O confidence for nominal 95% coverage. Monte Carlo Method 2 is conservative,
apparently because the stopping rule proposed by Burnham et al. (1980) is conservative
when the variation in m is allowed for. The estimated actual confidence level for Method
1 does not differ significantly from 95/O, the conservative stopping rule being coun-
teracted by the effect of skewness. As sample size increases, skewness decreases, and
Method 1 will also become conservative. Average sample size for these simulations was
approximately S0.

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms
Fourier Series Model tor Line Transect Data
477

Kronmal and Tarter (1968) derived the following stopping rule for m: select the first
value of m such that

w*( rl+l ) I m+l|.

Burnham et al. (1980, p. 138) argued that an, is of order l/m2, so that a2m+2<< am+1. They
therefore took a2}7l+2 = 0. To assess whether this makes the stopping rule conservative, a
number of simulations was run. In half of these, a2en+2 was approximated by a2m+2 in the
stopping rule, and for the rest, a2m+2 was assumed to be zero. No significant difference
was detected; this supports the conclusion of Burnham et al. (1980).
It is clear from these results that further investigation of the stopping rule is required;
this was acknowledged by Burnham et al. (1980, p. 138). Construction of a rule that is
more conservative is not sufficient if analyses are still conditional on m; if the variability in
the actual confidence level is high, it is not satisfactory to adjust the rule until the average
actual confidence level is 95/O. We have found that m = 1 gives substantially below 95/O
confidence, and that large values of m give substantially greater than 95% confidence.
The extensive simulations carried out on the halfZnormal detection function indicate that
the analytic method with m = 3 leads to reliable confidence interval estimation. When the
detection function is not the half-normal but is similar in shape, this ad hoc value of m
may permit quick estimation that is more reliable than that obtained with the stopping
rule for m. Further investigation of detection functions that arise in practice may prove
valuable.

ACKNOWLEDGEMENTS

I am very grateful to Kenneth P. Burnham for his constructive criticisms and comments on
an earlier draft, and to the referees for their useful suggestions.

RESUME

Le modele des series de Fourier ofire un procede puissant pour estimer la densite d'une population
animale provenant de donnees observees sur une ligne ('transect'). L'estimation est sure pour un
vaste choix de fonctions de detection. Les intervalles de confiance analytiques, par contre, fournis-
sent au mieux une valeur de 90% pour des intervalles nominaux de 95%. Trois solutions, l'une
utilisant des techniques de Monte Carlo, une autre utilisant directement des repetitions des lignes et
une troisieme fondee sur la methode du 'jackknife' sont discutees et comparees.

REFERENCES

Buckland, S. T. (1980). A modified analysis of the Jolly-Seber capture-recapture model. Biometrics


36, 419-435.
Burnham, K. P., Anderson, D. R. and Laake, J. L. (1980). Estimation of density from line transec
sampling of biological populations. Wildlife monograph no. 72. Supplement to Journal of
Wildlife Management 44.
Cochran, W. G. (1977). Sampling Techniques, 3rd edition. New York: Wiley.
Crain, B. R., Burnham, K. P., Anderson, D. R. and Laake, J. L. (1979). Nonparametric estimation
of population density for line transect sampling using Fourier series. Biometrical Journal 21,
731-748.
Efron, B. (1979). The 1977 Reitz Lecture. Bootstrap methods: another look at the jackknife.
Annals of Statistics 7, 1-26.
Kronmal, R. and Tarter, M. (1968). The estimation of probability densities and cumulatives by
Fourier series methods. Jouma! of the American Statistical Association 639 925-952.
Pollock, K. H. (1978). A family of density estimators for line transect sampling. Biometrics 34,
475-478.

Received February 1981; revised June 1981

This content downloaded from 131.221.9.20 on Fri, 28 Apr 2017 16:20:31 UTC
All use subject to http://about.jstor.org/terms

You might also like