ABSTRACT
WALDEN,
A.T. and WHITE,R.E. 1984, On Errors of Fit and Accuracy in matching Synthetic
Seismograms and Seismic Traces, Geophysical Prospecting 32, 871891.
A synthetic seismogram that closely resembles a seismic trace recorded at a well may not
be at all reliable for, say, stratigraphic interpretation around the well. The most accurate
synthetic seismogram is, in general, not the one that displays the smallest errors of fit to the
trace but the one that best estimates the noise on the trace. If the match is confined to a short
interval of interest or if the seismic reflection wavelet is allowed to be unduly long, there is
considerable danger of forcing a spurious fit that treats the noise on the trace as part of the
seismic reflection signal instead of making a genuine match with the signal itself. This paper
outlines tests that allow an objective and quantitative evaluation of the accuracy of any
match and illustrates their application with practical examples.
The accuracy of estimation is summarized by the normalized mean square error (NMSE)
in the estimated reflection signal, which is shown to be
where Ps/PNis the signaltonoise power ratio and n is the spectral smoothing factor. That is,
the accuracy varies directly with the ratio of the power in the signal (taken to be the synthetic)
to that in the noise on the seismic trace, and the smoothing acts to improve the accuracy of
the predicted signal. The construction of confidence intervals for the NMSE is discussed.
Guidelines for the choice of the spectral smoothing factor n are given.
The variation of wavelet shape due to different realizations of the noise component is
illustrated, and the use of confidence intervals on wavelet phase is recommended.
Tests are described for examining the normality and stationarity of the errors of fit and
their independence of the estimated reflection signal.
* Paper read at the 45th meeting of the European Association of Exploration Geophysicists,
Oslo, June 1983, revision received January 1984.
** Geophysical Research and Technical Services, BP Exploration Co. Ltd, Britannic House,
Moor Lane, London EC2Y 9BU, England.
871
872
1. I N T R O D U C T I O N
Application of partial coherence analysis to matching synthetic seismograms and
seismic traces (White 1980) assumes that the recorded seismic trace is a filtered
version of the broadband synthetic seismogram computed from the well log, plus
some additive noise. Different components of the broadband synthetic seismogram
may be differently filtered. The mathematical expression of this picture of the trace is
4
y(t) =
where * denotes convolution, xi(t)is the ith component of the broadband synthetic
reflection spike sequence, hi(t)is the ith wavelet, and u(t) is noise. In the model each
input component xi(t) is considered to be a distinct part of the reflection coefficient
series, and to be uncontaminated by noise. Suppose q = 2; then, for example, input
channel 1 might be attenuated primaries plus internal multiples, and channel 2
surface multiples.
The noise u(t) is assumed to be
a. stationary and random,
b. statistically independent of the other components of the trace,
c. normally (Gaussian) distributed with zero mean.
The constraints of stationarity and random noise imply that its mean, variance,
correlation, and spectral characteristics do not change with time. Small departures
from normality (Gaussianity) are not critical to the estimation procedure, but larger
departures could be important.
The reflection sequence is not assumed to be whiteit is calculated explicitly
from the sonic logand the individual input components xi(t) can be either nonstationary or nonrandom or both, and correlated (to a limited degree).
The wavelets are not assumed to be minimum phase, but from physical considerations each is expected to have a zero d.c. component. Although this fact plays no
part in the formulation of the matching procedure, it does provide a useful check on
the quality of estimated wavelets.
Since most practical applications concern onechannel matches, the methods for
assessing the quality of a match are now discussed in terms of the onechannel case
(q = 1) which allows greater simplicity of notation and interpretation. The techniques are easily extendable to two or more channels.
2. I N F E R E N C EF R O M
THE
GOODNESSOFFIT
ANALYSIS OF TRACE M A T CH IN G
873
length T gives a frequency separation of 1/T between independent spectral components, there are b / ( l / T ) such independent components within the spectral
window, and smoothing can be usefully envisaged as the averaging of this number of
adjacent independent spectral ordinates. Hence the smoothing factor n, henceforth
called simply smoothing, is equal to bT, the bandwidth of the smoothing window
multiplied by the data gate length T .
The two spectral windows employed by us, the Papoulis (Papoulis 1973) and
Daniell (e.g. Bloomfield 1976) windows, are illustrated in fig. 1. For the Papoulis
FREQ (Hz)
window, the relation of smoothing n to the data gate length T and total width L of
the taper applied to the correlations is
= 3.400
TIL.
The lag window length L is just twice the maximum lag in the taper, i.e. twice the
lag at which the taper drops to zero, and it should be chosen long enough to enclose
the waveletdominated portions of the crosscorrelations between the trace and
broadband synthetic seismogram. For the Daniell window, which is a tapered sinc
function in the time domain, the width L of the main lobe of the sinc function is
related to the smoothing by
n = 2T/L.
The choice of smoothing n is discussed in section 3.
The goodnessoffit as a function of frequency can be measured by the estimated
signaltonoise power ratio of the match at frequencyf:
A.T. W A L D E N A N D R.E. W H I T E
874
is
@xx
I fi  H 1 df
= P,/n,
(1)
and the normalized mean square error in the signal estimate is simply the inverse:
NMSE
= (l/n)(PN/Ps)
= A.
(2)
A is a measure of the accuracy of estimation and has the form that one would
intuitively expect. The accuracy varies directly with the ratio of the power in the
signal to that in the noise on the trace and directly as the smoothing n.
Now A = 2BT(PdPN) = 2BT(l/n)(nP$PN)= ( 2 B T / n ) A 2= v,A.
Hence a
lOO(1  CI)%confidence interval for A is given by
AI I A I A:,
where A: and A: are defined byand
Pr{Fvi, v 2 , v1A12 2
(v2/v1)(pS/pN)}
d2 = Pr{Fvl,
v2, v 1 h 2 2
I (vZ/vl)(pS/pN)},
A N A L Y S I S O F TRACE M A T C H I N G
875
A; I NMSE I A;2.
Methods for finding Pr{F,,, y 2 , If *} are discussed in appendix B.
For a white reflection sequence, @Jf) % const = s2, say, so that the mean
power in the errors of prediction is given by
is
 HI2
df},
i.e. s2 multiplied by the expected energy of the errors in the wavelet, and the signal
power is
s2 ~ ~ H df.
1 2
Hence, if the reflection sequence is white the NMSE is the NMSE in the estimate of
the wavelet itself, since s2 is eliminated by the normalization.
The theory associated with the NMSE in the signal estimate, outlined above, is
derived under the assumption that spectral bias errors are negligible; this is the case
when the smoothing is less than, or equal to, the optimal choice, the value of which
is considered in section 3. Bias error should not prove to be a practical problem
since one should always tend to err on the side of too little smoothing, rather than
too much. Poor centering of the crosscorrelation between the trace and the synthetic seismogram can also cause biastermed misalignment bias and the practical procedure includes an automatic scan of alignment to ensure proper centering.
Examples of the confidence intervals obtained for the NMSE in the signal estimate from some real data analyses are given in table 1. The NMSE estimates found
Table 1. N M S E results for some synthetic seismogram
studies.
Type of
smoothing
and factor
mratio
and 90%
conf. level
(brackets)
interval and
point estimate
(brackets) %
Well 1
D anie11
n = 13.2
1.0
(0.23)
5.4, 16
(7.6)
Well 2
Papoulis
n = 13.6
0.65
(0.30)
6.4, 46
(11)
Well 3
Daniell
n = 13.2
1.80
(0.23)
3.0, 8.0
(4.2)
Well 4
Daniell
n = 13.2
2.10
(0.22)
2.7, 6.7
(34
Well 5
D anie11
n = 9.4
0.9
(0.35)
7.7, 42
112)
Well
90% NMSE
876
from substituting the estimated value P^,/P^, in equation (2) are also given, and the
skewness of the distribution about this (biased) point estimate can be appreciated. A
useful value is the maximum of the confidence interval; with 95% confidence this
value is the maximum likely NMSE in the signal estimate. Note that even though
well 1 and well 5 both have $%
ratios of approximately 1, the different smoothings
used lead to very different quality assessments (16% compared with 42%).
3. SMOOTHING
The previous section showed that the distribution of the
ratio is a function of
smoothing through v1 and v 2 . For fixed T and L the number of degrees of freedom
associated with the window (n) depends on window type (Daniell or Papoulis) and,
hence, the magnitude of the signaltonoise ratio obtained from matching will vary
from one smoothing window to another. The specification of a matching analysis by
means of T and L is therefore incomplete if the window type is not also stated.
For statistical purposes the bandwidth of the spectral window is given by
b = n/T. The smoothing n should be chosen to give a reasonable bandwidth to the
spectral window which, as a rule of thumb, should be somewhat less than half the
trace bandwidth. If it is larger than this, the spectral estimates tend to be badly
distorted by the smoothing. An unduly large bias from smoothing is called
oversmoothing .
While different spectral windows produce similar random errors of estimation
for a given smoothing n, they still differ in the distortions and biases they introduce
into the estimates. Consider the estimate of spectral power at zero frequency. Any
power near zero frequency in the spectrum being smoothed is attenuated by the
dropoff of the main lobe of the Papoulis window, but remains undiminished within
the bandwidth of the nearly rectangular Daniell window. Hence, a larger d.c. component will be associated with the use of a Daniell window. In tests for well 1 (table
l), for example, the average d.c. level per sample of the wavelet with n = 24 was only
about 1% of the peak magnitude wavelet value for Papoulis smoothing, but nearly
10% for Daniell smoothing. Such large values should not arise if the spectra are not
oversmoothed, but some oversmoothing becomes inevitable if the match is not good
and cannot be extended over a segment of more than 500 ms. The Daniell window
therefore is for quick preliminary analyses and searches, and the Papoulis window is
generally better for a final estimation.
The smoothing is an important factor in the matching procedure. Since the
chosen value n is never more than an educated guess, it is advisable to vary the
smoothing over a small range about the chosen value, and examine the relative time
alignment of the broadbandsynthetic seismogram and the seismic trace for these
values. Consecutiveor largejumps in this timing are indicative of unstable estimation. Usually there is a reasonable range of satisfactory values for n over which
the error in the wavelet estimate is close to the best attainable; outside of this range
the match will be either a forced fit or a strongly biased one. This behavior is
illustrated in figs 2a and 2b. In each case the attenuated primaries trace from well 1
ANALYSIS O F T R A C E M A T C H I N G
S/N = 0.75
a.
30
213 NEE %
(1
877
26 24
20 
22
14 12 10 
18
16
'1
\ \\
.45
.50
 .55
'..'./'/
S E 0 FTES?
AIC
PARAMETER
(I
113.3
14.2
.60
42.6
25.6
SMOOTHING n
Fig. 2a. Normalized error energy in wavelet estimate, and value of AIC parameter, as a
function of smoothing n ; S/N = 0.75.
 .30
262422
NEE %
(I
.35
20
18
18
141210
86
,\
.40
 .45
SO
 35
4 2
AIC
PARAMETER
(I
SEQ FTEST
was convolved with a wavelet, and to the result was added random noise filtered by
a wavelet with a power spectrum very similar to that of the observed residual trace
from the original synthetic study. Scaling was carried out to give a specified S/N
ratio. For each of several choices of smoothing the wavelet was estimated by least
878
squares matching (White 1980). The estimated wavelet and the input (known)
wavelet were then compared and the relative error, or normalized error energy
(NEE), calculated for each smoothing. The plots of NEE for S/N ratios of 0.75 and
1.0 are shown as the heavy lines in figs 2a and 2b, respectively. It is clear that the
bias error associated with oversmoothing increases dramatically, while the random
error associated with undersmoothing increases more slowly; notice that the penalty
incurred by undersmoothing is substantially greater for the lower S/N ratio. It is
very dangerous therefore to go into matching without understanding the role of
smoothing.
There are statistical criteria for assisting in the choice of lag window length. Two
such criteria are the Akaike Information Criterion (AIC) and the Sequential Ftest,
examined in detail in Bunch (1984). The optimum length corresponds to a
minimum of the AIC parameter, or the length corresponding to the crossing of a
preset confidence level for the sequential Ftest. Examples of the AIC plot for well 1
and well 2 are given in figs 3a and 3b. For well 1 there is a clear minimum, and it is
seen that for n < 33.6 there is no shift in the timing alignment of the synthetic
seismogram and the seismic trace, and estimates around the minimum, n % 28, are
by this test stable. However, for well 2 the AIC plot behaves very poorly, the peak in
the AIC plot at n z 14 coinciding with a large timing alignment jump; the best
smoothing suggested by the plot is n in the range 10 to 12.
In general, even where the timing alignment is well behaved, there is a clear
tendency for these automatic methods to select lag window lengths (in ms) which are
too short from bandwidth considerations, i.e. they oversmooth. This behavior is
understandable since these automatic methods do not take account of the biases
from spectral estimation, and therefore do not penalize short operators sufficiently
hard. For example, for well 1 the bandwidth of the seismic trace was 45 Hz, while
the bandwidth of the smoothing window, given by b = n/T, was some 33 Hz for
both automatic methods. The smoothing actually chosen in the well 1 study corresponded to a bandwidth of 17.6 Hz, a much more reasonable value. In figs 2a and
2b the AIC plots (shown as dashed lines) have been superimposed on the NEE
(normalized error energy) curves, and the smoothing chosen by the sequential Ftest
(90% level) has also been marked. For both S/N levels, the two automatic methods
select a smoothing which is too large, i.e. they oversmooth.
To summarize the topic of smoothing, it is recommended that in any analysis
one should
1. investigate the effects of varying the smoothing n before deciding on a suitable
value (cf. tests for deconvolution parameters which vary the operator length);
2. state the smoothing factor and the type of window employed, since this information, together with T, completely identifies the windowing procedure;
3. adopt a consistent approach, as needed in, say, comparisons of
ratios, by
applying the same type of smoothing window (say a Papoulis window) in all final
analyses ;
4. look for any variations in the relative time alignment of the broadband synthetic
seismogram and the seismic trace as the smoothing is varied, since these are
ANALYSIS O F T R A C E M A T C H I N G
819
 0.30
 0.35
AIC
PARAMETER
 0.40
 0.45
 0.50
 0.55
Smoothing n
Time alignment
(ma)
0.34
AIC
0.36
PARAMETER
0.38
0.40
0.42
0.44
28.4 21.3
9.5
9.9
0
880
A . T . W A L D E N A N D R.E. W H I T E
4. W A V E L E TS I M I L A R I TAY
N D P H A S EE R R O R S
The attenuated primaries trace from well 1 was convolved with the wavelet of fig.
4a. and to the result was added colored noise produced as described in section 3.
TRUE S/N=1
WAVELET USED
IN SIMULATION
WAVELET ESTIMATED
FROM 1st SIMULATION
WAVELET ESTIMATED
FROM 2nd SIMULATION
U)
0 8
7 0
2
0
d
DS
Scaling was then carried out to give a S/N ratio of 1.0 over the 750 ms gate. The
wavelet was estimated by leastsquares matching with the same parameter values as
for well 1 in table 1 ; it is shown in fig. 4b. The procedure was repeated with a
different realization of random noise, and the estimated wavelet is shown in fig. 4c.
The wavelet of fig. 4b is less symmetric than that of fig. 4c. It is natural to look
at the phases of the two wavelets to see if they differ significantly (in fact, the error
energy from matching is on average partitioned equally between the phase and
relative amplitude errors). A lOO(1  a)% confidence interval for phase O ( f ) is given
by Jenkins and Watts (1968, p. 434) as
where .it," is the estimated coherence for this onechannel case, and F 2 , 2 n  2l; a is
the lOO(1  a)% point of the F z , 2 n  2 distribution. Since 1 sin (x)1 I 1 it follows that
the coherence must exceed a threshold for application of this formula. For a 90%
interval, with n = 13.2, as here, F 2 , 24; o.9 is found to be 2.534 and thus it is required
that 9; > 0.172. For frequencies f such that ?,"(f) > 0.172, the 90% confidence
interval for phase for the wavelet of fig. 4b has been plotted in fig. 5a. Also marked
on the diagram is the known phase of the wavelet used in the simulation, i.e. that in
fig. 4a. It can be seen that four out of five intervals include the true value. This
analysis is repeated in fig. 5b for the wavelet of fig. 4c, and this time all five intervals
include the true phase values. (The formula of Jenkins and Watts 1968 given above
ANALYSIS O F T R A C E M A T C H I N G
881
a.
1st SIMULATION
1
b.
20
40
FREQ (Hz)
2nd SIMULATION
2.07
2
0
20
40
FREQ (Hz)
Fig. 5. 90% confidence intervals for phase from estimated wavelets and known phase of input
(simulation) wavelet for (a) first simulation and (b) second simulation.
is not the only one possible; a full discussion of the estimation of confidence intervals on gain and phase of frequency response functions is given in Walden 1984.)
The confidence intervals calculated for the phase indicate that there is nothing
anomalous about the phase of the wavelet of fig. 4c; indeed, the phase estimates for
each wavelet, figs 4b and 4c, are consistent with the phases of the simulation wavelet
4a, even though 4b and 4c look so different.
Before one can look at wavelets and call them different, one has to have some
idea of the range of variation likely from the estimation procedure. One way of
doing this would be to display a large number of wavelets estimated by simulations
like those used to produce figs 4b and 4c. This has been done on a small scale, and
the results are displayed in fig. 6. The top and bottom wavelets are the input
(known) wavelet, and the intermediate wavelets are 10 independent estimates for
different realizations of the noise component. It is interesting to convolve each of
these wavelets with the attenuated primaries trace and then compare these filtered
synthetics; this has been done in fig. 7, and one now has to look much closer to see
dissimilarities. Plots of the phase and its confidence range, and of amplitude too if
desired, are obviously much more concise than the displays of figs 6 and 7, they are
quantitative, and they can pinpoint the cause of any difference precisely. Another
possibility for comparing wavelet shapes would be to compute the mean square
difference between the two wavelets (after appropriate scaling) and devise some test
related to (2) of section 2, but it would be less diagnostic than the use of confidence
intervals.
882
A.T. W A L D E N A N D R . E . W H I T E
INPUT SIMULATION
WAVELET
INDEPENDENT ESTIMATES
INPUT SIMULATION
WAVELET
TIME IN SECONDS
2.5
3.0
TIME IN SECONDS
Fig. 7. Filtered
ANALYSIS O F TRACE M A T C H I N G
883
5. TESTINGTHE MODEL
Having estimated a wavelet, a final ideal stage in the analysis would be to check the
validity of the model assumptions about the nature of the noise, detailed in
section 1.
Properties of the noise are tested by examining the residuals from the match
defined by
ii(t) = y(t)  L(t) * x(t).
We use two methods to test, in a univariate sense, that the amplitude distribution of the residuals is normal (Gaussian); one graphical, known as QQ plotting
(Wilk and Gnanadesikan 1968) and one numerical, employing a statistic called the
Cramervon Mises (CVM) statistic, (Stephens 1974, 1976). The latter appears to be
the best quantitative goodnessoffit test for this purpose.
The QQ plot emphasizes visually any departures from normality, especially in
the tails of the distribution which is where they are most likely to occur. A sample
from a normal distribution will plot as a straight line, and systematic deviations
from a straight line indicate nonnormality. The QQ plot in fig. 8 closely approx
ORDERED RESIDUALS
STANDARDISED
TO VARIANCE OF 1
CORRESPONDING GAUSSIAN
QUANTILES
imates a straight line, the deviations for large values being due to only about three
points; hence normality is indicated. Of course, any sample shows fluctuations
about a straight line, and it is for this reason that a quantitative test is useful to sort
out borderline cases.
A . T . W A L D E N A N D R.E. W H I T E
884
b.
2.01
201

STATISTICS
2.0 J
2.oJ
4.07
RESIDUALS
STANDARDISED
TO VARIANCE OF 1
4a1
ox)
0.0
4.0'
4.0J
I
1.8
2.0
2.2
2.4
1.8
2.0
2.2
2.4
TIME IN SECONDS
Fig. 9. Residual trace and moving statistics from matching the synthetic at well 6 to (a) line A
seismic trace and (b) line B seismic trace.
MOVING
STATISTICS
885
"U\..,
0.0:
2.0
STANDARDISED
TO VARIANCE OF 1

2.0
o . o y
0.0
 
++
4.0J
2.0
4.0J
0.0
2.0
2.0
0.0
2.0
Fig. 10. Two examples of scatter plots and moving statistics of residuals against estimated
filtered synthetic seismograms from matching.
wells. In the scatter plot of fig. 10a there is a clear tendency towards positive
residuals coincident with negative values of estimated filtered synthetic seismograms
and negative residuals coincident with positive values of estimated filtered synthetic
seismograms. Such notable behavior is indicative of a poorly fitting model and
arises even though the estimated signal and the noise are orthogonal. The moving
statistics emphasize the trend. In contrast, the scatter plot behavior in fig. 10b is
quite satisfactory, and the moving statistics are really quite parallel and horizontal.
The residuals from matching have been used as approximations to the true
unknown errors in order to assess the properties of the latter. Simulations were used
to gauge
a. the correctness of making inference about the true errors from the residuals, and
b. the utility of the methods used for examining the estimated noise or residuals (i.e.,
886
the utility of qualitative and quantitative tests for normality, displays of moving
statistics to assess stationarity, and scatter plots and displays of their moving
statistics to assess correlation between residuals and estimated filtered broadband
synthetic seismograms).
These simulations gave no cause for concern with respect to (a), but showed that
it is usually difficult to separate sampling fluctuations from real trends, especially
where no quantitative test is forthcoming. This difficulty is often due to durations of
passable matches between seismic data and synthetics being too short to allow
simple clearcut answers. However, when more distinctive behaviorsuch as seen in
fig. 10does occur, it can be very useful.
Results showed that in matching real data the assumption that the noise is
normally distributed was almost always clearly supported by the results of the two
tests for normality applied to the residuals from the match. These tests are valid
when the error series is stationary.
6. S U M M A R Y
The main statistical points to consider when attempting a match are:
a. Selection and investigation of the right spectral smoothing according to the
guidelines in section 3 and knowledge of the likely spectral content of the seismic
wavelet from the recording and processing parameters.
b. The measures of accuracy can be helpful when scanning for a good match over
several traces and different time gates. The interpretation of the results of such
scans demands a careful geophysical assessment; for example, any shift of the
best fit trace from the well location has to be reconciled with likely navigational
errors and, when matching migrated data, with the possible consequences of
incorrect migration velocities. This paper has dealt solely with the statistical
accuracy of matching and has made no attempt to cover all its practical ramifications.
c. The socalled optimal smoothing criteria, the AIC and Sequential Ftests, should
be treated with caution. The criteria consistently underestimate the physical
wavelet length, or equivalently oversmooth the seismic spectra.
ratio exceeds the 90% confidence level for detectionAhen it is cond. If the
cluded that a valid detection has been achieved. The size of the S / N ratio can be
used to quantify the accuracy of the estimate, as detailed in section 2.
e. Phase effects can be deceptive in wavelet estimation. Plotting confidence intervals
on phase gives one a better appreciation of the magnitude of phase uncertainty.
Confidence intervals on amplitude (gain) may also prove useful.
f. Check the residuals from the match for normality, approximately constant
variance over time, and lack of correlation with the estimated filtered broadband
synthetic seismogram. If any of these assumptions are violated, downgrade the
reliability of the estimate. Of course, if it is geophysically desirable or possible to
select a substantially different matching gate then this problem may perhaps be
effectively overcome.
887
ACKNOWLEDGMENTS
We thank Dr P.N.S. OBrien for helpful comments on a BP report on which this
paper is based, and the Chairman and Board of Directors of the British Petroleum
Company plc for permission to publish the work.
A P P E N D I XA
THED I S T R I B U T IOO
F N
T H E ESTIMATED
S/N POWERR A T I O
Fourier transform of the singlechannel convolutional model
y(t) = h(t) * x(t)
+ u(t)
gives
Y ( f )= W
M f )+ W ) ,
6yy.x(f)
= W f )* C(1/T)I Y ( f ) A ( f ) X ( f 121,
)
where W ( f )is the spectral window employed. It is assumed that H ( f ) varies little
across this window. Then the leastsquares estimate of H ( f ) is
= I A ( f )12&xx(f)
+ &yy.x(f),
(A2)
where 6xx(f)
and & J f ) are the smoothed autospectra of x(t) and y(t) and & x y ( f )
their smoothed crossspectrum. By writing
Wf)=W )+ A f f C f ) X ( f ) ,AHCf) =
is
wn
and making use of (Al) (orthogonality of input and residuals) one can relate the
estimated residual spectrum to the sample noise power spectrum & u u ( f ) :
=
&yy.x(f)
+I
12&xx(f).
(A31
888
= v,
it follows that
E{Ih ~ 16~~(f)I
f )
= (1/n)~{6,,(f)} = QUu(f)/n.
Equation (1) of section 2, namely,
E{
&xx(f)
I A ( f ) H ( f ) 1 df
= P,/n
follows on integrating over frequency. This equation also gives the negative bias that
results from estimating the total noise power by integrating & ) y y . x ( f ) .
The formation of equations and distributions for estimators obtained through
matching has required assumptions about the noise u(t) and the smoothness of H ( f ) .
What of the input x(t)? In matching this is supplied and it can be treated as a
known driving function. In particular, there is no necessity to regard x(t) as stochastic and the use of the notation 6 J f ) here does not imply that & x x ( f ) has a
statistical distribution; it simply denotes a known smoothed autospectrum. It may
be convenient for purposes other than the matching to treat x(t) as stochastic and a
brief indication of how matching can be linked to this approach is given after the
other derivations that are the aim of this appendix.
The distribution of the estimated signal spectrum I B(f)12&xx(f)
follows from the
chisquared decomposition of (A2). The reasoning parallels that related to (A3), but
now the distribution of the sum is noncentral chisquared (Johnson and Kotz 1970,
Chap. 28) because any realization of the Gaussian noise has the same signal
h(t) * x(t) added to it. The noncentrality comes entirely from the signal spectrum,
from the fixed component H ( f ) in A(f). To convert the quantities in (A2) to standardized chisquared variables, it is multiplied by 2n/@,.,(f) as before and the 2n
degrees of freedom associated with 6 J f ) split into 2 for 1 r ? ( f ) 1 and (2n  2) for
6)yy.x(f).
That is,
2n I f i ( f ) 12&xx(f)
@,,(f
1
has a xi,a distribution, where the noncentrality parameter is
The value of comes by definition from setting the random components to zero.
Alternatively it can be derived from
E{I R f )lxx(f)>
= @ J f )  E { & y y . x ( f ) }= I H ( f ) I 2 @ x x ( f )
+ @uu(f)/n
889
Py = Ps
+ PN.
A
1= VPdPN
(vpS)/pN
I A(f)
I2 6 x x ( f ) df
to zero or from
+ [pN/nl) = E { d l , A } = v 1 + ,k
ratio (v2 ps)/(vlpN)
has the noncentral Fdistribution
E{(vpS)/pN}
= (v/pN)(pS
Thus the
section 2.
The variance of a xt, A variable is 41
containing 6ss(f)
= I I ? ( f ) 1'6~,(f)
gives
Fvl,Y 2 r a as stated in
to order l/n. Exactly the same result can be derived for the largesample variance of
6ss(f)
from the expressions for spectral variances and covariances given by
890
Goodman (1957), which are founded on a fully stochastic model for both x(t) and
y(t). In a similar way, the x:,, distribution for 6jt,,(f)leads to
and it is only after adding the signal sampling variance (D.,,(f)/nthat one obtains the
standard expression for the variance of a power spectral estimate
var,
{%t,(f)}
= @&(f)/n.
(47)
Equations (A4) and (A6) can be termed noise sampling variances whereas the total
stochastic variances (A5) and (A7) include a variance that arises from treating the
signal also as a sample from a stochastic process. In matching, the assumption of a
stochastic signal is superfluous since the distribution theory for the estimators can
be developed from expectations that range solely over the postulated ensemble of
noise samples. The smoothed spectrum 6xx(f)
in this theory is analogous to the
sums of squares and products matrix in regression theory and its appearance does
not imply that x(t) is stochastic, although it does restrict the complexity of H ( f ) .
Even if a stochastic signal is assumed, the replacement of 6xx(f)
by its population
value would introduce an unnecessary approximation using an unknown quantity.
Note too that in stochastic models of seismic traces containing a common signal,
(A6) is a more appropriate measure of power spectral variance than the standard
expression (A7) when only one specimen of signal is being considered.
A P P E N D I XB
The noncentral Fdistribution can be closely approximated by a standard central
Fdistribution (which is extensively tabulated and available in most algorithm
libraries) using a result of Patnaik (1949), viz.
Pr
where
f*
P V I .
VZ,
a 5
f * > 7z Pr
W V 3 , v2
f**>,
+ 1)f**/V1
= (v1
and
v3
= (v1
+ 1)2/(v1 + 21).
f*
= (1
+ [1/v1])f**
and
vj =
vi(1
v,(l
+ [1/vJ2
+ [2;l/vl])
= (1
+ A2)f**
+ A)
(1 + 2A2)
 vl(l

Pr
{Fvi,
vz, vlA12
(v2/vl)(pS/pN)} =
ANALYSIS O F T R A C E M A T C H I N G
891
REFERENCES
BLOOMFIELD,
P. 1976, Fourier Analysis of Time Series: An Introduction, Wiley, New York.
BUNCH,A.W.H. 1984, Predicting the optimal least squares filter using adaptations of standard
statistical theory, BP Report Ext. 25628.
CLEVELAND,
W. and KLEINER,
B. 1975, A graphical technique for enhancing scatterplots with
moving statistics, Technometrics 17, 447454.
GOODMAN,
N.R. 1957, On the joint estimation of the spectra, cospectrum and quadrature
spectrum of a twodimensional stationary Gaussian process, Scientific Paper No. 10,
Engineering Statistics Laboratory, New York University, also University of Princeton
thesis.
JENKINS,
G.M. and WATTS,D.G. 1968, Spectral Analysis and its Applications, HoldenDay,
San Francisco.
JOHNSON,
N.L. and KOTZ, S. 1970, Continuous Univariate Distributions2, Wiley, New
York.
PAPOULIS,
A. 1973, Minimumbias windows for high resolution spectral estimates, IEEE
Transactions on Information Theory IT19,912.
PATNAIK,
P. 1949, The noncentral x2 and Fdistributions and their applications, Biometrika
36,202232.
STEPHENS,
M. 1974, EDF statistics for goodnessoffit and some comparisons, Technometrics
69,730737.
STEPHENS,
M. 1976, Asymptotic results for goodnessoffit statistics with unknown parameters, Annals of Statistics 4, 357369.
WALDEN,
A.T. 1984, Confidence intervals on gain and phase of frequency response functions,
BP Report Ext. 25477.
WHITE,R.E. 1980, Partial coherence matching of synthetic seismograms with seismic traces,
Geophysical Prospecting 28,333358.
WILK,M.B. and GNANADESIKAN,
R. 1968, Probability plotting methods for the analysis of
data, Biometrika 55, 117.