Professional Documents
Culture Documents
Yangkang Chen∗
Accepted 2017 September 26. Received 2017 September 25; in original form 2017 August 21
SUMMARY
Effective and efficient arrival picking plays an important role in microseismic and earth-
quake data processing and imaging. Widely used short-term-average long-term-average ratio
(STA/LTA) based arrival picking algorithms suffer from the sensitivity to moderate-to-strong
random ambient noise. To make the state-of-the-art arrival picking approaches effective, mi-
croseismic data need to be first pre-processed, for example, removing sufficient amount of
noise, and second analysed by arrival pickers. To conquer the noise issue in arrival picking for
weak microseismic or earthquake event, I leverage the machine learning techniques to help rec-
ognizing seismic waveforms in microseismic or earthquake data. Because of the dependency
of supervised machine learning algorithm on large volume of well-designed training data, I
utilize an unsupervised machine learning algorithm to help cluster the time samples into two
groups, that is, waveform points and non-waveform points. The fuzzy clustering algorithm has
been demonstrated to be effective for such purpose. A group of synthetic, real microseismic
and earthquake data sets with different levels of complexity show that the proposed method is
much more robust than the state-of-the-art STA/LTA method in picking microseismic events,
even in the case of moderately strong background noise.
Key words: Inverse theory; Time-series analysis; Earthquake source observations.
88
C The Author 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society.
Unsupervised machine learning 89
data present in the entire shot record, which allows us to adjust the (Forghani-Arani et al. 2013). The multichannel denoising methods
trace-by-trace picks and to discard picks associated with bad or dead rely on a fairly dense spatial sampling of the data. For most micro-
traces. Velis et al. (2015) utilized pattern recognition techniques to seismic monitoring projects, where the number of spatial geophones
detect waveforms from microseismic data and used reduced-rank is not large enough, the multichannel denoising methods are appli-
filtering approach to improve the SNR of the waveforms. cable or cannot obtain acceptable results.
Coppens (1985) proposed a fully automatic method to pick first Han & van der Baan (2015) applied ensemble empirical mode
arrivals that makes use of the delay-time method in order to com- decomposition (EEMD) method combined with adaptive interval
pute static corrections at each shot position. From the automatically thresholding strategy to denoise microseismic data. Empirical mode
picked first arrivals on common-offset gathers, the delay times, decomposition (EMD) was developed by Huang et al. (1998) to
I organize the paper as follows: I first give a brief introduction where d(xi , x j ) denotes the distance between xi and x j . xi de-
of the concept of clustering analysis, and then I formulate the mi- notes the ith N dimensional data point. xi and x j are both vectors.
croseismic event picking problem as a clustering problem. Next, I · p denotes p-norm of the input vector. The Euclidean distance
introduce the iterative solver for solving the fuzzy clustering prob- is a special case where p = 2, while Manhattan metric has p = 1.
lem. I then use a group of examples with comprehensive analysis and However, there are no general theoretical guidelines for selecting a
discussion to demonstrate the performance of the proposed method measure for any given application.
and compare the performance with the state-of-the-art STA/LTA
method. Finally, I draw some key conclusions in the end of the
paper. Microseismic event picking as a clustering problem
A microseismic record can be classified as waveform and non-
waveform components. The first index of waveform components
can be treated as the arrival of the microseismic event. The essence
T H E O RY of the arrival picking problem is thus turned into a classification
problem given a group of data points. When a group of training
Clustering analysis data is given together with predefined data features, the classification
problem can be viewed as a supervised classification problem (e.g.
Clustering analysis is a type of unsupervised machine learning ap-
binary classification). If one even wants to classify the microseismic
proach. The target of clustering analysis is to group the input data
record using the data itself, the problem becomes a classic clustering
into several clusters just according to the inherent features of the
analysis problem.
input data set. The number of groups can be defined in advance ac-
The most important factor that affects the performance of the
cording to the purpose of a specific problem. Simply speaking, each
clustering analysis is the selected feature vector. In the algorithm,
cluster after clustering analysis is a collection of objects which have
I propose three features to construct the feature vector, which are
some sort of similarities which defer them from objects in the other
mean, power, and STA/LTA. All these feature vectors are measured
clusters. Clustering is driven only by the choice of input features
in the time domain.
(or attributes) and the number of desired clusters and thus is much
The three features are defined as follows:
more flexible than those supervised machine learning techniques
where a large amount of training data is required. (i) Mean M
An important component of a clustering algorithm is the distance
1
i+w
measure between data points. If the components of the data instance M(i) = d(i) (2)
vectors are all in the same physical units then it is possible for the N i−w
simple Euclidean distance metric to be sufficient to successfully
group similar data instances. The following Minkowski Metric is a (ii) Power E
common way for measuring distance for an N dimensional data
i+w
E(i) = d 2 (i) (3)
d(xi , x j ) = xi − x j p , (1) i−w
Unsupervised machine learning 91
Figure 2. Predefined features of (a) clean data and (b) noisy data for clustering.
(iii) STA/LTA R w denotes half of window length. d(i) denotes the input seismic.
NSTA and NLTA denote short-term and long-term periods, respec-
tively.
i Fig. 1 shows a simple example for demonstrating the extracted
1
STA(i) = d( j) features. Fig. 1 contains two synthetic data in the clean and noisy
NSTA j=i−NSTA cases. The noisy data contains a large amount of noise that makes
the effective signals almost buried under the noise. The SNR of this
1
i
LTA(i) = d( j) noisy data is −3.64 dB. The red circles in panels (a) and (b) are the
NLTA j=i−NLTA picked arrival indices using the presented algorithm. It is very clear
that in both cases, the presented algorithm obtains very success-
R(i) = STA(i)/LTA(i) (4) ful arrival picking results. Fig. 2 shows the three aforementioned
92 Y. Chen
Figure 3. Calculated membership values of (a) clean data and (b) noisy data that define different clusters.
features used for clustering for clean data (Fig. 2a) and noisy data or more clusters with different degrees of membership. In this case,
(Fig. 2b). data will be associated with an appropriate membership value.
Fuzzy c-means is a method of clustering which allows one piece
of data to belong to two or more clusters (Dunn 1973; Bezdek 1981).
It is based on minimization of the following objective function
Fuzzy clustering
N
C
Fuzzy clustering belongs to the type of overlapping clustering, and J= u i,m j xi − c j 2 , 1 ≤ m ≤ ∞, (5)
uses fuzzy sets to cluster data, so that each point may belong to two i=1 j=1
Unsupervised machine learning 93
2
xi − c j − m−1 Via this criterion, one can automatically detect the microseismic
= C m−1 event. In the DISCUSSION section, I give a brief discussion on
2
1 the implementation and parameter selection for the aforementioned
k=1
xi − ck algorithm.
2
xi − c j − m−1
= . (10)
C
2 EXAMPLES
− m−1
xi − ck
k=1 To numerically evaluate the performance, I use the arrival picking
error metric which is defined as follows:
for easier implementation. The above iteration terminates either
when the maximum number of iteration (e.g. 100) is reached or when
H
it is converged (e.g. Uk + 1 − Uk < ). The obtained membership Error = |I (h) − Î (h)|, (11)
h
vectors ui, j is then used to detect the microseismic event.
Fig. 3 shows the membership values calculated by iterative es- where Error denotes the picking error measured in samples. I(h)
timation for the data shown in Fig. 1. Fig. 3(a) corresponds to denotes the index corresponding to the exact first arrival for hth
the clean data and Fig. 3(b) corresponds to the noisy data. It trace in a multichannel 2-D microseismic record and Î (h) denotes
is clear from Fig. 3 that a microseismic arrival exists when the the picked arrival (index). The exact arrival is found by applying
membership value jumps from one to zero (or from zero to one). the proposed method on clean synthetic data.
Unsupervised machine learning 95
I first use a multichannel synthetic data set to demonstrate the spatial traces correspond to recorded data from 50 geophones. The
performance of the proposed method. The microseismic data set red circles in Fig. 5(a) correspond to the picked first arrivals for
is simulated from the two-layer velocity model shown in Fig. 4. this clean data set using the proposed clustering method. The blue
The blue inverted triangles in the first layer denote the 50 evenly circles in Fig. 5(b) correspond to the picked first arrivals using the
spaced geophones for recording the microseismic signals. The blue STA/LTA method. From this clean data test, it is clear that both
dots in the second layer denote the two microseismic sources. The clustering method and the STA/LTA method work well in detecting
two sources are generated during the hydraulic fracturing process. the first arrivals when the SNR is very high. As a comparison, I
I use the acoustic wave to simulate the recorded data, as shown in then conduct an experiment for noisy data. I simulate the noisy
Fig. 5. Figs 5(a) and (b) show the clean microseismic data. The 50 data by adding some random noise with SNR = 0.24 dB. The
96 Y. Chen
arrival picking results are shown in Fig. 6. From Fig. 7(a), one can To compare the arrival picking performance of the two methods
see clearly that the red circles, which correspond to the proposed on raw noisy data is not exactly fair. In practice, the noisy data is
method, successfully picked all the first arrivals without making any usually denoised first and then passed into the arrival picker. To
mistake. While for the result by the traditional STA/LTA method, compare the performance of the two methods on denoised data, I
as indicated by the blue circles in Fig. 7(b), most picked arrivals are apply a multichannel denoising operator to the noisy data shown
not correct. The noisy data example demonstrates that for noisy data in Fig. 6 to remove most of the noise. The multichannel denois-
set, the proposed method can be robust to obtain acceptable arrival ing operator I use is the damped multichannel singular spectrum
picking results while the STA/LTA method cannot perform well. In (DMSSA) algorithm, proposed by Huang et al. (2016). It is worth
other words, the proposed method is not sensitive to noise while the noting that since for this example, the spatial sampling is dense and
STA/LTA method is very sensitive to ambient random noise. the number of spatial traces is relatively high (50 in this case), one
Unsupervised machine learning 97
method. There are 40 receivers for this data set and the red circles after an initial denoising step is done on the raw data, where the
indicate the picked arrival indices. Fig. 11(b) shows the result from SNR becomes much better and more tolerable for the STA/LTA
the STA/LTA method. The performance of the proposed method is method.
very close to excellent except for one picking mistake, as shown in I extract the 20th trace in Fig. 11 for a single-trace comparison in
the fourth trace in Fig. 11(a). Because the data quality of this data Fig. 12. Figs 12(a) and (b) show the results from the two methods,
is relatively high, the STA/LTA method obtains accurate picking where one can more clearly see the 1D waveform signals and the
in some traces, but for most traces, the STA/LTA fails in picking difference in picked time indices. The three features, namely, Mean,
the accurate arrival. Note that in this test, I apply the two methods Power, and STA/LTA, are shown in Fig. 13 for the selected single
directly to the raw microseismic data, so the performance of the trace. It is very clear that the mean and power features are rela-
two methods demonstrates their relative robustnesses to noise. It is tively insensitive to noise since before 0.35s both mean and power
worth mentioning that the STA/LTA method is usually implemented are almost zero. The STA/LTA, however, is much more sensitive to
Unsupervised machine learning 99
Figure 15. Arrival picking results for the second real surface microseismic Figure 16. Arrival picking results for the second real surface microseismic
data with extremely strong background noise using the presented method. data with extremely strong background noise using the STA/LTA method.
ambient noise as one can observe a lot of fake peaks in the STA/LTA proposed clustering method and the STA/LTA method to the data
curve. The integrated clustering analysis over the two less noise- set and show the result in Figs 15 and 16, respectively. This data
insensitive features (i.e. mean and power) and a more noise-sensitive is much noisier than the previous example. However, the proposed
feature (i.e. STA/LTA) through the fuzzy C-means framework ac- method is still very robust in detecting those arrivals, which ap-
counts for the anti-noise superiority of the presented method to the pear spatially coherent, as indicated by the red circles in Fig. 15.
traditional STA/LTA method. The two groups of membership val- The results from the STA/LTA method, however, are in a mess,
ues of the real single microseismic trace are shown in Fig. 14. A as indicated by the blue circles in Fig. 16. Without denoising, the
distinct zero-to-one jump happens around 168 ms can be observed STA/LTA method is almost not possible to accurately detect the
from the top panel of Fig. 14. microseismic events while the proposed method can work properly
I then show a more complicated real microseismic event. Fig. 15 even in severely corrupted data.
shows the original record with the horizontal components H1 The last real data example is an earthquake data stack pro-
and H2, and the vertical component, respectively. Twelve three- file. Fig. 17 shows Professor Peter Shearer’s stacks over many
component geophones are used to record the signals from hydraulic earthquakes at a constant epicentral distance (offset angle)
fracturing. The microseismic record is noisy and amplitude of events (Shearer 1991a,b). Fig. 17 has been improved a lot from the raw data
is weak. So not all signals are immediately detectable. I apply the by stacking different earthquake data. Different seismic phases can
100 Y. Chen
C O N C LU S I O N S
DISCUSSIONS
Microseismic and earthquake data may contain strong distractive
The implementation of the proposed clustering method is quite background noise that may heavily affect the waveform arrival pick-
straightforward and it is easy to control the performance of the ing, and could further result in an unconvincing tomographic model
algorithm. The four steps shown in the THEORY section are almost that is based on the arrival picking. I have introduced an effective
the whole framework of the algorithm. Once the feature vectors are and intelligent arrival picking algorithm that is based on clustering
fixed (e.g. mean, power, and STA/LTA), the parameters needed to be analysis. The three features (Mean, Power, STA/LTA) fed into the in-
defined are just exponent parameter m for the membership matrix U telligent clustering algorithm make the clustering engine capable of
(see eq. 5), which is usually fixed as m = 2, the number of clusters detecting arrivals in extremely noisy environment without the need
C, which is fixed as C = 2. In other words, the method is a fully of pre-processing the data. The presented arrival picking algorithm
automatic method and almost does not require any human inference. is an unsupervised machine learning technique that can be applied
Unsupervised machine learning 101
to an arbitrarily large amount of microseismic (and earthquake) Forghani-Arani, F., Willis, M., Haines, S.S., Batzle, M., Behura, J. & David-
data. Unlike the supervised machine learning techniques, the pro- son, M., 2013. An effective noise-suppression technique for surface mi-
posed method does not require a reasonably large volume of training croseismic data, Geophysics, 78(6), KS85–KS95.
data and thus can be fairly flexible. The fuzzy clustering algorithm Gan, S., Wang, S., Chen, Y. & Chen, X., 2016a. Simultaneous-source sep-
aration using iterative seislet-frame thresholding, IEEE Geosci. Remote
has been shown to be an effective clustering analysis method in
Sen. Lett., 13, 197–201.
the presented framework. The comparison between the presented
Gan, S., Wang, S., Chen, Y., Chen, X. & Xiang, K., 2016b. Separation
algorithm with the state-of-the-art STA/LTA method shows very of simultaneous sources using a structural-oriented median filter in the
promising performance, especially in the low-SNR data set based flattened dimension, Comput. Geosci., 86, 46–54.
on a combination of several synthetic and real data examples with Gelchinsky, B. & Shtivelman, V., 1983. Automatic picking of first arrivals
Liu, W., Cao, S. & Chen, Y., 2016a. Seismic time-frequency analysis via Song, F. & Toksöz, M.N., 2011. Full-waveform based complete moment
empirical wavelet transform, IEEE Geosci. Remote Sens. Lett., 13, 28–32. tensor inversion and source parameter estimation from downhole micro-
Liu, W., Cao, S. & Chen, Y., 2016b. Applications of variational mode seismic data for hydrofracture monitoring, Geophysics, 76(6), WC103–
decomposition in seismic time-frequency analysis, Geophysics, 81, WC116.
V365–V378. Song, F., Kuleli, H.S., Toksöz, M.N., Ay, E. & Zhang, H., 2010. An improved
Liu, W., Cao, S., Wang, Z., Kong, X. & Chen, Y., 2017. Spectral decompo- method for hydrofracture-induced microseismic event detection and phase
sition for hydrocarbon detection based on VMD and teager-kaiser energy, picking, Geophysics, 75(6), A47–A52.
IEEE Geosci. Remote Sens. Lett., 14(4), 539–543. Vaezi, Y. & Van der Baan, M., 2015. Comparison of the STA/LTA and power
Maxwell, S.C., Rutledge, J., Jones, R. & Fehler, M., 2010. Petroleum spectral density methods for microseismic event detection, Geophys. J.
reservoir characterization using downhole microseismic monitoring, Geo- Int., 203(3), 1896–1908.