You are on page 1of 15

GEOPHYSICS, VOL. 86, NO. 1 (JANUARY-FEBRUARY 2021); P. M1–M15, 21 FIGS., 7 TABLES.

10.1190/GEO2019-0332.1

Building large-scale density model via a deep-learning-based data-driven


method

Zhaoqi Gao1, Chuang Li1, Bing Zhang1, Xiudi Jiang2, Zhibin Pan3, Jinghuai Gao1, and Zongben
Xu4

ABSTRACT and density models randomly generated based on the statistical


distributions of well logs are also used to generate several pairs of
As a rock-physics parameter, density plays a crucial role in seismic data and the corresponding large-scale density. This can
lithology interpretation, reservoir evaluation, and description. greatly enlarge the size and diversity of the training data set and
However, density can hardly be directly inverted from seismic consequently leads to a significant improvement of the proposed
data, especially for large-scale structures; thus, additional infor- method in dealing with a heterogeneous medium even though
mation is needed to build such a large-scale model. Usually, well- only a few well logs are available. Our method is applied to syn-
log data can be used to build a large-scale density model through thetic and field data examples to verify its performance and com-
extrapolation; however, this approach can only work well for sim- pare it with the well extrapolation method, and the results clearly
ple cases and it loses effectiveness when the medium is laterally display that the proposed method can work well even though only
heterogeneous. We have adopted a deep-learning-based method a few well logs are available. Especially in the field data example,
to build a large-scale density model based on seismic and well- the built large-scale density model of the proposed method is im-
log data. The long short-term memory network is used to learn proved by 11.9666 dB and 0.6740, respectively, in peak signal-to-
the relation between seismic data and large-scale density. Except noise ratio and structural similarity compared with that of the well
for the data pairs directly obtained from well logs, many velocity extrapolation method.

INTRODUCTION By incorporating well-log data, one simple and widely used ap-
proach can be used to build a large-scale density model. This ap-
Estimating high-fidelity models for subsurface parameters is cru- proach is realized by extrapolating the density from well locations
cial for the exploration of oil and gas. Among several parameters, to other places using some extrapolation methods, and usually the
density plays a key role in lithology interpretation, reservoir evalu- horizons of the subsurface medium are needed to guarantee a lat-
ation, and description. Consequently, building a density model is erally reasonable extrapolated model. However, this approach has
important in geophysics. However, it is well known that large-scale two weaknesses: (1) picking accurate horizons remains a challenge,
density can hardly be directly inverted using the information carried and (2) it is firmly established that the well-log data have high ver-
by seismic waves because the amplitude of the scattered wavefield tical resolution but poor horizontal continuity. Thus, the extrapo-
corresponding to the density perturbation decreases quickly with
lated density model is correct in the well position but unreliable for
the increase of the scattering angle (Tarantola, 1986; Forgues and
places that are far away from the wells. This weakness will be more
Lambaré, 1997; Virieux and Operto, 2009). Thus, additional infor-
serious when the subsurface medium is laterally heterogeneous
mation is needed for building a large-scale density model.

Manuscript received by the Editor 21 May 2019; revised manuscript received 30 July 2020; published ahead of production 4 October 2020; published online
16 December 2020.
1
Xi’an Jiaotong University, School of Information and Communications Engineering, Xi’an, Shaanxi 710049, China and Xi’an Jiaotong University, National
Engineering Laboratory for Offshore Oil Exploration, Xi’an, Shaanxi 710049, China. E-mail: zq_gao@xjtu.edu.cn (corresponding author); chli0409@126.com;
xawslhh@163.com; jhgao@mail.xjtu.edu.cn.
2
CNOOC Research Institute, Beijing 100028, China. E-mail: jiangxd2@cnooc.com.cn.
3
Xi’an Jiaotong University, School of Information and Communications Engineering, Xi’an, Shaanxi 710049, China. E-mail: zbpan@mail.xjtu.edu.cn.
4
Xi’an Jiaotong University, School of Mathematics and Statistics, Xi’an, Shaanxi 710049, China. E-mail: zbxu@mail.xjtu.edu.cn.
© 2021 Society of Exploration Geophysicists. All rights reserved.

M1

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M2 Gao et al.

and only a few well logs are available. Another commonly used In this paper, we propose a deep-learning-based data-driven
approach builds a large-scale density model from the large-scale method to build a large-scale density model. Our basic idea is to build
velocity model by using some empirical relations between velocity a mapping from seismic data to a large-scale density model using
and density, such as the Gardner relation (Gardner et al., 1974) and end-to-end deep learning. To build such a nonlinear relation, the deep
the generalized Gardner relation (Ursenbach, 2005). However, this architecture called the long short-term memory (LSTM) network is
approach also faces two problems: (1) it is challenging to build an adopted. To train the LSTM network and improve its applicability in
accurate large-scale velocity model, and (2) it is untrivial to deter- complex cases (e.g., in a laterally heterogeneous medium), we pro-
mine the parameters of an empirical relation because they are lith- pose to use the following method to construct the data set for network
ology-dependent, meaning that the empirical relation with fixed training and validation based on seismic data and a few well logs.
parameters is unreliable and will introduce artifacts into the density First, pairs of seismic traces and the corresponding large-scale density
model. As a result, more advanced techniques are required for model directly obtained from well logs are used to form the data set.
building a large-scale density model. Because the limited wells are sparsely distributed in space, this data
As a data-driven machine learning algorithm, neural networks, set is insufficient to guarantee the generalization ability of the deep
which are inspired by the biological neural networks that constitute network, especially for a laterally heterogeneous medium. To over-
animal brains, enable us to do tasks by learning from several exam- come this shortcoming, we randomly generate many velocity and
ples. Neural networks have been applied in geophysical problems for density models according to the statistical distributions of well-log
a long time (van der Baan and Jutten, 2000; Poulton, 2002). Recently, data, based on which several pairs of synthetic seismic data and the
Maiti and Tiwari (2010) propose a new method, which is set in a corresponding large-scale density model are generated to signifi-
Bayesian neural network framework and uses a hybrid Monte Carlo cantly enlarge the size and diversity of the data set. The proposed
simulation scheme to identify facies changes from complex well-log method has two important characteristics: (1) it avoids the require-
data. In addition, the neural network is optimized by a hybrid genetic ments of large-scale velocity building and horizon picking, and (2) it
algorithm and particle swarm optimization method. Ardjmandpour has the potential to handle complex cases even if only limited
et al. (2011) propose a forward modeling and an inversion method well logs are available. We test the deep-learning-based data-driven
based on neural networks. Wit et al. (2013) use a mixture density method using synthetic and field data examples. The numerical re-
sults clearly demonstrate the following two facts: (1) the randomly
network to obtain 1D marginal posterior probability density func-
generated data set is indeed the key for improving the performance of
tions, which provide a quantitative description on the individual earth
the proposed method in dealing with a strong heterogeneous medium,
parameters. Kahrizi and Hashemi (2014) propose to use the neuron
and (2) the proposed method performs significantly better than the
curve to find the optimal layer size and the minimum size of the train-
commonly used methods.
ing set of the neural network and apply it to find the first-break picks
This paper is organized as follows. We first present the basic con-
of seismic data. Konaté et al. (2015) propose the self-organizing map
cept and information about the LSTM network. Then, a detailed
neural network and use it for the classification of metamorphic rocks
description of the proposed method is provided. Finally, numerical
from log data. Keynejad et al. (2017) compare the simultaneous pre-
examples are presented to verify the effectiveness of the proposed
stack inversion and neural network methods in creating 3D Poisson’s
method.
ratio models built upon low-frequency initial models. The compari-
son shows that the neural-network-based method is notably more
successful than the extrapolation-based method for the results beyond THE LSTM NETWORK
the logged sections in the wells and away from the control wells.
LSTM, first proposed by Hochreiter and Schmidhuber (1997), is
Chen et al. (2018) use the pseudo-back-propagation neural network
a kind of RNN that is powerful for processing time-series data. Dif-
method to invert the gravity anomalies of multidensity interfaces. The
ferent from traditional RNN, LSTM has a unique architecture, as
latest variant of neural networks, that is, deep learning, has also been
shown in Figure 1. The key of LSTM is the cell state that is rep-
used in solving geophysical problems, including but not limited to
resented by the horizontal line running through the top of a unit. It
seismic inversion, velocity model building, and seismic facies analy-
enables LSTM to have continuous gradient flow to prevent back-
sis (Lewis and Vigh, 2017; Araya-Polo et al., 2018; Qian et al., 2018;
propagated errors from vanishing or exploding; consequently,
Gao et al., 2019, 2020a, 2020b), and it promises breakthroughs.
LSTM can work well for the tasks that require memories of events
Recently, the applications of neural networks in reservoir charac-
that happened thousands or even millions of discrete time steps
terization have also been investigated. Ahmadi et al. (2013) propose
earlier.
to implement a soft sensor on the basis of a feed-forward neural net- For each unit of LSTM, the relation between its input and output
work to forecast the permeability of a reservoir. Chaki et al. (2015) can be summarized as follows:
propose a preprocessing scheme to improve the prediction of the sand
8
fraction from seismic attributes using neural networks. Cersósimo >
>
> f t ¼ σ g ðW f xt þ U f ht−1 þ bf Þ
et al. (2016) use a neural network to predict lateral variations of seis- >
>
mic velocity, density, thickness, and gamma rays. Boateng et al. < it ¼ σ g ðW i xt þ Ui ht−1 þ bi Þ
ot ¼ σ g ðW o xt þ U o ht−1 þ bo Þ ; (1)
(2017) propose a porosity estimation method based on Caianiello >
>
>
> c ¼ f ∘ c þ i ∘ σ ðW x þ U h þ b Þ
neural networks and Levenberg-Marquardt optimization, and they > t t t−1
: ht ¼ ot ∘ σ h ðct Þ
t c c t c t−1 c
demonstrate that this method is robust. Motaz and Ghassan (2018)
propose a petrophysical property estimation method using a deep net-
work called recurrent neural networks (RNN), and they demonstrate where xt ∈ Rd is the input vector; f t ∈ Rh is the forget gate’s ac-
that this method can build a density model from seismic data through tivation vector; it ∈ Rh is the input gate’s activation vector; ot ∈ Rh
a learned nonlinear relation between them. is the output gate’s activation vector; ht ∈ Rh is the output vector;

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M3

“∘” denotes element-wise multiplication; ct ∈ Rh is the cell state be effective in dealing with complex cases, such as a strong hetero-
vector; σ is the activation function; and W ∈ Rh×d , U ∈ Rh×h , geneous medium. Considering that seismic data and well-log den-
and b ∈ Rh are the learnable weights and biases, which can be ob- sity are temporal dynamic, meaning each data point of them is not
tained through network training with the gradient calculated isolated but has the dependency of the data points before and after it,
through back propagation (Rumelhart et al., 1986). Herein, σ g ðxÞ ¼ we think that the deep learning architecture called LSTM is suitable
1∕ð1 þ e−x Þ is the sigmoid function, and σ c and σ h are the tanh in our method because of its effectiveness in processing time-series
function tanhðxÞ ¼ ðex − e−x Þ∕ðex þ e−x Þ. data as introduced above. We call the proposed method the deep-
learning-based method, and its workflow is shown in Figure 2.
METHODOLOGY There are several important parts of the proposed method, and each
of them will be explained in detail as follows.
Although large-scale density can hardly be inverted from seismic
data (Forgues and Lambaré, 1997), it is still reasonable to believe The deep learning architecture of the proposed
that the relation between them certainly exists because of the fol- method
lowing two facts: (1) the traveltime of seismic data is related to the
large-scale velocity model of the subsurface, and (2) the large-scale The basic ingredient of a successful deep learning application is
velocity and density are nonlinearly related (Gardner et al., 1974). the network architecture. Herein, we use a deep learning architec-
As a consequence, it is possible to obtain large-scale density from ture, shown in Figure 3, in our method. The input of the deep
seismic data given a sufficiently accurate nonlinear relation between network is the seismic data. Considering the temporal and spatial
them. In this work, we propose to establish such a nonlinear relation correlation, at each time step, we use seismic data within a sliding
using deep learning in the data-driven framework with the aim of window (marked by the green rectangles in Figure 3) as the input
providing a more advanced and robust large-scale density model rather than just inputting a single data point. This can bring us two
building method, which can not only work for simple cases but also obvious advantages. First, the correlation of seismic data enables
deep learning to build a more sophisticated non-
linear relation. Second, by taking the seismic
data within a window into consideration, a non-
linear relation with better noise resistibility can
be expected. In application, the size of the sliding
window in the time direction should be chosen
based on the wavelength of the seismic wavelet
whereas its size in lateral distance can be fixed as
a constant. Once the seismic data are input into
the deep network, the unit of LSTM at each time
step will capture the information of input and
then output a vector, which then will be used
Figure 1. The architecture of the LSTM network. Symbol “A” represents the unit of as the input of the fully connected neural network
LSTM, which contains several operations and gates. The terms xt ; t ¼ 1; 2; : : : ; N t (FCNN) to output a density value for the current
and ht ; t ¼ 1; 2; : : : ; N t are the input and output of a unit, respectively. time step.

Figure 3. The deep learning architecture used in the proposed


method. Symbol “A” represents the unit of LSTM, whose detailed
structure can be found in Figure 1. Symbol “B” represents the FCNN,
Figure 2. Proposed workflow for building a large-scale density which is used to convert the output of “A” into density at each time
model based on deep learning. step.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M4 Gao et al.

Based on the deep learning architecture, the output density at 1) Obtain the statistical distributions of well-log data (compres-
time t can be expressed as sional wave [P-wave] velocity and density). As shown in Figure 4,
we first equally divide the entire well-log curve into several sections
ρt ¼ σ ρ ðW ρ ht þ bρ Þ; (2) based on its value. Specifically, the maximum value vmax and the
minimum value vmin of the well-log curve are obtained, and then the
where W ρ ∈ Rh and bρ ∈ R are the weights and bias of the FCNN value range ½vmin ; vmax  is divided into several sections with the
layer, respectively, ht is the output of the unit of LSTM at time t, σ ρ is lower and upper bounds of each section being
the activation function of the FCNN layer, and the rectified linear unit
function σ ρ ðxÞ ¼ maxð0; xÞ is used in our work. It is worth noting 
slower ¼ vmin þ ðl − 1Þ × vmaxL−vmin
from equation 2 that the weights and bias of the FCNN layer do not l
upper ; (4)
sl ¼ vmin þ l × vmaxL−vmin
change along with time t. In other words, we use shared weights and
bias for all time steps with the aim to reduce the free parameters of the
deep learning architecture. In summary, the learnable parameters of where slower
l and supper
l are the lower and upper bounds of section
the deep learning architecture can be summarized as follows: sl ; l ¼ 1; 2; : : : ; L, respectively, and L is the total number of sec-
tions. Then, a probability is assigned to each section according to
Θ ¼ fW f ;Uf ;bf ;W i ;U i ; bi ;W o ;Uo ;bo ;W c ;Uc ; bc ;W ρ ;bρ g: the number of data points within it as follows:
(3) N sl
Pðsl Þ ¼ ; (5)
NT
Constructing data sets for training and validation
where N sl is the number of data points within the section sl and N T is
For all supervised deep learning applications, multiple pairs of
the total data points of the whole well-log curve. In this step, choosing
data are crucial for training the deep network, in which massive data
a reasonable value of L (the number of sections) is the key for
sets are often required for more complex data analysis applications.
obtaining the probability of the well-log because a too small L will
For our purposes, we need pairs of seismic data and large-scale den-
lead to low accuracy and a too large L will lead the calculation of
sity. Herein, we use the well-log data to construct such data sets.
probability to be computationally expensive. A reasonable criterion is
The well-log data are converted from the depth to the time domain
that we choose the value of L as small as possible on the premise
before use to make sure that the seismic data and well logs are in the
of certain accuracy. In practice, the algorithm summarized in Algo-
same domain. Based on the well logs, we construct two kinds of
data sets, and the details will be introduced next. rithm 1 can be adopted to determine the optimal value of L. In the
For the first kind of data set, we directly obtain large-scale den- algorithm, PðLÞ is the probability obtained using equations 4 and 5
sity traces from well logs by applying a filter to extract the low-fre- with the number of sections of L and ΦL ð·Þ is an operator that maps a
quency trend of the density curves, and then we associate them with probability with L sections to a probability with Lmax sections using
max
the corresponding seismic data to form several data pairs. However, interpolation. This operator can guarantee that ΦL ðPðLÞ Þ and PðL Þ
2
training the deep learning based on this data set faces problems be- have the same dimensionality; consequently, their L distance can be
cause the number of available wells is usually limited in practice, measured. Herein, we empirically set ε ¼ 2% with the aim to balance
which seriously hinders the generalization ability of deep learning accuracy and efficiency and Lmax is chosen as 100.
especially for a laterally heterogeneous medium. 2) Generate random models and the corresponding synthetic seis-
The second kind of data set is the supplement of the first kind to mic data. Based on the probability defined above, we first generate
improve the generalization ability of deep learning. Herein, our ba- several 1D P-wave velocity and density models. Each 1D model has
sic idea is to randomly generate several models and calculate their several layers (ranging randomly from four to seven), and the thick-
corresponding seismic data to form many data pairs with the aim of ness of each layer is randomly given under the premise that the
trying to include different geologic structures (such as a salt-like whole model has the same length as the well-log curve. The value
structure) into the data set. This can be realized through the follow- (P-wave velocity or density) of each layer is randomly generated
ing two steps: according to the calculated probability. In this paper, this is realized
by using the inbuilt function randsrc of the MATLAB software.
Then, we generate a 2D P-wave velocity model and a 2D density
Algorithm 1. Determine the optimal value of L.
model by interpolating these 1D models in lateral direction. Next,
Input: Error threshold ε and the maximum number of sections Lmax we obtain seismic data of the 2D model in a trace-by-trace manner
using the convolution model (Robinson, 1967) as follows:
Output: Optimal value L
max
1: Calculate the probability PðL Þ 2w 3
1
2: Set L ¼ 2, calculate the probability Pð2Þ and calculate 6 w2 w1 7
max
Errorð2Þ ¼ kΦ2 ðPð2Þ Þ − PðL Þ k22
6 . .. 72 3
6 . 7 rm;1
6 . w2 . 7
3: while ErrorðLÞ∕Errorð2Þ > ε do 6 .. .. 76 rm;2 7
6 76 7
4: Set L ¼ L þ 1 sm ¼ Wrm ¼ 6 wL . . w1 76 .. 7; (6)
6 .. 74 . 5
max
5: Calculate PðLÞ and ErrorðLÞ ¼ kΦL ðPðLÞ Þ − PðL Þ k22 6 7
6 wL . w2 7 r
6 .. .. 7 m;N t
6: end while 4 5
7: return L ¼ L.
. .
wL

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M5

where sm is the seismic data of the mth trace; w ¼ ½w1 ; w2 ; · · · ; wL T function with the aim to guarantee the estimation accuracy of deep
is the seismic wavelet; and rm ¼ ½rm;1 ; rm;2 ; · · · ; rm;N t T is the learning on the well-log data. The loss function shown in equation 8
reflectivity vector of the mth trace of which each element is can be minimized using a gradient-based local optimization tech-
 vm;iþ1 ρm;iþ1 −vm;i ρm;i nique with the descent direction calculated through back propagation
; i ¼ 1; 2; · · · ; N t − 1 (Rumelhart et al., 1986).
rm;i ¼ vm;iþ1 ρm;iþ1 þvm;i ρm;i ; (7)
0; i ¼ Nt
Implementation of the proposed method
where vm;i and ρm;i are the ith element of the P-wave velocity and
density, respectively, corresponding to the mth trace. Apart from the For the implementation of the proposed method, we use Tensor-
seismic data, the large-scale density of each trace of the 2D model Flow for training our deep learning architecture. TensorFlow is a free
is derived by applying a filter to the corresponding trace of the 2D open-source software library for dataflow and differentiable program-
density model. ming across a range of tasks, and it can automatically and correctly
After finishing these two steps, we can obtain plentiful data pairs map the flow of gradients back to individual weights and biases dur-
of seismic traces and the corresponding large-scale density, which are ing back propagation (Abadi et al., 2015). We train the deep learning
then used to train the deep learning architecture
together with the data pairs directly obtained from
well logs. It is worth noting that, although a con-
tinuous-to-discrete conversion is used during con-
struction of the random data set, the seismic traces
and large-scale density models within the random
data set are all continuous.

Training of the deep learning


architecture
Training the supervised deep learning archi-
tecture to learn the nonlinear relation between
seismic data and large-scale density is the key
for building the large-scale density model. This
can be realized through a network training pro- Figure 4. Illustration of how to obtain the probability of a well-log curve: (a) the well-
cedure, which obtains the optimal parameter set log curve and (b) the corresponding probability. To calculate the probability, we equally
divide the well-log curve into several sections based on its value.
by minimizing a loss function that measures the
quality of the learnable parameter set. Consider-
ing that the input and output of the deep learning
architecture in our application are all continuous,
meaning it is a regression problem, the following
loss function is used for network training:
   2 
1X X ðnÞ ðnÞ ðnÞ
CðΘÞ¼ ρt − ρ^ t xt ;Θ þ
N1 t ðnÞ ðnÞ
ðxt ;ρt Þ∈Γ1
 X    2 
λ X ðnÞ ðnÞ ðnÞ
ρt − ρ^ t xt ;Θ ;
N2 t
ðxðnÞ
t ;ρ
ðnÞ
t Þ∈Γ 2

(8)

where Γ1 is the data set directly obtained from


well logs with the size of N 1 ; Γ2 is the randomly
ðnÞ ðnÞ
generated data set with the size of N 2 ; ðxt ; ρt Þ
is the nth input seismic data and its corresponding
ðnÞ ðnÞ
true large-scale density at time step t; ρ^ t ðxt ; ΘÞ
is the estimated large-scale density of the current
ðnÞ
deep network with xt as the input; and λ is a
parameter to balance the two terms in the loss
function. In the early stage of training, a large λ Figure 5. The modified Marmousi II model: (a) P-wave velocity and (b) density. This
is chosen to make the second term in the loss func- model is from the top portion of the Marmousi II model with the water layer removed. In
tion to dominate the training procedure, leading to addition, the original depth domain model is converted to the time domain. The whole
model has 1600 traces, and each trace has a length of 1.2 s. This model has three pseudo
a good generalization ability of the trained deep wells, which are located at CDPs of 100, 400, and 1100. It is worth noting that these
learning. Then, λ is gradually reduced to empha- three wells offer us no information about the middle part of the model, which is complex
size the importance of the first term in the loss in structure because faults and anomalies are present.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M6 Gao et al.

architecture using Adam’s optimizer (Kingma and Ba, 2014) with a data and well-log data. The proposed method uses a machine learning
fixed learning rate of 0.001 and a fixed batch size of 32. A new mini- algorithm, that is, deep learning, to learn a nonlinear relation between
batch will be created before each iteration using random shuffling to seismic data and large-scale density. Such an approach does not re-
ensure that there is no bias in learning. Before iteration, all weights quire large-scale velocity or picked horizons of the subsurface, which
and biases are initialized. Specifically, all weights are randomly ini- are the prerequisites of the well extrapolation method or the empiri-
tialized based on a normal distribution whose mean and variance are cal-relation-based method. Thus, the proposed method can be more
0 and 0.01, respectively, and all biases are initialized as a constant of robust. More importantly, a large amount of randomly generated data
0.1. The entire training procedure contains 100 epochs, and the early pairs, which cover a variety of subsurface structures, are incorporated
stopping strategy (Nielsen, 2015) is adopted to prevent overfitting. into the training data set, leading the proposed deep-learning-based
method to have the potential to handle complex cases.
The advantage of the proposed deep-learning-based
method SYNTHETIC EXAMPLES
The proposed deep-learning-based method provides a general In this section, the performance of the proposed deep-learning-
framework to obtain a large-scale density model based on seismic based method will be assessed through synthetic experiments, which
are based on a modified Marmousi II model. The
true P-wave velocity and density models are
shown in Figure 5a and 5b, respectively. This
model is generated from the top portion of the
original Marmousi II model, and the water layer
is removed. In addition, the original depth domain
model is converted to the time domain. The whole
modified Marmousi II model has 1600 traces in
lateral direction, each of which has a length of
1.2 s. We set the sampling interval as 0.002 s; con-
sequently, each trace has 601 points in time. This
model has three pseudo wells, which are located at
common depth points (CDPs) of 100, 400, and
1100. The P-wave velocity and density curves
of these wells are shown in Figure 6. It is worth
noting that the three pseudo wells offer us nearly
no information about the middle part of the model,
which is complex in structure because faults and
anomalies are present. As a consequence, it is suit-
able to use this test to investigate the performance
of different methods in dealing with complex
cases. Based on the P-wave velocity and density
model, the observed seismic data, as shown in
Figure 7, are obtained for all 1600 traces using the
convolution model shown in equation 6. A Ricker
wavelet with peak frequency of 10 Hz is used
Figure 6. The P-wave velocity (the blue lines) and density (the red lines) of the as the source in this test. Based on the modified
pseudo wells (a–c) correspond to wells 1–3, respectively. Marmousi II model, we conduct two experiments.
First, we use the proposed deep-learning-based
method to build a large-scale density model and compare its perfor-
mance with the well extrapolation method. Second, we conduct the
uncertainty analysis to assess how the Gaussian noise in seismic data
influences the performance of the proposed method. Detailed infor-
mation of these experiments is shown in the following sections.

Comparison of different methods in building a


large-scale density model
In this experiment, we apply the well extrapolation method and
the proposed method to build the large-scale density model of the
modified Marmousi II model and compare their performance. In
addition, to more clearly display the importance of the randomly
Figure 7. The observed seismic data, which is calculated using the
convolution model based on the P-wave velocity and density mod- generated data set to the proposed method, we also test its perfor-
els shown in Figure 5. Herein, the Ricker wavelet with peak fre- mance with and without randomly generated data set. The well
quency of 10 Hz is used. extrapolation method is implemented based on the three pseudo

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M7

wells. We pick the horizons of the subsurface based on seismic data


to constrain the horizontal extrapolation to obtain more reasonable
results. For the implementation of the proposed method, we follow
the workflow shown in Figure 2. Table 1 summarizes the detailed
deep architecture. We empirically set the size of the sliding window
as 10 × 5. In other words, for each time step, its 10 adjacent data
points in time and 5 adjacent data points in lateral direction from the

Table 1. The detailed deep architecture used in the synthetic


examples.

Parameter Value

Dimension of the input vector xt ; t ¼ 1; 2; : : : ; N t 50


Dimension of the output vector ht ; t ¼ 1; 2; : : : ; N t 128
Number of time steps (number of LSTM units) N t 121 Figure 9. Part of the randomly generated 2D models: (a) P-wave
Layer of LSTM 1 velocity and (b) density. Here, (a and b) share the same structure
but with randomly assigned values. This part of the 2D model is
Dimension of W f 128 × 50 generated from nine evenly distributed and randomly generated
Dimension of U f 128 × 128 1D models using interpolation. These models are corresponding
to the synthetic examples.
Dimension of bf 128
Dimension of W i 128 × 50
Dimension of U i 128 × 128
Dimension of bi 128
Dimension of W o 128 × 50
Dimension of U o 128 × 128
Dimension of bo 128
Dimension of W c 128 × 50
Dimension of U c 128 × 128
Dimension of bc 128
Dimension of W ρ 1 × 128
Dimension of bρ 1

Figure 10. The large-scale density models: (a) the true model gen-
erated from Figure 5b using smoothing, (b) the model built by well
Figure 8. The probabilities calculated from the three pseudo wells extrapolation, (c) the model built by the proposed method without
shown in Figure 6 using the method illustrated in Figure 4: (a and randomly generated data set, and (d) the model built by the pro-
b) correspond to the P-wave velocity and density, respectively. posed method with randomly generated data set.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M8 Gao et al.

seismic data will be used as the input of the unit of LSTM. It is ity and density from the three pseudo wells (see Figure 8). Then, 57
worth noting that the number of time steps of LSTM (number of different 1D P-wave velocity and density models are randomly gen-
LSTM units) is set as 121 here, meaning that the time step of the erated based on the calculated probabilities following the method
sliding window is chosen as 0.01 s here even though the sampling introduced above. Next, 2D P-wave velocity and density models
interval of the data is 0.002 s. This mechanism can significantly are generated by interpolating these 1D models in lateral direction.
reduce the number of LSTM units and consequently leads to efficiency By setting the interval of different 1D models as 200 traces, we can
improvements during network training. It is reasonable to use this obtain a 2D P-wave velocity model as well as a 2D density model,
mechanism in our application because the seismic data and large-scale both of which have 11,201 traces, after interpolation. Figure 9 dis-
density are band limited and the information will not be lost while plays part of the randomly generated 2D models. Finally, the 2D
satisfying the Nyquist sampling theorem. In addition, the dimension models together with their corresponding synthetic seismic data
of the output vector of LSTM in each time step is 128 here. are used to generate an additional data set, which is then used
The key step is to construct the data set for training the deep for network training. The built large-scale density models of differ-
learning architecture. As introduced above, part of the data set is ent methods are shown in Figures 10 and 11. To quantitatively mea-
directly obtained from the three pseudo wells, whereas the others sure the quality of these models, we calculate their corresponding
come from the randomly generated models. To generate such ran- peak signal-to-noise ratio (PS/N) and structural similarity (SSIM),
dom models, we first calculate the probability of the P-wave veloc- and the results are shown in Table 2.
Figures 10b and 11b display the built large-
scale density model of the well extrapolation
method, and its corresponding PS/N and SSIM
are shown in Table 2. It is clear from the results
that this model is quite different from the true
large-scale density model (Figure 10a) in density
structures and values. This model can not depict
the density anomalies around CDP 250 and CDP
800, nor depict the discontinuity in the middle of
the model. This is because the well extrapolation
has a strong dependence on the well logs and it
can only offer reliable results for the places around
wells, and the wells used in this experiment offer
no information about these density anomalies.
Figures 10c and 11c display the built large-
scale density model of the proposed method with-
out a randomly generated data set. Compared to
the model built by well extrapolation, this model
is more consistent with the true model because
larger PS/N and SSIM values are obtained. Spe-
cifically, this model tells us that a low-density
anomaly presents around CDP 250 and 0.6 s in
time. In addition, this model displays some high-
density anomalies in the deep part, which is con-
sistent with the true model. This implies that the
performance of the proposed method is better than
that of the well extrapolation method, even though
Figure 11. A detailed comparison of the large-scale density models: (a-d) correspond to
the models shown in Figure 10a–10d, respectively. The density structures inside the red only three wells are used in the network training.
marked areas are quite different from those of the true model. However, the built large-scale density model here
also has some incorrect structures, such as the
density values inside the red dashed area (Fig-
Table 2. Comparison of large-scale density models built by ure 11c) are lower than those in the true model (Figure 11a). This
different methods. These models correspond to the noise-free is because the data set obtained directly from the three pseudo wells
synthetic example based on the modified Marmousi II model.
The bold values indicate that the proposed method with is insufficient for deep learning to establish a nonlinear relation that
randomly generated data set has the best performance. can be generalized to a variety of subsurface structures.
Figures 10d and 11d display the built large-scale density model
of the proposed method with a randomly generated data set.
Proposed method Proposed method Compared to the model shown in Figure 10c, this model has a sig-
Well without randomly with randomly nificant improvement in quality. Specifically, this model has a very
Method extrapolation generated data set generated data set
high PS/N value (49.2250 dB), which is almost 10 dB larger than
PS/N (dB) 32.8276 39.5746 49.2250 that without a randomly generated data set, and also a high SSIM
SSIM 0.9708 0.9982 0.9996 value (0.9996). In addition, the overall density structures and values
are consistent with that of the true model even for some small-size

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M9

density anomalies, whose accurate depiction is considered to be by increasing the energy of the Gaussian noise, the effective signal
challenging especially when no adequate well is available. These contained in the seismic data is seriously contaminated by the noise
results imply the following: (1) the randomly generated data set especially when the S/N is 0 dB, which makes building a large-scale
is crucial for the performance of the proposed deep-learning-based density model from seismic data challenging. Besides the seismic
method especially for its implementation in complex cases and data, other experimental conditions used here are the same as that
(2) the proposed method works well in this synthetic experiment used above. The randomly generated data set is used in this experi-
even though only three wells are available, and it achieves superior ment during training of the proposed deep learning architecture.
performance over the well extrapolation method. The results are shown in Figure 13 and Table 3.
Figure 13b displays the built large-scale density model corre-
Uncertainty analysis on Gaussian noise sponding to the noisy data with the S/N of 10 dB. In such a case,
the built model is comparable to the noise-free case (Figure 13a) be-
The proposed method builds the large-scale density model from cause both of them have consistent density structures. In addition,
seismic data. In the previous experiment, we assumed noise-free their corresponding PS/N as well as the SSIM values are very close
seismic data. Herein, we conduct an uncertainty analysis to inves- to each other. Figure 13c displays the built large-scale density model
tigate the sensitivity of the proposed method to Gaussian noise. corresponding to the noisy data with the S/N of 5 dB. It is clear from
Three different levels of Gaussian noise with signal-to-noise ratios this model that, with the increase of the Gaussian noise, some arti-
(S/Ns) of 10, 5, and 0 dB are added into the seismic data to generate facts are present in the model. This phenomenon can more clearly be
three noisy data sets (see Figure 12). It is clear from Figure 12 that, observed in Figure 13d, which corresponds to the case with the S/N

Figure 12. The noisy seismic data: (a-c) correspond to the S/N of 10, 5, and 0 dB, respectively; (d-f) are the comparisons of the noise-free data
(the red lines) and the noisy data (the blue dashed lines), which correspond to (a-c), respectively. The data are from CDP 400.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M10 Gao et al.

of 0 dB. Specifically, three obvious artifacts marked by the red ure 14. This field data set has 810 traces in lateral direction, and
dashed rectangles can be observed, and the PS/N value of this model each trace has a length of 4.0 s. Herein, we only use part of the
is lower than the noise-free case by 4.8015 dB, which indicates the data, which ranges from 1.8 to 2.4 s in time. The sampling interval
reduction in quality. This is maybe because the strong noise destroys of these data is 0.002 s; thus, each trace has 301 samples. This part
the lateral continuity of seismic data. Considering that the proposed of the data is quite complex in structure because it has faults and
deep-learning-based method builds the large-scale density model in a anomalies. As a consequence, its large-scale density model building
trace-by-trace manner, this discontinuity is definitely harmful. is challenging. These data have five wells, which are located at
Although strong noise can cause the quality of the built large- CDPs of 115, 163, 294, 560, and 677. Figure 15 displays the P-
scale density model to decrease, it is worth noting that the proposed wave and density curves of the five wells. Different from the other
method can also accurately depict the complex density structures of wells, well 2 (Figure 15g) has a high anomaly from approximately
the middle part of the model even for the case with an S/N of 0 dB. 2.15 to 2.2 s. To further complicate the density model building,
To be specific, the positions and shapes of the low- and high-density we will not use well 2 for training but for testing, meaning that
anomalies can be accurately characterized. In addition, even for the no information about the density anomaly will be provided to the
extreme case (S/N is 0 dB), the large-scale density model built by proposed method. In the following, we will conduct two different
the proposed method is still significantly better than that of the well experiments: (1) we apply the well extrapolation method and the
extrapolation method. This suggests that the proposed method can proposed method to build the large-scale density model of this field
tolerate a wide range of Gaussian noise levels. data set and compare their performance, and (2) we use two differ-
ent well groups to train the deep learning architecture to investigate
FIELD DATA EXAMPLES the influence of the training data set on the performance of the pro-
posed method for uncertainty analysis.
In this section, the performance of the proposed deep-learning-
based method will be investigated in field data examples, which are Comparison of different methods in building a
conducted based on the 2D poststack seismic data shown in Fig- large-scale density model
Herein, the well extrapolation method and the proposed deep-
learning-based method are used to build large-scale density models.
The well extrapolation method uses four wells located at CDPs of
115, 294, 560, and 677 to obtain the large-scale density model for
all traces using extrapolation constrained by the picked horizons.
The same wells are used to construct the training data set for the
proposed method. Specifically, four pairs of the seismic data and
the corresponding large-scale density from the well-log form the
first part of the training data set, and the remaining data pairs are
constructed based on the 2D random models, which are generated
based on the probabilities shown in Figure 16. The 2D random
models have a total of 16,200 traces, which are generated based
on 163 1D randomly generated P-wave velocity and density models
using lateral interpolation. Figure 17 displays part of the 2D random
models. The wavelet used to generate the synthetic seismic data of
the 2D random models is extracted from the observed seismic data.
Table 4 displays the detailed information of the deep architecture
used in this example. The same as the synthetic example, we set
the size of the sliding window as 10 × 5, and the dimension of
the output vector of LSTM in each time step is fixed at 128. Herein,
the number of time steps of LSTM is 151, meaning that the time
step of the sliding window is 0.004 s. Figure 18 displays the large-
scale density models built by the well extrapolation method and the

Table 3. Comparison of large-scale density models built by


the proposed method when Gaussian noise is present in
seismic data. These models correspond to the synthetic
example based on the modified Marmousi II model. The
bold values indicate that the proposed method has the best
performance when S/N of seismic data is 10 dB.

S/N of seismic data (dB) 10 5 0


Figure 13. The large-scale density models: (a) corresponds to the
noise-free data, which is the same as that shown in Figure 10d, and PS/N (dB) 48.3452 47.5767 44.4235
(b-d) correspond to the noisy data with S/Ns of 10, 5, and 0 dB, re- SSIM 0.9995 0.9995 0.9992
spectively. The red dashed rectangles indicate the noodle-like artifacts.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M11

proposed method, and a detailed comparison of


them is shown in Figure 19. Table 5 presents
quantitative measurements of the built models.
Figures 18a and 19b display the built large-
scale density model of the well extrapolation
method. A detailed comparison of this model and
the true large-scale density model corresponding
to well 2 is shown in Figure 19d. It is clear from
the above results that the built model is quite dif-
ferent from the true density model in the struc-
tures. This is also verified by the SSIM value
of the built model, which is only 0.2515. Specifi-
cally, the built model of well extrapolation cannot
offer us a correct and reasonable depiction of the
high-density anomaly in well 2. This displays the
Figure 14. The 2D poststack field data. These data have 810 traces, and each trace has a strong limitation of the well extrapolation method
length of 4.0 s. Herein, we only choose part of the data (ranges from 1.8 to 2.4 s) in the that it can hardly work for complex cases without
example. This data set has five wells, which are located at CDPs of 115, 163, 294, 560, and being given sufficient well logs.
677.

Figure 15. The P-wave velocity (the blue lines) and density (the red lines) of the five wells: (a-e) the P-wave velocity curves corresponding to
wells 1–5, respectively, and (f-j) the density curves corresponding to wells 1–5, respectively.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M12 Gao et al.

Figures 18b and 19c display the built large-scale density model of testing; consequently, the above results are evidence that the trained
the proposed method. This model is quite different from that built deep learning architecture has good generalization ability in this
by the well extrapolation method in the structures, as compared in experiment.
Figure 19. More importantly, the comparison of density curves
shown in Figure 19d indicates that the built model of the proposed Uncertainty analysis on the training data set
method is consistent with the true density model obtained from the
well-log. Specifically, the built model of the proposed method In the preceding experiment, we use the well group that includes
can clearly depict the structure and the value of the high-density four wells (wells 1 and 3–5) to train our deep learning architecture.
anomaly in well 2. This consistency of the built model and the true Herein, we investigate how different well combinations used in net-
density model is also verified by the PS/N and SSIM values in work training influence the performance of the proposed method. In
Table 5, which are 47.0095 dB and 0.9255, respectively. It is worth
noting that the information of well 2 is not used in training but in
Table 4. The detailed deep architecture used in field data
examples.

Parameter Value

Dimension of the input vector xt ; t ¼ 1; 2; : : : ; N t 50


Dimension of the output vector ht ; t ¼ 1; 2; : : : ; N t 128
Number of time steps (number of LSTM units) N t 151
Layer of LSTM 1
Dimension of W f 128 × 50
Dimension of U f 128 × 128
Dimension of bf 128
Dimension of W i 128 × 50
Dimension of U i 128 × 128
Dimension of bi 128
Dimension of W o 128 × 50
Dimension of U o 128 × 128
Dimension of bo 128
Dimension of W c 128 × 50
Figure 16. The probabilities calculated from the five wells shown in Dimension of U c 128 × 128
Figure 15: (a and b) correspond to the P-wave velocity and density,
respectively. Dimension of bc 128
Dimension of W ρ 1 × 128
Dimension of bρ 1

Figure 17. Part of the randomly generated 2D models: (a) P-wave


velocity and (b) density. Here, (a and b) share the same structure but
with randomly assigned values. This part of the 2D model is gen-
erated from nine evenly distributed and randomly generated 1D Figure 18. The large-scale density models: (a) the model built
models using interpolation. These models correspond to the field by the well extrapolation method and (b) the model built by the
data examples. proposed deep-learning-based method.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M13

addition to the well group used above (well


group 1), we will use another well group (well
group 2), which only contains three wells (wells
3–5), to train the deep network and build large-
scale density models of the field data. It is worth
noting that the three wells in well group 2 are all
far away from well 2 in lateral distance, meaning
that less information can be used here. Apart
from the well group, the other experimental con-
ditions used here are the same as those used in
the previous experiment. Figures 20 and 21 dis-
play the built models based on different well
groups, whereas Table 6 shows their quantitative
comparisons.
It is clear from the above results that the built
large-scale density models based on the two dif-
ferent well groups are comparable to each other,
meaning that the structures and density values of
the built models are consistent. Specifically, in
both cases, the built model of the proposed
method can accurately depict the density anomaly Figure 19. Comparison of different large-scale density models of well 2: (a) the ob-
served seismic data around well 2; (b and c) the large-scale density models around well
of well 2. This observation is also verified by the 2, which are built by well extrapolation and the proposed method, respectively; and
PS/N and SSIM values shown in Table 6, which (d) the comparison of different large-scale density curves of well 2.
displays that the values of the two cases are close.
This could be because, in the proposed method, the majority of data
pairs in the training data set are randomly generated rather than
directly obtained from the well logs. This key strategy enables the
training data set to cover a wide variety of subsurface structures even Table 5. A comparison of the built large-scale density models
though the well logs themselves do not have such structures, leading of well 2 in the field data example. The bold values indicate
to a significantly improved performance of the proposed deep- that the proposed method performs significantly better than
learning-based method. Although the proposed method seems to the well extrapolation method.
be not very sensitive to the choice of wells, we still have to emphasize
that sufficient well logs are essential to the successful application of Method Well extrapolation Proposed method
the proposed method, and we should use the available well logs as
much as possible to construct the training data set to increase the PS/N (dB) 35.0429 47.0095
performance of the proposed method. SSIM 0.2515 0.9255

DISCUSSION
In the preceding sections, the deep-learning-based method has
been successfully applied to build large-scale density models and
its advantages over the well extrapolation method have been clearly
demonstrated through theoretical analysis and numerical examples.
In addition, the uncertainty analyses on Gaussian noise in seismic
data and the choice of wells in network training have been con-
ducted to verify the robustness of the proposed method. The main
advantages of the proposed method can be briefly summarized as
follows. First, it builds a large-scale density model from seismic
data using a nonlinear relation between them described by a deep
learning architecture that is established based on the LSTM net-
work. This approach avoids the requirements of a large-scale veloc-
ity model and picked horizons, which are the essential elements of
some other methods, leading these methods to be weak because
large-scale velocity model building and horizon picking are known
to be challenging. Second, the proposed method uses randomly
generated models to greatly enlarge the size and diversity of the
training data set instead of just using the limited well logs for net-
work training. This strategy significantly improves the performance Figure 20. The large-scale density models: (a and b) the model built
of the proposed method in dealing with complex cases. Below, we by the proposed deep-learning-based method using well groups 1
will address some important aspects of the proposed method. and 2 for network training, respectively.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
M14 Gao et al.

step. This sliding window should be replaced by


a sliding cube in 3D cases to fully exploit the
spatial correlation of 3D seismic data. Second,
during random model generation, we should gen-
erate 3D random models in 3D cases, and these
models also can be generated from several 1D
random models through 3D interpolation.

Application to other reservoir


parameters
In this work, the deep-learning-based method is
used to build a large-scale density model based on
seismic data. The basic idea behind this work also
can be applied to other reservoir parameters, such
as the P-wave impedance and quality factor Q.
Next, we will use Q as an example to discuss
how it can be obtained by using deep learning.
The Q factor describes the intrinsic attenuation
of subsurface media, which leads to the viscous
Figure 21. Comparison of different large-scale density models of well 2: (a) the observed effects of the seismic waveform, meaning ampli-
seismic data around well 2; (b and c) the large-scale density models around well 2, which tude reduction and phase dispersion (Causse et al.,
are built by the proposed method using well groups 1 and 2 for network training, respec-
tively; and (d) the comparison of different large-scale density curves of well 2. 1999). An accurate Q model is not only key for
mitigating the viscous effects in acoustic full-
waveform inversion (Virieux and Operto, 2009;
Agudo et al., 2018) and seismic deconvolution (Margrave et al.,
Table 6. A comparison of the large-scale density models of
well 2 in the field data example. The models are built by the 2011; Wang et al., 2013; Sui and Ma, 2019), but it is also crucial
proposed method using different groups of wells in network for the identification of gas reservoirs. Because Q influences the am-
training. The bold values indicate that the group 1 can lead plitude and phase of seismic data, it is reasonable to believe that a
to a better PS/N value while the group 2 can lead to a better nonlinear relation between seismic data and Q exists and it can be
SSIM value.
accurately described by a deep learning architecture. However, build-
ing such a nonlinear relation may face one difficulty that we rarely
Training data set Well group 1 Well group 2 have the true Q model as the labels for network training. To overcome
this, we can use the idea presented by Biswas et al. (2019) to incor-
Wells Wells 1, 3–5 Wells 3–5 porate the physical principle (the forward operator maps models to
PS/N (dB) 47.0095 44.7418 seismic data) into deep learning to convert the conventional super-
SSIM 0.9255 0.9513 vised deep learning into an unsupervised one.

CONCLUSION
Potential limitations
Building an accurate large-scale density model is of great impor-
The proposed deep-learning-based method builds a large-scale tance in exploration geophysics. We propose a deep-learning-based
density model from seismic data based on the nonlinear relation method to build such a model. This method builds the large-scale
between them. Consequently, its effectiveness has two prerequisite density model from seismic data based on the nonlinear relation
conditions. First, sufficient well logs are required by the proposed between them described by a deep learning architecture, and the
method to build a reliable nonlinear relation. In other words, the well logs are used to construct a training data set to enable deep
proposed method is not applicable to cases in which no well-log learning to learn such a nonlinear relation. The proposed method
is available. Second, the well-tie condition between seismic data has two important characteristics. First, the LSTM network is used
and well logs is required by the proposed method to ensure that to make full use of the dynamic nature of seismic data and well logs.
a reasonable nonlinear relation can be built. Second, and more important, the random models, which are gen-
erated based on the probabilities calculated from well logs, are used
Extension to 3D cases to greatly enlarge the size and diversity of the training data set to
significantly improve the ability of the proposed method in handling
The extension of the proposed method to 3D cases is straightfor- complex cases. Synthetic examples based on the modified Mar-
ward because it uses a trace-by-trace strategy to build the large-scale mousi II model and field data examples display that the proposed
density model, so that its applications in 2D and 3D cases are ba- method can build reasonable large-scale density models even
sically the same. However, one should pay attention to the following though the well logs used in network training are limited, and they
two aspects to ensure a successful 3D implementation. First, as in- display superior performance because the built large-scale density
troduced in the 2D cases, we use a sliding window to extract the model is 11.9666 dB and 0.6740 larger, respectively, in PS/N and
seismic data inside it as the input of the LSTM unit for each time SSIM than that of the well extrapolation method in the field data

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user
Data-driven large-scale density building M15

example. In addition, uncertainty analyses of the proposed method Forgues, E., and G. Lambaré, 1997, Parameterization study for acoustic
and elastic ray plus born inversion: Journal of Seismic Exploration, 6,
on Gaussian noise in seismic data and the choice of wells in network 253–277.
training are conducted to display the robustness of the proposed Gao, Z., S. Hu, C. Li, H. Chen, J. Gao, and Z. Xu, 2020a, Reflectivity
method. In the future, we will extend the current work to 3D ap- inversion of nonstationary seismic data with deep learning based data
correction: 82nd Annual International Conference and Exhibition, EAGE,
plications and also investigate its implementation in building mod- Extended Abstracts, doi: 10.3997/2214-4609.202011195.
els for other reservoir parameters. Gao, Z., C. Li, T. Yang, Z. Pan, J. Gao, and Z. Xu, 2020b, OMMDE-Net: A
deep learning-based global optimization method for seismic inversion:
IEEE Geoscience and Remote Sensing Letters, Early Access, doi: 10
ACKNOWLEDGMENTS .1109/LGRS.2020.2973266.
Gao, Z., Z. Pan, C. Zuo, J. Gao, and Z. Xu, 2019, An optimized deep net-
work representation of multimutation differential evolution and its appli-
The authors gratefully thank the editors and three anonymous cation in seismic inversion: IEEE Transactions on Geoscience and
reviewers for their comments, which have greatly helped to improve Remote Sensing, 57, 4720–4734, doi: 10.1109/TGRS.2019.2892567.
the quality of this paper. The authors also thank the National Gardner, G. H. F., L. W. Gardner, and A. R. Gregory, 1974, Formation veloc-
ity and density–the diagnostic basics for stratigraphic traps: Geophysics,
Nature Science Foundation of China under grant no. 41804113, 39, 770–780, doi: 10.1190/1.1440465.
the National Postdoctoral Program for Innovative Talents under Hochreiter, S., and J. Schmidhuber, 1997, Long short-term memory: Neural
Computation, 9, 1735–1780, doi: 10.1162/neco.1997.9.8.1735.
grant no. BX201700193, the National Key R&D Program of the Kahrizi, A., and H. Hashemi, 2014, Neuron curve as a tool for performance
Ministry of Science and Technology of China under grant nos. evaluation of MLP and RBF architecture in first break picking of seismic
2018YFC1504200 and 2018YFC0603501, and the National Science data: Journal of Applied Geophysics, 108, 159–166, doi: 10.1016/j
.jappgeo.2014.06.012.
and Technology Major Project under grant nos. 2016ZX05024-001- Keynejad, S., M. L. Sbar, and R. A. Johnson, 2017, Comparison of model-
007 and 2017ZX05069 for their financial support. based generalized regression neural network and prestack inversion in
predicting Poisson’s ratio in Heidrun Field, North Sea: The Leading Edge,
36, 938–946, doi: 10.1190/tle36110938.1.
DATA AND MATERIALS AVAILABILITY Kingma, D. P., and J. Ba, 2014, Adam: A method for stochastic optimiza-
tion: arXiv preprint arXiv:1412.6980.
Data related to the synthetic examples are available and can be Konaté, A. A., H. Pan, S. Fang, S. Asim, Y. Z. Yao, C. Deng, and N. Khan,
2015, Capability of self-organizing map neural network in geophysical
obtained by contacting the corresponding author. Data related to the log data classification: Case study from the CCSD-MH: Journal of Ap-
field data examples are confidential and cannot be released. plied Geophysics, 118, 37–46, doi: 10.1016/j.jappgeo.2015.04.004.
Lewis, W., and D. Vigh, 2017, Deep learning prior models from seismic im-
ages for full-waveform inversion: 87th Annual International Meeting, SEG,
Expanded Abstracts, 1512–1517, doi: 10.1190/segam2017-17627643.1.
REFERENCES Maiti, S., and R. K. Tiwari, 2010, Automatic discriminations among geo-
physical signals via the Bayesian neural networks approach: Geophysics,
Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Cor- 75, no. 1, E67–E78, doi: 10.1190/1.3298501.
rado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, Margrave, G. F., M. P. Lamoureux, and D. C. Henley, 2011, Gabor decon-
G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Leven- volution: Estimating reflectivity by nonstationary deconvolution of seis-
berg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. mic data: Geophysics, 76, no. 3, W15–W30, doi: 10.1190/1.3560167.
Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Motaz, A., and A. Ghassan, 2018, Petrophysical-property estimation from seis-
Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, mic data using recurrent neural networks: 88th Annual International Meeting,
Y. Yu, and X. Zheng, 2015, TensorFlow: Large-scale machine learning on SEG, Expanded Abstracts, 2141–2146, doi: 10.1190/segam2018-2995752.1.
heterogeneous systems. (Software available from tensorflow.org). Nielsen, M. A., 2015, Neural networks and deep learning: Determination
Agudo, Ò. C., N. V. da Silva, M. Warner, T. Kalinicheva, and J. Morgan, Press.
2018, Addressing viscous effects in acoustic full-waveform inversion: Poulton, M. M., 2002, Neural networks as an intelligence amplification tool: A
Geophysics, 83, no. 6, R611–R628, doi: 10.1190/geo2018-0027.1. review of applications: Geophysics, 67, 979–993, doi: 10.1190/1.1484539.
Ahmadi, M. A., S. Zendehboudi, A. Lohi, A. Elkamel, and I. Chatzis, 2013, Qian, F., M. Yin, X.-Y. Liu, Y.-J. Wang, C. Lu, and G.-M. Hu, 2018, Un-
Reservoir permeability prediction by neural networks combined with supervised seismic facies analysis via deep convolutional autoencoders:
hybrid genetic algorithm and particle swarm optimization: Geophysical Geophysics, 83, no. 3, A39–A43, doi: 10.1190/geo2017-0524.1.
Prospecting, 61, 582–598, doi: 10.1111/j.1365-2478.2012.01080.x. Robinson, E. A., 1967, Predictive decomposition of time series with appli-
Araya-Polo, M., J. Jennings, A. Adler, and T. Dahlke, 2018, Deep-learning cation to seismic exploration: Geophysics, 32, 418–484, doi: 10.1190/1
tomography: The Leading Edge, 37, 58–66, doi: 10.1190/tle37010058.1. .1439873.
Ardjmandpour, N., C. Pain, J. Singer, J. Saunders, E. Aristodemou, and J. Rumelhart, D. E., G. E. Hinton, and R. J. Williams, 1986, Learning repre-
Carter, 2011, Artificial neural network forward modelling and inversion sentations by back-propagating errors: Nature, 323, 533–536, doi: 10
of electrokinetic logging data: Geophysical Prospecting, 59, 721–748, .1038/323533a0.
doi: 10.1111/j.1365-2478.2010.00935.x. Sui, Y., and J. Ma, 2019, A nonstationary sparse spike deconvolution with
Biswas, R., M. K. Sen, V. Das, and T. Mukerji, 2019, Prestack and poststack anelastic attenuation: Geophysics, 84, no. 2, R221–R234, doi: 10.1190/
inversion using a physics-guided convolutional neural network: Interpre- geo2017-0846.1.
tation, 7, no. 3, SE161–SE174, doi: 10.1190/INT-2018-0236.1. Tarantola, A., 1986, A strategy for nonlinear elastic inversion of seismic
Boateng, C. D., L.-Y. Fu, W. Yu, and G. Xizhu, 2017, Porosity inversion reflection data: Geophysics, 51, 1893–1903, doi: 10.1190/1.1442046.
by Caianiello neural networks with Levenberg-Marquardt optimization: Ursenbach, C. P., 2005, Generalized Gardner relations: 75th Annual
Interpretation, 5, no. 3, SL33–SL42, doi: 10.1190/INT-2016-0119.1. International Meeting, SEG, Expanded Abstracts, 1885–1888, doi: 10
Causse, E., R. Mittet, and B. Ursin, 1999, Preconditioning of full-waveform .1190/1.1817057.
inversion in viscoacoustic media: Geophysics, 64, 130–145, doi: 10.1190/ van der Baan, M., and C. Jutten, 2000, Neural networks in geophysical
1.1444510. applications: Geophysics, 65, 1032–1047, doi: 10.1190/1.1444797.
Cersósimo, D. S., C. L. Ravazzoli, and R. G. Martínez, 2016, Prediction of Virieux, J., and S. Operto, 2009, An overview of full-waveform inversion in
lateral variations in reservoir properties throughout an interpreted seismic exploration geophysics: Geophysics, 74, no. 6, WCC1–WCC26, doi: 10
horizon using an artificial neural network: The Leading Edge, 35, 265– .1190/1.3238367.
269, doi: 10.1190/tle35030265.1. Wang, L., J. Gao, W. Zhao, and X. Jiang, 2013, Enhancing resolution of
Chaki, S., A. Routray, and W. K. Mohanty, 2015, A novel preprocessing nonstationary seismic data by molecular-Gabor transform: Geophysics,
scheme to improve the prediction of sand fraction from seismic attributes 78, no. 1, V31–V41, doi: 10.1190/geo2011-0450.1.
using neural networks: IEEE Journal of Selected Topics in Applied Earth Wit, R. W. L. D., A. P. Valentine, and J. Trampert, 2013, Bayesian inference
Observations and Remote Sensing, 8, 1808–1820, doi: 10.1109/JSTARS of Earth’s radial seismic structure from body-wave traveltimes using neu-
.2015.2404808. ral networks: Geophysical Journal International, 195, 408–422, doi: 10
Chen, X., Y. Du, Z. Liu, W. Zhao, and X. Chen, 2018, Inversion of density .1093/gji/ggt220.
interfaces using the pseudo-backpropagation neural network method:
Pure and Applied Geophysics, 175, 4427–4447, doi: 10.1007/s00024-
018-1889-7. Biographies and photographs of the authors are not available.

Downloaded from http://pubs.geoscienceworld.org/geophysics/article-pdf/86/1/M1/5229186/geo-2019-0332.1.pdf


by Fundação Oswaldo Cruz user

You might also like