IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 14, NO.
12, DECEMBER 2017 2395
Prediction of Subsurface NMR T2 Distributions in a
Shale Petroleum System Using Variational
Autoencoder-Based Neural Networks
Hao Li and Siddharth Misra
Abstract— Nuclear magnetic resonance (NMR) is used in are interpreted using empirical and mechanistic models to
geological characterization to investigate the internal structure quantify the physical properties of the subsurface. NMR
of geomaterials filled with fluids containing 1 H and 13 C nuclei. logging tool is deployed in a borehole to primarily acquire the
Subsurface NMR measurements are generally acquired as well
logs that provide information about fluid mobility and fluid- T2 distributions of the subsurface earth formations, which can
filled pore size distribution. Acquisition of subsurface NMR log be further processed to obtain the physical properties of the
is limited due to operational and instrumentation challenges. formations, such as pore size distribution (PSD), fluid-filled
We implement a variational autoencoder (VAE) for improved porosity, bound fluid saturations, and permeability [1].
training of a neural network (NN) to generate the NMR-T2 Use of logging tools, tool physics model, geophysical mod-
distributions along a 300-ft depth interval in a shale petroleum
system at 11 000-ft depth below sea level. Subsurface mineral and els, and inversion- and machine-learning-based data interpre-
kerogen volume fractions, fluid saturations, and T2 distributions tation techniques for purposes of subsurface characterization
acquired at 460 discrete depth points were used as the training has been evolving with the advancements in sensor physics
data set. The trained VAE-NN successfully predicts the T2 distri- and computational methods. Well logs are being interpreted
butions for 100 discrete depths at an R 2 of 0.75 and normalized using neural networks (NNs), which is a machine-learning
root-mean-square deviation of 15%.
approach, to obtain subsurface physical properties. For exam-
Index Terms— Machine learning, nuclear magnetic ple, Wong et al. [2] used well logs to classify a formation into
resonance (NMR). different lithofacies followed by the estimation of porosity and
I. I NTRODUCTION permeability using genetic NN. Lithology determination from
N UCLEAR magnetic resonance (NMR) logging tool is
sensitive to the mobility of the pore-filling fluid phases,
bound fluid volume, and the pore structure of subsurface
well logs was performed by Chang et al. [3] in Ordovician
rock units using fuzzy memory NN.
Deep learning (DL) methods are advanced machine-learning
hydrocarbon-bearing formations. Acquisition of NMR log is techniques for data processing, information retrieval, pat-
more expensive and requires harder to deploy infrastructure tern recognition, and diagnostics. DL algorithms are getting
than other conventional logs. Previous studies aim to predict increasingly adopted in remote sensing applications. High-
NMR-derived physical properties of the subsurface, such as resolution satellite images are classified based on the embed-
saturation and permeability [1]. Prediction of the entire T2 ded scenes, object detection, and land-use patterns using
distribution spanning relaxation times of 0.3 to 3000 ms deep belief networks [4] and deep convoluted NNs [5].
from conventional logs without core data is a novel task. Autoencoder-based active learning techniques have also been
A machine-learning method is implemented to extract the used on hyperspectral data for classification by selecting the
complex relationships of T2 distribution with mineral volume most informative samples for training and by effective texture
fraction and fluid saturation logs for generating synthetic NMR extraction and decontamination of speckle noise in the data [6].
T2 distributions in shales in the absence of NMR logging tool. To the best of our knowledge, there are no predictive model
applications of DL methods, especially those based on autoen-
II. S TATE OF THE A RT coders and convolutional NNs, to enhance the subsurface
Surface-based deep sensing measurements, borehole-based characterization using well logs. This letter proposes a novel
near-wellbore measurements (logs), and laboratory measure- DL-assisted method to characterize the NMR-T2 distribu-
ments of geological core samples extracted from wellbores tion response of fluid-filled porous subsurface formations by
processing the conventional and easy-to-acquire subsurface
Manuscript received June 26, 2017; revised August 10, 2017; accepted logs in the absence of NMR logging tool.
October 16, 2017. Date of publication November 16, 2017; date of current
version December 4, 2017. (Corresponding author: Siddharth Misra.)
H. Li is with the Mewbourne School of Petroleum and Geological Engi- III. T HEORY AND M ETHODS
neering, University of Oklahoma, Norman, OK 73019-0390 USA (e-mail: A. Nuclear Magnetic Resonance for Characterization
haoli@ou.edu).
S. Misra is with the Petroleum and Geological Engineering Depart- NMR response is used as a diagnostic technique to map the
ment, University of Oklahoma, Norman, OK 73019-0390 USA (e-mail: fluid-filled pores in a geomaterial for purposes of reservoir
misra@ou.edu). characterization. In subsurface, NMR response is primarily
Color versions of one or more of the figures in this letter are available
online at http://ieeexplore.ieee.org. generated by the relaxation of magnetically excited 1 H and 13 C
Digital Object Identifier 10.1109/LGRS.2017.2766130 nuclei of pore-filling fluids. NMR response is quantified as
1545-598X © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
2396 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 14, NO. 12, DECEMBER 2017
Fig. 2. Testing process schematic.
and connected to a three-layer NN to associate the formation
fluid saturations and mineral content with the dominant NMR
T2 features in the latent space, as extracted in the first step
of the training. The three-layer NN is comprised of an input
layer that takes ten mineral contents and fluid saturations
Fig. 1. Training process schematic.
logs as input, a hidden layer with eight neurons, and an
a T2 distribution that comprises the bulk fluid relaxation, output layer with two neurons. The output of the three-layered
surface relaxation, and diffusion relaxation signatures. Surface NN is fed to the decoder. The trained decoder is frozen
relaxation occurs at the fluid–solid interface and is affected to keep its “memory” of the step 1 training with T2 data.
by mineralogy and lithology. Bulk relaxation is affected After the training process is complete, the trained VAE-NN
by the fluid type, the hydrogen content, and its mobility. generates T2 distribution in the shale petroleum system (SPS)
Consequently, NMR T2 signal is affected by both fluid and by processing conventional logs (see Fig. 2).
matrix compositions. Therefore, NMR T2 distribution can be
C. VAE Architecture and Theory
successfully predicted when the formation fluid saturations and
VAE architecture encodes training data input X into the
matrix mineral volume fractions are known.
latent vector z, which is constrained to follow a Gaussian
distribution [9]. During the training phase, the VAE weights
B. VAE-Based Neural Network Training and Test Processes
are altered to minimize two loss functions. The first loss
An autoencoder is an NN that reproduces its input as its function X − f (z)2 enables the VAE to learn to minimize
output by implementing an encoder NN followed by a decoder the difference between inputs and outputs. The second loss
NN [7]. On the encoder side, it has latent layers of lower function Kullback–Leibler divergence forces the encoder to
dimensions compared to the preceding layers. The encoder generate latent vector that follows a Gaussian distribution
projects the input data to the lower dimension latent layer; fol- by measuring the relative entropy between the prior and
lowing that the decoder decodes the latent vector to reconstruct the approximate posterior (hidden representation) probability
the input. With this bottleneck structure [Fig. 1 (step 1)], density functions.
an autoencoder learns to extract the input signal’s dominant During the data projection step of encoding, VAE learns
features and characteristics. A good reconstruction ensures to arrange the training data with similar features at nearby
that the input signals were properly represented in the locations in the latent space to effectively reduce the loss.
lower dimension latent layer during the encoding procedure. T2 distribution in the latent space represents the most impor-
Variational autoencoder (VAE) is a specific form of autoen- tant trending features of the T2 distribution of the SPS. Once
coder, wherein latent vectors are constrained to follow a the VAE is trained to memorize and generalize the dominant
Gaussian distribution [8] that adds uncertainty to the projected features in the T2 distributions, any new input of T2 distribu-
latent variable to limit the amount of information that can pass tion is projected to a corresponding location in the latent space
through the latent layer, i.e., discretize the information flow. where there already exists a projection of a similar T2 distri-
We construct the VAE with three hidden layers. The input bution, which was fed during the prior training phase. For
and output layers of VAE are 64-dimensional measured and accurate prediction of NMR response using the mineral vol-
reconstructed NMR T2, respectively. The 3 hidden layers have ume fractions and fluid saturations, VAE-NN needs to project
16, 2, and16 neurons, respectively. A sigmoid activation is a unique combination of mineral contents and fluid saturations
implemented in the output layer and rectified linear unit acti- close to T2 projection memorized during the VAE training.
vation for the remaining 4 layers. Simple backpropagation and
stochastic gradient descent algorithms optimize the weights IV. DATA ACQUIRED F ROM THE S HALE
and biases of neurons during the training process. P ETROLEUM S YSTEM
A two-step training process (Fig. 1) is implemented during A set of well logs was retrieved from the 300-ft long inter-
the training prior to the testing phase (Fig. 2). In the first section of a well with the SPS comprising a top black shale,
step, VAE is trained to generalize, learn, and memorize the a middle sandy siltstone, and a bottom black shale. Variation in
dominant features of the T2 distributions in training data. formation mineral compositions leads to changes in the pore
Encoder projects the T2 data on a 2-D latent space. A higher structure, grain texture, and surface relaxivity. These charac-
dimensional latent layer is required when the training data teristics along with fluid saturations and their distribution in
set has several distinct features. After the VAE is trained, the pore network govern the NMR T2 distribution response of
the trained decoder (the second half of the VAE) is frozen the formation.
LI AND MISRA: PREDICTION OF SUBSURFACE NMR T2 DISTRIBUTIONS IN SPS 2397
Fig. 3. (a) Comparison of predicted NMR T2 (red curve) and measured NMR T2 (blue curve) distributions. (b) R 2 of T2 distribution predictions for
100 depths from the testing data set of the SPS. In (a), x-axis is T2 relaxation time from 0.3 to 3000 ms; y-axis is normalized NMR signal amplitude
approximately ranging from 0 to 0.25.
Formation mineral content, fluid saturation, and T2 distrib- distinct lithologies. High prediction accuracy for the 100 ran-
ution logs were acquired at 597 depth points along the 300-ft domly selected test data proves the possibility of applying
length of well intersection with SPS. The 28 unrepresentative the predictive model to a reservoir unit comprising one
depths, where T2 distributions had three peaks, were screened of the seven lithologies or their combinations. Moreover,
out so that VAE-NN learns from the most representative the predictive model is applicable only in formations having
depths. Following that, all input logs were normalized to have unimodal or bimodal PSD. The predictive model cannot be
zero mean and unit variance to facilitate convergence of the generalized to other shale reservoirs. Prediction performance
training process. T2 distribution predictions are performed for depths exhibiting unimodal T2 is high; therefore, the uni-
using seven mineral content logs, namely, kerogen, calcite, modal synthetic T2 can be used to estimate petrophysical
dolomite, illite, chlorite, quartz, and feldspar contents, and properties, such as PSD, permeability, and residual saturation.
three fluid saturation logs, namely, bound water, free water, Such petrophysical estimations derived from bimodal synthetic
and oil logs. These ten logs were inverted from resistivity, T2 will be error prone due to lower prediction performance at
neutron, density, gamma ray, and dielectric logs. NMR T2 dis- the depths having bimodal PSD.
tribution, mineral content, and fluid saturation logs are split VII. C ONCLUSION
randomly into testing and training data sets. Data from 460 out A novel predictive model is built using a VAE-NN to predict
of 597 depths were used as training data, and remaining the NMR T2 distribution in an SPS using cheaper and easy-to-
100 depths were used as testing data. acquire mineral content and fluid saturation logs. Due to the
V. R ESULTS limited availability of T2 distributions with two peaks in the
Testing data, comprising mineral contents, fluid saturations, data set, VAE-NN has relatively lower accuracy in predicting
and NMR T2 distributions, from 100 randomly selected depths bimodal T2 distributions. The overall prediction performance
out of the 597 depths across the SPS, are used to test the of VAE-NN in the entire SPS has R 2 of 0.75 and NRSMD
accuracy of the trained VAE-NN. In Fig. 3(a), especially for of 15%.
NMR data with a single peak, the trained VAE performs at R EFERENCES
a high accuracy of R 2 of 0.8 and normalized root-mean- [1] H. Li and S. Misra, “Prediction of subsurface NMR T2 distribution from
square deviation (NRMSD) of 14%. For NMR data with two formation-mineral composition using variational autoencoder,” in Proc.
SEG Tech. Program Expanded Abstracts, 2017, pp. 3350–3354.
peaks, the prediction accuracy is lower than that of the single- [2] P. M. Wong, T. D. Gedeon, and I. J. Taggart, “An improved technique in
peak cases. The predictive model had limited access to NMR porosity prediction: A neural network approach,” IEEE Trans. Geosci.
T2 distributions with two peaks during the training phase Remote Sens., vol. 33, no. 4, pp. 971–980, Jul. 1995.
[3] H.-C. Chang, H.-C. Chen, and J.-H. Fang, “Lithology determination
that led to the poor quality of prediction for T2 distributions from well logs with fuzzy associative memory neural network,” IEEE
with two peaks. Less than 1/3 of the T2 data in the training Trans. Geosci. Remote Sens., vol. 35, no. 3, pp. 773–780, May 1997.
[4] Q. Zou, L. Ni, T. Zhang, and Q. Wang, “Deep learning based feature
data set have two peaks. During the testing phase, R 2 of selection for remote sensing scene classification,” IEEE Geosci. Remote
overall prediction is around 0.75 [Fig. 3(b)] and NRMSD is Sens. Lett., vol. 12, no. 11, pp. 2321–2325, Nov. 2015.
around 15%. In Fig. 3, 30% of predictions have R 2 < 0.6 [5] F. P. S. Luus, B. P. Salmon, F. van den Bergh, and B. T. J. Maharaj,
“Multiview deep learning for land-use classification,” IEEE Geosci.
because of the noise in well logs, uncertainty in inversion- Remote Sens. Lett., vol. 12, no. 12, pp. 2448–2452, Dec. 2015.
derived logs, and insufficient training data. NN prediction is a [6] J. Geng, J. Fan, H. Wang, X. Ma, B. Li, and F. Chen, “High-resolution
data-driven method that will not summarize a log-T2 relation- SAR image classification via deep convolutional autoencoders,” IEEE
Geosci. Remote Sens. Lett., vol. 12, no. 11, pp. 2351–2355, Nov. 2015.
ship accurately if there are not enough training data having [7] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Adv. Neural
the specific log-T2 relationship. Inf. Process. Syst., 2014, pp. 2672–2680.
[8] D. P. Kingma and M. Welling. “Auto-encoding variational Bayes.”
VI. D ISCUSSION Unpublished paper, 2013. [Online]. Available: https://arxiv.org/abs/
1312.6114
The entire work is based on a data set acquired in a [9] C. Doersch, “Tutorial on variational autoencoders.” Unpublished paper,
single well that intersects a shale reservoir comprising seven 2016. [Online]. Available: https://arxiv.org/abs/1606.05908