You are on page 1of 13

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 1

A Supervised Descent Learning Technique for


Solving Directional Electromagnetic
Logging-While-Drilling
Inverse Problems
Yanyan Hu , Rui Guo , Yuchen Jin , Xuqing Wu , Maokun Li , Aria Abubakar,
and Jiefu Chen , Member, IEEE

Abstract— In this article, a new scheme based on the supervised difficulties in solving inverse problems [2], such as non-
descent method (SDM) for solving directional electromagnetic linearity, ill-posedness, as well as nonuniqueness, especially
logging-while-drilling (LWD) inverse problems is proposed. The when using the newly emerged ultradeep azimuthal resistivity
SDM provides us a new perspective to combine the classical
gradient-based inversion and machine-learning-based inversion LWD tools that have a large depth of investigation and
schemes. It iteratively learns a set of descent directions in the require more inversion parameters to describe the subsurface
offline training process, where the training model set is generated structure [3].
in advance according to the prior information, and then updates A variety of optimization methods have been applied to
the models with the learned descent directions as well as data interpret the directional electromagnetic LWD measurements,
residuals in the prediction stage, resulting in great flexibility to
incorporate prior information, the capability of skipping local among which the deterministic methods are prevalent due
minima, and accelerated convergence. The generalization ability to its straightforward implementation and fast convergence
of the SDM to interrogate new models that are not contained in through iteratively minimizing the discrepancy between the
the training model set is also explored. By utilizing real-time measurements and the forward model responses [4], [5].
information obtained from the logging process, the learned Regularization is utilized to reduce the ill-posedness of the
descent directions can be slightly revised with a higher efficiency
to get closer to the true model. In addition, we probe the problem and stabilize the optimization by imposing extra
sensitivity of the SDM by adding different levels of random constraints [6]. Plenty of algorithms, including the gradient
noise to the measurements. Numerical examples demonstrate descent method and the Gauss–Newton method [7], [8], are
that SDM-based inversion can achieve a higher resolution, faster widely used to solve this inversion problem. However, for
convergence, and higher robustness than conventional schemes complicated logging inverse problems with high nonlinearity,
such as Occam’s inversion.
deterministic methods consider only the local properties of
Index Terms— Descent directions learning, directional electro- the objective function and are prone to be trapped into local
magnetic logging-while-drilling (LWD), inverse problem, machine minima. Meanwhile, deterministic schemes are limited to
learning, supervised descent method (SDM), well logging.
incorporate prior information only by regularization. On the
I. I NTRODUCTION other hand, plenty of prior knowledge from the experience
of geologists and geophysicists is difficult to be expressed in
A CCURATE inversion of directional electromagnetic (also
known as azimuthal resistivity) logging-while-drilling
(LWD) measurements plays an essential role in geosteering,
rigorous mathematical forms [9].
As an alternative, statistical methods governed by the
Bayesian theorem can conquer the local minima prob-
a technique to adjust actively the borehole position on the
lem by the sampling approach like Markov chain Monte
fly to reach geological targets [1]. It is also crucial for appli-
Carlo (MCMC) [10]. It computes the posterior distribution
cations including reservoir mapping, landing fault detection,
from the likelihood function and the prior probability [11].
and salt edge detection. However, the inversion of directional
However, statistical inversion is oftentimes time-consuming,
electromagnetic LWD measurements faces all the typical
leading to the difficulty to expand to high-dimension sce-
Manuscript received December 19, 2019; revised March 3, 2020; accepted narios (and real-time requirements in geosteering). Hybrid
March 29, 2020. (Corresponding author: Jiefu Chen.) Monte Carlo [12], nonparametric sampling [13], and parallel
Yanyan Hu, Yuchen Jin, and Jiefu Chen are with the Department of
Electrical and Computer Engineering, University of Houston, Houston, multiple-chain delayed rejection adaptive Metropolis (DRAM)
TX 77004 USA (e-mail: chenjiefu@gmail.com). MCMC [14] have been adopted to improve the sampling
Rui Guo and Maokun Li are with Tsinghua University, Beijing 100084, efficiency, but a fast inversion scheme is still necessary.
China.
Xuqing Wu is with the Department of Computer Information Systems, Deep neural network (DNN) has been applied to solve the
University of Houston, Houston, TX 77004 USA. inverse problems, performing as a surrogate mapping from
Aria Abubakar is with Schlumberger, Houston, TX 77478 USA. the data space to the model space [15]–[21]. Furthermore,
Color versions of one or more of the figures in this article are available
online at http://ieeexplore.ieee.org. the conventional iterative scheme can be incorporated into
Digital Object Identifier 10.1109/TGRS.2020.2986000 deep learning framework [22], [23] to make the learning

0196-2892 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

process more reasonable. The DNN can extract the implicit the direction of the boundary-crossing as well as the forma-
mapping relationship between the measurements and the mod- tion anisotropy [33], [34]. The recently developed ultradeep
els through the training process and then use the learned con- azimuthal resistivity LWD tool can reach to over 100 ft
nections to guide the inversion of similar new measurements to from the borehole owing to the employment of very low
converge to a rational model in reality. This scheme provides a working frequency and long spacings between the transmitting
new prospective to introduce prior information to improve the and receiving antennas [35]–[38]. The new generation of the
accuracy or to accelerate the convergence except for adopting directional electromagnetic LWD tool expands the application
regularization, especially when prior knowledge is difficult to of geosteering to the reservoir-scale and, on the same time,
describe in rigorous form. In addition, plenty of efforts have raises greater challenges for subsurface structure inversion and
also been made to integrate the so-called physical loss from interpretation.
the system constraints, and forward model or loss function A directional electromagnetic LWD tool takes the mea-
modification is introduced to guide the DNN [24]–[28]. This surements of earth’s electrical properties by propagating
is also inspiring, because the physical modeling or constraints the electromagnetic waves into formation using transmitting
are valuable to understand the physics and instruct the model antennas and then capturing the electromagnetic signal by
reconstruction comparing with just a “black box” mapping receiving antennas and transforming them into signal atten-
between the data and model spaces [15]. Another promising uations and phase shifts [39]. To reconstruct the resistivity
characteristic of deep learning is its high efficiency, because profile surrounding the wellbore, the interpretation of multi-
the time-consuming training is only needed to be done once channel measurements is indispensable. Since tool responses
and the prediction will be rapid with the direct utilization of could be synthesized by simulation using a physical forward
the well-trained network. model F (·), the interpretation could be viewed as an inverse
However, despite the high accuracy and efficiency of the problem, F −1 (·), where we need to reconstruct the formation
data- or physics-driven DNN, it has its own limitation. For the properties from the observed measurements.
inversion of electromagnetic LWD measurements, it is difficult Generally speaking, there are two different approaches,
to describe the subsurface structure with a limited number of model-based and pixel-based, used to describe the subsurface
parameters when extending to high-dimension scenarios, while structures. In the model-based approach, the inversion domain
the more the parameters, the more expensive the training of the is divided into several regions described by the geometric
network. In addition, end-to-end implementation oftentimes models based on prior information. The resistivity is assumed
implies limited generalization ability. to be homogeneous in each region. Under this condition,
In this article, we study the application of an efficient the parameters we need to invert contain the resistivity of
and robust supervised descent method (SDM) for solving each subregion and the position of each boundary. On the
directional electromagnetic LWD inverse problems. Further contrary, in pixel-based inversion, we divide the inversion
improvement is also explored according to the characteristics domain into uniform or nonuniform small cells, in which the
of the LWD and geosteering applications. Considering the parameters need to be inverted are only resistivity of each cell.
essence of geophysical inversion is to reconstruct the sub- Overall, model-based inversion, with less unknowns, is often-
surface structures with not only the measurements but also times an overdetermined problem but with strong nonlinearity.
the geological/geophysical and prior knowledge; the SDM Moreover, in this scheme, a model needs to be assumed in
combines the machine learning scheme and the advantages advance, which may lead to model unconformity. Another
of the conventional methods. The SDM was first proposed drawback of model-based inversion is that its capabilities to
in [29] to solve the optimization problems in computer vision. be extended to 2-D/3-D cases are very limited. On the other
Then, it has been applied for 1-D transient electromagnetic hand, pixel-based inversion has more unknowns and usually is
(TEM) [30] inverse problems as well as first-arrival travel-time an underdetermined problem, so we need to add regularization
tomography [31]. The structure of this article is organized to decrease its ill-posedness. However, one of the advantages
as follows. Section II introduces the directional electromag- of pixel-base inversion is that it can be easily extended to
netic LWD and basic methodology of SDM-based inversion. 2-D/3-D cases and has generalization ability in a degree. In this
Section III illustrates a detailed implementation of inversion article, we will focus on the pixel-based inversion of the
using the SDM as well as improvement strategies for LWD directional electromagnetic LWD measurements.
and geosteering applications. In Section IV, numerical exam-
ples are given to validate the performances of the SDM,
B. SDM Theory
the effectiveness of improvement strategies, as well as the
generalization ability. Meanwhile, the efficiency and sensitivity The inversion of the directional electromagnetic LWD data
analysis are also illustrated. Section V concludes this article. can be regarded as a nonlinear data-fitting process that aims
at obtaining a detailed estimation of the subsurface properties
through minimizing the following target function:
II. T HEORY AND M ETHODOLOGY
A. Directional Electromagnetic LWD L(m) = F (m) − dobs 22 (1)
Directional electromagnetic LWD tools are widely used where dobs is the observed data acquired in a logging survey
in directional drilling and geosteering [1], [32]. Equipped and F (m) is the synthetic data generated from the forward
with tilted or transverse antennas, the tool can distinguish model F (·) with the estimated resistivity model m. That is

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 3

to say, the misfit between the observed data and the synthetic III. I NVERSION W ITH SDM
data is represented by (1). The corresponding model m that A. Offline Training Framework
minimizes the misfit is considered as the solution of the LWD
inverse problem. To describe the complicated subsurface structures, the total
According to Newton’s method, an iterative scheme can be number of needed pixels is usually large, introducing artificial
adopted to update m so that L(m) tends to decrease. Taking fluctuation into the inverted resistivity model even when the
the second-order Taylor expansion of L(m) with respect to training data misfit is small. This is mainly because of the
m and assuming L(m) is twice differentiable, we have ill-posedness of the problem. As mentioned before, regular-
ization is adopted to decrease the ill-posedness and stabilize
1 the inversion, which is beneficial to suppress the artificial
L(m0 + m) ≈ L(m0 )+JL (m0 )T m+ mT HL (m0 )m
2 fluctuation [40]. Therefore, loss function in (5) can be modified
(2) as
N 
   
where m0 is the value of m in the last iteration, JL (m0 ) is Lrk = mn − Kk dn 2 + λr (Kk )2 (7)
k k 2 2
the Jacobian matrix of L at m0 , and HL (m0 ) is the Hessian n=1
matrix of L at m0 . The minimum of (2) can be reached when
the gradient of L with respect to m is zero, that is where λ is the Lagrange multiplier to balance the regular-
ization and r (Kk ) is the regularization function. The regular-
m = −HL (m)−1 JL (m) ization function is supposed to impose constraints on model
parameter m, which also constrains the computation of Kk .
= −2HL (m)−1 JF (m)T (F (m) − dobs ) (3) Here, we employ the smoothest constraint regulariza-
tion [40] and define r (Kk ) as
where JF is the Jacobian matrix of F(·). For nonlinear F (·),
m needs to be updated iteratively with (3) to reach at r (Kk ) = ∇(mk + Kk dk ) (8)
least a local minimum of L(m). However, it is time- and
memory-consuming to compute the Hessian matrix, not to where ∇ is the differential matrix and
⎡ ⎤
mention the inversion of the Hessian matrix in each iteration. 1 0 ··· 0
The overhead is still large even with the Gauss–Newton ⎢ −1 1 ··· 0⎥
⎢ ⎥
method where we approximate the Hessian matrix with JF T
JF . ⎢ 0 −1 ··· 0⎥
⎢ ⎥
Thus, instead of computing the Hessian matrix or Jacobian ∇=⎢ . .. ⎥
.. (9)
⎢ .. . ⎥
.
matrix, we define K = −2HL (m)−1 JF (m)T as the update ⎢ ⎥
⎣ 0 0 ··· 1 ⎦
direction operator, and K is learned in each iteration from a
training model set generated according to prior information 0 0 ··· −1
during the offline training stage. Then, (3) can be simplified which means that the gradient of the resistivity between the
to adjacent pixels is minimized. To minimize Lrk , (7) can be
rewritten in matrix form
m = K(F (x) − dobs ) = Kd (4)  2
Lrk = Mk − Dk KkT  F
where d = F (x) − dobs , representing the data error between   2
+ λ Mk + Dk KkT ∇ 2 (10)
the synthetic and observed data. In the kth iteration, an aver-
age gradient descent Kk can be obtained by minimizing the where F represents the Frobenius norm, and
following cost function: ⎡ T
⎤ ⎡ T

mk1 dk1
N 
   ⎢ T ⎥ ⎢ T ⎥
mn − Kk dn 2 ⎢ mk2 ⎥ ⎢ dk2 ⎥
Lk = (5) ⎢ ⎥
Mk = ⎢ . ⎥ Dk = ⎢ . ⎢ ⎥. (11)

⎣ .. ⎦ ⎣ ..
k k 2
n=1 ⎦
T T
mkN dkN
where N is the number of training samples, and
Following the least square criterion, we get:
mkn = m∗n − mkn  −1  
    KkT = DkT Dk DkT Mk − λMk ∇∇ T G(λ) (12)
dkn = d∗n − dkn = F m∗n − F mkn (6)
where
in which m∗n is the nth training model, mkn is the updated  −1
model of the nth training model in the kth iteration, and d∗n G(λ) = I + λ∇∇ T . (13)
and dkn are the synthetic data of m∗n and mkn correspondingly.
Through the training procedure, a sequence of coefficient Then, KkT can be utilized to update the model as
matrices K1 , K2 , . . . , Kk are computed and preserved for the Mk+1 = Mk + Dk KkT . (14)
online prediction stage, during which the model update item
m can be directly updated according to (4) by iteratively The same process is repeated in the k + 1 iteration until
applying the series of Kk . the maximum number of iterations is reached, or the relative

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Fig. 1. Offline training flowchart of the SDM.

model and data misfit are small enough, where the relative
model misfit is defined as
1  mkn 2
N
rms M = (15)
N n=1 m∗n 2

and the relative data misfit is defined as


1  dkn 2
N
rms D = . (16)
N n=1 d∗n 2

During the training stage, λ is set according to the relative data


misfit rms D in each iteration. Fast convergence at the first steps
and influence of rms D at the last steps are guaranteed through
this involvement of rms D to λ. Many other settings are also
feasible. From our experience of solving the logging inverse
Fig. 2. Online prediction flowchart of the SDM.
problem with SDM, rms M and rms D decrease very fast, and
in general, it is large enough to set the maximal iteration as
20. Overall, the flowchart of the offline training stage is shown and therefore, it can achieve high efficiency during the pre-
in Fig. 1. diction stage. In each iteration of the prediction, if we assume
the length of the model vector and the data vector as M
and L, respectively, the complexity of computing Kk−1 dk−1
B. Online Prediction Framework is O(ML), which can be approximated as the overall complex-
During the online prediction stage, the same initial model is ity of updating mk , since G(α) is quite sparse. Before that,
applied as that in the training stage and the model parameters the forward modeling is executed for one time to compute the
can be directly updated by iteratively adopting the learned data error dk−1 . Therefore, the total time complexity of a
descent directions Kk and the data residual [31]. We can also I-step prediction is O(IML) plus I times forward modeling.
incorporate constrains to the model parameters to stabilize Note that not all the Ks are required during the prediction
further the prediction process. Thus, the objective function to procedure, because the learned Ks in the last few iterations
be optimized during the kth iteration is formulated as may lose its directionality when the updated model gets close
 2 to the true model [31]. Therefore, the relative data misfit
Pkr (mk ) = mk − mk−1 − Kk−1 dk−1 22 + α ∇ T mk 2 (17) rms D is monitored during each iteration. The iteration will
 2 be terminated if it reaches the maximal limitation or if rms D
where ∇ T mk 2 is the regularization term and α is the does not continue to decrease. Moreover, the prediction needs
regularization coefficient whose setting is similar to λ in (10). to be stopped if rms D diverges in the first few iterations,
The corresponding solution of (17) is which may occur when the testing model is quite different
mk = G(α)(Kk−1 dk−1 + mk−1 ) (18) from the training models, indicating that prior information is
not accurate or relevant enough. This also makes sense for
where G(α) is defined in (13) and dk−1 = dobs − F (mk−1 ), guiding our directions to solve the problem.
in which dobs is the measurement to be fit. In addition, compared with the Gauss–Newton method that
The flowchart of online prediction is shown in Fig. 2. The considers only the local property (the first- or second-order
SDM avoids computing the derivatives of the cost function, derivatives) and gets trapped easily into local minima,

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 5

the training and testing scheme endows SDM with the ability where γ is the weighting coefficient and 0 < γ < 1. We would
to skip the local minima that are not included in the training suggest keeping γ close to 1 so that K is only slightly revised.
set during the prediction stage. Detailed explanation can be Note that the K replacing method can get more accurate
found in [30]. prediction theoretically, because Kreplace extracts the more
specific pattern updated by the new and accurate feedback.
However, it is time-consuming to retrain or relearn a new
C. Training Model Set Design Kreplace whenever new feedback is received. In addition, there
The SDM possesses great flexibility and capability of incor- might be potential risks that the retrained descent directions
porating prior information, which is fulfilled by extracting have a limited scope of application if the new feedback has
certain pattern features that are difficult to be described or too strong local properties. On the contrary, slightly revising
formulated in a rigorous form from the training models. When original K may not obtain as accurate prediction as the K
the prior information is reasonable enough, the intake of replacing method, but it takes little time to compute Kupdate ,
more specific pattern features in the training models will lead which can better reflect current pattern features while retaining
to predictions that are more accurate. On the other hand, the generalization ability of the original descent directions.
the generalization ability is limited under this circumstance That is also why we suggest keeping γ close to 1. Hence,
due to the relative fixed pattern of training models. the K updating method is more desirable considering the
The generalization ability of the SDM to interpolate new overall benefit.
structures that are not included in the training model set is In the K updating method, supposing the number of resam-
achieved by weighting the descent directions with the data pled training models for one updating is N  (N  is small,
residual in the online prediction stage. To enhance further the usually smaller than 20), we assume that the resistivities of
generalization ability as well as the robustness of the online the layers above the newly occurred layer are already known,
prediction, more general models instead of specific ones are so we would fix them and resample the updating training
recommended when setting up the training model set. model set. Then, we can acquire Krevise by minimizing (5)
Moreover, except for adopting more general models, based on this resampled training model set. Specific example
the diversity of the training set can be built up by incorporating is illustrated in Section IV-A3. The Levenberg–Marquardt
the models of different patterns. For example, if we already algorithm is used to stabilize the solution when Dk becomes
know that the true model has five layers in most of the logging singular, so
points, a small portion is not. Then, we can add three-layer  −1
T
Kk_revise = DkT Dk + μI DkT Mk (20)
or four-layer models into the training model set that primarily
contains five-layer models. The descent direction learned from where μ is the damping factor set to be proportional to the
this mixed training model set can reflect multiple patterns, maximal eigenvalue of Dk , and Dk and Mk are defined
which means that by tuning the size of different patterns of the in (11). Finally, Krevise can be obtained by assembling the
T
training model sets, we can adjust the likelihood of the prior series of Kk_revise .
knowledge, and hence obtain more flexibility to incorporate
IV. N UMERICAL E XAMPLES
various types of prior information. This is also helpful if the
accurate prior information is unknown. In this section, we first validate the effectiveness of inversion
with SDM using synthetic data and compare with traditional
Occam’s inversion [40], during which improvement of adopt-
D. Real-Time Feedback and Update ing the K updating strategy is also displayed. Then, the gen-
Considering the practice of the LWD procedure, real-time eralization ability of the SDM is demonstrated by training
feedback from downhole can be utilized to update the descent with the model set consisting of three-layer models and testing
direction K. Resistivity along the trajectory is assumed to with the four- or five-layer models. In addition, the applicative
be known after drilling through a certain region. Therefore, boundary is explored to an extent. Next, the efficiency com-
whenever there is a new layer, K can be updated according to parison between SDM and Occam’s inversion is given. Finally,
the real-time feedback. In practice, the situation could be more the robustness of the SDM is validated by sensitivity analysis.
complicated and the inversion result can be a good reference Note that we add random noise to the synthetic measurements,
to set the parameters. With the new and accurate information in which the noise level of attenuation is 0.08 dB and that of
of resistivity, two strategies can be applied. One is that a large phase shift is 0.5◦ .
amount of new training models can be resampled according to All examples discussed in this article are based on 1-D mod-
the updated prior information, followed by a newly computed eling, i.e., the transmitters are treated as point magnetic dipole
Kreplace applied to invert the next logging point. We call this sources, and the formations are assumed as layered media.
as the K replacing method. The other, named the K updating As analytical solutions in this circumstance are available [41],
method, is that only a small number of models is regenerated generating one training sample takes less than 1 ms on average
according to the updated prior information. Then, a descent on a four-Core i7 desktop with parallel operation.
direction Krevise can be fast computed and employed to update
A. Validation of the Algorithm
the original K, which can be described as
With the pixel-based method, the parameters we need to
Kupdate = γ K + (1 − γ )Krevise (19) invert are only resistivities. We discretized the resistivity

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

TABLE I
S ETUP FOR P IXEL -BASED I NVERSION IN THE N UMERICAL
E XPERIMENT (T HREE -L AYER C ASE )

between ±100 ft vertically with respect to the LWD tool using


nonuniform grids. The thickness is 3 ft when the distance of
the pixel to the tool is less than 30 ft, while for the rest of
the pixels, the thickness increases in a ratio of 1.05. Assuming
with the prior knowledge, we have already known the structure
information as well as a rough estimation of resistivities, from
which we randomly sampled 1000 models as m∗ train to learn
the descent directions. Then, another set of random models
m∗ test were generated to test the inversion performance. Note
that the initial parameters should be set as the same in the
training and testing stages. Here, we adopted two scenarios,
three-layer and five-layer model sets, to validate our algorithm.
We set λ and α to one-tenth of the relative data misfit in both
scenarios.
1) Three-Layer Examples: Training with the model set
consisting of three-layer models and testing with random
models and continuous earth model. Fig. 3. Training relative misfits of a three-layer case. (a) Relative model
In this example, the training and testing setup is listed misfit. (b) Relative data misfit.
in Table I, where r 1, r 2, and r 3 are the resistivities of each
layer from top to bottom, z is the position of the upper
boundary, and t is the thickness of the middle layer. In the only adding regularization in Occam’s inversion is limited,
training stage, we set the maximum step of iterations as 15. while the learned descent gradients in the SDM have more
In each iteration, the relative model misfit rms M and data misfit boundary and resistivity information from the training model
rms D were computed according to (15) and (16), respectively, set to reconstruct models. Furthermore, as shown in Fig. 4(b),
and are shown in Fig. 3(a) and (b), from which we can see that the second boundary of the true model is close to the logging
both of the model and data misfit converge within 15 iterations. tool, which leads to increasing the difficulty of inversion.
During the online prediction, as previous description, The whole part of the reconstructed resistivities of Occam’s
the iteration will stop when it reaches the maximum step or inversion below the second boundary fluctuates heavily, while
if the relative data misfit is less than the preset level. Then, the SDM handles this scenario much better. This is partially
the inverted models of the SDM are plotted in Fig. 4 with because of prior information that provides constraint to the
blue solid lines. By comparing with the true model drawn flatness of each layer and partially due to smoothness regular-
in red dotted lines, we can see a promising result that the ization added during both training and prediction stages.
constructed models well reflect the structure of the true ones, We then tested on a continuous earth model with fixed true
no matter the thickness of the conductive layer, the positions vertical depth (TVD). This continuous earth model is a pseudo-
of the boundaries, and the values of the resistivities. 2-D model stitched by 1-D models from 80 logging points.
Meanwhile, as a comparison, we inverted the same syn- It is a three-layer model with a conductive layer embedded
thetic measurements with Occam’s inversion and plotted the in the less conductive background shown in Fig. 5(a), where
reconstructed models in Fig. 4 with green dashed lines. we also add the nonuniform meshes used in the experiment.
In Occam’s inversion, the Jacobian matrix is needed and First, we learned the decent directions from the training model
computed with the finite-difference approach. It is reasonable set constructed according to prior information (also shown
to say that these results reflect the changes in the subsurface in Table I) and then applied these descent directions to the
structure. However, the layer boundaries, especially the upper synthetic data generated from Fig. 5(a), which explains why
boundaries, are not as clear as that inverted by the SDM and SDM can achieve high efficiency. It is because in the same
the values of the inverted resistivities are also less accurate. logging survey, training is only needed to be done for one
One of the reasons is that we added noise to the data. The time. The inverted model by the SDM is shown in Fig. 5(b),
other is mainly because prior information incorporated by and the corresponding reconstructed models with Occam’s

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 7

Fig. 4. Inversion of three-layer random models. (a)–(d) Four different profiles.

inversion are shown in Fig. 5(c). For this continuous model,


the position of the upper boundary in the left-hand-side part
is hard to reconstruct, because the middle conductive layer
has a shielding effect on the upper less conductive one. It is
difficult to formulate this kind of prior information or describe
it in a rigorous form to perform as a regularization function.
That explains why Occam’s inversion does not get a good
performance in this area. However, by properly constructing
the training models, SDM inversion can embody this prior
knowledge and achieve a higher resolution than Occam’s
inversion.
2) Five-Layer Examples: Training with the model set con-
sisting of five-layer models and testing with random models
and continuous earth model.
In this example, the corresponding training and testing setup
is listed in Table II, where r , z, and t represent the resistivities, Fig. 5. Inversion results of a three-layer continuous model by different meth-
ods. (a) True model. (b) Reconstructed model by SDM. (c) Reconstructed
the position of the first boundary, and the thickness of each model by Occam’s inversion.
layer, respectively. The maximum step is set as 15, which is
reasonable according to the training relative model misfit rms M
and relative data misfit rms D shown in Fig. 6. further validates not only the effectiveness of SDM inversion
The inverted results are illustrated in Fig. 7, where the red but also its advantages compared with Occam’s inversion.
dotted, blue solid, and green dashed lines represent the true Here, we also tested on a five-layer continuous synthetic
model, model reconstructed by SDM, and Occam’s inversion, model shown in Fig. 8(a) (similarly the nonuniform meshes
respectively. From Fig. 7, we can conclude that the structure are drawn), in which there are 81 logging points in total.
feature is better captured by SDM inversion, in which the It is also a pseudo-2-D model stitched by 1-D models and
boundaries as well as values of resistivities agree well with the the training model set is sampled according to Table II.
true models, especially when two boundaries are close to each Fig. 8(b) and (c) shows the reconstructed models by SDM
other, while models reconstructed by Occam’s inversion still and Occam’s inversion. From these results, both SDM and
have vague or even indistinguishable boundaries. This example Occam’s inversion perform well in the first half logging points

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

TABLE II
S ETTINGS FOR P IXEL -BASED I NVERSION IN THE N UMERICAL
E XPERIMENT (F IVE -L AYER C ASE )

Fig. 7. Inversion of five-layer random models. (a)–(d) Four different profiles.

of Fig. 8(c). However, for SDM, we do have strategies to


improve the reconstructed models as aforementioned.
3) Update of K With Real-Time Feedback Information:
First, as illustrated in Section III-C, we added four-layer
models to the training model set to increase its diversity,
which would enhance the learned descent directions’ ability
to interpolate multiple patterns of the structures that are very
different from the training set or change in a large range.
We set the ratio of five-layer models to four-layer models
as 1:1, totaling 1000 samples. Fig. 9(a) shows the inversion
result of employing this strategy. We can see that the four-layer
part can be distinguished even though the boundary is not clear
Fig. 6. Training relative misfits of five-layer case. (a) Relative model misfit. enough. This result validates the effectiveness of the training
(b) Relative data misfit. model set design strategy.
The second strategy is to update the descent directions
and provide relative accurate inversions. In the middle parts, with real-time feedback from downhole, as described in
SDM can still invert the top layer with a clear boundary but Section III-D. Here, we set γ as 0.9 and resampled 16 training
not Occam’s inversion because of the similar shielding effect models (one half is five-layer models and the other half is
T
explained before. In the last several logging points, with the four-layer models) to learn Kk_revise in (20). To be specific,
logging tool going deeper, neither SDM nor Occam’s inversion as shown in Fig. 9(a), when the logging tool enters the
performs well. For SDM, it is mainly because the pattern of fourth layer, we assume that we have already known that
the models in this part is different from that in the training the resistivities of the first three layers are 90, 2, and 80 .
model set, which means the prior information is not accurate Therefore, we fix parameters r 1, r 2, and r 3 in Table II
enough in this circumstance. It is reasonable for the SDM as 90, 2, and 80, respectively, and resample the updating
not getting as good results as when the prior information is training model set. Then, Kupdate can be computed accord-
accurate. For Occam’s inversion, it will be difficult to improve ing to (19) and used for the following inversion. We can
the performance of the models in the right-hand-side part update the directions whenever we know there is a new layer.

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 9

Fig. 9. Improved inversion results of five-layer continuous model. (a) Recon-


structed model by training model set design. (b) Reconstructed model by
updating K with real-time feedback.

information is absent or limited. Fig. 10 shows the recon-


structed models with the blue solid lines, while the true model
with red dotted lines, from which we can see the results
reasonably, shows the layers although there is some inac-
curacy of resistivities. According to the previous experience
of training model set design, it is plausible to add other
multiple-layer models into the training set to improve the
reconstructed models even though we do not know the prior
information. For now, further research is needed to explore the
boundary of the generalization ability in this inverse problem.
Fig. 8. Inversion results of the five-layer continuous model from different Moreover, we do test the situation where the prior is not
methods. (a) True model. (b) Reconstructed model from SDM. (c) Recon- accurate enough or the pattern of the test sample is quite differ-
structed model from Occam’s inversion. ent from that of the training samples to explore the applicative
boundary of the SDM. The results are shown in Fig. 11. The
The improved reconstructed model is shown in Fig. 9(b),
examples we demonstrated in Section IV-A1 follow the pattern
by which we can see in the right-hand-side part that the
as high resistivity-low resistivity-high resistivity. The trained
boundaries are clearer than that in Fig. 9(a), especially in the
K can extract the feature of this pattern and give us an accurate
four-layer region.
inversion when the testing pattern is similar. However, when
we apply the same K to invert the measurements generated
B. Generalization Ability Analysis from the true model shown in Fig. 11(a) whose pattern is low
To test the generalization ability of the SDM, we use the resistivity-high resistivity-low resistivity, exactly opposite to
descent directions learned from a training model set consist- the training samples, it failed [shown as the blue solid line
ing of only three-layer models to invert the synthetic data in Fig. 11(a)]. The corresponding relative data misfit, shown
generated from the four- or five-layer models. This scenario in Fig. 11(c), diverges at the beginning iterations, as mentioned
also demonstrates the application of the SDM where the prior previously. However, it does not mean that SDM cannot deal

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Fig. 10. Generalization ability demonstration: reconstructed four- or Fig. 11. Demonstration of the possible applicative boundary of the SDM.
five-layer models with training on the three-layer model set. (a)–(d) Four (a) Failed inversion example when the prior is quite inaccurate or the testing
different profiles. pattern is opposite to the training pattern. (b) Successful inversion for the
same measurements generated from true model in (a) with accurate prior.
(c) Diverged relative model misfit at the beginning iterations corresponding
to inverting model in (a).
with the latter pattern. Instead, we resample a training set with
the pattern of low resistivity-high resistivity-low resistivity,
train a new K based on this new model set, and apply it to TABLE III
invert the measurements generated from the true model shown T IME C OMPARISON B ETWEEN SDM AND O CCAM ’ S I NVERSION W ITH
R ESPECT TO T HREE - AND F IVE -L AYER C ONTINUOUS M ODELS
in Fig. 11(a). We obtain a pretty accurate inversion shown
in Fig. 11(b).
In general, the SDM does not seem to be able to handle the
situation where the testing sample is completely different from
the training model set. However, the generalization ability
endows the SDM to interpolate new structures like different
number of layers that are not included in the training model
set. Therefore, although SDM has the generalization ability to faster if we compute the Jacobian matrix using the adjoint
an extent, an accurate prior is essentially the key to achieve a method, but it still is slower than directly retrieving the
successful inversion. descent directions from memory as how SDM does. The online
prediction time of the SDM shown in Table III corresponds to
the inversion result in Figs. 8(b) or 9(a) without updating the
C. Efficiency Comparison With Occam’s Inversion descent direction Kupdate . However, we did monitor the time
The online prediction and inversion time of the SDM and of computing Kupdate , which is about 3 s when we set the
Occam’s inversion is listed in Table III. We can see that for the number of resampled models as 16. Therefore, the total time
same scenario, the SDM spends much less time to reconstruct of the SDM with updating Kupdate is still much less than that
a continuous model than that of Occam’s inversion. of Occam’s inversion.
For SDM, the offline training stage takes most of the time, SDM performs much like how other machine learning-based
but once the training is finished in advance, it would be much methods accelerate the inversion, i.e., to sacrifice the offline
faster, because no derivative computation is needed at all time for online computational speed. It is more suitable
during the online prediction stage with the utilization of the when the amount of data to be inverted is large, because
learned descent directions. For Occam’s inversion, it may be time-consuming training is done only once and then applied to

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 11

the experiment demonstrates the robustness of the SDM even


when the noise is strong.

V. C ONCLUSION
In this article, we present an SDM to solve the LWD inverse
problems. The SDM achieves great flexibility to incorporate
prior information, capability of skipping local minima, as well
as acceleration to converge through combining the advantages
of the conventional methods leveraging the machine learning
scheme.
The SDM is divided into two stages: offline training and
online prediction. It iteratively learns a set of descent direc-
tions in the offline training process, where the training model
set is generated in advance according to the prior information.
During the online prediction stage, we adopt the same initial
model as that in the training stage and apply the learned
descent directions and data residuals directly to update the
models. The direct application of the learned descent directions
avoids the computation of derivatives in the online predic-
tion stage, which can accelerate the convergence. Meanwhile,
the adoption of data residuals can provide guidance from phys-
ical forward modeling, endowing SDM with the generalization
ability.
Numerical examples demonstrate that SDM-based inversion
can achieve higher resolution and faster convergence than
conventional Occam’s inversion.

Fig. 12. Inversion of synthetic data with different noise levels. (a) Noise
level is level 1. (b) Noise level is level 2. R EFERENCES
[1] Q. Li, D. Omeragic, L. Chou, L. Yang, and K. Duong, “New directional
electromagnetic tool for proactive geosteering and accurate formation
the whole data set. However, the traditional gradient-descent evaluation while drilling,” in Proc. SPWLA, New Orleans, LA, USA,
method may outweigh SDM in view of time consumption 2005, pp. 1–16.
[2] D. Li, D. R. Wilton, D. R. Jackson, H. Wang, and J. Chen, “Accelerated
when there are only a few data to be inverted. However, we can computation of triaxial induction tool response for arbitrarily devi-
pretrain a series of Ks for basic structures and preserve them ated wells in planar-stratified transversely isotropic formations,” IEEE
for online prediction if we have a better understanding of the Geosci. Remote Sens. Lett., vol. 15, no. 6, pp. 902–906, Jun. 2018.
[3] R. Beer et al., “Geosteering and/or reservoir characterization the prowess
generalization ability of the SDM, which will enhance the of new-generation IWD tools,” in Proc. 51st Annu. Logging Symp., Perth,
competitiveness of the SDM even when the amount of data WA, Australia, 2010, pp. 1–5.
is small. [4] T. M. Habashy and A. Abubakar, “A general framework for constraint
minimization for the inversion of electromagnetic measurements,” Prog.
Electromagn. Res., vol. 46, pp. 265–312, 2004.
[5] O. Arikan, “Regularized inversion of a two-dimensional integral equa-
D. Sensitivity Analysis tion with applications in borehole induction measurements,” Radio Sci.,
vol. 29, no. 3, pp. 519–538, May 1994.
To validate the robustness of the SDM, we increased the [6] F. J. Andre-I Nikolaevich Tikhonov and V. I.-A. Arsenin, Solutions Ill-
noise level of synthetic data during the prediction stage. The Posed Problems. Washington, DC, USA: Winston, 1977.
attenuation and phase shift increased from (0.08 dB, 0.5◦ ) to [7] K. Levenberg, “A method for the solution of certain non-linear problems
in least squares,” Quart. Appl. Math., vol. 2, no. 2, pp. 164–168,
(0.12 dB, 0.7◦ ) (called level 1) and (0.25 dB, 1.5◦ ) (called Jul. 1944.
level 2), respectively, where the new noise levels are already [8] D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear
very strong in terms of logging measurements. Here, we uses parameters,” J. Soc. Ind. Appl. Math., vol. 11, no. 2, pp. 431–441,
Jun. 1963,
the measurements generated from the three-layer continuous [9] A. Abubakar, T. M. Habashy, M. Li, and J. Liu, “Inversion algorithms for
model described in the previous example and added the syn- large-scale geophysical electromagnetic measurements,” Inverse Prob-
thetic measurements with noises in levels 1 and 2 to perform lems, vol. 25, no. 12, Dec. 2009, Art. no. 123012.
the inversion. Fig. 12 shows the corresponding reconstructed [10] B. A. Berg, Markov Chain Monte Carlo Simulations and Their Statistical
Analysis. Singapore: World Scientific, 2004.
models, where Fig. 12(a) and (b) is in noise levels 1 and 2, [11] A. Tarantola, Inverse Problem Theory Methods for Model Parameter
respectively. As shown in Fig. 12, clear boundaries can still be Estimation. Philadelphia, PA, USA: Society for Industrial and Applied
distinguished in most of the regions even though the noise is Mathematics, 2005.
[12] Q. Shen, X. Wu, J. Chen, Z. Han, and Y. Huang, “Solving geosteer-
stronger. What is more important, the left-hand-side parts of ing inverse problems by stochastic hybrid Monte Carlo method,”
the upper boundaries are relatively well reconstructed. In brief, J. Petroleum Sci. Eng., vol. 161, pp. 9–16, Feb. 2018.

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

[13] Q. Shen, J. Chen, and H. Wang, “Data-driver interpretation of ultradeep [36] U. Ezioba and J.-M. Denichou, “Mapping-While-Drilling system
azimuthal propagation resistivity measurements: Transdimensional sto- improves well placement and field development,” J. Petroleum Technol.,
chastic inversion and uncertainty quantification,” Petrophysics SPWLA vol. 66, no. 8, pp. 32–35, Aug. 2014.
J. Formation Eval. Reservoir Description, vol. 59, no. 6, pp. 786–798, [37] H. Wang, Q. Shen, and J. Chen, “Sensitivity study and uncertainty
2018. quantification of azimuthal propagation resistivity measurements,” in
[14] H. Lu, Q. Shen, J. Chen, X. Wu, and X. Fu, “Parallel multiple-chain Proc. SPWLA, London, U.K, 2018, p. 15.
DRAM MCMC for large-scale geosteering inversion and uncertainty [38] H.-H. Wu, C. Golla, T. Parker, N. Clegg, and L. Monteilhet, “A new
quantification,” J. Petroleum Sci. Eng., vol. 174, pp. 189–200, Mar. 2019. ultra-deep azimuthal electromagnetic lwd sensor for reservoir insight,”
[15] A. Lucas, M. Iliadis, R. Molina, and A. K. Katsaggelos, “Using deep in Proc. SPWLA. London, U.K., 2018, p. 14.
neural networks for inverse problems in imaging: Beyond analytical [39] P. F. Rodney and M. M. Wisler, “Electromagnetic wave resistivity MWD
methods,” IEEE Signal Process. Mag., vol. 35, no. 1, pp. 20–36, tool,” SPE Drilling Eng., vol. 1, no. 5, pp. 337–346, Oct. 1986.
Jan. 2018. [40] S. C. Constable, R. L. Parker, and C. G. Constable, “Occam’s inversion:
[16] I. Malkiel, M. Mrejen, A. Nagler, U. Arieli, L. Wolf, and H. Suchowski, A practical algorithm for generating smooth models from electro-
“Deep learning for the design of nano-photonic structures,” in Proc. magnetic sounding data,” Geophysics, vol. 52, no. 3, pp. 289–300,
IEEE Int. Conf. Comput. Photography (ICCP), May 2018, pp. 1–14. Mar. 1987,
[17] Z. Wei and X. Chen, “Deep-learning schemes for full-wave nonlinear [41] L. Zhong, J. Li, A. Bhardwaj, L. Shen, and R. Liu, “Computation of
inverse scattering problems,” IEEE Trans. Geosci. Remote Sens., vol. 57, triaxial induction logging tools in layered anisotropic dipping forma-
no. 4, pp. 1849–1860, Apr. 2019. tions,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 4, pp. 1148–1163,
May 2008.
[18] J. Sun, Z. Niu, K. A. Innanen, J. Li, and D. O. Trad, “A theory-guided
deep learning formulation of seismic waveform inversion,” in Proc. SEG
Tech. Program Expanded Abstr., Aug. 2019, pp. 2343–2347.
[19] W. Wang, F. Yang, and J. Ma, “Velocity model building with a modified
fully convolutional network,” in Proc. SEG Tech. Program Expanded
Abstr., Aug. 2018, pp. 2086–2090,
[20] Y. Xu et al., “Borehole resistivity measurement modeling using machine- Yanyan Hu received the B.S. degree in electronic
learning techniques,” Petrophysics, vol. 59, no. 06, pp. 778–785, 2018. engineering and the M.S. degree in signal and infor-
[21] W. Lewis and D. Vigh, “Deep learning prior models from seismic images mation processing from Xidian University, Xi’an,
for full-waveform inversion,” in Proc. SEG Tech. Program Expanded China, in 2015 and 2018, respectively. She is pur-
Abstracts, 2017, pp. 1512–1517. suing the Ph.D. degree with the Department of
[22] J. Adler and O. Öktem, “Solving ill-posed inverse problems using Electrical and Computer Engineering, University of
iterative deep neural networks,” Inverse Problems, vol. 33, no. 12, Houston, Houston, TX, USA.
Dec. 2017, Art. no. 124007. Her research interests include scientific machine
[23] L. Huang, M. Polanco, and T. E. Clee, “Initial experiments on improving learning and deep learning for forward and inverse
seismic data inversion with deep learning,” in Proc. New York Sci. Data modeling.
Summit (NYSDS), Aug. 2018, pp. 1–3.
[24] Z. Wei, D. Liu, and X. Chen, “Dominant-current deep learning scheme
for electrical impedance tomography,” IEEE Trans. Biomed. Eng.,
vol. 66, no. 9, pp. 2546–2555, Sep. 2019.
[25] Z. Wei and X. Chen, “Physics-inspired convolutional neural network for
solving full-wave inverse scattering problems,” IEEE Trans. Antennas
Propag., vol. 67, no. 9, pp. 6138–6148, Sep. 2019. Rui Guo received the B.S. degree in telecommunica-
[26] R. Zhang, Y. Liu, and H. Sun, “Physics-guided convolutional tion engineering from the Beijing University of Posts
neural network (PhyCNN) for data-driven seismic response model- and Telecommunications, Beijing, China, in 2014,
ing,” 2019, arXiv:1909.08118. [Online]. Available: http://arxiv.org/abs/ and the M.S. degree in electronic engineering from
1909.08118 the Institute of Electronics (IECAS), Chinese Acad-
emy of Sciences, Beijing, in 2017. He is pursuing
[27] Y. Jin, X. Wu, J. Chen, and Y. Huang, “Using a physics-driven deep
the Ph.D. degree in electronic engineering with
neural network to solve inverse problems for lwd azimuthal resistivity
Tsinghua University, Beijing.
measurements,” in Proc. SPWLA. The Woodlands, Texas, USA, 2019,
Since 2017, he has been a Research Student with
p. 13.
the Beijing National Research Center for Informa-
[28] A. Johnston, R. Garg, G. Carneiro, I. Reid, and A. van den Hengel,
tion Science and Technology, Tsinghua University.
“Scaling CNNs for high resolution volumetric reconstruction from
He was a Summer Intern in software engineering with the Beijing Geoscience
a single image,” in Proc. IEEE Int. Conf. Comput. Vis. Workshops Center (BGC), Schlumberger, Beijing, in 2018, and an Intern in data science
(ICCVW), Oct. 2017, pp. 930–939. with the Schlumberger Digital Foundation Center (SDFC), Houston, TX,
[29] X. Xiong and F. De la Torre, “Supervised descent method and its USA, in 2019. His thesis research is on the theory and algorithms of machine-
applications to face alignment,” in Proc. IEEE Conf. Comput. Vis. learning-based inversion especially in magnetotellurics (MT), microwave
Pattern Recognit., Jun. 2013, pp. 532–539. imaging, and seismic tomography.
[30] R. Guo, M. Li, G. Fang, F. Yang, S. Xu, and A. Abubakar, “Appli- Mr. Guo received the Outstanding Graduates Award from IECAS
cation of supervised descent method to transient electromagnetic data in 2017 and the Best Student Paper Award at the PhotonIcs and Electro-
inversion,” Geophysics, vol. 84, no. 4, pp. E225–E237, Jul. 2019. magnetics Research Symposium (PIERS) in 2019.
[31] R. Guo, M. Li, F. Yang, S. Xu, and A. Abubakar, “First arrival
traveltime tomography using supervised descent learning technique,”
Inverse Problems, vol. 35, no. 10, Oct. 2019, Art. no. 105008.
[32] M. S. Bittar, “Electromagnetic wave resistivity tool having a tilted
antenna for geosteering within a desired payzone,” U.S. Patent 6 476 609,
Nov. 5, 2002. Yuchen Jin received the B.S. degree in electronic
[33] D. Omeragic et al., “Sensitivities of directional electromagnetic mea- information from the Huazhong University of Sci-
surements for well placement and formation evaluation while drilling,” ence and Technology, Wuhan, China, in 2017. He is
in Proc. SEG Tech. Program Expanded Abstr., Jan. 2006, pp. 1630–1634. pursuing the Ph.D. degree with the Department
[34] S. Li, J. Chen, and T. L. J. Binford, “Using new lwd measurements of Electrical and Computer Engineering, Univer-
to evaluate formation resistivity anisotropy at any dip angle,” in Proc. sity of Houston, Houston, TX, USA, co-advised by
SPWLA. Abu Dhabi, United Arab Emirates, 2014, p. 16. Dr. Jiefu Chen and Dr. Xuqing Wu.
[35] J. Seydoux et al., “Full 3d deep directional resistivity measurements His research interests include machine learning,
optimize well placement and provide reservoir-scale imaging while inverse problem, optimization, seismic processing,
drilling,” in Proc. SPWLA. Abu Dhabi, United Arab Emirates, 2014, and signal processing.
p. 14.

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

HU et al.: SUPERVISED DESCENT LEARNING TECHNIQUE 13

Xuqing Wu received the Ph.D. degree in computer Aria Abubakar was born in Bandung, Indonesia,
science from the University of Houston, Houston, in August 21, 1974. He received the M.Sc. degree
TX, USA, in 2011. (cum laude) in electrical engineering and the Ph.D.
In 2015, he was a Data Scientist and Software degree (cum laude) in technical sciences from the
Engineer of Energy and IT industry. He is an Delft University of Technology, Delft, The Nether-
Assistant Professor of computer information sys- lands, in 1997 and 2000, respectively.
tems with the College of Technology, University In 1996, he was a Research Student with
of Houston. His research interests include scien- Shell Research B.V., Amsterdam, The Netherlands.
tific machine learning, probabilistic modeling, and He was a Summer Intern with Schlumberger-Doll
subsurface sensing. Research, Ridgefield, CT, USA, in 1999. From
September 2000 to February 2003, he was with
the Laboratory of Electromagnetic Research and Section of Applied Geo-
physics, Delft University of Technology. He is a Head of Data Science and
Scientific Advisor with Schlumberger. His main research activity includes
solving forward and inverse problems in acoustics, electromagnetics, and
elastodynamics.
Maokun Li received the B.S. degree in electronic Dr. Abubakar received the Best 1997 Master’s Thesis Award in electrical
engineering from Tsinghua University, Beijing, engineering from the Delft University of Technology.
China, in 2002, and the M.S. and Ph.D. degrees
in electrical engineering from the University of Illi-
nois at Urbana–Champaign, Champaign, IL, USA,
in 2004 and 2007, respectively.
After graduation, he worked as a Senior Research Jiefu Chen (Member, IEEE) received the B.S.
Scientist at Schlumberger-Doll Research, Cam- degree in engineering mechanics and the M.S.
bridge, MA, USA. In 2014, he joined the degree in dynamics and control from the Dalian Uni-
Department of Electronic Engineering, Tsinghua versity of Technology, Dalian, China, in 2003 and
University. He has published one book chapter, 2006, respectively, and the Ph.D. degree in electrical
50 journal articles, 120 conference proceedings, and 3 patent applications. engineering from Duke University, Durham, NC,
His research interests include fast algorithms in computational electromagnet- USA, in 2010.
ics and their applications in antenna modeling, electromagnetic compatibility From 2011 to 2015, he was a Staff Scientist with
analysis, and inverse problems. the Advantage Research and Development Center,
Dr. Li was also among the recipients of the China National 1000 Plan Weatherford International Ltd., Houston, TX, USA.
in 2014 and the 2017 IEEE Ulrich L. Rohde Innovative Conference Paper Since September 2015, he has been an Assistant
Award. He also serves as an Associate Editor for the IEEE J OURNAL Professor with the Department of Electrical and Computer Engineering,
ON M ULTISCALE AND M ULTIPHYSICS C OMPUTATIONAL T ECHNIQUES and University of Houston, Houston. His research interests include computational
Applied Computational Electromagnetic Society Journal and a Guest Editor electromagnetics, inverse problems, machine learning for scientific computing,
for the Special Issue on Electromagnetic Inverse Problems for Sensing and oilfield data analytics, seismic data processing, underground and underwater
Imaging in the IEEE Antennas and Propagation Magazine. wireless communication, and well logging.

Authorized licensed use limited to: University of Houston. Downloaded on June 03,2020 at 09:45:03 UTC from IEEE Xplore. Restrictions apply.

You might also like