Computers & Geosciences: Guofeng Liu, Xiaohong Meng, Zhaoxi Chen

Computers & Geosciences 48 (2012) 86–92
Contents lists available at SciVerse ScienceDirect
Computers & Geosciences

journal homepage: www.elsevier.com/locate/cageo
3D magnetic inversion based on probability tomography

and its GPU implement
Guofeng Liu n, Xiaohong Meng, Zhaoxi Chen
School of Geophysics and Information Technology, China University of Geosciences, Beijing 100083, China
a r t i c l e i n f o a b s t r a c t
Article history: There are two types of three-dimensional (3D) magnetic inversion methods based on the classification of the
Received 20 March 2012 inversion result, one is the inversion approach that determines a 3D susceptibility distribution that produces
Received in revised form a given magnetic anomaly, and the other is to inverse the source distribution in a purely probabilistic sense,
21 May 2012
in which the inversion results are equivalent physical parameters between þ 1 and 1. The second method
Accepted 22 May 2012
Available online 31 May 2012
is easier and more stable, but obtaining the susceptibility directly to recognize certain lithology is often more
desirable. Furthermore, it is difficult to add an external geological constraint in the second method for
Keywords: reducing the nonuniqueness of magnetic inversion. Herein, we propose an iterative method to inverse the
Magnetic inversion susceptibility based on the second method. The proposed method obtains the perturbation of susceptibility
Probability tomography
by multiplying some susceptibility with the probability tomography result of misfits in observed data and
GPU
forward data given a certain susceptibility model. We present a graphic processing unit (GPU) scheme to
Optimization
tackle an intensive computing problem. The forward and probability function are computed in parallel on the
GPU. Incorporating reasonable parallel strategies and three key optimization steps like memory optimization,
execution configuration optimization and instruction optimization, the 3D magnetic inversion in this paper
on a Tesla C2050 GPU shows greatly improved efficiency compared to serial code on a 2.5 GHz CPU, with a
60-fold increase in speed especially for the large volumes of data. We design a synthetic model with two
prismatic susceptibility anomalies. The inversion result of this model also proves the effectiveness of the
inversion method introduced in this paper.
& 2012 Elsevier Ltd. All rights reserved.
1. Introduction kind of magnetic inversion is probability tomography, which deals

with imaging the source distribution in a purely probabilistic sense
Magnetic surveying has been used extensively over the years, without external constraints. Probability tomography was first devel-
resulting in a large volume of data. Magnetic surveying data have oped for the analysis of self-potential data (Patella, 1997a,b) and then
been used for mapping geological structures in tectonic studies, extended to geoelectric and electromagnetic methods (Mauriello
resource exploration and engineering investigations, especially in et al., 1998; Mauriello and Patella, 1999a,b). So far, gravity and
the reconnaissance stage of these applications. However, when magnetic probability tomography imaging have also been developed
used in detailed prospecting, magnetic surveying requires the use (Mauriello and Patella, 2005, 2008, 2001; Chianese and Lapenna.
of the inversion approach to outline the distribution of under- (2007); Guo et al., 2011a,b).
ground susceptibility. Compared to the those methods that inverse the susceptibility
The inversion of magnetic data for 3D source distribution is directly, probability tomography of magnetic data is simple, stable
increasingly common. Depending on the type of inversion data, there and easy to perform. The results, however, are values between 1
are two types of magnetic inversion. The first is to inverse the and þ1, represent the susceptibility influence and deficit compare
susceptibility directly (Li and Odenburg, 1996; Pilkington, 1997; with the surrounding background susceptibility, and is not the actual
Boulanger and Chouteau, 2001; Portniaguine and Zhdanov, 2002), susceptibility. It is also difficult to add the geological constraint in the
some extended methods,like limiting the physical property in certain imaging processing to overcome the inherent uniqueness of source
range, using sparseness constraint, have been developed for over- distribution and improve the image resolution. In this paper, we
coming the inherent uniqueness of source distribution and intensive propose an iterative method to inverse the susceptibility according to
computing (Oldenburg et al., 1993; Li and Oldenburg, 2003; Zhdanov, the probability tomography computing of a misfit between the
2002; Vogel, 2002; Fullagar et al., 2008; Pilkington, 2009). The second observed magnetic data and forward magnetic data for a given
susceptibility model.
Driven by the insatiable market demand for real-time,
n
Corresponding author. Tel./fax: þ 8601082321331. high-definition 3D images, the programmable NVIDIA Graphic
E-mail address: liugf@cugb.edu.cn (G. Liu). Processing Unit (GPU) as a co-processor of a central processing
0098-3004/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.cageo.2012.05.025
G. Liu et al. / Computers & Geosciences 48 (2012) 86–92 87
unit (CPU) has been developed for high-performance computing. magnetization of the sphere are I and A0 ,so, we have the following
Compute Unified Device Architecture (CUDA) is a parallel pro- expression of a f:, a¼2UlUL nUN mUM, b¼2UmUM lULnUN,
gramming model and software environment provided by NVIDIA c¼2U(lUL nUL), d¼3U(MUNþnUE), e¼3U(MUlþmUL) and f¼2Un
that is designed to overcome the challenge of using a traditional, UN MUm lUL, in that, L ¼ cosI cosA0 , M ¼ cosIsinA0 , N ¼ sinI,
general-purpose GPU while maintaining a shallow learning curve l ¼ cosI0 cosA00 , m ¼ cosI0 sinA10 , n ¼ sinI0 (Guo et al. 2011a).
for programmers familiar with standard programming languages
such as C. Presently, the T10 series NVIDIA GPU C2050 computa- 2.2. The qth rectangular occurrence probability function
tion ability reaches 1.3 TFlops, but is only the size of a normal
video card and consumes just 300 W of electricity per hour We define the total power L associated with DTa on the
(NVIDIA, 2010, 2011). This GPU has been used successfully in surface as
many geophysical parallel computing fields (Zhang et al., 2009; Z
Wang et al., 2010; Shi et al., 2011), they introduces the main L ¼ DT 2a dS ð4Þ
s
problem in many aspects like floating-point errors, micro-struc-
ture of graphic processing unit and so on. Substituting from (2), (4) becomes
Besides the intensive computation of 3D magnetic inversion, Q
X Z
kq T
the current magnetic survey, especially airborne surveys are L¼ DT a Bq ðx,y,zÞdS ð5Þ
4p s
characterized by extremely large volumes of data, all above q¼1
problems need substantive amounts of parallel computing power. As defined by Mauriello and Patella (2005, 2008), the occur-
In this study, we developed GPU/CPU heterogeneous parallel rence probability function of a generic qth rectangular cell, valid
computing to tackle the computing challenge. for 3D tomography is
R
s DT a Bq dS
ZT a ðqÞ ¼ qRffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
R 2 ffi ð6Þ
2. Theory of 3D magnetic inversion based on probability 2
s DT a dS s Bq dS
tomography
According to Schwarz’s inequality property, we can write
2.1. The total magnetic field function Z Z Z
DT 2a dSU B2q dS Z ½ DT a Bq dS2 ð7Þ
s s s
In a reference rectangular coordinate system with the
horizontal (x–y) plane and the z axis positive downwards, the the results of the 3D probability tomography satisfy the condi-
subsurface is represented by a 3D array of rectangular cells, each tion:
of them having a variable susceptibility. To simplify the calcula- 1 r Za ðqÞ r 1
tion, the magnetic effect of each cell is approximated by the effect
of a dipole located at its center. For the qth cell with susceptibility Positive values of Za(q) indicate the influence of concentrated
kq, the magnetic anomaly caused by it at a receiving point p on susceptibility at the qth cell to the surrounding background
the ground surface is (Fig. 1) susceptibility, while negative values indicate a susceptibility
deficit.
kq T
DT q ¼ Bq ðx,y,zÞ ð1Þ
4p
2.3. 3D magnetic imaging based on probability tomography
The total magnetic field on p is a sum of all the cell effects
underground: Assuming an initial susceptibility model kq1, the total mag-
XQ netic field at the receiving point p is
kq T
DT a ¼ Bq ðx,y,zÞ ð2Þ Q
4p X kq1 T
q¼1
DT a1 ¼ Bq ðx,y,zÞ ð8Þ
q¼1
4p
Kq is the susceptibility of a rectangular cell, and T is the inducing
field strength of geomagnetic field. The misfit of the observed magnetic field DTa and DT a1 is
1h DT a DT a1 . Their probability tomography results for the generic
Bq ðx,y,zÞ ¼ 5 aðxq xÞ2 þbðyq yÞ2 þ cðxq xÞðzq zÞ qth cell are 1 r ZðDT a DT a Þ ðqÞ r1, indicating the influence or
r i 1
deficit of kq1 compared to the real susceptibility kq at the qth
þdðyq yÞðzq zÞ þ eðxq xÞðyq yÞ þ f ðzq zÞ2 ð3Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rectangular cell. The probability tomography result can be multi-
where r ¼ ðxq xÞ2 þ ðyq yÞ2 þðzq zÞ2 is the distance between plied by a susceptibility value dk to convert the probability value
the center of cell to the receiving point, we define the inclination to susceptibility. The susceptibility can then be added to kq1, to
and declination of the geomagnetic field are I0 and A00 , total make it identical to kq, but it is difficult to initially choose an of
dk, so we include the following iterative procedure:
kqi þ 1 ¼ kqi þ ZðDT a DT a Þ ðqÞ dk ð9Þ
i
where DT ai is the ith forward total magnetic field with suscept-

ibility model ki. A flowchart of this inversion routine is shown in
Fig. 2.
3. Accelerating the 3D magnetic imaging with GPU/CPU

heterogeneous parallel computing
The GPU is a cluster of streaming multiprocessors integrated

on a graphics card. It has an extremely high calculation speed
Fig. 1. Model with a rectangular cell q and receiving point p. when processing a large volume of graphics data. Currently, there
88 G. Liu et al. / Computers & Geosciences 48 (2012) 86–92
3.1. 3D magnetic inversion parallel strategy using a GPU
In every iterative procedure of the 3D magnetic inversion

presented herein, there are two intensive computational modules,
one is the magnetic field forward computation using Eq. (2), the
other is the probability tomography computation using Eq. (6).
These two modules are part of the parallel strategy using a GPU.
Every receiving point on the surface is calculated separately for
the forward part, so every GPU thread calculates one receiving point.
Assuming that there are nx and ny receiving points in the x and y
directions on the surface, a two-dimensional (2D) thread structure
can be defined (Fig. 3). For each block, a 2D thread number (e.g., 16,
16) defined as dimblock indicates that in each block there are 16
threads in each x and y directions. Then, according to nx and ny, the
grid size is defined as ((nxþblockDim.x 1)/dimblock.x, (nyþblock-
Dim.y1)/dimblock.y),
the CPU code written in C language for this forward computing i:
void imagingðfloatnimaging, int nx, int ny. . .Þ
f
inti;
...
forði ¼ 0; i o nx; i þ þ Þf
Fig. 2. Flowchart of the 3D magnetic inversion developed in the present study. forðj ¼ 0; j o ny; j þ þ Þf
magnetic½i½j ¼ eq: ð2Þ;
g
are over 30 streaming multiprocessors integrated on the graphics g
card. Over the past several years, the GPU has been developed to ...
process intensive calculations, especially high-density parallel g
calculations; its peak computation frequency is over 10 times
more than that of a CPU (NVIDIA, 2010). the GPU code written in C version CUDA for this forward
The following are the primary hardware differences between a computing is
CPU and a GPU with respect to parallel programming: __global__forward_kernelðfloatnd_magnetic. . .Þ
f
(1) Different number of processing threads: A CPU equipped with
ifði o nx & & jo nyÞ
four quad-cores can only process 16 threads. By comparison,
the smallest executable unit of parallelism on a GPU com- i ¼ blockIdx:xnblockDim:x þ threadIdx:x;
prises 32 threads (a warp). All NVIDIA GPUs can support at j ¼ blockIdx:ynblockDim:y þ threadIdx:y;

least 768 concurrently active threads per multiprocessor, and d_magnetic½jnnxþ i ¼ eq: ð2Þ;
some GPUs support 1024 or more active threads per multi- g
processor. GPUs that have 30 multiprocessors can support
For the probability tomography GPU computing, every GPU
more than 30,000 active threads. The thread structure of a
thread calculates one imaging point underground. If there are
GPU is defined by grid size that is a block number per grid and
nx ny nz imaging rectangular cells in the x, y and z directions,
block size that is thread number per block.
(2) Different degrees of difficulty for controlling threads: When a
CPU calculates in parallel, threads on a CPU are generally
heavyweight entities, the operating system must swap
threads on and off of CPU execution channels to provide
multithreading capability, controlling the switch is difficult,
generally speaking, one CPU parallel program can only have
one parallel kernel. When a GPU calculates in parallel, the
calculating grain size is lightweight, in a typical system,
thousands of threads are queued up for work, if the GPU
must wait on one warp of threads, it simply begins executing
work on another. Moreover, a program can have more than
one GPU parallel kernel using different parallel strategies.
(3) Different RAM model: Any CPU procedures can access memory
freely. On the other hand, a GPU has various virtual or
physical RAMs. The different type of RAM for a GPU can be
chosen based on calculating needs.
Fig. 3. Thread structure of GPU parallel magnetic forward computing, where

In GPU/CPU heterogeneous computing, the CPU is responsible blockDim.x and blockDim.y are the threads number in a block in the x and y
directions, respectively, blockIdx.x and blockIdx.y are the ordinal numbers of each
for the serial part of the program, and the GPU is responsible for block, and threadIdx.x and threadIdx.y are the ordinal numbers of a thread in a
the parallel part. Thus, when using a GPU, the first thing to design block, the index of each thread in x and y direction is blockidx.x*blockDim.xþ
is the parallel strategy. threadIdx.x and blockidx.y*blockDim.yþ threadIdx.y.
3.2. Key optimization steps of 3D magnetic inversion
Developing the calculating procedure of the GPU is easy;

however, optimization of the procedure must be considered
carefully. In this study, we optimized the GPU code taking into
consideration three aspects, memory optimization, execution
configuration and instruction optimization.
Memory optimization is most important for performance. The
goal is to maximize the use of the hardware by maximizing
bandwidth, which is accomplished using as much fast memory
and as little slow-access memory as possible. GPUs use several
memory spaces (Fig. 6), which have different characteristics that
reflect their distinct usages in applications. These memory spaces
include global, local, shared, texture, and registers, as shown in
Fig.6. Using Tesla C2050 GPU as an example, it has 3 GB of global
memory, which can be used to write and read, but storing or
accessing speed is slow because there is no cache. The constant
memory is 64 KB of read-only memory, with storing or accessing
speed four times faster than the global memory. The shared
memory is only 16 KB in size and the partial data communication
can be executed internally. Texture memory is another read-only
memory. Its size is identical to that of global memory, but its
Fig. 4. Thread structure of GPU parallel magnetic probability tomography reading or accessing speed is faster than the global memory.
computing. Combining memory characteristics and application of magnetic
forward computing and imaging, the global memory is used to
a 3D thread structure is defined (Fig. 4). Every block is responsible save the forward computing and imaging results, the constant
for an imaging point on the surface. Each thread in this block memory is used to transfer function parameters, and the texture
calculating underground rectangular cells requires to inverse memory is used to transfer some read-only data, like the suscept-
susceptibility. A 1D thread number for a block (e.g., 256) is ibility model in the forward computing.
assigned, and then, according to the nx and ny, the 2D grid size Another key to good performance is to keep the multiproces-
is defined as (nx, ny), sors on the GPU as busy as possible. Occupancy is the ratio of the
the CPU code written in C language for this probability number of active warps per multiprocessor to the maximum
tomography computing is
void imagingðfloatnimaging, int nx, int ny. . .Þ
f
inti;
...
forði ¼ 0; io nx; iþ þÞf
forðj ¼ 0; j ony; j þ þ Þf
forðk ¼ 0; k onz; k þ þ Þf
image½i½j½k ¼ eq: ð6Þ;
g
g
...
g
the GPU code written in C version CUDA for this probability

tomography computing is:
Fig. 5. Flowchart of the parallel magnetic inversion for the GPU.
__global __imaging _ker nelðfloatnmagnetic . . .Þ
f
...
constinti ¼ blockIdx:x;
const intj ¼ blockIdx:y;
const intk ¼ threadIdx:x;
forðl ¼ k; l o nz; l ¼ l þ blockDim:xÞf

magnetic½lnnynnxþ jnnxþ i ¼ eq: ð6Þ;
g
...
g
The flowchart of the magnetic inversion for the designed Fig. 6. The various memory spaces on a GPU, in that, constant and texture
parallel strategies of the GPU is shown in Fig. 5. memory are read-only memory, others are read-write memory.
Fig. 7. The occupancy of the 3D magnetic probability tomography for different

thread numbers in a block. The inversion grid size is 200 200 500. The
occupancy is calculated by the CUDA tool Cudaprof.
number of possible active warps. Low occupancy always interferes

with the ability to hide memory latency, resulting in performance Fig. 8. GPU and CPU computing time for the forward part of the 3D magnetic
degradation. The dimension and size of blocks per grid and the inversion for different receiving point sizes, nx ny.
dimension and size of threads per block are both important factors.
When choosing the first execution configuration parameter, i.e., the
number of blocks per grid, the primary concern is keeping the entire
GPU busy. The number of blocks in a grid should be larger than the
number of multiprocessors so that all multiprocessors have at least
one block to execute. The second execution configuration parameter
is the threads per block. Because the warp is the smallest executable
unit of parallelism on a GPU, it is better if the thread number per
block is a multiple of 32 threads (a warp). In our processing, it is easy
to make the block number larger than the number of multiprocessors,
so we were more concerned with the effect of thread number per
block. For example, note the probability tomography part of the 3D
magnetic inversion; specifically, the relationship between the thread
number per block and occupancy shown in Fig. 7. When the block
size is 128, all threads are active.
Awareness of how instructions are executed often permits
low-level optimizations that can be useful, especially in code that
is run frequently. Use of single-precision floats and shift opera- Fig. 9. GPU and CPU computing time for the probability tomography part of the
tions to avoid expensive division and modular calculations is 3D magnetic inversion for different inversion grid sizes, nx ny nz.
encouraged. During calculations, CUDA provides two types of run-
time mathematical operations that can be distinguished by their
names: some have names with prepended underscores, whereas Table 1
others do not (e.g., __function Name() versus functionName()). Parameters of the model.
Functions such as __functionName() map directly to the hardware
Length Length Depth Susceptibility
level; they are faster but provide somewhat lower accuracy (e.g.,
in X (m) in Y (m) in Z (m) (SI)
__sin f(x) and __exp f(x)). Functions such as functionName(), the
built-in function of the C language, are slower but have higher Anomaly 1 350–550 550–750 100–200 0.01
accuracy (e.g., sin f(x) and exp f(x)). When accuracy is tolerable, Anomaly 2 750–950 550–750 100–200 0.01
the first type of functions can be used to raise calculating
efficiency.
After optimization, the run time of the CPU code for the
forward and probability tomography module is compared with
the GPU code when choosing different inversion grids. A NVIDIA
Tesla C2050 GPU is applied for the testing environment. This GPU
has a 3 GB global memory and it applies PCI-E 16X to connect
with an AMD CPU running at 2.5 GHz, with 8 G DDR2 memory.
The programming environment of this GPU is CUDA3.1, the
operation system is Linux 2.4.21, and the language compiler is
GCC 3.2.3. The forward module comparison is shown in Fig. 8, and
the probability tomography module comparison is shown in
Fig. 9. Clearly, the GPU computing can largely improve the
efficiency, especially for inversion of a large volume of data.
4. Model test
We designed a 3D susceptibility model to test our method. The Fig. 10. Magnetic field contour of the model. A–A0 is a profile across the magnetic
model length and width is 1200 m 1200 m, the depth is 500 m, field peak.
Fig. 11. The quality control profile A–A0 . (a) the observed magnetic data of A–A0 , (b) inversed and observed magnetic data of A–A0 , (c) the model under A–A0 and
(d) inversed model under A–A0 .
Fig. 12. 3D visualization of model (a) and inversion result (b). (b) results using a cutoff value o 0.003.
and the inversion grid is 10 m 10 m 10 m. There are two Tesla C2050 GPU. After low-grain parallel strategies and three key
prismatic susceptibility anomalies in this model; their position optimization steps, the inversion efficiency was greatly improved,
and susceptibility values are shown in Table 1. the contour of the especially for large volumes of data. The effectiveness of the
magnetic field is shown in Fig. 10. The inclination of the magnetic method was confirmed with a model test.
field and the total magnetization of the sphere is 901; their
declination is 01, the inducing field with a strength of 50000 nT.
After 7 iteration, we get the inversion result, the quality Acknowledgments
control of the inversion result across profile A–A0 are shown in
Fig. 11. in this comparison, we can distinguish there are two This work is supported by the National Natural Science
anomalies on the background, for displaying the inversion result Foundation of China under Grant Nos. 41104083 and 41074095,
of the whole lines, the 3D display comparing the original model SinoProb-01 project and the Fundamental Research Funds for the
and inversion results is shown in Fig. 12. After cut the suscept- Central Universities.
ibility under 0.003, the result outlines successfully the magnetic
source distribution at the nearly same location as the designed
model. These results in some way approve that the rational of the References
approach is correct, together with the great improvement of
computing time shown in Figs. 8 and 9, I believe the method of Boulanger, O, Chouteau, M., 2001. Constraints in 3D gravity inversion. Geophysical
this paper can provide a new choice for 3D magnetic inversion. Prospecting 49, 265–280.
Chianese, D, Lapenna., V., 2007. Magnetic probability tomography for environ-
mental purposes test measurements and field applications. Journal of Geophysics
and Engineering 4, 63–74.
5. Conclusion Fullagar, P.K., Pears, G.A., McMonnies, B., 2008. Constrained inversion of geologic
surfaces—pushing the boundaries. The Leading Edge 27, 98–105.
Guo, L,H., Shi, L., Meng, X.H., 2011a. 3D correlation imaging of magnetic total field
We develop a new 3D magnetic inversion method to inverse anomaly and its vertical gradient. Journal of Geophysics and Engineering 8,
the susceptibility distribution directly based on the probability 287–293.
tomography function. This method is easier and more stable than Guo, L,H., Meng, X.H., Shi, L., 2011b. 3D correlation imaging of the vertical gradient
of gravity anomaly. Journal of Geophysics and Engineering 8, 6–12.
those general methods associated with large matrix computation. Li, Y., Oldenburg, D.W., 1996. 3-D inversion of magnetic data. Geophysics 61,
We also implemented this new inversion method on a NVIDIA 394–408.
Li, Y., Oldenburg, D.W., 2003. Fast inversion of large-scale magnetic data using Portniaguine, O., Zhdanov, M.S., 2002. 3-D magnetic inversion with data compres-
wavelet transforms and a logarithmic barrier method. Geophysical Journal sion and image focusing. Geophysics 67, 1532–1541.
International 152, 251–265. Patella, D., 1997a. Introduction to ground surface self-potential tomography.
Mauriello, P., Monna, D., Patella, D., 1998. 3D geoelectric tomography and Geophysical Prospecting 45, 653–681.
archeological application. Geophysical Prospecting 46, 543–570. Patella, D., 1997b. Self-potential global tomography including topographic effects.
Mauriello, P., Patella, D, 1999a. Resistivity anomaly imaging by probability Geophysical Prospecting 45, 843–863.
tomography. Geophysical Prospecting 47, 411–429. Pilkington, M., 1997. 3-D magnetic imaging using conjugate gradients. Geophysics
Maurillo, P., Patella, D., 1999b. Principles of probability tomography for natural- 62, 1132–1142.
source electromagnetic induction fields. Geophysics 64, 1403–1417. Pilkington, M., 2009. 3D magnetic data-space inversion with sparseness con-
Maurillo, P., Patella, D., 2001. Gravity probability tomography: a new tool for straints. Geophysics 74, L7–L15.
buried mass distribution imaging. Geophysical Prospecting 49, 1–12. Shi, X,H., Li, C., Wang, S.H., et al., 2011. Computing prestack Kirchhoff time
Maurillo, P., Patella. D., 2005. Localization of magnetic sources underground by a migration on general purpose GPU. Computers and Geosciences 37,
data adaptive tomographic scanner. arxiv:physics/0511192v2, 1–5.
1702–1710.
Mauriello, P., Patella, D., 2008. Localization of magnetic sources underground by a
Vogel, C.R., 2002. Computational methods for inverse problems. Society of
probability tomography approach. Progress in Electromagnetic Research M 3,
Industrial and Applied Mathematics, 37–140p, p.
27–56.
Wang, X., Gao, X., Yao, 2010. Accelerating POCS interpolation of 3D irregular
NVIDIA Corporation, 2011. NVIDIA GPU Computing Developer Homepage, NVidia Inc.
seismic data with graphics processing units. Computers and Geosciences 36,
/http://developer.nvidia.com/object/gpucomputing.html (accessed12.01.10)S .
NVIDIA Corporation, 2010. CUDA C Programming Best Practices Guides 3.2. 1292–1300.
/http://nvidia.com/object/cuda_develop.htmlS. Zhdanov, M.S., 2002. Geophysical Inverse Theory and Regularization Problems.
Oldenburg, D.W., McGillivray, P.R, Ellis, R.G., 1993. Generalized subspace methods Elsevier Science, pp. 76–98.
for large-scale inverse problems. Geophysical Journal International 114, Zhang, J.H., Wang, S.Q., Yao, Z.X., 2009. Accelerating 3D fourier migration with
12–20. graphics processing units. Geophysics 74, WCA129–WCA139.

Computers & Geosciences: Guofeng Liu, Xiaohong Meng, Zhaoxi Chen

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computers & Geosciences: Guofeng Liu, Xiaohong Meng, Zhaoxi Chen

Uploaded by

Copyright:

Available Formats

Computers & Geosciences 48 (2012) 86–92

Contents lists available at SciVerse ScienceDirect

Computers & Geosciences

3D magnetic inversion based on probability tomography

1. Introduction kind of magnetic inversion is probability tomography, which deals

where DT ai is the ith forward total magnetic ﬁeld with suscept-

3. Accelerating the 3D magnetic imaging with GPU/CPU

The GPU is a cluster of streaming multiprocessors integrated

3.1. 3D magnetic inversion parallel strategy using a GPU

In every iterative procedure of the 3D magnetic inversion

Fig. 3. Thread structure of GPU parallel magnetic forward computing, where

3.2. Key optimization steps of 3D magnetic inversion

Developing the calculating procedure of the GPU is easy;

the GPU code written in C version CUDA for this probability

Fig. 7. The occupancy of the 3D magnetic probability tomography for different

number of possible active warps. Low occupancy always interferes

You might also like