You are on page 1of 11

Accelerated Image Reconstruction for Nonlinear

Diffractive Imaging
Yanting Ma, Student Member, IEEE, Hassan Mansour, Senior Member, IEEE, Dehong Liu, Senior Member, IEEE,
Petros T. Boufounos, Senior Member, IEEE, and Ulugbek S. Kamilov, Member, IEEE

Abstract—The problem of reconstructing an object from the FB


measurements of the light it scatters is common in numerous
arXiv:1708.01663v2 [cs.CV] 14 Dec 2017

imaging applications. While the most popular formulations of


the problem are based on linearizing the object-light relation-
ship, there is an increased interest in considering nonlinear
formulations that can account for multiple light scattering. In IL
this paper, we propose an image reconstruction method, called
CISOR, for nonlinear diffractive imaging, based on a nonconvex
optimization formulation with total variation (TV) regularization.
The nonconvex solver used in CISOR is our new variant of fast
iterative shrinkage/thresholding algorithm (FISTA). We provide CISOR
fast and memory-efficient implementation of the new FISTA
variant and prove that it reliably converges for our nonconvex
optimization problem. In addition, we systematically compare
our method with other state-of-the-art methods on simulated as 𝜆
0.2% 8.2% 16.2% 24% 32.2%
well as experimentally measured data in both 2D and 3D settings.

Fig. 1. Comparison between linear method with first Born approximation (FB,
Index Terms—Diffraction tomography, proximal gradient first row), iterative linearization method (IL, second row), and our proposed
method, total variation regularization, nonconvex optimization nonlinear method (CISOR, third row). Each column represents one contrast
value as indicated at the bottom of the images on the third row. CISOR is
stable for all tested contrast values, whereas FB and IL fail for large contrast.

I. I NTRODUCTION
Estimation of the spatial permittivity distribution of an ob- that can account for the nonlinearity. Note that the nonlinearity
ject from the scattered wave measurements is ubiquitous in nu- in this work refers to the fundamental relationship between
merous applications. Conventional methods usually rely on lin- the scattered wave and the permittivity contrast, rather than
earizing the relationship between the permittivity and the wave that introduced by the limitations of sensing systems such as
measurements. For example, the first Born approximation [1] missing phase.
and the Rytov approximation [2] are linearization techniques
commonly adopted in diffraction tomography [3]–[8]. Other A standard way for solving inverse problems is via op-
imaging systems that are based on a linear forward model in- timization, where a sequence of estimates is generated by
clude optical projection tomography (OPT), optical coherence iteratively minimizing a cost function. For ill-posed problems,
tomography (OCT), digital holography, and subsurface radar a cost function usually consists of a quadratic data-fidelity
[9]–[18]. One attractive aspect of linear methods is that the term and a regularization term, which incorporates prior in-
inverse problem can be formulated as a convex optimization formation such as transform-domain sparsity to mitigate the
problem and solved by various efficient convex solvers [19]– ill-posedness. Total variation (TV) [25] is one of the most
[22]. However, linear models are highly inaccurate when the commonly used regularizers in image processing, as it captures
physical size of the object is large compared to the wavelength the property of piece-wise smoothness in natural objects.
of the incident wave or the permittivity contrast of the object The challenge of such formulation for nonlinear diffractive
compared to the background is high [23]. Therefore, in order imaging is that the data-fidelity term is nonconvex due to the
to be able to image strongly scattering objects such as human nonlinearity and that the TV regularizer is nondifferentiable.
tissue [24], nonlinear formulations that can model multiple For such nonsmooth and nonconvex problems, the prox-
scattering need to be considered. The challenge is then to imal gradient method, also known as iterative shrink-
develop fast, memory-efficient, and reliable inverse methods age/thresholding algorithm (ISTA) [26]–[28], is a natural
choice. ISTA is easy to implement and is proved to converge
Y. Ma is with the Department of Electrical and Computer Engineering,
North Carolina State University, Raleigh, NC 27606. This work was completed under some technical conditions. However, its convergence
while Y. Ma was with MERL. speed is slow. FISTA [22] is an accelerated variant of ISTA,
H. Mansour, D. Liu, and P. T. Boufounos are with Mitsubishi Electric which is proved to have the optimal worst case convergence
Research Laboratories (MERL), 201 Broadway, Cambridge, MA 02139.
U. S. Kamilov (email: kamilov@wustl.edu) is with Computational Imaging rate for convex problems. Unfortunately, its convergence anal-
Group (CIG), Washington University in St. Louis, St. Louis, MO 63130. ysis for nonconvex problems has not been established.
2

A. Contributions
𝑟 = (𝑥, 𝑦)
In this paper, we propose a new image reconstruction
method called Convergent Inverse Scattering using Optimiza- 𝑟
tion and Regularization (CISOR) for fast and memory-efficient
nonlinear diffractive imaging.
The key contributions of this paper are as follows:
• A novel nonconvex formulation that precisely models the
fundamental nonlinear relationship between the scattered
wave and the permittivity contrast, while enabling fast
and memory-efficient implementation, as well as rigorous
convergence analysis.
• A new relaxed variant of FISTA for solving our non-
convex problem with rigorous convergence analysis. Our Fig. 2. Visual representation of the measurement scenario considered in
this paper. An object with a real permittivity contrast f (r) is illuminated
new variant of FISTA may be of interest on its own as a with an input wave uin (r), which interacts with the object and results in
general nonconvex solver. the scattered wave usc at the sensor domain Γ ⊂ R2 . The complex scattered
• Extension of the proposed formulation and algorithm, as
wave is captured at the sensor and the algorithm proposed here is used for
estimating the contrast f .
well as the convergence analysis, to the 3D vectorial case,
which makes our method applicable in a broad range of
convergence guarantee for nonconvex problems, is applied in
engineering and scientific areas.
[42] as the nonconvex solver, whereas our method applies
our new variant of FISTA with rigorous convergence analysis
B. Related Work established here.
Many methods that attempt to integrate the nonlinearity of
wave scattering have been proposed in the literature. The iter- II. P ROBLEM F ORMULATION
ative linearization (IL) method [29], [30] iteratively computes The problem of inverse scattering based on the scalar
the forward model using the current estimated permittivity, and theory of diffraction [1], [43] is described as follows and
estimates the permittivity using the field from the previously illustrated in Figure 2; the formulation for the 3D vectorial
computed forward model. Hence, each sub-problem at each case is presented in Appendix VI-A. Let d = 2 or 3, suppose
iteration is a linear problem. Contrast source inversion (CSI) that an object is placed within a bounded domain Ω ⊂ Rd .
[31]–[33] defines an auxiliary variable called the contrast The object is illuminated by an incident wave uin , and the
source, which is the product of the permittivity contrast and scattered wave usc is measured by the sensors placed in a
the field. CSI alternates between estimating the contrast source sensing region Γ ⊂ Rd . Let u denote the total field, which
and the permittivity. Hybrid methods (HM) [34]–[36] combine satisfies u(r) = uin (r) + usc (r), ∀r ∈ Rd . The scalar Lippmann-
IL and CSI, aiming to benefit from each of their advantages. Schwinger equation [1] establishes the relationship between
A comprehensive comparison of these three methods can be wave and permittivity contrast:
found in the review paper [35]. Recently, the idea of neural Z
network unfolding has inspired a class of methods that updates u(r) = uin (r) + k 2 g(r − r0 )u(r0 )f (r0 )dr0 , ∀r ∈ Rd .
the estimates using error backpropagation [37]–[41]. While Ω
such methods can in principle model the precise nonlinearity, In the above, f (r) = ((r) − b ) is the permittivity contrast,
in practice, the accuracy may be limited by the availability of where (r) is the permittivity of the object, b is the permit-
memory to store the iterates needed to perform unfolding. tivity of the background, and k = 2π/λ is the wavenumber
Figure 1 provides a visual comparison of the reconstructed in vacuum. We assume that f is real, or in other words, the
images obtained by the first Born (FB) linear approximation object is lossless. The free-space Green’s function is defined
[1], the iterative linearization (IL) method [29], [30], and as follows:
our proposed nonlinear method CISOR; detailed experimental ( (1)
− j H (kb krk), if d = 2
setup will be presented in Section IV. We can see that when g(r) = ejk4b krk0 , (1)
the contrast of the object is small, all methods achieve similar 4πkrk , if d = 3
reconstruction quality. As the contrast value increases, the (1)
where k · k denotes the Euclidean norm, H0 is the Hankel
performance of the linear method degrades significantly and √
the iterative linearization method only succeeds up to a certain function of first kind, and kb = k b is the wavenumber of
level, whereas our nonlinear method CISOR provides reliable the background medium. The corresponding discrete system
reconstruction for all tested contrast values. is then
While we were concluding this manuscript, we became y = Hdiag(u)f + e, (2)
aware of very recent related work in [42], which considered in
u = u + Gdiag(f )u, (3)
a similar problem as in this paper. Our work differs from
[42] in the following aspects: (i) Only the 2D case has where f ∈ RN , u ∈ CN , uin ∈ CN are N uniformly spaced
been considered in [42], whereas our work extends to the samples of f (r), u(r), and uin (r) on Ω, respectively, and
3D vectorial case. (ii) FISTA [22], which does not have y ∈ CM is the measured scattered wave at the sensors with
3

measurement error e ∈ CM . For a vector a ∈ RN , diag(a) ∈ 10 0

Cost Function Value


RN ×N is a diagonal matrix with a on the diagonal. The =0 (ISTA)
=0.88
matrix H ∈ CM ×N is the discretization of Green’s function =0.96
g(r − r0 ) with r ∈ Γ and r0 ∈ Ω, whereas G ∈ CN ×N is the =1 (FISTA)
discretization of Green’s function with r, r0 ∈ Ω. The inverse
scattering problem is then to estimate f given y, H, G, and
10 -1
uin . Define
A := I − Gdiag(f ). (4)
100 200 300 400 500
Note that (2) and (3) define a nonlinear inverse problem, Iteration
because u depends on f through u = A−1 uin , where A is
defined in (4). In this paper, the linear method refers to Fig. 3. Empirical convergence speed for relaxed FISTA with various α values
tested on experimentally measured data.
the formulation where u = uin , and the iterative linearization
method replaces u in (2) with the estimate ub computed from need to be evaluated at two different points at each iteration.
(3) using the current estimate b f and assumes that u b is a While such extra computation may be insignificant in some
constant with respect to f . applications, it can be prohibitive in the inverse scattering
To estimate f from the nonlinear inverse problem (25) problem, where additional evaluations of the gradient and
and (26), we consider the following nonconvex optimization the objective function require the computation of the entire
formulation. Define forward model. Another accelerated proximal gradient method
Z(f ) := Hdiag(u)f , (5) that is proved to converge for nonconvex problems is proposed
in [45], which is not directly related to FISTA for non-smooth
which is the (clean) scattered wave from the object with objective functions like (6).
permittivity contrast f . Moreover, let C ⊂ RN be a bound
convex set that contains all possible values that f may take.
A. Relaxed FISTA
Then f is estimated by minimizing the following composite
cost function with a nonconvex data-fidelity term D(f ) and a We now introduce our new variant of FISTA to solve (6).
convex regularization term R(f ): Starting with some initialization f0 ∈ RN and setting s1 = f0 ,
t0 = 1, α ∈ [0, 1), for k ≥ 1, the proposed algorithm proceeds
f ∗ = arg min {F(f ) := D(f ) + R(f )} , (6) as follows:
f ∈RN

where fk = proxγR (sk − γ∇D(sk )) (9)


p
1 4t2k + 1 + 1
D(f ) = ky − Z(f )k22 , (7) tk+1 = (10)
2 v 2 
N u 2 tk − 1
X uX sk+1 = fk + α (fk − fk−1 ), (11)
R(f ) = τ t |[Dd f ]n |2 + χC (f ). (8) tk+1
n=1 d=1
where the choice of the step-size γ to ensure convergence
In (8), Dd is the discrete gradient operator in the dth dimen- will be discussed in Section III-B. Notice that the algorithm
sion, hence the first term in R(f ) is the total variation (TV) (9)-(11) is equivalent to ISTA when α = 0 and is equivalent
cost, and the parameter τ > 0 controls the contribution of the to FISTA when α = 1. For this reason, we call it relaxed
TV cost to the total cost. The second term χC (·) is defined as FISTA. Figure 3 shows that the empirical convergence speed
( of relaxed FISTA improves as α increases from 0 to 1. The plot
0, if f ∈ C
χC (f ) := . was obtained by using the experimentally measured scattered
∞, if f 6∈ C microwave data collected by the Fresnel Institute [46]. Our
Note that D(·) is differentiable if A is non-singular, R(·) is theoretical analysis of relaxed FISTA in Section III-B estab-
proper, convex, and lower semi-continuous if C is convex and lishes convergence for any α ∈ [0, 1) with appropriate choice
closed. of the step-size γ.
The two main elements of relaxed FISTA are the computa-
III. P ROPOSED M ETHOD tion of the gradient ∇D and the proximal mapping proxγR .
Given ∇D(sk ), the proximal mapping (9) can be efficiently
As mentioned in Section I that for a cost function like
solved [47], [48]. The following proposition provides an
(6), the class of proximal gradient methods, including ISTA
explicit formula for ∇D, which enables fast and memory-
[26]–[28] and FISTA [22], can be applied. However, ISTA is
efficient computation of ∇D.
empirically slow and FISTA has only been proved to converge
for convex problems. A variant of FISTA has been proposed in Proposition 1. Let Z(f ) be defined in (5) and w = Z(f ) − y.
[44] for nonconvex optimization with convergence guarantees. Then
∇D(f ) = Re diag(u)H HH w + GH v ,
 
This algorithm computes two estimates from ISTA and FISTA, (12)
respectively, at each iteration, and selects the one with lower
where u and v are obtained from the linear systems
objective function value as the final estimate at the current it-
eration. Therefore, both the gradient and the objective function Au = uin , and AH v = diag(f )HH w. (13)
4

Proof. See Appendix VI-B1. 20

In the above, u and v can be efficiently solved by the 15

SNR [dB]
conjugate gradient method. Note that the formulation for CISOR
CSI
computing the gradient in Proposition 1 is also known as the 10
IL
adjoint state method [49]. In our implementation, A is an FB
operator rather than an explicit matrix, and the convolution 5
with Green’s function in A is computed using the fast Fourier
0
transform (FFT) algorithm. Note that the formula for the 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
gradient defined in (12) is for the scalar field case. The formula Contrast Value
for the 3D vectorial case is provided in (30) in Appendix VI-A.
Fig. 4. Comparison of different reconstruction methods for various contrast
levels tested on simulated data.
B. Theoretical Analysis
We now provide the convergence analysis of relaxed FISTA Relaxed FISTA can be used as a general nonconvex solver
applied to our nonconvex optimization problem (6). for any cost function that has a smooth nonconvex term
The following proposition shows that the data-fidelity term with Lipschitz gradient and a nonsmooth convex term whose
(7) has Lipschitz gradient on a bounded domain. Note that proximal mapping can be easily computed. The convergence
Lipschitz gradient of the smooth term in a composite cost analysis of relaxed FISTA does not require the estimates to be
function like (6) is essential to prove the convergence of constrained on a bounded domain. The condition of bounded U
relaxed FISTA, which we will establish in Proposition 3. in the statement of Proposition 3 is to ensure that the gradient
of the specific function D(f ) defined in (7) is Lipschitz, as
Proposition 2. Suppose that U ⊂ RN is bounded. Assume that
discussed in Proposition 2.
kuin k < ∞ and the matrix A defined in (4) is non-singular for
all s ∈ U. Then D(s) has Lipschitz gradient on U. That is,
IV. E XPERIMENTAL R ESULTS
there exists an L ∈ (0, ∞) such that
We now compare our method CISOR with several state-
k∇D(s1 ) − ∇D(s2 )k ≤ Lks1 − s2 k, ∀s1 , s2 ∈ U. (14) of-the-art methods, including iterative linearization (IL) [29],
Proof. See Appendix VI-B2. [30], contrast sourse inversion (CSI) [31]–[33], and SEAGLE
[40], as well as a conventional linear method, the first Born
Notice that all fk obtained from (9) are within a bounded set approximation (FB) [1]. All algorithms use additive total vari-
C, and each sk+1 obtained from (11) is a linear combination ation regularization. In our implementation, CSI uses Polak-
−1
of fk and fk−1 , where the weight α ttkk+1 ∈ [0, 1) since Ribiére conjugate gradient. CISOR uses the relaxed FISTA
−1 defined in SectionIII-A with α = 0.96 and fixed step-size γ,
α ∈ [0, 1) and ttkk+1 ≤ 1 by (10). Hence, the set that covers all
possible values for {fk }k≥0 and {sk }k≥1 is bounded. Using which is manually tuned. The other methods use the standard
this fact, we have the following convergence guarantee for FISTA [22], also with manually tuned and fixed step-sizes.
relaxed FISTA applied to solve (6). Comparison on simulated data. The wavelength of the
incident wave in this experiment is 7.49 cm. Define the
Proposition 3. Let U in Proposition 2 be the bounded set that contrast of an object with permittivity contrast distribution f as
covers all possible values for {fk }k≥0 and {sk }k≥1 obtained max(|f |). We consider the Shepp-Logan phantom and change
from (9) and (11), L be the corresponding Lipschitz constant its contrast to the desired value to obtain the ground-truth ftrue .
2
defined in (14). Choose 0 < γ ≤ 1−α2L for any fixed α ∈ [0, 1). We then solve the Lippmann-Schwinger equation to generate
Define the gradient mapping as the scattered waves that are then used as measurements. The
f − proxγR (f − γ∇D(f )) center of the image is the origin and the physical size of the
Gγ (f ) := , ∀f ∈ RN . (15) image is set to 120 cm × 120 cm. Two linear detectors are
γ
placed on two opposite sides of the image at a distance of
Then, relaxed FISTA achieves the stationary points of the
95.9 cm from the origin. Each detector has 169 sensors with
cost function F defined in (6) in the sense that the gradient
a spacing of 3.84 cm. The transmitters are placed on a line
mapping norm satisfies
48.0 cm to the left of the left detector, and they are spaced
lim kGγ (fk )k = 0. (16) uniformly in azimuth with respect to the origin within a range
k→∞
of [−60◦ , 60◦ ] at every 5◦ . The reconstructed SNR, which is
Moreover, Let F ∗ denote the global minimum of F and we defined as 20 log10 (kftrue k/kf̂ − ftrue k), is used as the compari-
assume its existence. Then for any K > 0, son criterion. The size the reconstructed images f̂ is 128 × 128
2L(3 + γL)2 (F(f0 ) − F ∗ ) pixels. For each contrast value and each algorithm, we run the
min kGγ (fk )k2 ≤ . (17) algorithm with five different regularization parameter values
k∈{1,...,K} KγL(1 − γL)
and select the result that yields the highest reconstructed SNR.
Proof. See Appendix VI-B3.
Figure 4 shows that as the contrast increases, the recon-
Note that for any f ∈ RN , Gγ (f ) = 0 implies 0 ∈ ∂F(f ), structed SNR of FB and IL decreases, whereas that of CSI and
where ∂F(f ) is the limiting subdifferential [50] of F defined CISOR is more stable. While it is possible to further improve
in (6) at f , hence f is a stationary point of F. the reconstructed SNR for CSI by running more iterations,
5

2.2 0.5

0 0

λ CISOR SEAGLE IL CSI FB


0.5
λ 2.2

0
0

Fig. 5. Reconstructed images obtained by different algorithms from 2D experimentally measured data. The first and second rows use the FoamDielExtTM
and the FoamDielIntTM objects, respectively. From left to right: ground truth, reconstructed images by CISOR, SEAGLE, IL, CSI, and FB. The color-map
for FB is different from the rest, because FB significantly underestimated the contrast value. The size of the reconstructed objects are 128 × 128 pixels.

𝜆 1.76

0
𝑦
𝑥 𝑧=0 mm CISOR IL CSI

𝑥 𝑦=0 mm

𝑦 𝑥 =-25 mm

Fig. 6. Reconstructed images obtained by CISOR, IL, and CSI from 3D experimentally measured data for the TwoShperes object. From left to right: ground
truth, reconstructed image slices by CISOR, IL, and CSI, the reconstructed contrast distribution along the dashed lines showing in the image slices on the first
column. From top to bottom: image slices parallel to the x-y plane with z = 0 mm, parallel to the x-z plane with y = 0 mm, and parallel to the y-z plane
with x = −25 mm. The size of the reconstructed objects are 32 × 32 × 32 pixels for a 150 × 150 × 150 mm cube centered at (0, 0, 0).

CSI is known to be slow as the comparisons shown in [35]. are active for each transmitter. That is, the 119 receivers that
A visual comparison between FB, IL, and CISOR at several are closest to a transmitter are inactive for that transmitter.
contrast levels has been presented in Figure 1 at the beginning While the dataset contains multiple frequency measurements,
of the paper. we only use the ones corresponding to 3 GHz, hence the
Comparison on experimental data. We test our method in wavelength of the incident wave is 99.9 mm. The pixel size
both 2D and 3D settings using the public dataset provided by of the reconstructed images is 1.2 mm.
the Fresnel Institute [46], [51]. Two objects for the 2D setting, In the 3D setting, the transmitters are located on a sphere
FoamDielExtTM and FoamDielintTM, and two objects for the with radius 1.769 m. The azimuthal angle θ ranges from 20◦
3D setting, TwoSpheres and TwoCubes, are considered. to 340◦ with a step of 40◦ , and the polar angle φ ranges
In the 2D setting, the objects are placed within a 150 from 30◦ to 150◦ with a step of 15◦ . The receivers are only
mm × 150 mm square region centered at the origin of the placed on a circle with radius 1.769 m in the azimuthal
coordinate system. The number of transmitters is 8 and the plane with azimuthal angle ranging from 0◦ to 350◦ with
number of receivers is 360 for all objects. The transmitters a step of 10◦ . Only the receivers that are more than 50◦
and the receivers are placed on a circle centered at the origin away from a transmitter are active for that transmitter. A
with radius 1.67 m and are spaced uniformly in azimuth. Only visual representation of this setup is shown in Figure 8 in the
one transmitter is turned on at a time and only 241 receivers Appendix. We use the data corresponding to 4 GHz for the
6

𝜆 1.43

0
𝑦

𝑥 𝑧=33 mm CISOR IL CSI

𝑥 𝑦=-17 mm

𝑦 𝑥 =17 mm

Fig. 7. Reconstructed images obtained by CISOR, IL, and CSI from 3D experimentally measured data for the TwoCubes object. From left to right: ground
truth, reconstructed image slices by CISOR, IL, and CSI, the reconstructed contrast distribution along the dashed lines showing in the image slices on the
first column. From top to bottom: image slices parallel to the x-y plane with z = 33 mm, parallel to the x-z plane with y = −17 mm, and parallel to the y-z
plane with x = 17 mm. The size of the reconstructed objects are 32 × 32 × 32 pixels for a 100 × 100 × 100 mm cube centered at (0, 0, 50) mm.

TwoSpheres object and 6 GHz for the TwoCubes object, hence memory-efficient due to the explicit formula for the gradient
the wavelength of the incident wave is 74.9 mm and 50.0 mm, as provided in Proposition 1. The formula allows using the
respectively. The pixel size is 4.7 mm for the TwoSpheres and conjugate gradient method to accurately compute the gradient
3.1 mm for TwoCubes. without the need of storing all the iterates, which was required
Figure 5 provides a visual comparison of the reconstructed in SEAGLE [41]. Second, CISOR enjoys rigorous theoretical
images obtained by different algorithms for the 2D data. For convergence analysis as established in Proposition 3, while
each object and each algorithm, we run the algorithm with other methods only have reported empirical convergence per-
five different regularization parameter values and select the formance.
result that has the best visual quality. Figure 5 shows that all
nonlinear methods CISOR, SEAGLE, IL, and CSI obtained
reasonable reconstruction results in terms of both the contrast VI. A PPENDIX
value and the shape of the object, whereas the linear method A. 3D Diffractive Imaging
FB significantly underestimated the contrast value and failed to
capture the shape. These results demonstrate that the proposed 1) Problem Formulation: The measurement scenario for
method is competitive with several state-of-the-art methods. the 3D case follows that in [51] and is illustrated in Fig-
Figures 6 and 7 present the results for the TwoSpheres and ure 8. The fundamental object-wave relationship in case of
the TwoCubes objects, respectively. Again, the results show electromagnetic wave obeys Maxwell’s equations. For time-
that CISOR is competitive with the state-of-the-art methods harmonic electromagnetic field and under the Silver-Müller
for the 3D vectorial setting as well. radiation condition, it can be shown that the solution to
Maxwell’s equations is equivalent to that of the following
V. C ONCLUSION integral equation [52]:
In this paper, we proposed a nonconvex formulation for
Z
~
E(r) =E~ in (r) + (k 2 I + ∇∇·) g(r − r0 )f (r0 )E(r
~ 0 )dr0 ,
nonlinear diffractive imaging based on the scalar theory of

diffraction, and further extended our formulation to the 3D (18)
vectorial field setting. The nonconvex optimization problem ~
which holds for all r ∈ R3 . In (18), E(r) ∈ C3 is the electric
was solved by our new variant of FISTA. We provided an field at spatial location r, which is the sum of the incident
explicit formula for fast computation of the gradient at each field E~ in (r) ∈ C3 and the scattered field E~ sc (r) ∈ C3 . The
FISTA iteration and proved that the algorithm converges for scalar permittivity contrast distribution is defined as f (r) =
our nonconvex problem. Numerical results demonstrated that ((r) − b ), where (r) is the permittivity of the object, b
the proposed method is competitive with several state-of-the- is the permittivity of the background, and k = 2π/λ is the
art methods. The advantages of CISOR over other methods wavenumber in vacuum. We assume that f (r) is real. The
are mainly in the following two aspects. First, CISOR is more 3D free-space scalar Green’s function g(r) is defined in (1).
7

𝑧 the rectangular grid on Ω is important in order for discrete


differentiation to be easily defined. Using these definitions, the
discrete system that corresponds to (20) and (18) with r ∈ Ω
can then be written as
𝜙 Tx N
X
~ sc (rm ) = δ 3
E ~ n ),
G(rm − rn )f (rn )E(r (22)
polarization
𝐸 in n=1
~ n) = E
E(r ~ in (rn ) + (k 2 + ∇∇·)B(r
~ n ), (23)
Ω
sc
𝐸 𝑂 𝑥 where m = 1, . . . , M , n = 1, . . . , N , and
𝜃
Rx Tx N
Γ 𝐸 in
X
~ n ) = δ3
B(r ~ k ),
g(rn − rk )f (rk )E(r n = 1, . . . , N.
polarization 𝑦 polarization
k=1

Let us organize E ~ n and E


~ nin for n = 1, . . . , N into column
vectors as follows:
 (1)   in,(1) 
E E
e = E (2)  ,
u e in = E in,(2)  ,
u (24)
Fig. 8. The measurement scenario for the 3D case considered in this paper. E (3) E in,(3)
The object is placed within a bounded image domain Ω. The transmitter
antennas (Tx) are placed on a sphere and are linearly polarized. The arrows where E (i) ∈ CN is a column vector whose nth coordinate
in the figure define the polarization direction. The receiver antennas (Rx) are (i)
is En ; similar notation applies to E in,(i) ∈ CN . Then the
placed in the sensor domain Γ within the x-y (azimuth) plane, and are linearly
polarized along the z direction. discretized inverse scattering problem is defined as follows:
 
y
e =He I3 ⊗ diag(e f) ue +e
e, (25)
As illustrated in Figure 8, the measurements in our problem  
are the scattered field measured in the sensor region Γ, u
e =ue in + (k 2 I + D) I3 ⊗ (Gdiag(
e ef )) ue, (26)
Z
~ sc (r) = (k 2 I + ∇∇·) g(r − r0 )f (r0 )E(r
~ 0 )dr0 , ∀r ∈ Γ, where Ip is a p × p identity matrix and we drop the subscript
E
Ω p if the dimension is clear from the context, e f ∈ RN is the
(19) vectorized permittivity contrast distribution, which is assumed
~ in (19) is obtained by evaluating (18)
where the total field E to be real in this work, D ∈ R3N ×3N is the matrix representa-
at r ∈ Ω. In the experimental setup considered in this paper, tion of the gradient-divergence operator ∇∇·, y e ∈ CM is the
Γ ∩ Ω = ∅, hence Green’s function is non-singular within the scattered wave measurement vector with measurement noise
integral region in (19). Therefore, we can conveniently move e ∈ CM , and G
e e ∈ CN ×N is the matrix representation of the
the gradient-divergence operator ∇∇· inside the integral. Then convolution operator induced by the 3D free-space Green’s
(19) becomes function (1). Note that according to the polarization of the
receiver antennas in the experimental setup we consider in this
Z
~ sc (r) = G(r − r0 )f (r0 )E(r
E ~ 0 )dr0 , ∀r ∈ Γ, (20) sc,(3)

paper, the ideal noise-free measurements yem − eem are Em ,
for m = 1, ..., M , which are the scattered wave E ~ along the
sc

where G(r − r0 ) = (k 2 I + ∇∇)g(r − r0 ) is the dyadic Green’s z-dimension measured at M different locations in the sensor
function in free-space, which has an explicit form: region Γ. Therefore, H e ∈ CM ×3N is the matrix representation
of the convolution operator induced by the third row of the
r − r0 r − r0
  
3 3j
G(r − r0 ) = k 2 − − 1 ⊗ dyadic Green’s function (21). Define
k 2 d2 kd d d  

j 1
 
Ae := I − (k 2 I + D) I3 ⊗ (Gdiag(
e f )) .
e (27)
+ 1+ − I g(r − r0 ), (21)
kd k 2 d2 Similar to the scalar field case, we can see that the mea-
surement vector y e is nonlinear in the unknown e f , because
where ⊗ denotes the Kronecker product, d = kr − r0 k, and I e −1 u
u
e depends on e f according to u
e =A e in , which follows from
is the unit dyadic.
(26).
2) Discretization: To obtain a discrete system for (18) Next, we define the discrete gradient-divergence operator
and (20), we define the image domain Ω as a cube and ∇∇·, for which the matrix representation is D in (26). For a
uniformly sample Ω on a rectangular grid with sampling step scalar function f of the spatial location rl,s,t , denote f (rl,s,t )
δ in all three dimensions. Let the center of Ω be the origin by fl,s,t . Following the finite difference rule,
and let rl,s,t := ((l − J/2 − 0.5)δ, (s − J/2 − 0.5)δ, (t − J/2 −
0.5)δ) for r, s, t = 1, . . . , J. To simplify the notation, we use a ∂2 fl−1,s,t − 2fl,s,t + fl+1,s,t
fl,s,t := ,
one-to-one mapping (l, s, t) 7→ n and denote the samples by rn ∂x2 δ2
2
for n = 1, . . . , N , where N = J 3 . Assume that M detectors are ∂ fl−1,s−1,t − fl−1,s+1,t
fl,s,t :=
placed at rm for m = 1, . . . , M . Note that the detectors do not ∂x∂y 4δ 2
have to be placed on a regular grid, because the dyadic Green’s fl+1,s+1,t − fl+1,s−1,t
+ .
function 21 can be evaluated at any spatial points, whereas 4δ 2
8

∂2 ∂ 2
∂ 2
The definitions of and ∂y∂z
∂x∂z are similar to that of ∂x∂y . Using the definition w = Z(f ) − y and summing over m =
Moreover, for a vector E
(i)
~ m,n,p ∈ C3 , we use Em,n,p to denote 1, ..., M ,
its ith coordinate. Then we have that the first coordinate of M  
E~ m,n,p is X ∂Zm
[∇D(f )]n = wm
m=1
∂fn
(1)
Em,n,p = E in,(1) + k 2 Bm,n,p
(1)
M N  M
 m,n,p
 X
 X X ∂ui
∂ ∂ (1) ∂ (2) ∂ (3) = un Hm,n wm + fi Hm,i wm
+ B + B + B ∂fn
∂x1 ∂x1 m,n,p ∂x2 m,n,p ∂x3 m,n,p m=1 i=1 m=1
(1) (1) (1) N  
Bm+1,n,p − 2Bm,n,p + Bm−1,n,p
X ∂ui
= un HH w n + fi HH w i ,
   
in,(1)
= Em,n,p + k 2 Bm,n,p
(1)
+ (31)
δ2 i=1
∂fn
(2) (2) (2) (2)
Bm−1,n−1,p − Bm−1,n+1,p − Bm+1,n−1,p + Bm+1,n+1,p
+ where a denotes the complex conjugate of a ∈ C. Label the
4δ 2 two terms in (31) as T1 and T2 , then
(3) (3) (3) (3)
Bm−1,n,p−1 − Bm−1,n,p+1 − Bm+1,n,p−1 + Bm+1,n,p+1
+ .
T1 = diag(u)H HH w n ,
 
4δ 2 (32)

The second and third coordinates of E ~ m,n,p can be obtained and


in a similar way. H
∂A−1

3) CISOR for 3D: The data-fidelity term D(·) in the (a)
T2 = (u ) in H
diag(f )HH w
objective function(6) is now defined
 as D(ef ) := 12 ke
y − Z(eef )k2 , ∂fn
 H
where Z( e I3 ⊗ diag(e
f ) := H f) ue . Let we = Z( f)−y e , and (b) ∂A
= −(uin )H A−H A−H diag(f )HH w
ee ee
∂fn
  H
u)H He H w + (I3 ⊗ G
e H )(k 2 I + DH )e

g = diag(e v , (28) (c) ∂A (d)
= −uH (f ) v = [diag(u)H GH v]n . (33)
∂fn
where u
e and v
e are obtained from the linear systems
In the above, step (a) holds by plugging in ui = A−1 uin i .
 

Ae e in ,
e u=u e Hv
and A e = (I3 ⊗ diag(e e H w,
f ))H e (29) Step (b) uses the identity

where A e is defined in (27). Then the gradient of D can be ∂A−1 ∂A −1


= −A−1 A , (34)
written as ∂fn ∂fn
( 3 )
which follows by differentiating both sides of AA−1 = I,
X
∇D(f ) = Re g(i) (30)
i=1
∂A −1 ∂A−1
A +A = 0.
with g(i) = (g(i−1)N +1 , . . . , giN ) ∈ CN for i = 1, 2, 3. ∂fn ∂fn

From step (b) to step (c), we used the fact that u = A−1 uin and
B. Proofs for Theoretical Results defined v := A−H diag(f )HH w, which matches (13). Finally,
step (d) follows by plugging in A = I − Gdiag(f ). Combining
1) Proof for Proposition
1: The gradient of D(·) defined (31), (32), and (33), we have obtained the expression in (12).
in (7) is Re JH Z w , where JZ is the Jocobian matrix of Z(·) Note that the extension to the 3D vectorial field case
5, which is defined as (30) is straightforward. We highlight the differences from the
scalar field case in the following. Let H e = [He (1) |H
e (2) |H e (3) ],
∂Z1 ∂Z1
...
 
(i) M ×N
∂f1 ∂fN where H ∈ C
e for i = 1, 2, 3. Following the chain rule
JZ =  .. .. .. of differentiation, we have
.
 
. . .
∂ZM ∂ZM
∂f1 ... ∂fN 3 3 N
"
(k)
#
∂ Zem X e (k) (k) X X ∂e ui e (k) fei .
= Hm,n u
en + H m,i
Recall that A = (I − Gdiag(f )) and u = A−1 uin , hence both A ∂ fen k=1 k=1 i=1 ∂ f
en
and u are functions of f . We omit such dependencies in our
notation for brevity. Following the chain rule of differentiation, Using the definition w
e and summing over m = 1, ..., M ,

M
" # 3
N
∂Zm ∂ X
h i X ∂ Zem X (k)
h
e (k) )H w
i
= Hm,i fi ui ∇D(e
f) = w
em = u
en (H e
∂fn ∂fn i=1 n
m=1 ∂ fn
e
k=1
n

N  3 X N
" #
 (k)
X ∂ui X ∂e
ui h
e (k) )H w
i
= Hm,n un + Hm,i fi . + fei (H e . (35)
i=1
∂fn k=1 i=1 ∂ fen i
9

Label the two terms in (35) as T1 and T2 , then 3) Proof for Proposition 3: First, we show that the Lips-
3 h
chitz gradient condition (14) implies that for all x, y ∈ U,
X i
T1 = u(k) )H (H
diag(e e (k) )H w , (36) L
D(y) − D(x) − h∇D(x), y − xi ≥ − kx − yk2 . (38)
e
n
k=1 2
3 (k) !H
∂A−1 in Define a function h : R → R as h(λ) := D(x + λ(y − x)),

(a) X
T2 = u diag(f )(H(k) )H w then h0 (λ) = h∇D(x + λ(y − x)), y − xi. Notice that h(1) =
∂fn
k=1 D(y)
" #(k) H R 1 0 and h(0) = D(x). Using the equality h(1) = h(0) +
3
(b) X ∂ Ae 0
h (λ)dλ, we have that
= e −1
 −A u
e e (k) )H w
f )(H
 diag(e e Z 1
k=1 ∂ fen
D(y) = D(x) + h∇D(x + λ(y − x)), y − xidλ
3
 " #H (k) 0
Z 1
(c) X ∂A e
= uH
−e v
e = D(x) + h∇D(x), y − xidλ
k=1 ∂ fen 0
Z 1
3 h
(d) X
i + h∇D(x + λ(y − x)) − ∇D(x), y − xidλ
= u(k) )H G
diag(e e H [(k 2 I + DH )e
v](k) . (37) 0
n Z 1
k=1 (a)
≥ D(x) − λLky − xk2 dλ + h∇D(x), y − xi
h i(k) 0
(k) e −1 u
In the above, step (a) follows from uei = A e in . Step L
i
e −H (I3 ⊗ = D(x) − ky − xk2 + h∇D(x), y − xi,
(b) follows from (34). In step (c), we defined ve := A 2
He
diag(f ))H w, which matches (29). Finally, step (d) follows
e e where step (a) uses Cauchy-Schwarz inequality and the Lip-
by plugging in the definition of Ae in (27). Combining (35), schitz gradient condition (14).
(36), and (37), we have obtained the expression in (30). Next, by (9), we have that for all x ∈ U, t ≥ 0,
2) Proof for Proposition 2: For a vector a that is a function
st − ft
of s, ai denotes the value of a evaluated at si . Using this R(x) ≥ R(ft ) + h − ∇D(st ), x − ft i. (39)
notation, we have γ
Let y = fk , x = fk+1 in (38) and x = fk , t = k + 1 in (39).
k∇D(s1 ) − ∇D(s2 )k ≤ kdiag(u1 )H HH w1 − diag(u2 )H HH w2 k Adding (38) and (39), we have
+ kdiag(u1 )H GH v1 − diag(u2 )H GH v2 k.
F(fk+1 ) − F(fk ) ≤ h∇D(fk+1 ) − ∇D(sk+1 ), fk+1 − fk i
Label the two terms on the RHS as T1 and T2 . We will prove 1 L
+ hsk+1 − fk+1 , fk+1 − fk i + kfk+1 − fk k2
T1 ≤ L1 ks1 − s2 k for some L1 ∈ (0, ∞), and T2 ≤ L2 ks1 − s2 k γ 2
can be proved in a similar way for some L2 ∈ (0, ∞). The (a) L L 1
result (14) is then obtained by letting L = L1 + L2 . ≤ ksk+1 − fk+1 k2 + kfk+1 − fk k2 + ksk+1 − fk k2
2 2 2γ
1 1 L
T1 ≤ kdiag(u1 )H HH w1 − diag(u2 )H HH w1 k − ksk+1 − fk+1 k2 − kfk+1 − fk k2 + kfk+1 − fk k2
2γ 2γ 2
+ kdiag(u2 )H HH w1 − diag(u2 )H HH w2 k (b)

1
 
1

≤ ku1 − u2 kkHkop kw1 k + kA−1 in ≤ − L kfk − fk−1 k2 − − L kfk+1 − fk k2
2 kku kkHkop kw1 − w2 k, 2γ 2γ
 
1 L
where k · kop denotes the operator norm and the last inequality − − ksk+1 − fk+1 k2 .
uses the fact that kdiag(d)kop = maxn∈[N ] |dn | ≤ kdk. We now 2γ 2
bound ku1 − u2 k and kw1 − w2 k. In the above, step (a) uses Cauchy-Schwarz, Proposition 2,
as well as the fact that 2ab ≤ a2 + b2 and 2ha − b, b − ci =
ku1 − u2 k = kA−1 in −1 in −1 −1 in
1 u − A2 u k ≤ kA1 − A2 kop ku k ka − ck2 − ka − bk2 − kb − ck2 . Step (b) uses the condition
2
= kA−1 −1 in
1 (A2 − A1 )A2 kop ku k in the proposition statement that γ ≤ 1−α 2L and (11), which
tk −1
≤ kA−1 −1 in implies ksk+1 − fk k ≤ α tk+1 kfk − fk−1 k, where we notice that
1 kop kGkop ks1 − s2 kkA2 kop ku k,
tk −1
kw1 − w2 k ≤ kHdiag(s1 )u1 − Hdiag(s1 )u2 k tk+1 ≤ 1 by (10), and α < 1 by our assumption. Summing both
sides from k = 0 to K:
+ kHdiag(s1 )u2 − Hdiag(s2 )u2 k
  K−1
≤ kHkop ks1 kku1 − u2 k + kHkop ks1 − s2 kkA−1 in 1 L X
2 kop ku k. − ksk+1 − fk+1 k2 ≤ F(f0 ) − F(fK )
2γ 2
Then the result T1 ≤ L1 ks1 − s2 k follows by noticing  k=0
1
that ks1 k, kuin k, kGkop , kHkop , and kA−1 − L kf0 − f−1 k2 − kfK − fK−1 k2 ≤ F(f0 ) − F ∗ ,

i kop for +
i = 1, 2 are bounded, and the fact that kw1 k ≤ kyk + 2γ
kHkop ks1 kkA−1 in
1 kop ku k < ∞. where F ∗ is the global minimum. The last step follows by
The Lipschitz property of the gradient (30) in the 3D case letting f−1 = f0 , which satisfies (11) for the initialization s1 =
can be proved in a similar way. f0 , and the fact that F ∗ ≤ F(fK ).
10

Recall the definition of the gradient mapping Gγ in (15), we [4] J. W. Lim, K. R. Lee, K. H. Jin, S. Shin, S. E. Lee, Y. K. Park, and
have Gγ (sk ) = sk −f
γ . Therefore,
k J. C. Ye, “Comparative study of iterative reconstruction algorithms for
missing cone problems in optical diffraction tomography,” Opt. Express,
K vol. 23, no. 13, pp. 16 933–16 948, June 2015.
X 2L (F(f0 ) − F ∗ ) [5] Y. Sung and R. R. Dasari, “Deterministic regularization of three-
kGγ (sk )k2 ≤ , (40) dimensional optical diffraction tomography,” J. Opt. Soc. Am. A, vol. 28,
γL(1 − γL)
k=1 no. 8, pp. 1554–1561, August 2011.
PK [6] V. Lauer, “New approach to optical diffraction tomography yielding
2
which implies that limK→∞ k=1 kGγ (sk )k < ∞, hence a vector equation of diffraction tomography and a novel tomographic
microscope,” J. Microsc., vol. 205, no. 2, pp. 165–176, 2002.
lim kGγ (sk )k = 0. (41) [7] Y. Sung, W. Choi, C. Fang-Yen, K. Badizadegan, R. R. Dasari, and
k→∞ M. S. Feld, “Optical diffraction tomography for high resolution live cell
Note that (41) establishes that the gradient mapping acting imaging,” Opt. Express, vol. 17, no. 1, pp. 266–277, December 2009.
[8] T. Kim, R. Zhou, M. Mir, S. Babacan, P. Carney, L. Goddard, and
on the sequence {sk }k≥0 generated from relaxed FISTA G. Popescu, “Supplementary information: White-light diffraction tomog-
converges to 0. In the following, we show that the gra- raphy of unlabelled live cells,” Nat. Photonics, vol. 8, pp. 256–263,
dient mapping acting on {fk }k≥0 converges to 0 as well, March 2014.
[9] J. Sharpe, U. Ahlgren, P. Perry, B. Hill, A. Ross, J. Hecksher-Sørensen,
which is the desired result stated in (16). By (9) and (15), R. Baldock, and D. Davidson, “Optical projection tomography as a tool
we have Gγ (sk ) = γ1 (sk − fk ). Therefore, (41) implies that for 3D microscopy and gene expression studies,” Science, vol. 296, no.
limk→∞ ksk − fk k = 0. Moreover, 5567, pp. 541–545, April 2002.
[10] W. Choi, C. Fang-Yen, K. Badizadegan, S. Oh, N. Lue, R. R. Dasari, and
(a) 1 M. S. Feld, “Tomographic phase microscopy: Supplementary material,”
kGγ (fk ) − Gγ (sk )k ≤ ksk − fk k Nat. Methods, vol. 4, no. 9, pp. 1–18, September 2007.
γ [11] T. S. Ralston, D. L. Marks, P. S. Carney, and S. A. Boppart, “Inverse
1 scattering for optical coherence tomography,” J. Opt. Soc. Am. A, vol. 23,
+ kproxγR (fk − γ∇D(fk )) − proxγR (sk − γ∇D(sk ))k no. 5, pp. 1027–1037, May 2006.
γ
[12] B. J. Davis, S. C. Schlachter, D. L. Marks, T. S. Ralston, S. A. Boppart,
(b) 1 1 and P. S. Carney, “Nonparaxial vector-field modeling of optical coher-
≤ ksk − fk k + kfk − γ∇D(fk ) − (sk − γ∇D(sk )) k ence tomography and interferometric synthetic aperture microscopy,” J.
γ γ
Opt. Soc. Am. A, vol. 24, no. 9, pp. 2527–2542, September 2007.
(c) 1 1 [13] D. J. Brady, K. Choi, D. L. Marks, R. Horisaki, and S. Lim, “Com-
≤ ksk − fk k + ksk − fk k + Lksk − fk k, (42) pressive holography,” Opt. Express, vol. 17, no. 15, pp. 13 040–13 049,
γ γ
2009.
where step (a) follows by (15) and the triangle inequality, [14] L. Tian, N. Loomis, J. A. Dominguez-Caballero, and G. Barbastathis,
step (b) follows by the well-known property that the proximal “Quantitative measurement of size and three-dimensional position of
fast-moving bubbles in airwater mixture flows using digital holography,”
operators for convex functions are Lipschitz-1, and step (c) Appl. Opt., vol. 49, no. 9, pp. 1549–1554, March 2010.
follows by the triangle inequality and the result that ∇D [15] W. Chen, L. Tian, S. Rehman, Z. Zhang, H. P. Lee, and G. Barbastathis,
is Lipschitz-L as we established in Proposition 2. Taking “Empirical concentration bounds for compressive holographic bubble
imaging based on a Mie scattering model,” Opt. E, vol. 23, no. 4, p.
the limit as k → ∞, we have limk→∞ kGγ (fk ) − Gγ (sk )k = 0, February, 2015.
hence, limk→∞ Gγ (fk ) = limk→∞ Gγ (sk ) = 0. [16] H. M. Jol, Ed., Ground Penetrating Radar: Theory and Applications.
Moreover, by (42), Amsterdam: Elsevier, 2009.
[17] M. Leigsnering, F. Ahmad, M. Amin, and A. Zoubir, “Multipath
kGγ (fk )k − kGγ (sk )k ≤ kGγ (fk ) − Gγ (sk )k ≤ (2 + γL)kGγ (sk )k, exploitation in through-the-wall radar imaging using sparse reconstruc-
tion,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 2, pp. 920–939,
April 2014.
hence, kGγ (fk )k2 ≤ (3 + γL)2 kGγ (sk )k2 for all k ≥ 0. Then by
[18] D. Liu, U. S. Kamilov, and P. T. Boufounos, “Compressive tomographic
(40), radar imaging with total variation regularization,” in Proc. IEEE 4th
K International Workshop on Compressed Sensing Theory and its Appli-
X 2L (F(f0 ) − F ∗ ) cations to Radar, Sonar, and Remote Sensing (CoSeRa 2016), Aachen,
kGγ (fk )k2 ≤ (3 + γL)2 , Germany, September 19-22, 2016, pp. 120–123.
γL(1 − γL)
k=1 [19] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ.
Press, 2004.
which implies [20] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed. Springer,
2006.
2L (F(f0 ) − F ∗ )
K min kGγ (fk )k2 ≤ (3 + γL)2 , [21] J. M. Bioucas-Dias and M. A. T. Figueiredo, “A new TwIST: Two-step
k∈{1,...,K} γL(1 − γL) iterative shrinkage/thresholding algorithms for image restoration,” IEEE
Trans. Image Process., vol. 16, no. 12, pp. 2992–3004, December 2007.
hence [22] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding
2L (F(f0 ) − F ∗ ) algorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2,
min kGγ (fk )k2 ≤ (3 + γL)2 . no. 1, pp. 183–202, 2009.
k∈{1,...,K} KγL(1 − γL) [23] B. Chen and J. J. Stamnes, “Validity of diffraction tomography based
on the first born and the first rytov approximations,” Appl. Opt., vol. 37,
no. 14, pp. 2996–3006, May 1998.
R EFERENCES
[24] V. Ntziachristos, “Going deeper than microscopy: the optical imaging
[1] M. Born and E. Wolf, Principles of Optics, 7th ed. Cambridge Univ. frontier in biology,” Nat. Methods, vol. 7, no. 8, pp. 603–614, August
Press, 2003, ch. Scattering from inhomogeneous media, pp. 695–734. 2010.
[2] A. J. Devaney, “Inverse-scattering theory within the Rytov approxima- [25] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based
tion,” Opt. Lett., vol. 6, no. 8, pp. 374–376, August 1981. noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268,
[3] M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, and H. Azhari, November 1992.
“Reconstruction in diffraction ultrasound tomography using nonuniform [26] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet-
FFT,” IEEE Trans. Med. Imag., vol. 21, no. 11, pp. 1395–1401, based image restoration,” IEEE Trans. Image Process., vol. 12, no. 8,
November 2002. pp. 906–916, August 2003.
11

[27] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative thresholding al-


gorithm for linear inverse problems with a sparsity constraint,” Commun.
Pure Appl. Math., vol. 57, no. 11, pp. 1413–1457, November 2004.
[28] J. Bect, L. Blanc-Feraud, G. Aubert, and A. Chambolle, “A `1 -unified
variational framework for image restoration,” in Proc. ECCV, Springer,
Ed., vol. 3024, New York, 2004, pp. 1–13.
[29] K. Belkebir, P. C. Chaumet, and A. Sentenac, “Superresolution in total
internal reflection tomography,” J. Opt. Soc. Am. A, vol. 22, no. 9, pp.
1889–1897, September 2005.
[30] P. C. Chaumet and K. Belkebir, “Three-dimensional reconstruction from
real data using a conjugate gradient-coupled dipole method,” Inv. Probl.,
vol. 25, no. 2, p. 024003, 2009.
[31] P. M. van den Berg and R. E. Kleinman, “A contrast source inversion
method,” Inv. Probl., vol. 13, no. 6, pp. 1607–1620, December 1997.
[32] A. Abubakar, P. M. van den Berg, and T. M. Habashy, “Application of
the multiplicative regularized contrast source inversion method tm- and
te-polarized experimental fresnel data,” Inv. Probl., vol. 21, no. 6, pp.
S5–S14, 2005.
[33] M. T. Bevacquad, L. Crocco, L. Di Donato, and T. Isernia, “Non-linear
inverse scattering via sparsity regularized contrast source inversion,”
IEEE Trans. Comp. Imag., 2017.
[34] K. Belkebir and A. Sentenac, “High-resolution optical diffraction mi-
croscopy,” J. Opt. Soc. Am. A, vol. 20, no. 7, pp. 1223–1229, July 2003.
[35] E. Mudry, P. C. Chaumet, K. Belkebir, and A. Sentenac, “Electromag-
netic wave imaging of three-dimensional targets using a hybrid iterative
inversion method,” Inv. Probl., vol. 28, no. 6, p. 065007, April 2012.
[36] T. Zhang, C. Godavarthi, P. C. Chaumet, G. Maire, H. Giovannini,
A. Talneau, M. Allain, K. Belkebir, and A. Sentenac, “Far-field diffrac-
tion microscopy at λ/10 resolution,” Optica, vol. 3, no. 6, pp. 609–612,
June 2016.
[37] U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch,
M. Unser, and D. Psaltis, “Learning approach to optical tomography,”
Optica, vol. 2, no. 6, pp. 517–522, June 2015.
[38] ——, “Optical tomographic image reconstruction based on beam prop-
agation and sparse regularization,” IEEE Trans. Comp. Imag., vol. 2,
no. 1, pp. 59–70,, March 2016.
[39] U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos, “A recursive
Born approach to nonlinear inverse scattering,” IEEE Signal Process.
Lett., vol. 23, no. 8, pp. 1052–1056, August 2016.
[40] H.-Y. Liu, U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos,
“Compressive imaging with iterative forward models,” in Proc. IEEE
Int. Conf. Acoustics, Speech and Signal Process. (ICASSP 2017), New
Orleans, LA, USA, March 5-9, 2017, pp. 6025–6029.
[41] H.-Y. Liu, D. Liu, H. Mansour, P. T. Boufounos, L. Waller, and
U. S. Kamilov, “SEAGLE: Sparsity-driven image reconstruction under
multiple scattering,” May 2017, arXiv:1705.04281 [cs.CV].
[42] E. Soubies, T.-A. Pham, and M. Unser, “Efficient inversion of multiple-
scattering model for optical diffraction tomography,” Opt. Express,
vol. 25, no. 18, pp. 21 786–21 800, Sep 2017.
[43] J. W. Goodman, Introduction to Fourier Optics, 2nd ed. McGraw-Hill,
1996.
[44] H. Li and Z. Lin, “Accelerated proximal gradient methods for nonconvex
programming,” in Proc. Advances in Neural Information Processing
Systems 28, Montreal, Canada, December 7-12 2015.
[45] S. Ghadimi and G. Lan, “Accelerated gradient methods for nonconvex
nonlinear and stochastic programming,” Math. Program. Ser. A, vol. 156,
no. 1, pp. 59–99, March 2016.
[46] J.-M. Geffrin, P. Sabouroux, and C. Eyraud, “Free space experimental
scattering database continuation: experimental set-up and measurement
precision,” Inv. Probl., vol. 21, no. 6, pp. S117–S130, 2005.
[47] A. Beck and M. Teboulle, “Fast gradient-based algorithm for constrained
total variation image denoising and deblurring problems,” IEEE Trans.
Image Process., vol. 18, no. 11, pp. 2419–2434, November 2009.
[48] U. S. Kamilov, “A parallel proximal algorithm for anisotropic total
variation minimization,” IEEE Trans. Image Process., vol. 26, no. 2,
pp. 539–548, February 2017.
[49] R.-E. Plessix, “A review of the adjoint-state method for computing the
gradient of a functional with geophysical applications,” Geophysical
Journal International, vol. 167, no. 2, pp. 495–503, 2006.
[50] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis. Springer
Science & Business Media, 2009.
[51] J.-M. Geffrin and P. Sabouroux, “Continuing with the Fresnel database:
experimental setup and improvements in 3D scattering measurements,”
Inv. Probl., vol. 25, no. 2, p. 024001, 2009.
[52] D. Colton and R. Kress, Inverse Acoustic and Electromagnetic Scattering
Theory. Springer Science & Business Media, 1992, vol. 93.

You might also like