Professional Documents
Culture Documents
Optimization Problems
Keiji Matsumoto
Quantum Computation Group, National Institute of Informatics,
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430,
e-mail:keiji@nii.ac.jp
February 6, 2018
Abstract
Perspective gf (λ1 , λ2 ) := λ2 f (λ2 /λ1 ) of a convex function f is positively ho-
mogeneous convex function from R≥ × R≥ to R. Its operator version had been
proposed and studied by several authors [3][4], when f is operator convex. This
paper gives characterization of the quantity as solution to a minimization prob-
lem and a maximization problem, when underlying Hilbert space is finite di-
mensional. By these characterization, new proofs of known properties of the
quantity are given, and explicit representation of the quantity when the argu-
ments are not full - rank is given. In addition, when the overlap of the support of
two arguments is one - dimensional, the quantity can be defined for any convex
but not necessarily operator convex function f .
1 Introduction
Perspective gf (λ1 , λ2 ) := λ2 f (λ2 /λ1 ) of a convex function f is positively ho-
mogeneous convex function from R≥ × R≥ to R. Its operator version had been
proposed and studied by several authors [3][4], when f is operator convex. This
paper gives characterization of the quantity as solution to certain minimization
problem and maximization problem, when underlying Hilbert space is finite di-
mensional. By these characterization, new proofs of known properties of the
quantity are given, and explicit representation of the quantity when the argu-
ments are not full - rank is given. In addition, when the overlap of the support of
two arguments is one - dimensional, the quantity can be defined for any convex
but not necessarily operator convex function f .
In this paper, we define operator perspective function as solution to a mini-
mization problem, and later show this definition coincide with the one in [3][4],
since its properties, representations, and extensions are easier to prove from
this new definition. Another motivation of this definition is link to maximal
1
non - commutative f - divergence, that is defined by minimization problem and
coincide with tr gf (ρ, σ) when f is operator convex.
Here, summary of new results are given. When f is operator convex and
ρ ̸> 0, σ ̸> 0,
Here f (r) := rf (1/r), X > 0. Also, ρ̃ and σ̃ is the largest self - adjoint op-
erators supported on supp ρ ∩ supp σ and ρ̃ ≤ ρ, σ̃ ≤ σ. (ρ̃ in fact is Shur
complement, ρ11 − ρ12 ρ−1 22 ρ21 , where the block components are defined accord-
ing to the decomposition supp σ + ker σ.) This formula in fact valid for any
convex but not necessarily operator convex function f provided supp ρ ∩ supp σ
is one - dimensional.
In particular, if ρ and σ are rank -1,
{
f (r) |ϕ〉 〈ϕ| , if |ψ〉 〈ψ| = r |ϕ〉 〈ϕ| , ∃r ≥ 0,
gf (|ψ〉 〈ψ| , |ϕ〉 〈ϕ|) =
f (0) |ϕx 〉 〈ϕx | + fˆ (0) |ψx 〉 〈ψx | , otherwise.
In fact, it turns out that gf is the largest convex function satisfying above.
Also, we show dual representation,
where
Wf (H) := {(Λ1 , Λ2 ) ; pΛ1 + qΛ2 ≤ gf (p, q) I, ∀p, q ≥ 0} . (2)
Here, Λ ≥ 0 (Λ ∈ B (B (H))) means Λ (X) ≥ 0 for any X ∈ B≥ (H), and I ∈
B (B (H)) is an identity map. This is non - commutative version of representation
of gf (λ1 , λ2 ) as point - wise supremum of linear functions.
Using this representation as the pointwise supremum of linear functions, we
show continuity properties of gf . Also, as a by-product of the proof of the
dual representation, the following necessary and sufficient condition of operator
convexity is given (51).
In the end, we investigate the property of generalized maximal f - divergence
Df,A (ρ∥σ). This quantity coincides with maximal f - divergence if A = I and
with tr Agf (ρ, σ) if f is operator convex.
2
positive elements, and strictly positive elements are denoted by Bsa (H), B≥ (H),
and B> (H), respectively.
Also, for each positive operator X, denote by supp X the its support, and
by πX the projection onto supp X. The projection onto the space K is denoted
by πK . Orthogonal complement of the projector π is denoted by π ⊥ . For each
X ∈ B (H), X −1 denotes Moor-Penrose generalized inverse of X.
R≥ and R> denotes positive and strict positive half-line, respectively. As in
[8], we suppose that h is a map from Rn to R∪ {±∞}. Instead of saying that h
is not defined on a certain set, we say that h (r) = ∞ on that set. The effective
domain of h, denoted by dom h , is the set of all r’s with h (r) < ∞. h is said to be
convex iff its epigraph, or the set epi h := {(r, λ) ; λ ≥ h (r)} is convex. A convex
function h is proper iff h is nowhere −∞ and not ∞ everywhere, and is closed
iff the set {r; λ ≥ h (r)} is closed for any λ2 , or equivalently, iff its epigraph is
closed (Theorem 7.1 of [8]) , or equivalently, iff it is lower semi - continuous.
Given a convex function h, its closure cl h is the greatest closed, or equiv-
alently, lower semi-continuous (not necessarily finite) function majorized by h.
The name comes from the fact that epi (cl h) = cl (epi h). cl h coincide with h
except perhaps at the relative boundary points of its effective domain. If h is
proper and convex, so is cl h (Theorem 7.4, [8]).
In the paper, unless otherwise mentioned, f satisfies
(FC) f is a proper closed convex function with (0, ∞) ⊂ dom f
For each f with (FC), its perspective function gf is
λ2 f (λ1 /λ2 ) , if λ1 ∈ dom f, λ2 > 0
limλ2 ↓0 λ2 f (λ1 /λ2 ) , if λ1 ∈ dom f, λ2 = 0,
gf (λ1 , λ2 ) := (3)
0, if λ1 = λ2 = 0,
∞, otherwise.
Also, for each r ≥ 0, define
fˆ (r) := gf (1, r) .
3
for any element A ∈ Bsa (H). Also, A + ∞ := ∞ and A − ∞ := −∞ for any
A ∈ Bsa (H).
Suppose ρ, σ ∈ B≥ (H). Then a simultaneous decomposition of {ρ, σ} is
{sx , px , qx }x∈X , where sx ∈ B≥ (H), px ∈ R≥ , qx ∈ R≥ and |X | < ∞, such that
∑ ∑
ρ= px sx , σ = qx sx .
x∈X x∈X
provided the RHS exists in Bsa (H). If for any simultaneous decomposition
gf (px , qx ) = ∞, ∃ x ∈ X ,
gf (ρ, σ) := ∞.
If the neither of the above is true, define
gf (ρ, σ) := −∞.
It is easy to see
∀ (p, q) ∈ R2≥ , ∀ρ ≥ 0, gf (pρ, qρ) = gf (p, q) ρ. (4)
Also,
gf (ρ ⊕ 0, σ ⊕ 0) = gf (ρ, σ) ⊕ 0, (5)
since any simultaneous decomposition {sx , px , qx } of {ρ ⊕ 0, σ ⊕ 0} is in the form
of sx = s′x ⊕ 0.
Remark 2.1 Since
∑ ∑
gf (px , qx ) sx = gf (p′x , qx′ ) s′x ,
x∈X x∈X
4
3 Properties
Lemma 3.1 Let |I| < ∞ and consider {ρi , σi }, i ∈ I. If gf (ρi , σi ) > −∞ for
all i ∈ I, ( )
∑ ∑ ∑
gf Λi (ρi ) , Λi (σi ) ≤ Λi (gf (ρi , σi ))
i∈I i∈I i∈I
( )
∑ ∑
gf Λi (ρi ) , Λi (σi )
i∈I i∈I
∑
≤ inf gf (pi,x , qi,x ) Λi (si,x ) ; {si,x , pi,x , qi,x }x∈X , i ∈ I
(i,x)∈X ×I
∑ ∑
= Λi inf gf (pi,x , qi,x ) si,x ; {si,x , pi,x , qi,x }x∈X ,i∈I
i∈I (i,x)∈X ×I
∑
= Λi (gf (ρi , σi )) ,
i∈I
where in the third line the {si,x , pi,x , qi,x }x∈X ranges over all the simultaneous
decompositions of{ρi , σi } for each i ∈ I.
(iii) Suppose f (0) ≤ 0 and gf (ρ, σ) > −∞. Then for all σ ′ ∈ B≥ (H),
5
(iii); With I := {1, 2} , Λ1 = Λ2 := I, ρ1 := ρ, ρ2 := 0, σ1 := σ and σ2 := σ ′ ,
Theorem 3.3
(i) If C −1 exists, ( )
gf CρC † , CσC † = Cgf (ρ, σ) C † (8)
(ii) Suppose ρi , σi ∈ B (Hi ) (i ∈ I) satisfies gf (ρi , σi ) > −∞ (∀i ∈ {1, 2}) and
gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) > −∞, then
Accordingly,
Proof. (i); Applying (ii) with Λ (X) := CXC † , we have ”≤”. The opposite
inequality is proved by applying (ii) with Λ (X) := C −1 XC −1† .
(ii); Let Λi ’s being conjugation of an isometry embedding from Hi into
H1 ⊕ H2 , we obtain ”≤”.
Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi ≥ gf (ρi , σi ) ,
Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi = gf (ρi , σi ) ,
6
and ∑
Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi = gf (ρ1 , σ1 ) ⊕ gf (ρ2 , σ2 ) .
i∈I
This identity and the inequality (11) are both true only if Pi {gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 )} Pj =
0 for all i ̸= j. Thus the LHS of the above identity equals gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ),
and we have the assertion.
(iii); Let {sx , px , qx }x∈X be a simultaneous decomposition of {ρ1 ⊕ ρ2 , σ1 ⊕ 0}.
′
∑ X0 := {x; πH2 sx πH2 = 0}. Then qx ̸= 0 iff x ∈ X0 . Also, with ρ1 ⊕ 0 :=
Let
x∈X0 px sx , ∑
px sx = (ρ1 − ρ′1 ) ⊕ ρ2 .
x∈X \X0
Therefore,
∑ ∑
gf (px , qx ) sx = fˆ (0) px sx = (ρ1 − ρ′1 ) ⊕ ρ2
x∈X \X0 x∈X \X0
∑
= gf (px , qx ) πH1 sx πH1 ⊕ πH2 sx πH2 .
x∈X \X0
4 Canonical Forms
(8) reduces computation of gf (ρ, σ) of arbitrary {ρ, σ} to those in certain canon-
ical form.
Denote
[ ] [ ]
ρ11 ρ12 I −ρ12 ρ−122 }supp σ
ρ= , C1 := .
ρ21 ρ22 0 I } ker σ
7
Then [ ] [ ]
ρ̃ 0 σ 0
C1 ρC1† = , C1 σC1 = σ = . (12)
0 ρ22 0 0
where
ρ̃ := ρ/ρ22 = ρ11 − ρ12 ρ−1
22 ρ21 .
Hereafter, for notational simplicity, block component of a matrix and its em-
bedding into B (H) are denoted by the symbol. For example, 1-1 component of
σ is also denoted by σ. Also, later ρ22 denotes linear operator V ρ22 V † , where
V is an isometry from ker σ into H. Also, ρ/ρ22 denotes Shur complement [10].
Next, do the following transform, which is analogous to the above one, to
the pair {ρ̃, σ}: we decompose supp σ into ker ρ̃ ⊕ supp ρ̃,
0 0 0 σ00 σ01 0 I 0 0 } ker ρ̃
C1 ρC1† = 0 ρ̃ 0 , σ = σ10 σ11 0 , C2 := −σ10 σ00 −1
I 0 }supp ρ̃
0 0 ρ22 0 0 0 0 0 I } ker σ
Remark 4.1 Here, σ11 denotes πρ̃ σπρ̃ , and not πσ σπσ . The rule for σ is not
quite in accordance with the notation ρ11 = πσ ρπσ .
Then C := C2 C1 ,
0 0 0 σ00 0 0
CρC † = 0 ρ̃ 0 , CσC † = 0 σ̃ 0 , (13)
0 0 ρ22 0 0 0
where
−1
σ̃ := σ/σ00 = σ11 − σ10 σ00 σ01 .
By Theorem 5.6 of [10],
It is easy to check, for any X ∈ B (supp ρ ∩ supp σ) and for any Y ∈ B (supp σ),
CXC † = X, (14)
†
CY C ∈ B (supp σ) , (15)
C1 Y C1† = Y. (16)
Remark 4.2 ρ̃ ∈ B≥ (H) (σ̃ ∈ B≥ (H), resp.) is the largest operator dominated
by ρ (by σ, resp.) and supported on supp σ (supp ρ, resp.) (Theorem 5.3, [10]):
By (14) and (15), to show this, we have only have to show ρ̃ is the largest
operator dominated by CρC † and supported on supp σ, which is easily verified
by (13).
8
By above remark, ρ̃′ = ρ̃ and σ̃ ′ = σ̃. However, ρ′00 and σ11
′
is not unitarily
equivalent to ρ11 and σ00 , respectively.
By (8) and (10),
Cgf (ρ, σ) C † = gf (0 ⊕ ρ̃ ⊕ ρ22 , σ00 ⊕ σ̃ ⊕ 0)
= f (0) σ00 ⊕ gf (ρ̃, σ̃) ⊕ fˆ (0) ρ22 , (17)
Therefore, by (14),
{ }
gf (ρ, σ) = C −1 f (0) σ00 ⊕ gf (ρ̃, σ̃) ⊕ fˆ (0) ρ22 C †−1
9
Note this theorem is not claiming that gf (ρ, σ) > −∞ for all operator convex
function f (which turns out to be the case).
Proof.
( )
∑ ∑
gf (ρ, I) = gf λi |ei 〉 〈ei | , |ei 〉 〈ei |
i∈Ii i∈I
∑
= gf (λi |ei 〉 〈ei | , |ei 〉 〈ei |) ,
i∈Ii
where i-th direct component is scalar multiple of |ei 〉 〈ei |. Thus by (4),
∑
gf (ρ, I) = f (λi ) |ei 〉 〈ei | = f (ρ) .
i∈I
5 Closed Formula
In this section, we prove that our definition of generalized perspective function
is equivalent to definition in other literatures [3][4]:
If supp ρ ̸⊂ supp σ,
The above statements are true for any convex functions f with (FC) provided
supp ρ ∩supp σ is 1 - dimensional.
10
By Naimark extension theorem, there are orthogonal projections {Px }x∈X and
an isometry V from H′ onto supp σ such that
∑
sx = V Px V † , sx = IH′ .
x∈X
Then
∑
f (px ) sx
x∈X
{ } { ( )}
∑ ∑
†
= V f (px ) Px V =V f px Px V†
x∈X x∈X
( { } )
∑
≥ f V px Px V†
x∈X
( )
−1/2
= f σ ρσ −1/2 .
where the inequality in the third line is due to Jensen’s inequality (Proposi-
tion A.1), and the equality in that inequality can be achieved by sx correspond-
ing to the spectral decomposition of σ −1/2 ρσ −1/2 . Therefore, we have (20).
If f is operator convex and supp ρ ̸⊂ supp σ, by (19) and (20), (21) is derived.
If supp ρ ∩ supp σ is 1 - dimensional, ρ̃ is constant multiple of σ̃, since both
of operators are supported on 1 - dimensional space supp ρ ∩ supp σ. Thus (18)
and (4) shows that gf (ρ, σ) > −∞. Therefore, (19) leads to (21).
Theorem 6.1
{ }
∑
gf (ρ, σ) = inf cx gf (|ψx 〉 〈ψx | , |ϕx 〉 〈ϕx |) , (24)
x
11
Proof. In the definition of gf (ρ, σ), we may restrict a simultaneous decompo-
sition {sx , px , qx }x∈X to those with rank sx = 1, ∀x ∈ X . Therefore, ”≥” is
obvious. Below, we show ”≤”.
Define
Therefore, there is a simultaneous decomposition {s′x , p′x , qx′ }x∈X ′ of {ρ, σ} such
that ∑ ∑
gf (p′x , qx′ ) s′x = cx gf (|ψx 〉 〈ψx | , |ϕx 〉 〈ϕx |) ,
x∈X ′ x∈X
indicating ”≤”.
12
Lemma 7.1 The LHS of (26) is larger than or equal to the RHS of (26) and
(27)
µ0 = max
∗
F∗ (ζ0∗ ) . (29)
ζ0 ≥0
Below, we apply this proposition considering the RHS of (27) as the primal
problem, and obtain the reverse test as its dual problem. To proceed, we need
to introduce a proper mathematical framework.
Consider the space C of continuous real valued functions on the compact
×2
set D ⊂ [0, 1] and the space Bsa of the space of the bounded self - adjoint
linear operators on the Hilbert space K. Endorse C and Bsa with the norm
∥h∥ := sup(λ1 ,λ2 )∈D |h (λ1 , λ2 )| and the operator norm ∥W ∥, respectively. From
these two spaces, we compose the linear space
{ n }
∑
hi Wi ; hi ∈ C , Wi ∈ Bsa ,
i=1
13
Then
∑∞ for each z ∈ Z, there exist bounded sequences {hi } and {Wi } with
z = i=1 hi Wi and
{∞ ∞
}
∑ ∑
∥z∥π = inf ∥hi ∥ ∥Wi ∥ ; z = hi Wi .
i=1 i=1
The strict inequality z > 0 means that z is an interior point of the cone
{z ′ ; z ′ ≥ 0}.
Any bounded linear functional ζ ∗ on Z is the linearization of bilinear form
on C and Bsa (see Section 2.2, [9]) :
ζ ∗ (hW ) = ζ ∗ (h) (W ) ,
where ζ ∗ (h) (·) and ζ ∗ (·) (W ) is an element of Bsa ∗
and C ∗ , respectively.
ˆ
Below, Z :=C ⊗π Bsa (H ⊗ H), and let X = Ω be the pair (Λ1 , Λ2 ) of linear
transform on Bsa (H). Λ1 can be identified with Bsa (H ⊗ H) by Choi’s repre-
sentation: ∑
Φ := |i〉 〈i| ⊗ |i〉 〈i| , Ψθ := Λθ ⊗ I(Φ),
i
Thus
{ }
X = Ω := Λ⃗ = (Λ1 , Λ2 ) ;1 , Λ1 ∈ Bsa (Bsa (H)) ,
⃗ : = g1 Ψ1 + g2 Ψ2 − gf Φ,
G(Λ)
⃗ : = tr A (Λ1 (ρ) + Λ2 (σ)) = tr Ψ1 (A ⊗ ρT ) + tr Ψ2 (A ⊗ σ T ).
FA (Λ)
Then using almost parallel argument as [7], we have:
Lemma 7.3 Suppose f satisfies (FC) and gf (λ1 , λ2 ) is continuous on D ⊂
×2
[0, 1] . Define
{ }
⃗ gf (λ1 , λ2 ) I − λ1 Λ1 − λ2 Λ2 is CP, ∀λ1 , λ2 ∈ D .
Wf,D := Λ;
where the minimization in the LHS is taken over all the (λ1 , λ2 ) → Zλ1 ,λ2 with
finitely many non-zero points and satisfying
Zλ1 ,λ2 ≥ 0, tr Zλ1 ,λ2 ≤ 1, (31)
∑ ∑
λ1 Zλ1 ,λ2 = A ⊗ ρT , λ2 Zλ1 ,λ2 = A ⊗ σ T . (32)
(λ1 ,λ2 )∈D (λ1 ,λ2 )∈D
14
Proof. G is convex, and Λ ⃗ ∗ := (w1,∗ I, w2,∗ I), where (w1,∗ , w2,∗ ) satisfies
λ1 w1,∗ + λ2 w2,∗ − gf (λ1 , λ2 ) < 0 for any λ1 ≥ 0, λ2 ≥ 0,
⃗ ∗ ) < 0.
G(Λ
With ζ ∗ ∈ Z∗ , ζ ∗ ≥ 0,
{ }
F∗ (ζ ∗ ) = sup tr Ψ1 (A ⊗ ρT ) + tr Ψ2 (A ⊗ σ T ) − ζ ∗ (g1 Ψ1 + g2 Ψ2 − gf Φ)
⃗
Ψ
{( ) ( ) }
= sup tr Ψ1 (A ⊗ ρT ) − ζ ∗ (g1 ) (Ψ1 ) + tr Ψ2 (A ⊗ σ T ) − ζ ∗ (g2 ) (Ψ2 ) + ζ ∗ (gf ) (Φ)
⃗
Ψ
{
ζ ∗ (gf ) (Φ) , if tr Ψ1 (A ⊗ ρT ) − ζ ∗ (g1 ) (Ψ1 ) = tr ρW and Ψ2 (A ⊗ σ T ) − ζ ∗ (g2 ) (Ψ2 ) ,
=
∞, otherwise.
Observe h → ζ ∗ (h) (W ) is a bounded functional on C. Therefore, by Riesz-
Markov representation theorem,
∫
ζ ∗ (h) (W ) = h (λ1 , λ2 ) dνW ,
D
where νW is a regular measure over the Borel sets of D. Since the map
W → ζ ∗ (χ (B)) (W ) = νW (B), where χ (·) is the indicator function, is a linear
positive map,
|νW (B)| ≤ ∥W ∥ ∥ζ ∗ (χ (B)) (·)∥ = ∥W ∥ ζ ∗ (χ (B)) (1) = ∥W ∥ ν1 (B) . (33)
dνW
Therefore, νW is absolutely continuous relative to ν1 .Thus ξλ1 ,λ2 (W ) := dν1
exists, and ∫
∗
ζ (h) (W ) = h (λ1 , λ2 ) ξλ1 ,λ2 (W ) d ν1
D
By this identity and linearity of W → ζ ∗ (h) (W ), W → ξλ1 ,λ2 (W ) is linear.
Positivity of ξλ1 ,λ2 follows from positivity of ζ ∗ . By (33),
|ξλ1 ,λ2 (W )| ≤ ∥W ∥ , ν- a.e. (34)
Therefore, there is Zλ1 ,λ2 ≥ 0 with tr Zλ1 ,λ2 W = ξλ1 ,λ2 (W ) for any W ∈
Bsa . By (34), (31) holds for ν-a.e. . Since dim H < ∞, the optimal value can be
achieved by sum over a finite set by Caratheodory’s theorem. Thus, rewriting
F∗ using Zλ1 ,λ2 , we have the assertion provided the LHS of (30) is finite. But
by
∑ the weak duality (28), this is true if the constraints (31) and (32), and
(λ1 ,λ2 )∈D gf (λ1 , λ2 ) tr Zλ1 ,λ2 < ∞ exists.
×2
Lemma 7.4 Suppose gf (λ1 , λ2 ) is continuous on D ⊂ [0, 1] . Also suppose
A ≥ 0 is rank - 1. Then the following identity holds provided the LHS of the below
is finite: ∑
min ⃗
gf (px , qx ) tr Asx = sup FA (Λ), (35)
x∈X ⃗
Λ∈W f,D
where the minimization in the LHS is taken over all the simultaneous decompo-
sition of {ρ, σ} such that (px , qx ) ∈ D.
15
Proof. Let π be the projector onto supp A, and apply π ⊥ ⊗ I to the both ends
of the first identity of (32):
∑ ( ) ( )
λ1 π ⊥ ⊗ I Zλ1 ,λ2 π ⊥ ⊗ I = 0.
(λ1 ,λ2 )∈D
( ) ( )
Since each term is positive, it means π ⊥ ⊗ I Zλ1 ,λ2 π ⊥ ⊗ I = 0. Therefore,
since A is rank - 1,
Zλ1 ,λ2 = A ⊗ Z̃λT1 ,λ2 ,
where ∑ ∑
λ1 Z̃λ1 ,λ2 = ρ, λ2 Z̃λ1 ,λ2 = σ.
(λ1 ,λ2 )∈D (λ1 ,λ2 )∈D
Lemma 7.5 Suppose A ≥ 0 is rank - 1. Then the following identity holds pro-
vided gf (ρ, σ) is finite:
⃗ = sup FA (Λ).
tr Agf (ρ, σ) = sup FA (Λ) ⃗ (36)
⃗
Λ∈W ′ ⃗
Λ∈W f
f
Proof. If f (0) < ∞ and fˆ (0) < ∞, gf (λ1 , λ2 ) < ∞ and continuous on [0, 1] .
×2
×2
Thus, let D = [0, 1] . Then the LHS of (35) is bounded by tr Agf (ρ, σ), which
is finite. Thus by Lemma 7.4, we obtain the assertion.
If f (0) = ∞ and fˆ (0) < ∞, the condition gf (ρ, σ) < ∞ demands supp ρ ⊃
supp σ. Define Dε := [ε, 1] × [0, 1]. Consider the simultaneous decomposition
obtained via spectrum decomposition of σ −1/2 ρ̃σ −1/2 . Then (px , qx ) ∈ Dr0 ,
where r0 > 0 is the smallest eigenvalue of σ −1/2 ρ̃σ −1/2 . Thus if ε ≤ r0 , feasible
point of the LHS of (35) exists, and its LHS is bounded by tr Agf (ρ, σ) from
above. Thus (35) holds.
Let us denote by Dε1 and Dε2 the LHS and the RHS of (35), respectively.
Since the domain is of the variable is larger, tr Agf (ρ, σ) ≤ Dε1 = Dε2 . Also,
∩
⃗ Λ
lim Dε2 = sup FA (Λ); ⃗ ∈ Wf,Dε
ε↓0
ε∈[0,r0 ]
{ } { }
⃗ Λ
= sup FA (Λ); ⃗ ∈ Wf,D = sup FA (Λ); ⃗ Λ
⃗ ∈ W′f .
0
Therefore, by we have ”≤” of the first identity of (36). Since the opposite
inequality holds by Lemma 7.1, we have the first identity of (36).
Finally, observe
⃗ ≥ sup FA (Λ),
tr Agf (ρ, σ) ≥ sup FA (Λ) ⃗
⃗
Λ∈Wf
⃗
Λ∈W′
f
where the first inequality is by Lemma 7.1 and the second by Wf ⊃ W′f . There-
fore, the first identity of (36) indicates the second one.
16
( )
Lemma 7.6 If (Λ1 , Λ2 ) ∈ W′f (H), it is possible to extend it to Λ̃1 , Λ̃2 ∈
¯
¯
W′f (K), K ⊃ H so that Λ̃θ ¯ = Λθ (θ = 1, 2).
B(H)
Lemma 7.7 (Λ1 , Λ2 ) ∈ Wf (H) (∈ W′f (H), resp.) iff (Λ′1 , Λ′2 ) ∈ Wf (H)
(∈ W′f (H), resp.), where
( )
Λ′θ (X) := C −1 Λθ CXC † C −1† , θ ∈ {1, 2}
This is equivalent to
f (r) − ar + b ≥ 0 (37)
and
2
(f (r) + r) (f (r) − ar + b) − (f (r) − ar)
( )
b
= (a + 1) r f (r) + − ar + bf (r) ≥ 0
a+1
17
for all r ≥ 0. The latter is true if
b
f (r) − ar + ≥ 0. (38)
a+1
Since f is convex and positive, for each a > 0, if b > 0 is large enough, both (37)
and (38) hold for all r ≥ 0, meaning the corresponding (Λ1 , Λ2 ) is an element
of W′f (H). Also,
([ ])
−ρ̃ 0
Λ1 (ρ) + Λ2 (σ) = .
0 aρ22
Since ρ22 = ρ − ρ̃, we have the assertion by letting a → ∞.
Proof. Suppose gf (ρ, σ) is finite. Then since (36) holds for all rank - 1 positive
operator A, we have (26) and (27). Next suppose gf (ρ, σ) = ∞. Then (26) and
(27) hold by Lemma 7.8.
Remark 7.10 When f is operator convex, gf (ρ, σ) may be defined by (26). But
some properties are not easy to show starting from this definition. For example,
to prove statement such as (9), one must compose (Λ1 , Λ2 ) ∈ Wf (H1 ⊕ H2 ) us-
ing (Λ1,i , Λ2,i ) ∈ Wf (Hi ) (i = 1, 2), but the composition is not straightforward.
Even the proof (5), which is easier, is not very straightforward.
8 Continuity
Though we had defined non - commutative extension of the perspective function
by simultaneous decomposition, usual definition is (20) for full - rank operators,
and extend it to non - full rank operators by taking limit. Thus, we have to
investigate relation between (21) and limit of (20) along certain continuous
curves.
Proof. Suppose gf (ρ, σ) < ∞. Then by (26) for each δ > 0, there is (Λ1,δ , Λ2,δ ) ∈
Wf such that
gf (ρ, σ) ≤ Λ1,δ (ρ) + Λ2,δ (σ) + δI.
18
Since (Λ1,δ , Λ2,δ ) is an element of Wf ,
lim gf (ρε , σε ) ≥ lim (Λ1,δ (ρε ) + Λ2,δ (σε ))
ε↓0 ε↓0
= Λ1,δ (ρ) + Λ2,δ (σ) ≥ gf (ρ, σ) − δI.
Since δ > 0 is arbitrary, we have the assertion.
Suppose gf (ρ, σ) = ∞. Then by (56), there is there is (Λ1,c , Λ2,c ) ∈ Wf
such that
tr Λ1,c (ρ) + tr Λ2,c (σ) ≥ c,
for all c > 0. Therefore,
lim tr gf (ρε , σε ) ≥ lim (tr Λ1,c (ρε ) + tr Λ2,c (σε ))
ε↓0 ε↓0
= tr Λ1,c (ρ) + tr Λ2,c (σ) ≥ c.
Since c > 0 is arbitrary, we have the assertion.
Theorem 8.2 Suppose f is operator convex. Let {ρε , σε }ε≥0 be a continuous
linear curve with {ρ0 , σ0 } = {ρ, σ}. Then (39) holds.
Proof. By convexity of gf ,
lim gf (ρε , σε ) ≤ lim {(1 − ε) gf (ρ, σ) + εgf (ρ1 , σ1 )}
ε↓0 ε↓0
= gf (ρ, σ) .
Thus combined with Proposition 8.1, we have the assertion.
Theorem 8.3 Suppose f is operator convex. Also let {ρε , σε }ε≥0 be a continu-
ous curve with {ρ0 , σ0 } = {ρ, σ}, ρε ≥ ρ, σε ≥ σ, and gf (ρε , σε ) < ∞, ∀ε ≥ 0.
Then (39) holds provided
lim gf (ρε − ρ, σε − σ) ≤ 0. (40)
ε↓0
19
9 Dual Representation (2)
In this section, we investigate (Λ1 , Λ2 ) achieving the supremum of (27). Since
the case where gf (ρ, σ) = ∞ is already treated by Lemma 7.8, we suppose
gf (ρ, σ) < ∞. By Lemmas 7.6- 7.7, we suppose H = supp ρ + supp σ and {ρ, σ}
is in the form of (12), without loss of generality.
First, we study the case where ρ > 0, σ > 0. By Lemma 7.7, suppose σ = I
without loss of generality.
If dim H = 1, (cI, −f ∗ (c) I) ∈ W′f (H) and with proper choice of c, it
achieves the supremum of (27).
When dim H ≥ 2, we suppose f is operator convex in addition to satisfying
(FC).
Suppose (Λf1 , Λf2 ) achieves the maximum of (27). Then
where fk (r) := rk ,
r r r t
ηt (r) := − = −1+
1+t t+r 1+t t+r
20
∫ dµ(t)
and (0,∞) (1+t) 2 < ∞.
We show (26) for each term separately: first we show it for each term of the
expansion (45), f2 and ηt . By weak duality (Lemma 7.1), it suffices to show the
achievability of the LHS of (26).
Since
Λf12 (X) = ρX + Xρ, Λf22 (X) = −ρXρ,
the map
When f = ηt ,
X −1 −1
Λη1t (X) = − t (tI + ρ) X (tI + ρ)
1+t
1 −1 −1
= (tI + ρ) {t (ρX + Xρ − X) + ρXρ} (tI + ρ) ,
1+t
−1 −1
Λη2t (X) = −ρ (tI + ρ) X (tI + ρ) ρ
−1 −1
= − {(tI + ρ) − tI} (tI + ρ) X (tI + ρ) {(tI + ρ) − tI}
−1 −1 2 −1 −1
= −X + t (tI + ρ) X + tX (tI + ρ) − t (tI + ρ) X (tI + ρ)(46),
21
Since
∂ 1 t (2r − 1) t + r2
ηt (r) = − 2 = 2
∂r 1 + t (t + r) (1 + t) (t + r)
2r (t + r) 2r 2 max {r, 1}
≤ 2 = (1 + t) (t + r) ≤ 2 ,
(1 + t) (t + r) (1 + t)
the integrals in these formulas are convergent. Also, since r → ηt (r) is convex,
ε → ηt (r+ε)−η ε
t (ε)
is monotone increasing. Therefore, by monotone convergence
theorem, the order of the differentiation and the integration by µ can be in-
terchanged, and (Λf1 , Λf2 ) in (48) and (49) coincide with the one in (41) and
(42).
(Λf1 , Λf2 ) ∈ W′f (H) follows from (Λf12 , Λf22 ) ∈ W′f2 (H) and (Λη1t , Λη2t ) ∈
′
Wηt (H). (44) already has checked, but also can be verified by (48) and (49).
So far, we had supposed f (0) < ∞. This assumption can be removed by
replacing f (r) by fδ (r) := f (r + δ), where δ ≤ δ0 , and δ0 is set to the half of
the smallest eigenvalue of ρ. Then differentiation of tr Agf (ρ′ , σ ′ ) at ρ′ = ρ − δI
and σ ′ = I leads to
Therefore,
Lemma 9.1 Suppose ρ > 0, σ ≥ 0 but σ ̸> 0, and fˆ (0) < ∞. Suppose also
(Λ1,0 , Λ2,0 , ) ∈ W′f (supp σ) and
22
Proof. Let δ0 > 0 be arbitrary, and define
[ ]
(1 + δ) Λ1,0 (X11 ) − δ fˆ (0) X11 fˆ (0) X21
Λ1 (X) := ,
fˆ (0) X21 α1 X22
≥ ∥Λ1 ∥cb
′
where f− denotes the right derivative (Since f (0) = ∞, such r0 always exists.)
If f (0) < ∞, define r0 := 0. Then let
[ ]
(1 + δ) Λ2,0 (X11 ) − δf (r0 ) X11 f (r0 ) X12
Λ2 (X) := ,
f (r0 ) X21 α2 X22
( )
where α2 := − 1 + δ −1 f ∗ (c) − δ −1 f (r0 ).
First, we check (Λ1 , Λ2 ) ∈ W′f (H), which is equivalent to checking
(f (r) I − rΛ1 − Λ2 ) ⊗ I
[{ ′
} ]
≥ f (r0 ) + (r − r0 ) f− (r0 ) I − rΛ1 − Λ2 ⊗ I
[ ( ′ )]
= f (r0 ) I − r0 Λ1 − Λ2 + (r − r0 ) f− (r0 ) I − Λ1 ⊗ I
≥ [f (r0 ) I − r0 Λ1 − Λ2 ] ⊗ I.
If r ≥ r0 ,
23
and
Therefore, if r ≥ r0 ,
(f (r) I − rΛ1 − Λ2 ) ⊗ I (Y )
( ) ( )
δ fˆ (0) r + f (r0 ) − f (r) Y11 f (r) − fˆ (0) r − f (r0 ) Y12
≥ ( ) ( )
f (r) − fˆ (0) r − f (r0 ) Y21 δ −1 fˆ (0) r + f (r0 ) − f (r) Y22
( ) [ δY Y12
]
= fˆ (0) r + f (r0 ) − f (r) 11
Y21 δ −1 Y22
≥ 0,
Λ1 (ρ) + Λ2 (σ)
[ ( ) ]
(1 + δ) (Λ1,0 (ρ̃) + Λ2,0 (σ)) − δ fˆ (0) ρ̃ + f (r0 ) σ 0
= ( )
0 ˆ
f (0) ρ22 − δ 2 + δ ρ22
( ) ( )
≥ (1 + δ) (gf (ρ̃, σ) − εI) + fˆ (0) ρ22 − δ fˆ (0) ρ̃ + f (r0 ) σ − δ 2 + δ ρ22
Remark 9.2 In the proof, Λ1,0 is in general depends on ε. Thus, f (r0 ) may be
sharply increasing in ε−1 . However, observe δ may be chosen arbitrarily small
independent of ε and f (r0 ), so that δf (r0 ) is small.
24
10 Dual Representation (3)
In this section, we compose (Λ1 , Λ2 ) ∈ W′f directly using (45), in the case where
f is operator convex, f (0) < ∞, and gf (ρ, σ) < ∞. This argument seems also
valid even if dim H = ∞. In the proof of Lemma 9.1, in the case of f (0) = ∞,
∥Λ1,0 ∥cb has to be finite. But this may not hold if dim H = ∞.
By assumption, either (i) fˆ (0) = ∞ and σ > 0 or (ii) fˆ (0) < ∞ and σ ≥ 0.
First, we study the former. In this case (41) and (42) cannot hold as they are,
since f ′ (0) may be infinite. By Lemma 7.7, let σ = I without loss of generality,
and define
∫ ∫
Λt11 (X) : = aX + bΛf12 (X) + Λη1t (X) dµ (t) + wt,1 dµ (t) X,
(t1 ,∞) (0,t1 )
∫ ∫
Λt21 (X) : = f (0) X + bΛf22 (X) + Λη1t (X) dµ (t) + wt,2 dµ (t) X,
(t1 ,∞) (0,t1 )
where
−1 −2 −2
wt,1 := (1 + t) − t (t + 1) , w2,t := − (t + 1) .
First, (Λt11 , Λt21 ) ∈ W′f is verified by checking the condition for each term;
(Λf12 , Λf22 ) ∈ W′f2 and (Λ1ηt , Λη2t ) ∈ W′ηt have been checked in the previous
section: (w1,t I, w2,t I) ∈ W′ηt checks easily. Second, the integral of Ληθt (X)
(θ ∈ {1, 2}) over (t1 , ∞), and the integral of wt,θ (θ ∈ {1, 2}) over (0, ∞)
∫ dµ(t)
are finite, since (0,∞) (1+t) 2 < ∞. (Here, it is important that t1 > 0, since
∂ηt /∂r (0) = O (1/t) as t → 0.) Therefore, (Λt11 , Λt21 ) is well - defined member of
W′f .
∫
w dµ (t) < ∞ implies
(0,∞) t,θ
∫
lim wt,θ dµ (t) = 0, θ ∈ {1, 2}.
t1 ↓0 (t1 ,∞)
∫
Also, since (0,∞)
ηt (r) dµ (t) < ∞ for all r,
∫ ∫ ∫
Λη1t
(ρ) dµ (t) + Λ2ηt (I) dµ (t) = ηt (ρ) dµ (t)
(t1 ,∞) (t1 ,∞) (t1 ,∞)
∫ ∫
→ ηt (ρ) dµ (t) = gηt (ρ, I) dµ (t) , as t1 ↓ 0.
(0,∞) (0,∞)
Next, we move on to the case (ii), where fˆ (0) < ∞ and σ ≥ 0. We apply
the above composition for {ρ̃, σ} (recall supp ρ̃ ⊂ supp σ), and use Lemma 9.1.
25
In this case, since f (0) < ∞, r0 = 0 and thus there is no need to suppose
∥Λ1,0 ∥cb < ∞.
When f (0) = ∞, fˆ (0) < ∞, ρ > 0 and σ ≥ 0, replacing f by fˆ, al-
most parallel composition is possible. However, if f (0) = ∞, fˆ (0) = ∞, and
dim H = ∞,,the argument in the previous section is not valid, since spectrum of
σ −1/2 ρσ −1/2 does not have finite gap from 0 in general. Neither any Loewner
- type integral formula is available to present author.
11 On Operator Convexity
From this section, we again come back to our usual set up, where dim H < ∞.
When σ = I, by (41) and (42), (Λf1 , Λf2 ) ∈ Wf (H) is equivalent to
[ ]
1 1
f (r) X − (Xf (ρ) + f (ρ) X)−Df (ρ) rX − (Xρ + ρX) ≥ 0, ∀r ≥ 0. (51)
2 2
Theorem 12.4 gives following characterization of operator convexity:
Claim 11.1 A function f with (FC) is operator convex iff (51) for all ρ, X ∈
B≥ (H).
When X commutes with ρ and invertible, this is equivalent to
f (rI) − f (ρ) ≥ f ′ (ρ) (rI − ρ) .
Proof. Since ”only if” have been already shown, we show ”if”. By Lemma 7.1,
for each simultaneous decomposition of {ρ, I},
∑
gf (px , qx ) sx ≥ Λf1 (ρ) + Λf2 (I) = f (ρ) ,
x∈X
∑
where the identity holds by defining qx = 1, ∀x ∈ X and letting x∈X px sx = ρ
be the spectral decomposition of ρ. Thus,
f (ρ) = gf (ρ, I) .
Since the RHS is convex function, so is the LHS.
12 Generalized f - divergence
Define generalized f -divergence, with A, ρ, σ ∈ B≥ (H), by
{ }
∑
Df,A (ρ∥σ) := inf gf (px , qx ) tr Asx ; {sx , px , qx }x∈X is a simultaneous decomposition of {ρ, σ}
x∈X
Since it is a scalar quantity and gf (px , qx ) is bounded from below, Df,A (ρ∥σ) >
−∞. As easily verified, if f is operator convex,
Df,A (ρ∥σ) = tr Agf (ρ, σ) .
26
Theorem 12.1 Let |I| < ∞.
∑
(ii) If ci ∈ R≥ (∀i ∈ I) and i ci = 1,
( ° )
∑ ° ° ∑ ∑
Df,A ci ρ° , ci σi ≤ ci Df,A (ρi ∥σi ) .
°
i∈I i∈I i∈I
(v) If C −1 exists, ( )
Df,A CρC † ∥CσC † = Df,C † AC (ρ∥σ) . (52)
Proof. (i)-(vi) are proved almost in parallel with the analogous assertion for
operator perspective gf (ρ, σ).
(vii) is immediate consequence of (iii) of Theorem 3.3. To obtain (viii),
combine (v), (vii) and the decompositions (12) and (13).
If supp ρ ⊂ supp σ,
( )
Df,A (ρ∥σ) ≤ tr Aσ 1/2 f σ −1/2 ρσ −1/2 σ 1/2 . (55)
27
Theorem 12.2 Suppose f satisfies (FC). Then Df,A (ρ∥σ) < ∞ only in the
following four cases.
(i) fˆ (0) < ∞ and f (0) < ∞;
(ii) fˆ (0) = ∞ , f (0) < ∞, and A (ρ − ρ̃) = 0 ;
(iii) fˆ (0) < ∞, f (0) = ∞, and A (σ − σ̃) = 0 ;
(iv) fˆ (0) = ∞ , f (0) = ∞, and A (ρ − ρ̃) = A (σ − σ̃) = 0.
Proof. By (54), it suffices to show Df,A (ρ̃∥σ̃) < ∞, which is the consequence
of (55).
where
Wf,A := {rW1 + W2 ≤ f (r) A, ∀r ≥ 0} .
The proof is almost the same as the one of Theorem in [7], thus omitted.
References
[1] Bhatia, R.: Matrix Analysis. Springer, Berlin (1996)
[2] Bhatia,R.: Positive Definite Matrices. Princeton (2007)
[3] A. Ebadian, I. Nikoufar, and M. E. Gordji, ”Perspectives of matrix convex
functions,” Proc. Natl. Acad. Sci. USA, 108(18):7313–7314 (2011)
[4] Edward Effros, Frank Hansen, ”Non-commutative perspectives,” Ann.
Funct. Anal. Vol. 5, No. 2, 74-79 (2014)
[5] Hiai, F., Mosonyi, M., Petz D., and Beny, C., ”Quantum f- divergences and
error corrections, ”Rev. Math. Phys. 23, 691–747 (2011)
28
[6] Luenberger, D. G.:Optimization by vector space methods. Wiley, New York
(1969)
[7] K. Matsumoto, ”A new quantum version of f-divergence,” arXiv:1311.4722
(2003)
[8] Rockafellar,R.T.:Convex Analysis. Princeton(1970)
[9] Ryan,R.A.:Introduction to tensor products of Banach spaces, Springer,
Berlin(2002)
[10] Zhang, F. ed.:The Shur Complement and Its Applications. Springer, Berlin
(2005)
implies
X ≥ CY −1 C † , Y ≥ C † X −1 C. (58)
−1 †
Conversely, if X ≥ CY C and Y ≥ 0, then (57) holds.
Remark A.3 In [2] and [10], they suppose X > 0 and/or Y > 0. However,
since the range of C and C † is a subspace of supp X and supp Y respectively,
existence of ker X and ker Y does not cause any problem.
29