You are on page 1of 29

Operator Perspective Function as Solution to

Optimization Problems
Keiji Matsumoto
Quantum Computation Group, National Institute of Informatics,
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430,
e-mail:keiji@nii.ac.jp
February 6, 2018

Abstract
Perspective gf (λ1 , λ2 ) := λ2 f (λ2 /λ1 ) of a convex function f is positively ho-
mogeneous convex function from R≥ × R≥ to R. Its operator version had been
proposed and studied by several authors [3][4], when f is operator convex. This
paper gives characterization of the quantity as solution to a minimization prob-
lem and a maximization problem, when underlying Hilbert space is finite di-
mensional. By these characterization, new proofs of known properties of the
quantity are given, and explicit representation of the quantity when the argu-
ments are not full - rank is given. In addition, when the overlap of the support of
two arguments is one - dimensional, the quantity can be defined for any convex
but not necessarily operator convex function f .

1 Introduction
Perspective gf (λ1 , λ2 ) := λ2 f (λ2 /λ1 ) of a convex function f is positively ho-
mogeneous convex function from R≥ × R≥ to R. Its operator version had been
proposed and studied by several authors [3][4], when f is operator convex. This
paper gives characterization of the quantity as solution to certain minimization
problem and maximization problem, when underlying Hilbert space is finite di-
mensional. By these characterization, new proofs of known properties of the
quantity are given, and explicit representation of the quantity when the argu-
ments are not full - rank is given. In addition, when the overlap of the support of
two arguments is one - dimensional, the quantity can be defined for any convex
but not necessarily operator convex function f .
In this paper, we define operator perspective function as solution to a mini-
mization problem, and later show this definition coincide with the one in [3][4],
since its properties, representations, and extensions are easier to prove from
this new definition. Another motivation of this definition is link to maximal

1
non - commutative f - divergence, that is defined by minimization problem and
coincide with tr gf (ρ, σ) when f is operator convex.
Here, summary of new results are given. When f is operator convex and
ρ ̸> 0, σ ̸> 0,

gf (ρ, σ) = gf (ρ̃, σ̃) + f (0) (σ − σ̃) + fˆ (0) (ρ − ρ̃)


= gf (ρ̃, σ) + fˆ (0) (ρ − ρ̃)
= lim gf (ρ + εX, σ + εX) .
ε↓0

Here f (r) := rf (1/r), X > 0. Also, ρ̃ and σ̃ is the largest self - adjoint op-
erators supported on supp ρ ∩ supp σ and ρ̃ ≤ ρ, σ̃ ≤ σ. (ρ̃ in fact is Shur
complement, ρ11 − ρ12 ρ−1 22 ρ21 , where the block components are defined accord-
ing to the decomposition supp σ + ker σ.) This formula in fact valid for any
convex but not necessarily operator convex function f provided supp ρ ∩ supp σ
is one - dimensional.
In particular, if ρ and σ are rank -1,
{
f (r) |ϕ〉 〈ϕ| , if |ψ〉 〈ψ| = r |ϕ〉 〈ϕ| , ∃r ≥ 0,
gf (|ψ〉 〈ψ| , |ϕ〉 〈ϕ|) =
f (0) |ϕx 〉 〈ϕx | + fˆ (0) |ψx 〉 〈ψx | , otherwise.

In fact, it turns out that gf is the largest convex function satisfying above.
Also, we show dual representation,

gf (ρ, σ) = sup {Λ1 (ρ) + Λ2 (σ)} , (1)


(Λ1 ,Λ2 )∈Wf (H)

where
Wf (H) := {(Λ1 , Λ2 ) ; pΛ1 + qΛ2 ≤ gf (p, q) I, ∀p, q ≥ 0} . (2)
Here, Λ ≥ 0 (Λ ∈ B (B (H))) means Λ (X) ≥ 0 for any X ∈ B≥ (H), and I ∈
B (B (H)) is an identity map. This is non - commutative version of representation
of gf (λ1 , λ2 ) as point - wise supremum of linear functions.
Using this representation as the pointwise supremum of linear functions, we
show continuity properties of gf . Also, as a by-product of the proof of the
dual representation, the following necessary and sufficient condition of operator
convexity is given (51).
In the end, we investigate the property of generalized maximal f - divergence
Df,A (ρ∥σ). This quantity coincides with maximal f - divergence if A = I and
with tr Agf (ρ, σ) if f is operator convex.

2 Notations and Definitions


First, we set terms and notations. Throughout the paper, we will work on a
finite dimensional Hilbert space H. In most cases, the underlying Hilbert space
is not mentioned unless it is confusing. The space of bounded operators on H
is denoted by B (H), respectively, and the space of their self - adjoint elements,

2
positive elements, and strictly positive elements are denoted by Bsa (H), B≥ (H),
and B> (H), respectively.
Also, for each positive operator X, denote by supp X the its support, and
by πX the projection onto supp X. The projection onto the space K is denoted
by πK . Orthogonal complement of the projector π is denoted by π ⊥ . For each
X ∈ B (H), X −1 denotes Moor-Penrose generalized inverse of X.
R≥ and R> denotes positive and strict positive half-line, respectively. As in
[8], we suppose that h is a map from Rn to R∪ {±∞}. Instead of saying that h
is not defined on a certain set, we say that h (r) = ∞ on that set. The effective
domain of h, denoted by dom h , is the set of all r’s with h (r) < ∞. h is said to be
convex iff its epigraph, or the set epi h := {(r, λ) ; λ ≥ h (r)} is convex. A convex
function h is proper iff h is nowhere −∞ and not ∞ everywhere, and is closed
iff the set {r; λ ≥ h (r)} is closed for any λ2 , or equivalently, iff its epigraph is
closed (Theorem 7.1 of [8]) , or equivalently, iff it is lower semi - continuous.
Given a convex function h, its closure cl h is the greatest closed, or equiv-
alently, lower semi-continuous (not necessarily finite) function majorized by h.
The name comes from the fact that epi (cl h) = cl (epi h). cl h coincide with h
except perhaps at the relative boundary points of its effective domain. If h is
proper and convex, so is cl h (Theorem 7.4, [8]).
In the paper, unless otherwise mentioned, f satisfies
(FC) f is a proper closed convex function with (0, ∞) ⊂ dom f
For each f with (FC), its perspective function gf is


 λ2 f (λ1 /λ2 ) , if λ1 ∈ dom f, λ2 > 0

limλ2 ↓0 λ2 f (λ1 /λ2 ) , if λ1 ∈ dom f, λ2 = 0,
gf (λ1 , λ2 ) := (3)

 0, if λ1 = λ2 = 0,

∞, otherwise.
Also, for each r ≥ 0, define
fˆ (r) := gf (1, r) .

Then fˆ also satisfies (FC) and


gfˆ (λ1 , λ2 ) = gf (λ2 , λ1 ) ,
and
fˆ (0) = lim λ2 f (1/λ2 ) .
λ2 ↓0

A generalization of the perspective function to the space B≥ (H) × B≥ (H)


had been introduced and studied by several authors. In this paper, we give
another definition of the quantity via a minimization problem. (These two
definitions turn out to be equivalent.) In this way, various properties, such as
convexity and so on, of the quantity becomes almost trivial.
For the notational simplicity, we extend the partial order ”≤” in Bsa (H) to
Bsa (H) ∪ {−∞, ∞} so that
−∞ < A < ∞

3
for any element A ∈ Bsa (H). Also, A + ∞ := ∞ and A − ∞ := −∞ for any
A ∈ Bsa (H).
Suppose ρ, σ ∈ B≥ (H). Then a simultaneous decomposition of {ρ, σ} is
{sx , px , qx }x∈X , where sx ∈ B≥ (H), px ∈ R≥ , qx ∈ R≥ and |X | < ∞, such that
∑ ∑
ρ= px sx , σ = qx sx .
x∈X x∈X

Suppose ρ, σ ∈ B≥ (H). We define


{ }

gf (ρ, σ) := inf gf (px , qx ) sx ; {sx , px , qx }x∈X is a simultaneous decomposition of {ρ, σ}
x∈X

provided the RHS exists in Bsa (H). If for any simultaneous decomposition
gf (px , qx ) = ∞, ∃ x ∈ X ,
gf (ρ, σ) := ∞.
If the neither of the above is true, define
gf (ρ, σ) := −∞.
It is easy to see
∀ (p, q) ∈ R2≥ , ∀ρ ≥ 0, gf (pρ, qρ) = gf (p, q) ρ. (4)
Also,
gf (ρ ⊕ 0, σ ⊕ 0) = gf (ρ, σ) ⊕ 0, (5)
since any simultaneous decomposition {sx , px , qx } of {ρ ⊕ 0, σ ⊕ 0} is in the form
of sx = s′x ⊕ 0.
Remark 2.1 Since
∑ ∑
gf (px , qx ) sx = gf (p′x , qx′ ) s′x ,
x∈X x∈X

for any {s′x , p′x , qx′ }x∈X with


px sx = p′x s′x , qx sx = qx′ s′x
we may suppose tr sx = 1 without loss ( of generality.
) Then the simultaneous
decomposition can be identified with Γ, {px , qx }x∈X , where Γ is a trace pre-
serving positive map from commutative subalgebra from B (H) into B (H). This
interpretation gives an operational meaning to the above minimization problem,
but the restriction tr sx = 1 is rather cumbersome.
To simplify mathematical arguments, we often suppose qx is either 0 or 1,
and px ̸= px′ for all x ̸= x′ .
Remark 2.2 In defining simultaneous decomposition, we had assumed the car-
dinality of X is finite. This restriction is not essential as long as dim H < ∞:
2
By usual argument using Caratheodory’s theorem, we can show |X | ≤ 3 (dim H)
is enough for minimization.

4
3 Properties
Lemma 3.1 Let |I| < ∞ and consider {ρi , σi }, i ∈ I. If gf (ρi , σi ) > −∞ for
all i ∈ I, ( )
∑ ∑ ∑
gf Λi (ρi ) , Λi (σi ) ≤ Λi (gf (ρi , σi ))
i∈I i∈I i∈I

holds for any positive maps Λi (i ∈ I) from B (H) to B (H).

Proof. If the LHS is −∞ or gf (ρi , σi ) = ∞, ∃i ∈ I, the statement is obvious.


Thus, suppose these are not the case, i.e., the RHS is finite.
Let {si,x , pi,x , qi,x }x∈X be a simultaneous decomposition of {ρi , σi }. Then
{∑ ∑ }
{Λi (si,x ) , pi,x , qi,x }(i,x)∈X ×I is a simultaneous decomposition of i∈I Λi (ρi ) , i∈I Λi (σi ) .
Therefore,

( )
∑ ∑
gf Λi (ρi ) , Λi (σi )
i∈I i∈I
 
 ∑ 
≤ inf gf (pi,x , qi,x ) Λi (si,x ) ; {si,x , pi,x , qi,x }x∈X , i ∈ I
 
(i,x)∈X ×I
  
∑  ∑ 
= Λi inf gf (pi,x , qi,x ) si,x ; {si,x , pi,x , qi,x }x∈X ,i∈I 
 
i∈I (i,x)∈X ×I

= Λi (gf (ρi , σi )) ,
i∈I

where in the third line the {si,x , pi,x , qi,x }x∈X ranges over all the simultaneous
decompositions of{ρi , σi } for each i ∈ I.

Corollary 3.2 (i) If ci ∈ R≥ (∀i ∈ I) and gf (ρi , σi ) > −∞ (∀i ∈ I),


( )
∑ ∑ ∑
gf ci ρ, ci σi ≤ ci gf (ρi , σi ) .
i∈I i∈I i∈I

(ii) If gf (ρ, σ) > −∞, for any positive map Λ,

gf (Λ (ρ) , Λ (σ)) ≤ Λ (gf (ρ, σ)) (6)

(iii) Suppose f (0) ≤ 0 and gf (ρ, σ) > −∞. Then for all σ ′ ∈ B≥ (H),

gf (ρ, σ) ≥ gf (ρ, σ + σ ′ ) . (7)

Proof. (i),(ii); trivial.

5
(iii); With I := {1, 2} , Λ1 = Λ2 := I, ρ1 := ρ, ρ2 := 0, σ1 := σ and σ2 := σ ′ ,

gf (ρ, σ + σ ′ ) ≤ gf (ρ, σ) + gf (0, σ ′ )


= gf (ρ, σ) + f (0) σ ′ ≤ gf (ρ, σ) .

Theorem 3.3

(i) If C −1 exists, ( )
gf CρC † , CσC † = Cgf (ρ, σ) C † (8)

(ii) Suppose ρi , σi ∈ B (Hi ) (i ∈ I) satisfies gf (ρi , σi ) > −∞ (∀i ∈ {1, 2}) and
gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) > −∞, then

gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) = gf (ρ1 , σ1 ) ⊕ gf (ρ2 , σ2 ) . (9)

(iii) Consider ρ1 , σ1 ∈ B≥ (H1 ) , ρ2 ∈ B≥ (H2 ). Then if {s′x , p′x , qx′ }x∈X ′ is


a simultaneous decomposition of {ρ1 ⊕ ρ2 , σ1 ⊕ 0}, then there is another
simultaneous decomposition {sx , px , qx }x∈X such that qx ̸= 0 iff x ∈ X0 ⊂
X, {
sx,1 ⊕ 0, x ∈ X0 ,
sx :=
0 ⊕ sx,2 , x ∈ X \X0 ,
and ∑ ∑
gf (p′x , qx′ ) s′x = gf (px , qx ) sx .
x∈X ′ x∈X

Accordingly,

gf (ρ1 ⊕ ρ2 , σ1 ⊕ 0) = gf (ρ1 , σ1 ) ⊕ fˆ (0) ρ2 (10)

Proof. (i); Applying (ii) with Λ (X) := CXC † , we have ”≤”. The opposite
inequality is proved by applying (ii) with Λ (X) := C −1 XC −1† .
(ii); Let Λi ’s being conjugation of an isometry embedding from Hi into
H1 ⊕ H2 , we obtain ”≤”.

gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) ≤ gf (ρ1 , σ1 ) ⊕ gf (ρ2 , σ2 ) . (11)

Let Pi be a projection onto Hi . Let {sx , px , qx }x∈X is the optimal simultaneous


decomposition of {ρ1 ⊕ ρ2 , σ1 ⊕ σ2 }. Then {Pi sx Pi , px , qx }x∈X is a simultane-
ous decomposition of {ρi , σi }. Therefore,

Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi ≥ gf (ρi , σi ) ,

which, by (11), indicates

Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi = gf (ρi , σi ) ,

6
and ∑
Pi gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ) Pi = gf (ρ1 , σ1 ) ⊕ gf (ρ2 , σ2 ) .
i∈I

This identity and the inequality (11) are both true only if Pi {gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 )} Pj =
0 for all i ̸= j. Thus the LHS of the above identity equals gf (ρ1 ⊕ ρ2 , σ1 ⊕ σ2 ),
and we have the assertion.
(iii); Let {sx , px , qx }x∈X be a simultaneous decomposition of {ρ1 ⊕ ρ2 , σ1 ⊕ 0}.

∑ X0 := {x; πH2 sx πH2 = 0}. Then qx ̸= 0 iff x ∈ X0 . Also, with ρ1 ⊕ 0 :=
Let
x∈X0 px sx , ∑
px sx = (ρ1 − ρ′1 ) ⊕ ρ2 .
x∈X \X0

Therefore,
∑ ∑
gf (px , qx ) sx = fˆ (0) px sx = (ρ1 − ρ′1 ) ⊕ ρ2
x∈X \X0 x∈X \X0

= gf (px , qx ) πH1 sx πH1 ⊕ πH2 sx πH2 .
x∈X \X0

Therefore, without loss of generality, we suppose sx = sx,1 ⊕ sx,2 . Since


∑ ∑ ∑
gf (px , qx ) sx,1 ⊕ sx,2 = gf (px , qx ) sx,1 ⊕ 0 + gf (px , qx ) 0 ⊕ sx,2 ,
x∈X x∈X x∈X

without loss of generality, we suppose that sx is in the following form:


{
sx,1 ⊕ 0, x ∈ X0 ,
sx :=
0 ⊕ sx,2 , x ∈ X \X0 ,

This means qx is non-zero iff x ∈ X0 , and


∑ ∑
px sx,1 = ρ1 , qx sx,1 = σ1 ,
x∈X0 x∈X0

px sx,1 = ρ2 .
x∈X \X0

Since {sx,1 , px , qx }x∈X0 is a simultaneous decomposition of {ρ1 , σ1 } and x∈X \X0 gf (px , 0) sx =
fˆ (0) ρ2 , we have (10).

4 Canonical Forms
(8) reduces computation of gf (ρ, σ) of arbitrary {ρ, σ} to those in certain canon-
ical form.
Denote
[ ] [ ]
ρ11 ρ12 I −ρ12 ρ−122 }supp σ
ρ= , C1 := .
ρ21 ρ22 0 I } ker σ

7
Then [ ] [ ]
ρ̃ 0 σ 0
C1 ρC1† = , C1 σC1 = σ = . (12)
0 ρ22 0 0
where
ρ̃ := ρ/ρ22 = ρ11 − ρ12 ρ−1
22 ρ21 .

Hereafter, for notational simplicity, block component of a matrix and its em-
bedding into B (H) are denoted by the symbol. For example, 1-1 component of
σ is also denoted by σ. Also, later ρ22 denotes linear operator V ρ22 V † , where
V is an isometry from ker σ into H. Also, ρ/ρ22 denotes Shur complement [10].
Next, do the following transform, which is analogous to the above one, to
the pair {ρ̃, σ}: we decompose supp σ into ker ρ̃ ⊕ supp ρ̃,
     
0 0 0 σ00 σ01 0 I 0 0 } ker ρ̃
C1 ρC1† =  0 ρ̃ 0  , σ =  σ10 σ11 0  , C2 :=  −σ10 σ00 −1
I 0  }supp ρ̃
0 0 ρ22 0 0 0 0 0 I } ker σ

Remark 4.1 Here, σ11 denotes πρ̃ σπρ̃ , and not πσ σπσ . The rule for σ is not
quite in accordance with the notation ρ11 = πσ ρπσ .

Then C := C2 C1 ,
   
0 0 0 σ00 0 0
CρC † =  0 ρ̃ 0  , CσC † =  0 σ̃ 0 , (13)
0 0 ρ22 0 0 0
where
−1
σ̃ := σ/σ00 = σ11 − σ10 σ00 σ01 .
By Theorem 5.6 of [10],

supp ρ̃ = supp σ̃ = supp ρ ∩ supp σ.

It is easy to check, for any X ∈ B (supp ρ ∩ supp σ) and for any Y ∈ B (supp σ),

CXC † = X, (14)

CY C ∈ B (supp σ) , (15)
C1 Y C1† = Y. (16)

Remark 4.2 ρ̃ ∈ B≥ (H) (σ̃ ∈ B≥ (H), resp.) is the largest operator dominated
by ρ (by σ, resp.) and supported on supp σ (supp ρ, resp.) (Theorem 5.3, [10]):
By (14) and (15), to show this, we have only have to show ρ̃ is the largest
operator dominated by CρC † and supported on supp σ, which is easily verified
by (13).

Interchanging ρ and σ above, we arrive at the other canonical forms,


 ′   
ρ00 0 0 0 0 0
C ′ ρC ′† =  0 ρ̃′ 0  , C ′ σC ′† =  0 σ̃ ′ 0 .

0 0 0 0 0 σ11

8
By above remark, ρ̃′ = ρ̃ and σ̃ ′ = σ̃. However, ρ′00 and σ11

is not unitarily
equivalent to ρ11 and σ00 , respectively.
By (8) and (10),
Cgf (ρ, σ) C † = gf (0 ⊕ ρ̃ ⊕ ρ22 , σ00 ⊕ σ̃ ⊕ 0)
= f (0) σ00 ⊕ gf (ρ̃, σ̃) ⊕ fˆ (0) ρ22 , (17)
Therefore, by (14),
{ }
gf (ρ, σ) = C −1 f (0) σ00 ⊕ gf (ρ̃, σ̃) ⊕ fˆ (0) ρ22 C †−1

= gf (ρ̃, σ̃) + f (0) C −1 σ00 C †−1 + fˆ (0) C −1 ρ22 C †−1


( ) ( )
= gf (ρ̃, σ̃) + f (0) C −1 CσC † − σ̃ C †−1 + fˆ (0) C −1 CρC † − ρ̃ C †−1
= gf (ρ̃, σ̃) + f (0) (σ − σ̃) + fˆ (0) (ρ − ρ̃) . (18)
Almost analogously,
{ }
gf (ρ, σ) = C1−1 gf (ρ̃, σ) + fˆ (0) ρ22 C1†−1

= gf (ρ̃, σ) + fˆ (0) (ρ − ρ̃) . (19)


Due to the decomposition (18), the following theorem is almost immediate.
Theorem 4.3 Suppose f satisfies (FC) and gf (ρ, σ) > −∞. Then gf (ρ, σ) <
∞ only in the following four cases.
(i) fˆ (0) < ∞ and f (0) < ∞;
(ii) fˆ (0) = ∞, f (0) < ∞, and supp ρ ⊂ supp σ;
(iii) fˆ (0) < ∞ , f (0) = ∞, and supp ρ ⊃ supp σ;
(iv) fˆ (0) = ∞ , f (0) = ∞, and supp ρ = supp σ.
Proof.
∑ By (17), it suffices to show gf (ρ̃, σ̃) < ∞. Observe supp ρ̃ = supp σ̃.
Let x∈X px sx = σ̃ −1/2 ρ̃σ̃ −1/2 be the spectral decomposition of σ̃ −1/2 ρ̃σ̃ −1/2 ,
and define qx :=
{ } 1. Then {sx , px , qx } is a simultaneous decomposition of
σ̃ −1/2 ρ̃σ̃ −1/2 , I . Therefore, if gf (ρ, σ) > −∞,
( )
gf (ρ̃, σ̃) = σ̃ 1/2 gf σ̃ −1/2 ρ̃σ̃ −1/2 , I σ̃ 1/2

≤ σ̃ 1/2 f (px ) sx σ̃ 1/2
x∈X
( )
1/2
= σ̃ f σ̃ −1/2 ρ̃σ̃ −1/2 σ̃ 1/2 < ∞.

Theorem 4.4 If gf (ρ, σ) > −∞, ∀ρ ≥ 0 and ∃σ > 0, then f is operator


convex, and ( )
gf (ρ, σ) = σ 1/2 f σ −1/2 ρσ −1/2 σ 1/2 .
Also, if gf (ρ, σ) > −∞, for all ρ ≥ 0 moves all over rank -k operators, f is k-
convex.

9
Note this theorem is not claiming that gf (ρ, σ) > −∞ for all operator convex
function f (which turns out to be the case).
Proof.
( )
∑ ∑
gf (ρ, I) = gf λi |ei 〉 〈ei | , |ei 〉 〈ei |
i∈Ii i∈I

= gf (λi |ei 〉 〈ei | , |ei 〉 〈ei |) ,
i∈Ii

where i-th direct component is scalar multiple of |ei 〉 〈ei |. Thus by (4),

gf (ρ, I) = f (λi ) |ei 〉 〈ei | = f (ρ) .
i∈I

Since gf is convex, so is ρ → f (ρ). Thus, if ρ moves all over B≥ (H), f is


operator convex. If ρ moves all over rank -k operators, f is k-convex.

5 Closed Formula
In this section, we prove that our definition of generalized perspective function
is equivalent to definition in other literatures [3][4]:

Theorem 5.1 Suppose f is operator convex in addition to the condition (FC),


and ρ, σ ∈ B≥ (H). Then gf (ρ, σ) > −∞. If supp ρ ⊂ supp σ,
( )
gf (ρ, σ) = σ 1/2 f σ −1/2 ρσ −1/2 σ 1/2 (20)

If supp ρ ̸⊂ supp σ,

gf (ρ, σ) = gf (ρ̃, σ) + fˆ (0) (ρ − ρ̃) , (21)

The above statements are true for any convex functions f with (FC) provided
supp ρ ∩supp σ is 1 - dimensional.

Observe ρ̃ is positive due to Proposition A.2 in fact, ρ̃−1 = πσ ρ−1 πσ .).


Proof. First, suppose f is operator convex and supp ρ ⊂ supp σ, and prove
(20). In this case,
( )
gf (ρ, σ) = σ 1/2 gf σ −1/2 ρσ −1/2 , πσ σ 1/2 . (22)
{ }
Consider a simultaneous decomposition {sx , px , qx }x∈X of σ −1/2 ρσ −1/2 , πσ .
Without loss of generality, we let qx = 1, so that
∑ ∑
px s = σ −1/2 ρσ −1/2 , sx = πσ .
x∈X x∈X

10
By Naimark extension theorem, there are orthogonal projections {Px }x∈X and
an isometry V from H′ onto supp σ such that

sx = V Px V † , sx = IH′ .
x∈X

Then

f (px ) sx
x∈X
{ } { ( )}
∑ ∑

= V f (px ) Px V =V f px Px V†
x∈X x∈X
( { } )

≥ f V px Px V†
x∈X
( )
−1/2
= f σ ρσ −1/2 .

where the inequality in the third line is due to Jensen’s inequality (Proposi-
tion A.1), and the equality in that inequality can be achieved by sx correspond-
ing to the spectral decomposition of σ −1/2 ρσ −1/2 . Therefore, we have (20).
If f is operator convex and supp ρ ̸⊂ supp σ, by (19) and (20), (21) is derived.
If supp ρ ∩ supp σ is 1 - dimensional, ρ̃ is constant multiple of σ̃, since both
of operators are supported on 1 - dimensional space supp ρ ∩ supp σ. Thus (18)
and (4) shows that gf (ρ, σ) > −∞. Therefore, (19) leads to (21).

6 Convex Roof Property


Application of (21) to two rank -1 operator shows
{
f (r) |ϕ〉 〈ϕ| , if |ψ〉 〈ψ| = r |ϕ〉 〈ϕ| , ∃r ≥ 0,
gf (|ψ〉 〈ψ| , |ϕ〉 〈ϕ|) =
f (0) |ϕx 〉 〈ϕx | + fˆ (0) |ψx 〉 〈ψx | , otherwise,
(23)
where, f (0) and/or fˆ (0) may be ∞. In fact, gf (ρ, σ) is the largest convex
function satisfying (23):

Theorem 6.1
{ }

gf (ρ, σ) = inf cx gf (|ψx 〉 〈ψx | , |ϕx 〉 〈ϕx |) , (24)
x

where {|ψx 〉 , |ϕx 〉 , cx } moves over all the decompositions with


∑ ∑ ∑
cx |ψx 〉 〈ψx | = ρ, cx |ϕx 〉 〈ϕx | = σ, cx = 1, cx ≥ 0.
x x x

11
Proof. In the definition of gf (ρ, σ), we may restrict a simultaneous decompo-
sition {sx , px , qx }x∈X to those with rank sx = 1, ∀x ∈ X . Therefore, ”≥” is
obvious. Below, we show ”≤”.
Define

X1 : = {x ; |ψx 〉 〈ψx | = rx |ϕx 〉 〈ϕx |} ,


X2 : = {ξ (x) ; x ∈ X \X1 } ,

where the map x → ξ (x) is bijective. Then by (23),



cx gf (|ψx 〉 〈ψx | , |ϕx 〉 〈ϕx |)
x∈X
∑ ∑
= cx gf (rx , 1) |ϕx 〉 〈ϕx | + cx (gf (0, 1) |ϕx 〉 〈ϕx | + gf (1, 0) |ψx 〉 〈ψx |)
x∈X1 x∈X \X1
∑ ∑ ∑
= gf (cx rx , cx ) |ϕx 〉 〈ϕx | + gf (0, cx ) |ϕx 〉 〈ϕx | + gf (cx , 0) |ψx 〉 〈ψx |
x∈X \X1 x∈X1 x∈X1

Therefore, there is a simultaneous decomposition {s′x , p′x , qx′ }x∈X ′ of {ρ, σ} such
that ∑ ∑
gf (p′x , qx′ ) s′x = cx gf (|ψx 〉 〈ψx | , |ϕx 〉 〈ϕx |) ,
x∈X ′ x∈X

indicating ”≤”.

7 Dual Representation (1)


×2
Define Wf (H) as a subset of B (B (H)) such that

Wf (H) := {(Λ1 , Λ2 ) ; pΛ1 + qΛ2 ≤ gf (p, q) I, ∀p, q ≥ 0} . (25)

Here, Λ ≥ 0 (Λ ∈ B (B (H))) means Λ (X) ≥ 0 for any X ∈ B≥ (H), and


I ∈ B (B (H)) is an identity map. Below, Wf (H) is denoted by Wf unless it is
confusing. Also, we define

W′f (H) := {(Λ1 , Λ2 ) ; gf (p, q) I − pΛ1 − qΛ2 is completely positive, ∀p, q ≥ 0} .

The purpose of the section is to show:

gf (ρ, σ) = sup {Λ1 (ρ) + Λ2 (σ)} , (26)


(Λ1 ,Λ2 )∈Wf (H)

= sup {Λ1 (ρ) + Λ2 (σ)} . (27)


(Λ1 ,Λ2 )∈W′f (H)

Here, the RHS’s are defined to be ∞ when there is no operator X with X ≥


Λ1 (ρ) + Λ2 (σ), ∀ (Λ1 , Λ2 ) ∈ Wf (H) or W′f (H).

12
Lemma 7.1 The LHS of (26) is larger than or equal to the RHS of (26) and
(27)

Proof. If {sx , px , qx }x∈X is a simultaneous decomposition of {ρ, σ} and (Λ1 , Λ2 ) ∈


Wf (H) or W′f (H),

Λ1 (ρ) + Λ2 (σ) = px Λ1 (sx ) + qx Λ2 (sx )
x∈X

≤ gf (px , qx ) sx .
x∈X

Thus taking the infimum of the RHS, we have the assertion.


By this lemma, we only have to show the achievability of (27).

Proposition 7.2 (Theorem 8.6.1 of [6]) Let F be a real-valued convex func-


tional defined on a convex subset Ω of a vector space X, and let G be a convex
mapping of X into a partially ordered normed space Z. Define
⃗ Λ
µ0 := inf{F (Λ); ⃗ ∈ Ω, G(Λ)
⃗ ≤ 0}.

Then for any ζ0∗ ≥ 0,


( ) 〈 〉
⃗ + ζ0∗ , G(Λ)
µ0 ≥ inf{F Λ ⃗ ; Λ)
⃗ ∈ Ω} =: F∗ (ζ0∗ ) . (28)

⃗ 1 such that G(Λ


Also, if there exists an Λ ⃗ 1 ) < 0 and µ0 is finite,

µ0 = max

F∗ (ζ0∗ ) . (29)
ζ0 ≥0

Below, we apply this proposition considering the RHS of (27) as the primal
problem, and obtain the reverse test as its dual problem. To proceed, we need
to introduce a proper mathematical framework.
Consider the space C of continuous real valued functions on the compact
×2
set D ⊂ [0, 1] and the space Bsa of the space of the bounded self - adjoint
linear operators on the Hilbert space K. Endorse C and Bsa with the norm
∥h∥ := sup(λ1 ,λ2 )∈D |h (λ1 , λ2 )| and the operator norm ∥W ∥, respectively. From
these two spaces, we compose the linear space
{ n }

hi Wi ; hi ∈ C , Wi ∈ Bsa ,
i=1

and its completion with respect to the projective norm


{ n n
}
∑ ∑
∥z∥π := inf ∥hi ∥ ∥Wi ∥ ; z = hi W i
i=1 i=1

ˆ π Bsa (that ∥·∥π


is denoted by Z. In fact, Z is the projective tensor product C ⊗
is a norm and ∥hW ∥π = ∥h∥ ∥W ∥ is known [9]) .

13
Then
∑∞ for each z ∈ Z, there exist bounded sequences {hi } and {Wi } with
z = i=1 hi Wi and
{∞ ∞
}
∑ ∑
∥z∥π = inf ∥hi ∥ ∥Wi ∥ ; z = hi Wi .
i=1 i=1

One can endorse the partial order ≥ in Z by




z≥0⇔ hi (λ1 , λ2 ) Wi ≥ 0, ∀ (λ1 , λ2 ) ∈ D.
i=1

The strict inequality z > 0 means that z is an interior point of the cone
{z ′ ; z ′ ≥ 0}.
Any bounded linear functional ζ ∗ on Z is the linearization of bilinear form
on C and Bsa (see Section 2.2, [9]) :

ζ ∗ (hW ) = ζ ∗ (h) (W ) ,
where ζ ∗ (h) (·) and ζ ∗ (·) (W ) is an element of Bsa ∗
and C ∗ , respectively.
ˆ
Below, Z :=C ⊗π Bsa (H ⊗ H), and let X = Ω be the pair (Λ1 , Λ2 ) of linear
transform on Bsa (H). Λ1 can be identified with Bsa (H ⊗ H) by Choi’s repre-
sentation: ∑
Φ := |i〉 〈i| ⊗ |i〉 〈i| , Ψθ := Λθ ⊗ I(Φ),
i
Thus
{ }
X = Ω := Λ⃗ = (Λ1 , Λ2 ) ;1 , Λ1 ∈ Bsa (Bsa (H)) ,

⃗ : = g1 Ψ1 + g2 Ψ2 − gf Φ,
G(Λ)
⃗ : = tr A (Λ1 (ρ) + Λ2 (σ)) = tr Ψ1 (A ⊗ ρT ) + tr Ψ2 (A ⊗ σ T ).
FA (Λ)
Then using almost parallel argument as [7], we have:
Lemma 7.3 Suppose f satisfies (FC) and gf (λ1 , λ2 ) is continuous on D ⊂
×2
[0, 1] . Define
{ }
⃗ gf (λ1 , λ2 ) I − λ1 Λ1 − λ2 Λ2 is CP, ∀λ1 , λ2 ∈ D .
Wf,D := Λ;

Then the following identity


∑ holds provided there is Z satisfying the constraints
(31) and (32), and (λ1 ,λ2 )∈D gf (λ1 , λ2 ) tr Zλ1 ,λ2 < ∞ exists:

min ⃗
gf (λ1 , λ2 ) tr Zλ1 ,λ2 = sup FA (Λ), (30)

Λ∈W
(λ1 ,λ2 )∈D f,D

where the minimization in the LHS is taken over all the (λ1 , λ2 ) → Zλ1 ,λ2 with
finitely many non-zero points and satisfying
Zλ1 ,λ2 ≥ 0, tr Zλ1 ,λ2 ≤ 1, (31)
∑ ∑
λ1 Zλ1 ,λ2 = A ⊗ ρT , λ2 Zλ1 ,λ2 = A ⊗ σ T . (32)
(λ1 ,λ2 )∈D (λ1 ,λ2 )∈D

14
Proof. G is convex, and Λ ⃗ ∗ := (w1,∗ I, w2,∗ I), where (w1,∗ , w2,∗ ) satisfies
λ1 w1,∗ + λ2 w2,∗ − gf (λ1 , λ2 ) < 0 for any λ1 ≥ 0, λ2 ≥ 0,
⃗ ∗ ) < 0.
G(Λ
With ζ ∗ ∈ Z∗ , ζ ∗ ≥ 0,
{ }
F∗ (ζ ∗ ) = sup tr Ψ1 (A ⊗ ρT ) + tr Ψ2 (A ⊗ σ T ) − ζ ∗ (g1 Ψ1 + g2 Ψ2 − gf Φ)

Ψ
{( ) ( ) }
= sup tr Ψ1 (A ⊗ ρT ) − ζ ∗ (g1 ) (Ψ1 ) + tr Ψ2 (A ⊗ σ T ) − ζ ∗ (g2 ) (Ψ2 ) + ζ ∗ (gf ) (Φ)

Ψ
{
ζ ∗ (gf ) (Φ) , if tr Ψ1 (A ⊗ ρT ) − ζ ∗ (g1 ) (Ψ1 ) = tr ρW and Ψ2 (A ⊗ σ T ) − ζ ∗ (g2 ) (Ψ2 ) ,
=
∞, otherwise.
Observe h → ζ ∗ (h) (W ) is a bounded functional on C. Therefore, by Riesz-
Markov representation theorem,

ζ ∗ (h) (W ) = h (λ1 , λ2 ) dνW ,
D

where νW is a regular measure over the Borel sets of D. Since the map
W → ζ ∗ (χ (B)) (W ) = νW (B), where χ (·) is the indicator function, is a linear
positive map,
|νW (B)| ≤ ∥W ∥ ∥ζ ∗ (χ (B)) (·)∥ = ∥W ∥ ζ ∗ (χ (B)) (1) = ∥W ∥ ν1 (B) . (33)
dνW
Therefore, νW is absolutely continuous relative to ν1 .Thus ξλ1 ,λ2 (W ) := dν1
exists, and ∫

ζ (h) (W ) = h (λ1 , λ2 ) ξλ1 ,λ2 (W ) d ν1
D
By this identity and linearity of W → ζ ∗ (h) (W ), W → ξλ1 ,λ2 (W ) is linear.
Positivity of ξλ1 ,λ2 follows from positivity of ζ ∗ . By (33),
|ξλ1 ,λ2 (W )| ≤ ∥W ∥ , ν- a.e. (34)
Therefore, there is Zλ1 ,λ2 ≥ 0 with tr Zλ1 ,λ2 W = ξλ1 ,λ2 (W ) for any W ∈
Bsa . By (34), (31) holds for ν-a.e. . Since dim H < ∞, the optimal value can be
achieved by sum over a finite set by Caratheodory’s theorem. Thus, rewriting
F∗ using Zλ1 ,λ2 , we have the assertion provided the LHS of (30) is finite. But
by
∑ the weak duality (28), this is true if the constraints (31) and (32), and
(λ1 ,λ2 )∈D gf (λ1 , λ2 ) tr Zλ1 ,λ2 < ∞ exists.

×2
Lemma 7.4 Suppose gf (λ1 , λ2 ) is continuous on D ⊂ [0, 1] . Also suppose
A ≥ 0 is rank - 1. Then the following identity holds provided the LHS of the below
is finite: ∑
min ⃗
gf (px , qx ) tr Asx = sup FA (Λ), (35)
x∈X ⃗
Λ∈W f,D

where the minimization in the LHS is taken over all the simultaneous decompo-
sition of {ρ, σ} such that (px , qx ) ∈ D.

15
Proof. Let π be the projector onto supp A, and apply π ⊥ ⊗ I to the both ends
of the first identity of (32):
∑ ( ) ( )
λ1 π ⊥ ⊗ I Zλ1 ,λ2 π ⊥ ⊗ I = 0.
(λ1 ,λ2 )∈D
( ) ( )
Since each term is positive, it means π ⊥ ⊗ I Zλ1 ,λ2 π ⊥ ⊗ I = 0. Therefore,
since A is rank - 1,
Zλ1 ,λ2 = A ⊗ Z̃λT1 ,λ2 ,
where ∑ ∑
λ1 Z̃λ1 ,λ2 = ρ, λ2 Z̃λ1 ,λ2 = σ.
(λ1 ,λ2 )∈D (λ1 ,λ2 )∈D

This obviously corresponds to a simultaneous decomposition.

Lemma 7.5 Suppose A ≥ 0 is rank - 1. Then the following identity holds pro-
vided gf (ρ, σ) is finite:

⃗ = sup FA (Λ).
tr Agf (ρ, σ) = sup FA (Λ) ⃗ (36)

Λ∈W ′ ⃗
Λ∈W f
f

Proof. If f (0) < ∞ and fˆ (0) < ∞, gf (λ1 , λ2 ) < ∞ and continuous on [0, 1] .
×2
×2
Thus, let D = [0, 1] . Then the LHS of (35) is bounded by tr Agf (ρ, σ), which
is finite. Thus by Lemma 7.4, we obtain the assertion.
If f (0) = ∞ and fˆ (0) < ∞, the condition gf (ρ, σ) < ∞ demands supp ρ ⊃
supp σ. Define Dε := [ε, 1] × [0, 1]. Consider the simultaneous decomposition
obtained via spectrum decomposition of σ −1/2 ρ̃σ −1/2 . Then (px , qx ) ∈ Dr0 ,
where r0 > 0 is the smallest eigenvalue of σ −1/2 ρ̃σ −1/2 . Thus if ε ≤ r0 , feasible
point of the LHS of (35) exists, and its LHS is bounded by tr Agf (ρ, σ) from
above. Thus (35) holds.
Let us denote by Dε1 and Dε2 the LHS and the RHS of (35), respectively.
Since the domain is of the variable is larger, tr Agf (ρ, σ) ≤ Dε1 = Dε2 . Also,
 
 ∩ 
⃗ Λ
lim Dε2 = sup FA (Λ); ⃗ ∈ Wf,Dε
ε↓0  
ε∈[0,r0 ]
{ } { }
⃗ Λ
= sup FA (Λ); ⃗ ∈ Wf,D = sup FA (Λ); ⃗ Λ
⃗ ∈ W′f .
0

Therefore, by we have ”≤” of the first identity of (36). Since the opposite
inequality holds by Lemma 7.1, we have the first identity of (36).
Finally, observe
⃗ ≥ sup FA (Λ),
tr Agf (ρ, σ) ≥ sup FA (Λ) ⃗

Λ∈Wf

Λ∈W′
f

where the first inequality is by Lemma 7.1 and the second by Wf ⊃ W′f . There-
fore, the first identity of (36) indicates the second one.

16
( )
Lemma 7.6 If (Λ1 , Λ2 ) ∈ W′f (H), it is possible to extend it to Λ̃1 , Λ̃2 ∈
¯
¯
W′f (K), K ⊃ H so that Λ̃θ ¯ = Λθ (θ = 1, 2).
B(H)

Proof. Let H′ be a finite dimensional Hilbert space such that K ⊂ H ⊗ H′ .


Then,
Λ̃θ := πH (Λθ ⊗ IH′ ) πH , θ ∈ {1, 2}
satisfies required conditions.
By this lemma, we let H := supp ρ + supp σ without loss of generality.

Lemma 7.7 (Λ1 , Λ2 ) ∈ Wf (H) (∈ W′f (H), resp.) iff (Λ′1 , Λ′2 ) ∈ Wf (H)
(∈ W′f (H), resp.), where
( )
Λ′θ (X) := C −1 Λθ CXC † C −1† , θ ∈ {1, 2}

for some invertible operator C.

Lemma 7.8 If fˆ (0) = ∞ and supp ρ ̸⊂ supp σ,

sup tr A (Λ1 (ρ) + Λ2 (σ)) = ∞,


(Λ1 ,Λ2 )∈W′f (H)

for all A ≥ 0 with tr A (ρ − ρ̃) ̸= 0.


As a consequence, if gf (ρ, σ) = ∞, the RHS of (26) and (27) are also ∞.

Proof. By Lemmas 7.6- 7.7, we suppose H = supp ρ + supp σ and {ρ, σ} is in


the form of (12), without loss of generality. (Then ρ22 ̸= 0.)
Since f is convex and limr→∞ f (r) = ∞, without loss of generality, suppose
f (r) ≥ 0, ∀r ≥ 0, and let
[ ] [ ]
−X11 aX12 0 0
Λ1 (X) := , Λ2 (X) := ,
aX21 aX22 0 −bX22
where the block forms are in accordance with the decomposition supp σ ⊕ ker σ.
Since Λθ is in the form of Λθ (X) = Dθ ∗ X with certain matrix Dθ , (Λ1 , Λ2 ) ∈
W′f (H) if f (r) 1dim H − rD1 − D2 ≥ 0. This is true iff
[ ]
f (r) + r f (r) − ar
≥ 0, ∀r ≥ 0.
f (r) − ar f (r) − ar + b

This is equivalent to
f (r) − ar + b ≥ 0 (37)
and
2
(f (r) + r) (f (r) − ar + b) − (f (r) − ar)
( )
b
= (a + 1) r f (r) + − ar + bf (r) ≥ 0
a+1

17
for all r ≥ 0. The latter is true if
b
f (r) − ar + ≥ 0. (38)
a+1
Since f is convex and positive, for each a > 0, if b > 0 is large enough, both (37)
and (38) hold for all r ≥ 0, meaning the corresponding (Λ1 , Λ2 ) is an element
of W′f (H). Also,
([ ])
−ρ̃ 0
Λ1 (ρ) + Λ2 (σ) = .
0 aρ22
Since ρ22 = ρ − ρ̃, we have the assertion by letting a → ∞.

Theorem 7.9 If gf (ρ, σ) > −∞, (26) and (27) hold.

Proof. Suppose gf (ρ, σ) is finite. Then since (36) holds for all rank - 1 positive
operator A, we have (26) and (27). Next suppose gf (ρ, σ) = ∞. Then (26) and
(27) hold by Lemma 7.8.

Remark 7.10 When f is operator convex, gf (ρ, σ) may be defined by (26). But
some properties are not easy to show starting from this definition. For example,
to prove statement such as (9), one must compose (Λ1 , Λ2 ) ∈ Wf (H1 ⊕ H2 ) us-
ing (Λ1,i , Λ2,i ) ∈ Wf (Hi ) (i = 1, 2), but the composition is not straightforward.
Even the proof (5), which is easier, is not very straightforward.

8 Continuity
Though we had defined non - commutative extension of the perspective function
by simultaneous decomposition, usual definition is (20) for full - rank operators,
and extend it to non - full rank operators by taking limit. Thus, we have to
investigate relation between (21) and limit of (20) along certain continuous
curves.

Proposition 8.1 Suppose f is operator convex. Let {ρε , σε }ε≥0 be a continuous


curve with {ρ0 , σ0 } = {ρ, σ}. Then

gf (ρ, σ) ≤ lim gf (ρε , σε ) .


ε↓0

In particular, if gf (ρ, σ) = ∞ in addition,

gf (ρ, σ) = lim gf (ρε , σε ) . (39)


ε↓0

Proof. Suppose gf (ρ, σ) < ∞. Then by (26) for each δ > 0, there is (Λ1,δ , Λ2,δ ) ∈
Wf such that
gf (ρ, σ) ≤ Λ1,δ (ρ) + Λ2,δ (σ) + δI.

18
Since (Λ1,δ , Λ2,δ ) is an element of Wf ,
lim gf (ρε , σε ) ≥ lim (Λ1,δ (ρε ) + Λ2,δ (σε ))
ε↓0 ε↓0
= Λ1,δ (ρ) + Λ2,δ (σ) ≥ gf (ρ, σ) − δI.
Since δ > 0 is arbitrary, we have the assertion.
Suppose gf (ρ, σ) = ∞. Then by (56), there is there is (Λ1,c , Λ2,c ) ∈ Wf
such that
tr Λ1,c (ρ) + tr Λ2,c (σ) ≥ c,
for all c > 0. Therefore,
lim tr gf (ρε , σε ) ≥ lim (tr Λ1,c (ρε ) + tr Λ2,c (σε ))
ε↓0 ε↓0
= tr Λ1,c (ρ) + tr Λ2,c (σ) ≥ c.
Since c > 0 is arbitrary, we have the assertion.
Theorem 8.2 Suppose f is operator convex. Let {ρε , σε }ε≥0 be a continuous
linear curve with {ρ0 , σ0 } = {ρ, σ}. Then (39) holds.
Proof. By convexity of gf ,
lim gf (ρε , σε ) ≤ lim {(1 − ε) gf (ρ, σ) + εgf (ρ1 , σ1 )}
ε↓0 ε↓0
= gf (ρ, σ) .
Thus combined with Proposition 8.1, we have the assertion.
Theorem 8.3 Suppose f is operator convex. Also let {ρε , σε }ε≥0 be a continu-
ous curve with {ρ0 , σ0 } = {ρ, σ}, ρε ≥ ρ, σε ≥ σ, and gf (ρε , σε ) < ∞, ∀ε ≥ 0.
Then (39) holds provided
lim gf (ρε − ρ, σε − σ) ≤ 0. (40)
ε↓0

(40) holds true if f (0) < ∞, fˆ (0) < ∞.


Proof. By Proposition 8.1, we only have to show
lim gf (ρε , σε ) ≤ gf (ρ, σ) .
ε↓0

Observe, by Lemma 3.1,


gf (ρε , σε ) ≤ gf (ρ, σ) + gf (ρε − ρ, σε − σ) .
Thus taking ε ↓ 0 of both ends, we have the assertion.
Also, if f (0) < ∞, fˆ (0) < ∞,
gf (ρε − ρ, σε − σ) ≤ gf (ρε − ρ, 0) + gf (0, σε − σ)
= fˆ (0) (ρε − ρ) + f (0) (σε − σ) .
Thus taking ε ↓ 0, (40) is derived.

19
9 Dual Representation (2)
In this section, we investigate (Λ1 , Λ2 ) achieving the supremum of (27). Since
the case where gf (ρ, σ) = ∞ is already treated by Lemma 7.8, we suppose
gf (ρ, σ) < ∞. By Lemmas 7.6- 7.7, we suppose H = supp ρ + supp σ and {ρ, σ}
is in the form of (12), without loss of generality.
First, we study the case where ρ > 0, σ > 0. By Lemma 7.7, suppose σ = I
without loss of generality.
If dim H = 1, (cI, −f ∗ (c) I) ∈ W′f (H) and with proper choice of c, it
achieves the supremum of (27).
When dim H ≥ 2, we suppose f is operator convex in addition to satisfying
(FC).
Suppose (Λf1 , Λf2 ) achieves the maximum of (27). Then

tr Agf (ρ + εX, σ + εY ) = tr Agf (ρ, σ) + εtr AΛf1 (X) + tr AΛf2 (Y ) + o (ε)

holds for any A ≥ 0. Therefore, if f is operator convex and σ = I > 0, by


differentiating (ρ, σ) → tr Agf (ρ + εX, σ + εY ), we obtain

Λf1 (X) : = Df (ρ) [X] , (41)


1 1
Λf2 (X) : = (f (ρ) X + Xf (ρ)) − Df (ρ) [Xρ + ρX] . (42)
2 2
Here, Df (ρ) [X] is the Frechet derivative of f at ρ in the direction of X,
° °
° 1 °
°
lim °Df (ρ) [X] − (f (ρ + εX) − f (ρ))°
ε→0 ε ° = 0.

In the basis diagonalizing ρ and letting ri ’s be eigenvalues of ρ,


[ ]
Df (ρ) [X] = f [1] (ri , rj ) Xi,j ,
{
f (ri )−f (rj )
, if ri ̸= rj ,
f [1] (ri , rj ) : = λi −λj

(43)
f (ri ) , if ri = rj .

By Df (ρ) [ρ] = ρf ′ (ρ), it is easy to check (Λf1 , Λf2 ) satisfies

Λf1 (ρ) + Λf2 (I) = f (ρ) = gf (ρ, I) . (44)

If f (0) < ∞, by Theorem 8.1, [5],



f (r) = f (0) + af1 (r) + bf2 (r) + ηt (r) dµ (t) (45)
(0,∞)

where fk (r) := rk ,
r r r t
ηt (r) := − = −1+
1+t t+r 1+t t+r

20
∫ dµ(t)
and (0,∞) (1+t) 2 < ∞.

We show (26) for each term separately: first we show it for each term of the
expansion (45), f2 and ηt . By weak duality (Lemma 7.1), it suffices to show the
achievability of the LHS of (26).
Since
Λf12 (X) = ρX + Xρ, Λf22 (X) = −ρXρ,
the map

X → f2 (r) X − rΛf12 (X) − Λf22 (X)


= (rI − ρ) X (rI − ρ)

is completely positive. Thus (Λf12 , Λf22 ) ∈ W′f2 (H). Also,

Λf12 (ρ) + Λf22 (σ) = ρ2 = gf2 (ρ, I) .

When f = ηt ,
X −1 −1
Λη1t (X) = − t (tI + ρ) X (tI + ρ)
1+t
1 −1 −1
= (tI + ρ) {t (ρX + Xρ − X) + ρXρ} (tI + ρ) ,
1+t
−1 −1
Λη2t (X) = −ρ (tI + ρ) X (tI + ρ) ρ
−1 −1
= − {(tI + ρ) − tI} (tI + ρ) X (tI + ρ) {(tI + ρ) − tI}
−1 −1 2 −1 −1
= −X + t (tI + ρ) X + tX (tI + ρ) − t (tI + ρ) X (tI + ρ)(46),

and the map

X → ηt (r) X − rΛη1t (X) − Λη2t (X)


t { −1
} {
−1
}
= I − (r + t) (tI + ρ) X I − (r + t) (tI + ρ)
t+r
is completely positive. Thus (Λη1t , Λη2t ) ∈ W′ηt (H). Also,
ρ −1
Λη1t (ρ) + Λ2ηt (σ) = − ρ (tI + ρ) = gηt (ρ, I)
1+t
−1
ρ (tI + ρ) (ρ − I)
= . (47)
1+t
Below we show

Λf1 (X) = aX + bΛf12 (X) + Λη1t (X) dµ (t) , (48)
(0,∞)

Λf2 (X) = f (0) X + bΛf22 (X) + Λη2t (X) dµ (t) , (49)
(0,∞)

21
Since
∂ 1 t (2r − 1) t + r2
ηt (r) = − 2 = 2
∂r 1 + t (t + r) (1 + t) (t + r)
2r (t + r) 2r 2 max {r, 1}
≤ 2 = (1 + t) (t + r) ≤ 2 ,
(1 + t) (t + r) (1 + t)

the integrals in these formulas are convergent. Also, since r → ηt (r) is convex,
ε → ηt (r+ε)−η ε
t (ε)
is monotone increasing. Therefore, by monotone convergence
theorem, the order of the differentiation and the integration by µ can be in-
terchanged, and (Λf1 , Λf2 ) in (48) and (49) coincide with the one in (41) and
(42).
(Λf1 , Λf2 ) ∈ W′f (H) follows from (Λf12 , Λf22 ) ∈ W′f2 (H) and (Λη1t , Λη2t ) ∈

Wηt (H). (44) already has checked, but also can be verified by (48) and (49).
So far, we had supposed f (0) < ∞. This assumption can be removed by
replacing f (r) by fδ (r) := f (r + δ), where δ ≤ δ0 , and δ0 is set to the half of
the smallest eigenvalue of ρ. Then differentiation of tr Agf (ρ′ , σ ′ ) at ρ′ = ρ − δI
and σ ′ = I leads to

Λδ1 (X) : = Df (ρ) [X] = Λf1 (X) ,


1 1
Λδ2 (X) : = (f (ρ) X + Xf (ρ) ) − Df (ρ) [Xρ + ρX − 2δX]
2 2
= Λf2 (X) + δΛf1 (X) .

Thus the following map is CP for all r ≥ 0 :

fδ (r) I − rΛδ1 − Λδ2 = f (r − δ) IΛf1 − (Λf2 + δΛf1 )


= f (r − δ) I − (r + δ) Λf1 − Λf2

Therefore,

f (r) I − rΛf1 − Λf2 is CP, ∀r ≥ δ.


Since δ ∈ (0, δ0 ] is arbitrary, (Λf1 , Λf2 ) ∈ W′f (H). ( Recall (44) was verified
without appealing to (45), though it is possible to check the identity also by
application of (45) to fδ .)

Lemma 9.1 Suppose ρ > 0, σ ≥ 0 but σ ̸> 0, and fˆ (0) < ∞. Suppose also
(Λ1,0 , Λ2,0 , ) ∈ W′f (supp σ) and

Λ1,0 (ρ̃) + Λ2,0 (σ) ≥ gf (ρ̃, σ) − εI.

Then there is (Λ1 , Λ2 ) ∈ W′f (H) with

Λ1 (ρ) + Λ2 (σ) ≥ gf (ρ, σ) − 2εI.

22
Proof. Let δ0 > 0 be arbitrary, and define
[ ]
(1 + δ) Λ1,0 (X11 ) − δ fˆ (0) X11 fˆ (0) X21
Λ1 (X) := ,
fˆ (0) X21 α1 X22

where δ ∈ (0, δ0 ] and

c : = fˆ (0) − δ 2 < fˆ (0) ,


( )
α1 : = 1 + δ −1 c − δ −1 fˆ (0) = fˆ (0) − δ 2 − δ.

If f (0) = ∞, define r0 which is smaller than the minimum point of f and


satisfies
¯ ′ ¯ ¯ ¯
¯f− (r0 )¯ ≥ (1 + δ0 ) ∥Λ1,0 ∥ + (1 + δ0 ) ¯¯fˆ (0)¯¯ + 2δ0
cb

≥ ∥Λ1 ∥cb

where f− denotes the right derivative (Since f (0) = ∞, such r0 always exists.)
If f (0) < ∞, define r0 := 0. Then let
[ ]
(1 + δ) Λ2,0 (X11 ) − δf (r0 ) X11 f (r0 ) X12
Λ2 (X) := ,
f (r0 ) X21 α2 X22
( )
where α2 := − 1 + δ −1 f ∗ (c) − δ −1 f (r0 ).
First, we check (Λ1 , Λ2 ) ∈ W′f (H), which is equivalent to checking

f (r) I − rΛ1 − Λ2 is CP, ∀r ≥ r0 , (50)

since for any r ∈ (0, r0 ),

(f (r) I − rΛ1 − Λ2 ) ⊗ I
[{ ′
} ]
≥ f (r0 ) + (r − r0 ) f− (r0 ) I − rΛ1 − Λ2 ⊗ I
[ ( ′ )]
= f (r0 ) I − r0 Λ1 − Λ2 + (r − r0 ) f− (r0 ) I − Λ1 ⊗ I
≥ [f (r0 ) I − r0 Λ1 − Λ2 ] ⊗ I.

If r ≥ r0 ,

[(f (r) I − rΛ1 − Λ2 ) ⊗ I (Y )]11


{ ( ) }
= f (r) I − (1 + δ) (rΛ1,0 + Λ2,0 ) + δ rfˆ (0) + f (r0 ) I ⊗ I(Y11 )
{ ( ) }
≥ f (r) I − (1 + δ) f (r) I + δ rfˆ (0) + f (r0 ) I ⊗ I(Y11 )
( )
= δ fˆ (0) r + f (r0 ) − f (r) Y11
≥ 0

23
and

[(f (r) I − rΛ1 − Λ2 ) ⊗ I (Y )]22


( ) ( )
= f (r) Y22 − 1 + δ −1 (rc − f ∗ (c)) Y22 + δ −1 fˆ (0) r + f (r0 ) Y22
( ) ( )
≥ f (r) Y22 − 1 + δ −1 f (r) Y22 + δ −1 fˆ (0) r + f (r0 ) Y22
( )
= δ −1 fˆ (0) r + f (r0 ) − f (r) Y22
≥ 0.

Therefore, if r ≥ r0 ,

(f (r) I − rΛ1 − Λ2 ) ⊗ I (Y )
 ( ) ( ) 
δ fˆ (0) r + f (r0 ) − f (r) Y11 f (r) − fˆ (0) r − f (r0 ) Y12
≥ ( ) ( ) 
f (r) − fˆ (0) r − f (r0 ) Y21 δ −1 fˆ (0) r + f (r0 ) − f (r) Y22
( ) [ δY Y12
]
= fˆ (0) r + f (r0 ) − f (r) 11
Y21 δ −1 Y22
≥ 0,

since fˆ (0) r + f (r0 ) − f (r) ≥ 0 and


( )−1 ( )
−1
δY11 − Y12 δ −1 A22 Y21 = δ Y11 − Y12 (Y22 ) Y21 ≥ 0.

Therefore, (Λ1 , Λ2 ) satisfies (50), or equivalently (Λ1 , Λ2 ) ∈ W′f (H). Also,

Λ1 (ρ) + Λ2 (σ)
[ ( ) ]
(1 + δ) (Λ1,0 (ρ̃) + Λ2,0 (σ)) − δ fˆ (0) ρ̃ + f (r0 ) σ 0
= ( )
0 ˆ
f (0) ρ22 − δ 2 + δ ρ22
( ) ( )
≥ (1 + δ) (gf (ρ̃, σ) − εI) + fˆ (0) ρ22 − δ fˆ (0) ρ̃ + f (r0 ) σ − δ 2 + δ ρ22

= gf (ρ̃, σ) + fˆ (0) ρ22 − ε (1 + δ) I + O (δ) .

Since δ ∈ (0, δ0 ) is arbitrary, we have the assertion.

Remark 9.2 In the proof, Λ1,0 is in general depends on ε. Thus, f (r0 ) may be
sharply increasing in ε−1 . However, observe δ may be chosen arbitrarily small
independent of ε and f (r0 ), so that δf (r0 ) is small.

By this lemma, the construction of the optimal (Λ1 , Λ2 ) is possible in the


case where supp ρ ⊂ supp σ, and gf (ρ, σ) is finite. By replacing f by fˆ, in
the almost parallel manner, the optimal (Λ1 , Λ2 ) is composed in the case where
supp ρ ⊃ supp σ and gf (ρ, σ) is finite. If supp ρ ̸⊂ supp σ, supp ρ ̸⊃ supp σ and
gf (ρ, σ) is finite, f (0) < ∞ and fˆ (0) < ∞. Thus, by repeating the extension
indicated in the lemma twice, we obtain optimal (Λ1 , Λ2 ).

24
10 Dual Representation (3)
In this section, we compose (Λ1 , Λ2 ) ∈ W′f directly using (45), in the case where
f is operator convex, f (0) < ∞, and gf (ρ, σ) < ∞. This argument seems also
valid even if dim H = ∞. In the proof of Lemma 9.1, in the case of f (0) = ∞,
∥Λ1,0 ∥cb has to be finite. But this may not hold if dim H = ∞.
By assumption, either (i) fˆ (0) = ∞ and σ > 0 or (ii) fˆ (0) < ∞ and σ ≥ 0.
First, we study the former. In this case (41) and (42) cannot hold as they are,
since f ′ (0) may be infinite. By Lemma 7.7, let σ = I without loss of generality,
and define

∫ ∫
Λt11 (X) : = aX + bΛf12 (X) + Λη1t (X) dµ (t) + wt,1 dµ (t) X,
(t1 ,∞) (0,t1 )
∫ ∫
Λt21 (X) : = f (0) X + bΛf22 (X) + Λη1t (X) dµ (t) + wt,2 dµ (t) X,
(t1 ,∞) (0,t1 )

where
−1 −2 −2
wt,1 := (1 + t) − t (t + 1) , w2,t := − (t + 1) .
First, (Λt11 , Λt21 ) ∈ W′f is verified by checking the condition for each term;
(Λf12 , Λf22 ) ∈ W′f2 and (Λ1ηt , Λη2t ) ∈ W′ηt have been checked in the previous
section: (w1,t I, w2,t I) ∈ W′ηt checks easily. Second, the integral of Ληθt (X)
(θ ∈ {1, 2}) over (t1 , ∞), and the integral of wt,θ (θ ∈ {1, 2}) over (0, ∞)
∫ dµ(t)
are finite, since (0,∞) (1+t) 2 < ∞. (Here, it is important that t1 > 0, since

∂ηt /∂r (0) = O (1/t) as t → 0.) Therefore, (Λt11 , Λt21 ) is well - defined member of
W′f .

w dµ (t) < ∞ implies
(0,∞) t,θ

lim wt,θ dµ (t) = 0, θ ∈ {1, 2}.
t1 ↓0 (t1 ,∞)

Also, since (0,∞)
ηt (r) dµ (t) < ∞ for all r,

∫ ∫ ∫
Λη1t
(ρ) dµ (t) + Λ2ηt (I) dµ (t) = ηt (ρ) dµ (t)
(t1 ,∞) (t1 ,∞) (t1 ,∞)
∫ ∫
→ ηt (ρ) dµ (t) = gηt (ρ, I) dµ (t) , as t1 ↓ 0.
(0,∞) (0,∞)

Therefore, taking t1 small enough, we have

Λt11 (ρ) + Λt21 (I) ≥ gf (ρ, I) − ε.

Next, we move on to the case (ii), where fˆ (0) < ∞ and σ ≥ 0. We apply
the above composition for {ρ̃, σ} (recall supp ρ̃ ⊂ supp σ), and use Lemma 9.1.

25
In this case, since f (0) < ∞, r0 = 0 and thus there is no need to suppose
∥Λ1,0 ∥cb < ∞.
When f (0) = ∞, fˆ (0) < ∞, ρ > 0 and σ ≥ 0, replacing f by fˆ, al-
most parallel composition is possible. However, if f (0) = ∞, fˆ (0) = ∞, and
dim H = ∞,,the argument in the previous section is not valid, since spectrum of
σ −1/2 ρσ −1/2 does not have finite gap from 0 in general. Neither any Loewner
- type integral formula is available to present author.

11 On Operator Convexity
From this section, we again come back to our usual set up, where dim H < ∞.
When σ = I, by (41) and (42), (Λf1 , Λf2 ) ∈ Wf (H) is equivalent to
[ ]
1 1
f (r) X − (Xf (ρ) + f (ρ) X)−Df (ρ) rX − (Xρ + ρX) ≥ 0, ∀r ≥ 0. (51)
2 2
Theorem 12.4 gives following characterization of operator convexity:
Claim 11.1 A function f with (FC) is operator convex iff (51) for all ρ, X ∈
B≥ (H).
When X commutes with ρ and invertible, this is equivalent to
f (rI) − f (ρ) ≥ f ′ (ρ) (rI − ρ) .
Proof. Since ”only if” have been already shown, we show ”if”. By Lemma 7.1,
for each simultaneous decomposition of {ρ, I},

gf (px , qx ) sx ≥ Λf1 (ρ) + Λf2 (I) = f (ρ) ,
x∈X

where the identity holds by defining qx = 1, ∀x ∈ X and letting x∈X px sx = ρ
be the spectral decomposition of ρ. Thus,
f (ρ) = gf (ρ, I) .
Since the RHS is convex function, so is the LHS.

12 Generalized f - divergence
Define generalized f -divergence, with A, ρ, σ ∈ B≥ (H), by
{ }

Df,A (ρ∥σ) := inf gf (px , qx ) tr Asx ; {sx , px , qx }x∈X is a simultaneous decomposition of {ρ, σ}
x∈X

Since it is a scalar quantity and gf (px , qx ) is bounded from below, Df,A (ρ∥σ) >
−∞. As easily verified, if f is operator convex,
Df,A (ρ∥σ) = tr Agf (ρ, σ) .

26
Theorem 12.1 Let |I| < ∞.

(i) If Λi are positive linear maps from B (H) into B (H),


( ° )
∑ °∑ ∑
°
Df,A Λi (ρi )° Λi (σi ) ≤ Dmax
f,Λ† (A) (ρi ∥σi )
°
i∈I i∈I i∈I


(ii) If ci ∈ R≥ (∀i ∈ I) and i ci = 1,
( ° )
∑ ° ° ∑ ∑
Df,A ci ρ° , ci σi ≤ ci Df,A (ρi ∥σi ) .
°
i∈I i∈I i∈I

(iii) For any positive map Λ,

Df,A (Λ (ρ) ∥Λ (σ)) ≤ Df,Λ† (A) (ρ∥σ)

(iv) Suppose f (0) ≤ 0. Then for all σ ′ ∈ B≥ (H),

Df,A (ρ∥σ) ≥ Df,A (ρ∥σ + σ ′ ) .

(v) If C −1 exists, ( )
Df,A CρC † ∥CσC † = Df,C † AC (ρ∥σ) . (52)

(vi) Suppose ρi , σi , Ai ∈ B≥ (Hi ) (i ∈ {1.2}) ,



Df,A1 ⊕A2 ( ρ1 ⊕ ρ2 ∥ σ1 ⊕ σ2 ) = Df,Ai (ρi ∥σi ) .
i∈I

(vii) For all ρ1 , σ1 ∈ B≥ (H1 ) , ρ2 ∈ B≥ (H2 ),

Df,A (ρ1 ⊕ ρ2 ∥σ1 ⊕ 0) = Df,A1 (ρ1 ∥σ1 ) + fˆ (0) tr A2 ρ2 , (53)

where Ai := πHi AπHi (i ∈ {1, 2}).


(viii) For all ρ, σ ∈ B≥ (H) ,

Df,A (ρ∥σ) = Df,A (ρ̃∥σ) + fˆ (0) tr A (ρ − ρ̃)


= Df,A (ρ∥σ̃) + f (0) tr A (σ − σ̃)
= Df,A (ρ̃∥σ̃) + +fˆ (0) tr A (ρ − ρ̃) + f (0) tr A (σ − σ̃)(54)
.

Proof. (i)-(vi) are proved almost in parallel with the analogous assertion for
operator perspective gf (ρ, σ).
(vii) is immediate consequence of (iii) of Theorem 3.3. To obtain (viii),
combine (v), (vii) and the decompositions (12) and (13).
If supp ρ ⊂ supp σ,
( )
Df,A (ρ∥σ) ≤ tr Aσ 1/2 f σ −1/2 ρσ −1/2 σ 1/2 . (55)

27
Theorem 12.2 Suppose f satisfies (FC). Then Df,A (ρ∥σ) < ∞ only in the
following four cases.
(i) fˆ (0) < ∞ and f (0) < ∞;
(ii) fˆ (0) = ∞ , f (0) < ∞, and A (ρ − ρ̃) = 0 ;
(iii) fˆ (0) < ∞, f (0) = ∞, and A (σ − σ̃) = 0 ;
(iv) fˆ (0) = ∞ , f (0) = ∞, and A (ρ − ρ̃) = A (σ − σ̃) = 0.

Proof. By (54), it suffices to show Df,A (ρ̃∥σ̃) < ∞, which is the consequence
of (55).

Lemma 12.3 Suppose Df,A (ρ∥σ) < ∞ and ρ, σ ∈ B≥ (H). Then

Df,A (ρ∥σ) = sup {tr ρW1 + tr σW2 } , (56)


(W1 ,W2 )∈Wf,A

where
Wf,A := {rW1 + W2 ≤ f (r) A, ∀r ≥ 0} .

The proof is almost the same as the one of Theorem in [7], thus omitted.

Theorem 12.4 (56) holds for all ρ ≥ 0, σ ≥ 0.

Proof. Suppose Df,A (ρ∥σ) = ∞. By Theorem 12.2, either fˆ (0) = ∞ and


A (ρ − ρ̃) ̸= 0, or f (0) = ∞ and A (σ − σ̃) = 0 . Since the latter case re-
duces to the former
( ) by replacing f by fˆ, we only study the former. Since
† †
Λ1 (A) , Λ2 (A) ∈ Wf,A provided (Λ1 , Λ2 ) ∈ W′f , by Lemma 7.8,
{ }
sup {tr ρW1 + tr σW2 } ≥ sup tr ρΛ†1 (A) + tr σΛ†2 (A)
(W1 ,W2 )∈Wf,A (H) (Λ1 ,Λ2 )∈W′f (H)
= ∞.

References
[1] Bhatia, R.: Matrix Analysis. Springer, Berlin (1996)
[2] Bhatia,R.: Positive Definite Matrices. Princeton (2007)
[3] A. Ebadian, I. Nikoufar, and M. E. Gordji, ”Perspectives of matrix convex
functions,” Proc. Natl. Acad. Sci. USA, 108(18):7313–7314 (2011)
[4] Edward Effros, Frank Hansen, ”Non-commutative perspectives,” Ann.
Funct. Anal. Vol. 5, No. 2, 74-79 (2014)
[5] Hiai, F., Mosonyi, M., Petz D., and Beny, C., ”Quantum f- divergences and
error corrections, ”Rev. Math. Phys. 23, 691–747 (2011)

28
[6] Luenberger, D. G.:Optimization by vector space methods. Wiley, New York
(1969)
[7] K. Matsumoto, ”A new quantum version of f-divergence,” arXiv:1311.4722
(2003)
[8] Rockafellar,R.T.:Convex Analysis. Princeton(1970)
[9] Ryan,R.A.:Introduction to tensor products of Banach spaces, Springer,
Berlin(2002)
[10] Zhang, F. ed.:The Shur Complement and Its Applications. Springer, Berlin
(2005)

A Some backgrounds from matrix analysis


Proposition A.1 (Theorem V.2.3 of [1])Let f be a continuous function on
[0, ∞) . Then, if f is operator convex and f (0)
( ≤ 0, )for any positive oper-
ator X and an operator C such that ∥C∥ ≤ 1, f C † XC ≤ C † f (X) C.

Proposition A.2 (Exercise 1.3.5 of [2], Theorem 1.12 of [10] ) Let X, Y be a


positive definite matrices. Then,
[ ]
X C
≥0 (57)
C† Y

implies
X ≥ CY −1 C † , Y ≥ C † X −1 C. (58)
−1 †
Conversely, if X ≥ CY C and Y ≥ 0, then (57) holds.

Remark A.3 In [2] and [10], they suppose X > 0 and/or Y > 0. However,
since the range of C and C † is a subspace of supp X and supp Y respectively,
existence of ker X and ker Y does not cause any problem.

29

You might also like