Professional Documents
Culture Documents
Relative Entropies
Guanghua Shi
arXiv:1911.04222v2 [math.FA] 19 Jun 2020
Abstract
In this paper, we focus on variational representations of some matrix symmetric
norm functions that are related to the quantum Rényi relative entropy.
Concretely, we
obtain variational representations of the function (A, B) 7→ (B q/2 K ∗ Ap KB q/2 )s
for symmetric norms by using the Hölder inequality and Young inequality. These
variational expressions enable us to make the proofs of the convexity/concavity of the
trace function (A, B) 7→ Tr(B q/2 K ∗ Ap KB q/2 )s more clear.
1 Introduction
Let Mn be the set of n × n matrices and Pn be the set of n × n positive definite matrices.
A matrix A ∈ Pn with TrA = 1 is called a density matrix. Many of the statements in this
paper are of special interest for density matrices but we will not make such restriction.
For A, B ∈ Pn , the traditional Rényi relative entropy (due to Petz [24]) is defined as
1
Dα (A||B) = log Tr(Aα B 1−α ), α ∈ (0, ∞)\{1}. (1.1)
α−1
A generalization of the traditional Rényi relative entropy was introduced by Müller-
Lennert, Dupuis, Szehr, Fehr, Tomamichel [23] and Wilde, Winter, Yang [28]. This entropy
is called the sandwiched Rényi relative entropy, and is defined as
1
D̃α (A||B) = log Fα (A, B), α ∈ (0, ∞)\{1}, (1.2)
α−1
where
1−α α
1−α
Fα (A, B) = Tr B 2α AB 2α , α ∈ (0, ∞). (1.3)
1
The trace function Fα (A, B) is a parameterized version of the fidelity
1 1
1
2
F (A, B) = Tr B 2 AB 2 , (1.4)
and is called the sandwiched quasi-relative entropy. We should notice that ([23])
1 1
lim D̃α (A||B) = k log B − 2 AB − 2 k, (1.5)
α→∞
where k · k is the operator norm. The expression in (1.5) coincides with the Thompson
metric
for all CPTP maps Φ and density matrices A, B. This inequality is also known as the
Data Processing Inequality. Essentially, the data processing inequality is equivalent to
the joint convexity or concavity of the trace functions in the definition of the quantum
divergence D. And the key to solve the jointly convexity problems of such trace functions
is to develop the related operator convexity theorems. This work was first started by
Lieb [22], who gave the so called Lieb’s concavity theorem and successfully solved the
convexity of Wigner-Yanase-Dyson information. After that, Lieb’s concavity theorem was
the impetus to several related works. Such as Epstein’s theorem [14], Ando’s convexity
theorem [1] and their generalizations.
The convexity of the trace function TrAα B 1−α in the traditional Rényi relative entropy
Dα can be established by Lieb’s concavity theorem and Ando’s convexity theorem. The
convexity of Fα (A, B) in the definition of the sandwiched Rényi relative entropy was
established by Frank and Lieb in [15], and is based on the extensions of Lieb’s concavity
2
theorem and Epstein’s theorem. The trace function in the α − z Rényi relative entropy
Dα,z can be abstracted into
q s
q
Ψp,q,s(A, B) = Tr B 2 K ∗ Ap KB 2 .
for A, B ∈ Pn and α ∈ R. These representations will make the proof of the convex-
ity/concavity of Ψp,q,s(A, B) more clear and enable us to give some new extensions.
This paper is organized as follows. In section 2, we establish the reverse Hölder inequal-
ity and Young inequality for symmetric norms. In section 3, we derive variational repre-
sentations of some matrix functionals related to quantum Rényi relative entropies. Finally,
3
q q
in section 4, we recover the convexity/concavity of Ψp,q,s(A, B) = Tr(B 2 K ∗ Ap KB 2 )s and
give a new generalization of Lieb’s concavity theorem in terms of symmetric anti-norms.
(i) Φ is a norm on Rn ,
(iv) Φ(1, 0, . . . , 0) = 1.
1 1 1
|xy|r ≥ |x|p + |y|q . (2.1)
r p q
Theorem 2.1. For x, y ∈ Rn and symmetric gauge function Φ,
1 1 1
Φ (|xy|r ) r ≥ Φ (|x|p ) p Φ (|y|q ) q (2.2)
1 1
holds for r, p > 0, q < 0 with r = p + 1q .
1 1 1
Φ (|xy|r ) ≥ Φ (|x|p ) + Φ (|y|q ) .
r p q
4
The function
r p r
ϕ(t) = t a + t−q b, where t, a, b > 0,
p q
1
gets its maximum at the point t0 = ( ab ) p+q , and
r r
ϕ(t0 ) = a p b q .
Hence we have
1 1 1
Φ (|xy|r ) r ≥ Φ (|x|p ) p Φ (|y|q ) q .
Denote s(A) the n−vector whose coordinates are the singular values of the matrix
A ∈ Mn in the decreasing order, i.e. s1 (A) ≥ s2 (A) ≥ . . . ≥ sn (A). Given a symmetric
gauge function Φ on Rn , the function
|||A||| := Φ(s(A))
holds for r, p, q > 0 with 1r = p1 + 1q , [6, (IV.43)]. Now we consider the reverse Hölder
inequality for symmetric norms.
Theorem 2.2. For symmetric norm |||·||| and matrices A, B with B invertible, we have
1 1 1
||||AB|r ||| r ≥ ||||A|p ||| p · ||||B|q ||| q (2.4)
1 1
holds for r, p > 0, q < 0 and r = p + 1q .
Proof. By Gelfand-Naimark Theorem (see [19, Theorem 6.13]) we have
It follows that
5
Hence it follows that
1 1 1
||||AB|r ||| r ≥ ||||A|p ||| p · ||||B|q ||| q .
6
3 Variational representations of some matrix functionals
For symmetric norm |||·|||, consider the function
q s
q
∗ p
Ψ̃p,q,s(A, B) = B K A KB
2 2
Ψ̃p,q,s(A, B)
( p q )
− 1 ∗ p 1 s(p+q) p+q 1 q 1 s(p+q) p+q
−
min (Z 2 K A KZ 2 ) p · (Z 2 B Z 2 ) q
Z>0
= (3.1)
p − 1 ∗ p q 1 q 1 s(p+q)
s(p+q)
− 21
min
(Z 2 K A KZ ) p + (Z 2 B Z 2 ) q .
Z>0 p + q p + q
(ii) Let s > 0, p > 0, q < 0 with p + q > 0; or s > 0, p < 0, q > 0 with p + q < 0. Then
Ψ̃p,q,s(A, B)
( p q )
− 1 ∗ p 1 s(p+q) p+q 1 q 1 s(p+q) p+q
−
max (Z 2 K A KZ 2 ) · (Z 2 B Z 2 ) q
p
Z>0
= (3.2)
p − 1 ∗ p q
1 s(p+q) 1 q 1 s(p+q)
max −
(Z 2 K A KZ 2 ) p +
(Z 2 B Z 2 ) q .
Z>0 p + q p + q
Proof. From Hölder inequality and Young inequality and their reverse versions, we have
for S, T ∈ Mn ,
r0 r0
||||ST |r0 ||| ≤ ||||S|r1 ||| r1 ||||T |r2 ||| r2 (3.3)
r0 r0 1 1 1
≤ ||||S|r1 ||| + ||||T |r2 ||| (r0 , r1 , r2 > 0, = + ); (3.4)
r1 r2 r0 r1 r2
and for S, T ∈ Mn with T invertible,
r0 r0
||||ST |r0 ||| ≥ ||||S|r1 ||| r1 ||||T |r2 ||| r2 (3.5)
r0 r0 1 1 1
≥ ||||S|r1 ||| + ||||T |r2 ||| (r0 , r1 > 0, r2 < 0, = + ). (3.6)
r1 r2 r0 r1 r2
2s(p + q) 2s(p + q)
r0 = 2s, r1 = , r2 = ,
p q
p 1
1 1 1
then the conditions r0 , r1 , r2 > 0 and r0 = r1 + r2 hold. Then by setting S = A 2 KZ − 2
7
1 q
and T = Z 2 B 2 in inequality (3.3) and (3.4), we have
q q
2 ∗ p s
Ψ̃p,q,s(A, B) = (B K A KB ) 2
q 1 1 p p 1 1 q
= (B 2 Z 2 Z − 2 K ∗ A 2 A 2 KZ − 2 Z 2 B 2 )s
p 1 1 q
= |A 2 KZ − 2 · Z 2 B 2 |r0
p 1
r0 1 q r0
≤ |A 2 KZ − 2 |r1 1 · |Z 2 B 2 |r2 2
r r
p q
− 1 ∗ p s(p+q) p+q 1 q 1 s(p+q) p+q
− 21
= (Z K A KZ )
2 p
· (Z B Z )
2 2 q
p − 1 ∗ p − 1 s(p+q) q 1 q 1 s(p+q)
≤ (Z 2 K A KZ 2 ) p + (Z 2 B Z 2 ) q .
p + q p + q
When
Z = B −q ♯ q (K ∗ Ap K),
p+q
we have
− 1 ∗ p 1 s(p+q) 1 1 s(p+q)
− ∗ p −1 ∗ p
(Z 2 K A KZ 2 ) p = ((K A K) 2 Z (K A K) 2 ) p
1 1 s(p+q)
∗ p ∗ p −1 q ∗ p
= ((K A K) 2 ((K A K) ♯ p B )(K A K) 2 )
p
p+q
1 1 p s(p+q)
= ((K ∗ Ap K) 2 B q (K ∗ Ap K) 2 ) p+q p
q q
= (B 2 K ∗ Ap KB 2 )s ;
and
1 q 1 s(p+q) q q s(p+q)
(Z 2 B Z 2 ) q = (B 2 ZB 2 ) q
q ∗ p q q s(p+q)
= (B 2 K A KB 2 ) p+q q
q q
= (B 2 K ∗ Ap KB 2 )s .
8
Under the conditions of (ii), set
2s(p + q) 2s(p + q)
r0 = 2s, r1 = , r2 = .
p q
p 1
1 1 1
Then we have r0 > 0, r1 > 0, r2 < 0 and r0 = r1 + r2 . Now set S = A 2 KZ − 2 and
1 q
T = Z 2 B 2 in inequalities (3.5) and (3.6). Following a similar argument as above, we can
obtain
q q
Ψ̃p,q,s(A, B) = (B 2 K ∗ Ap KB 2 )s
( p q )
− 1 ∗ p 1 s(p+q) p+q 1 q 1 s(p+q) p+q
−
max (Z 2 K A KZ 2 ) p · (Z 2 B Z 2 ) q
Z>0
=
p − 1 ∗ p + q (Z 12 B q Z 12 ) q .
s(p+q) s(p+q)
−1
max
(Z 2 K A KZ 2 ) p
Z>0 p + q p + q
Ψ̃p,q,s(A, B)
2 q 12 1q sq
s
1−sq 1
− 21 ∗ p − 21 1−sq
min (Z K A KZ ) · (Z B Z )
Z>0
= (3.7)
1 1
n 1 1 s
1 o
min (1 − sq)(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sq (Z 2 B q Z 2 ) q .
Z>0
Ψ̃p,q,s(A, B)
2 q 12 1q sq
s
1−sq 1
− 12 ∗ p − 21 1−sq
max (Z K A KZ ) · (Z B Z )
Z>0
= (3.8)
1 1
n 1 1 s
1 o
max (1 − sq)(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sq (Z 2 B q Z 2 ) q .
Z>0
2s
Proof. Under the conditions of (i), set r0 = 2s, r1 = 1−sq , r2 = 2q , then r0 , r1 , r2 > 0 and
1 p
− 21 1 q
r0 = r11 + r12 hold. Now set S = A KZ 2 and T = Z B in inequality (3.3) and (3.4),
2 2
then we have
q q
Ψ̃p,q,s (A, B) = (B 2 K ∗ Ap KB 2 )s
q 1 1 p p 1 1 q
= (B 2 Z 2 Z − 2 K ∗ A 2 A 2 KZ − 2 Z 2 B 2 )s
p 1
r0 1 q r0
≤ |A 2 KZ − 2 |r1 1 · |Z 2 B 2 |r2 2
r r
s 1−sq 1 1 sq
1 1
1
= (Z − 2 K ∗ Ap KZ − 2 ) 1−sq · (Z 2 B q Z 2 ) q
1 1
1 1
s
1
≤ (1 − sq)(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sq (Z 2 B q Z 2 ) q .
9
When
Z = B −q ♯sq (K ∗ Ap K),
we have
q
s
− 21 ∗ p 1 q
(Z K A KZ − 2 ) 1−sq = (B 2 K ∗ Ap KB 2 )s ;
and
2 q 21 q1 2q ∗ p
1 q
(Z B Z ) = (B K A KB 2 )s .
Similarly, under the conditions of (ii) and by using inequalities (3.5) and (3.6), we can
get
q s
q
Ψ̃p,q,s(A, B) = B 2 K ∗ Ap KB 2
2 q 12 1q sq
s
1−sq 1
− 21 ∗ p − 21 1−sq
max (Z K A KZ ) · (Z B Z )
Z>0
=
1 1
n 1 1 s
1 o
max (1 − sq)(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sq (Z 2 B q Z 2 ) q .
Z>0
Ψp,q,s(A, B)
( p q )
s(p+q) p+q 1 s(p+q) p+q
− 21 ∗ p − 12 1
q
min Tr(Z K A KZ ) p · Tr(Z 2 B Z 2 ) q
Z>0
= (3.9)
p q
1 1 s(p+q) 1 1 s(p+q)
Tr(Z − 2 K ∗ Ap KZ − 2 ) p + Tr(Z 2 B q Z 2 ) q
min
.
Z>0 p + q p+q
10
(ii) Let s > 0, p > 0, q < 0 with p + q > 0; or s > 0, p < 0, q > 0 with p + q < 0. Then
Ψp,q,s(A, B)
( p q )
1 1 s(p+q) p+q 1 1 s(p+q) p+q
max Tr(Z − 2 K ∗ Ap KZ − 2 ) p · Tr(Z 2 B q Z 2 ) q
Z>0
= (3.10)
p q
1 1 s(p+q) 1 1 s(p+q)
max − ∗ p − q
Tr(Z 2 K A KZ 2 ) p + Tr(Z 2 B Z 2 ) q .
Z>0 p+q p+q
Ψp,q,s(A, B)
1 1 sq
s
1−sq
− 21 ∗ p − 21 1−sq 1
q
min Tr(Z K A KZ ) · Tr(Z 2 B Z 2 )
q
Z>0
= (3.11)
1 1
n 1 1 s 1
o
min (1 − sq)Tr(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sqTr(Z 2 B q Z 2 ) q .
Z>0
Ψp,q,s(A, B)
1 1 sq
s
1−sq
− 21 ∗ p − 21 1−sq 1
q
max Tr(Z K A KZ ) · Tr(Z 2 B Z 2 ) q
Z>0
= (3.12)
1 1
n 1 1 s 1
o
max (1 − sq)Tr(Z − 2 K ∗ Ap KZ − 2 ) 1−sq + sqTr(Z 2 B q Z 2 ) q .
Z>0
1−t
Set s = t, p = 1, q = t and K = I in Corollary 3.3 or Corollary 3.4, we can obtain:
Corollary 3.5. Let A, B ∈ Pn . If 0 ≤ t ≤ 1, then
1−t t
1−t
Ft (A, B) = Tr B 2t AB 2t
t 1−t
1−t
− 12 − 12 1 1 t
min TrZ AZ · Tr(Z B Z ) 1−t
2 t 2
Z∈P n
= n o
1 1 1 1−t 1 t
min tTrZ − 2 AZ − 2 + (1 − t)Tr(Z 2 B t Z 2 ) 1−t .
Z∈Pn
If t ≥ 1, then
1−t t
1−t
Ft (A, B) := Tr B 2t AB 2t
t 1−t
1−t
− 12 − 12 1 1 t
The variational expressions of Ft (A, B) for t ∈ (0, 1) was obtained in [15]. See also [5]
and [7].
11
4 Generalizing Lieb’s Concavity Theorem via variational
method
Now we consider the convexity or concavity of Ψ̃p,q,s(A, B) and Ψp,q,s(A, B) by using the
variational method. Before doing so, we recall some convexity/concavity theorems about
the matrix function
We firstly recover the following well-known conclusions by using Corollary 3.4. For
more information of Theorem 4.2 we refer the readers to [10, 29].
(i) If 0 ≤ p, q ≤ 1 and 0 < s ≤ 1/(p + q), then Ψp,q,s (A, B) is jointly concave.
is concave in A. And
1 1 1
Tr(Z 2 B q Z 2 ) q
is concave in (A, B). Then by the variational representation (3.11) and Lemma 13 of [10]
we have Ψp,q,s(A, B) is jointly concave.
12
Under the conditions of (ii), we have
s 1
−1 ≤ p ≤ 0, = −1 > 0, and − 1 ≤ q ≤ 0.
1 − sq s −q
is convex in (A, B). Hence by the variational representation (3.12) and Lemma 13 of [10],
we have Ψp,q,s(A, B) is jointly convex.
Under the conditions of (iii), we have
s 1 1
1 ≤ p ≤ 2, = −1 ≥ , and − 1 ≤ q ≤ 0.
1 − sq s −q p
is convex in (A, B). Hence by the variational representation (3.12) and Lemma 13 of [10],
we have Ψp,q,s(A, B) is jointly convex.
More generally, Hiai [16] proved the following results, which can be viewed as general-
izations of the Epstein’s theorem for symmetric (anti-)norms.
Theorem 4.3. Set K ∈ Mn . And set |||·|||! be symmetric anti-norm and |||·||| be symmetric
norm for matrix.
(i) If 0 < p ≤ 1 and 0 < s ≤ 1/p, then |||(K ∗ Ap K)s |||! is concave for A ∈ Pn .
(ii) If −1 ≤ p < 0 and s > 0, then |||(K ∗ Ap K)s ||| is convex for A ∈ Pn .
13
We now consider the extension of the Lieb’s concavity theorem for symmetric anti-
norm.
for S, T ∈ Pn , and r, r1 , r2 > 0 with 1r = r11 + r12 . Then for 0 ≤ p, q ≤ 1 and 0 < s ≤
1/(p + q), we have
q q
(A, B) 7→ (B 2 K ∗ Ap KB 2 )s
!
is jointly concave.
Proof. Since
s(p + q) 1 s(p + q) 1
0 ≤ p ≤ 1, ≤ , and 0 ≤ q ≤ 1, ≤ ,
p p q q
is concave in A, and
1 q 1 s(p+q)
(Z 2 B Z 2 ) q
!
is concave in B. The Hölder inequality (4.1) ensures the corresponding Young inequality
and hence a similar version of variational representation for symmetric anti-norm:
q q
2 ∗ p s
(B K A KB ) 2
!
p −
1
∗ p − 1 s(p+q) q 1 q 1 s(p+q)
= min (Z 2 K A KZ 2 ) p + (Z 2 B Z 2 ) q
Z∈Pn p + q ! p + q
!
q q
Hence, it follows that (B 2 K ∗ Ap KB 2 )s is jointly concave in (A, B).
!
14
References
[1] T. Ando, Concavity of certain maps on positive definite matrices and applications to
Hadamard products. Linear Algebra Appl., 26 (1979), 203-241.
[4] S. Beigi, Sandwiched Rényi divergence satisfies data processing inequality. J. Math.
Phys., 54 (2013), 122202.
[7] R. Bhatia, T. Jain, Y. Lim, Strong convexity of sandwiched entropies and related
optimization problems. Reviews in Mathematical Physics, 30 (2018), 1850014.
[8] J. C. Bourin, F. Hiai, Jensen and Minkowski inequalities for operator means and
anti-norms. Linear Algebra Appl., 456 (2014), 22-53.
[9] E. A. Carlen, R. L. Frank, and E. H. Lieb, Some operator and trace function convexity
theorems. Linear Algebra Appl., 490 (2016) 174-185.
[10] E. A. Carlen, R. L. Frank, and E. H. Lieb, Inequalities for quantum divergences and
the Audenaert-Datta conjecture. Journal of Physics A: Mathematical and Theoretical,
51(48) (2018), 483001.
[11] E. A. Carlen, E. H. Lieb, A Minkowski type trace inequality and strong subadditiveity
of quantum entropy. II. Convexity and concavity. Lett. Math. Phys., 83(2) (2008),
107-126.
[12] N. Datta, Min-and max-relative entropies and a new entanglement monotone, IEEE
Transactions on Information Theory 55 (2009), 2816-2826.
[13] E.G. Effros. A matrix convexity approach to some celebrated quantum inequalities.
Proc. Natl. Acad. Sci. USA, 106 (2009), 1006-1008.
[14] H. Epstein, Remarks on two theorems of E. Lieb. Comm. Math. Phys., 31 (1973),
317-325.
[16] F. Hiai, Concavity of certain matrix trace and norm functions. Linear Algebra Appl.,
439(5) (2013), 1568-1589.
15
[17] F. Hiai, Concavity of certain matrix trace and norm functions. II. Linear Algebra
Appl., 496 (2016), 193-220.
[19] F. Hiai, D. Petz, Introduction to matrix analysis and applications. Springer Science
& Business Media, 2014.
[20] De Huang, Generalizing Lieb’s Concavity Theorem via operator interpolation, Ad-
vances in Mathematics, 369 (2020), 107208.
[21] V. Jaksic, Y. Ogata, Y. Pautrat and C.-A. Pillet, Entropic fluctuations in quantum
statistical mechanics. An Introduction. In: Quantum Theory from Small to Large
Scales: Lecture Notes of the Les Houches Summer School: Volume 95, August 2010,
Oxford University Press, 2012.
[22] E. H. Lieb, Convex trace functions and the Wigner-Yanase-Dyson conjecture. Ad-
vances in Mathematics, 11 (1973), 267-288.
[24] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys., 23 (1986),
57-65.
[26] J. Tropp, From joint convexity of quantum relative entropy to a concavity theorem
of Lieb. Proc. Amer. Math. Soc. 140(5) (2012), 1757-1760.
[27] H. Umegaki, Conditional expection in an operator algebra. IV. Entropy and Informa-
tion. Kodai Math. Sem. Rep., 14 (1962), 59-85.
[28] M. M. Wilde, A. Winter, D. Yang, Strong converse for the classical capacity of
entanglement-breaking and Hadamard channels via a sandwiched Rényi relative en-
tropy. Comm. Math. Phys., 331(2) (2014), 593-622.
16