Monatshefte fiir ]VIathematik 79, 303--306 (1975)

9 by Springer-Verlag 1975

A Trace Inequality of John yon N e u m a n n


L. Mirsky, Sheffield, England

(Received 12 December 1973)

The principal aim of this note is to establish an effectively self-contained
proof of J. yon Neumann's inequality [tr(AB)[~< ~ ~rCrr, where A, B are
any complex n • n matrices with singular values Q1>t .../> ~n, ~l/>... i>~n

1. I f A is a complex n • n matrix, we shall denote its conjugate

transpose by A*. The matrix A A * is non-negative hermitian and
its characteristic roots, say col ..... wn, are therefore real and non-
negative. The numbers + V ~ , . . . , + ~/~nn are called the singular
values of A.
In 1937, J. vow NEUMAN~ [5] proved the following result con-
cerning the trace of matrix products.
Theorem. I f A, B are complex n • n matrices with singular values
~1/>.., ) ~ ) n , 0"1 / > . - . ) f i n

respectively, then
< F er(rr.
ltr(AB)[ ~ (1)

This inequality emerged in the course of a broadly based

investigation, and the original proof was consequently not par-
ticularly easy. Some years ago, I offered what seemed quite a
simple derivation [4]. In view of the interest of von Neumann's
inequality, it may be worthwhile to present an alternative treat-
ment which is straightforward and elementary, and also (apart
from the use of a standard factorization theorem for matrices)
entirely self-contained.
304 L. MraSKY

2. A square matrix is said to be doubly-stochastic if its elements

are real and non-negative numbers and if the sum of the elements
in each row and in each column is equal to 1. We need the following
preliminary result, which has also a certain independent interest.
Lemma. I f (drs) is a doubly-stochastlc n • n matrix and i f
x l >~. .. >~xn >,O, yl >~.. . >~yn >~O, (2)
n n
Z Z (3)
r,s=l r=l

The conditions xn >10, Yn >t0 in (2) can actually be dispensed

with, b u t we prefer to formulate the assertion in the weakest form
which is adequate for our purpose. I t should also be noted that the
lemma is closely related to a result of KY FA~ [1, Lemma 1A].
However, the argument used below is different from F ~ ' s .
In view of (2), there exist non-negative numbers ~ , W (1 ~<i ~<n)
such that
x~= Z &, Y~= Z n~ (l<r<n).
r<i<n r<i<n

Hence, using the symbol ~r~ to denote the Kroneeker delta, we

n n n
Z xryr-- Z dr~xry~----- Z (&~--dr,)Xry~=
r=l r,#=l r,s=l

l<r,s<n r<i<n s<j<n

1 <i,.~ < n I<T<~


If i ~<j, then the inner sum on the right-hand side of (4) is non-
negative since

E l<r<i l<s<j

l<r</ l<s<n

and the same conclusion is obtained in a similar w a y for j ~<i.

The assertion (3) now follows in view of (4). Finally, we may note
that the lemma is also a very easy consequence of G. Birkhoff's
well-known theorem on doubly-stochastic matrices.
A Trace Inequality of John yon Neumann 305

3. We now come to the proof of the inequality (1). Here we

shall invoke a standard theorem to the effect t h a t any complex
square matrix M can be expressed in the form M = U D V, where
U and V are unitary while D is a diagonal matrix whose diagonal
elements are the singular values of M (arranged in any pre-assigned
order). Let us, then, write

A ----- U1 R V1, B = U2 S V~.,

where U 1 , V1, U 2 , V2 are unitary matrices and

R = diag (Q1, 99 0~), S = diag (al, 99 an).

Then, since tr (P Q) = fr (Q P), we have

tr (A B) = tr ( V~. U1 R V1 U2 S ) = tr (U ~ R V S),

where U = (urs) = (V2 U1) T and V = (Vrs) = V1 U2 are unitary

matrices. Hence

tr(AB)= ~ ursvrs~r~8
r,$ = i
and therefore
Itr (A B)i ~<


But, as is plain, (Juts [2) and (Ivrs l~) are doubly-stochastic matrices;
and (1) now follows immediately by the application of the lemma.

4. We shM1 conclude our discussion with miscellaneous remarks

bearing on von Neumann's theorem.
An almost immediate consequence of (1) is the identity

sup [tr (A U B V) ] = ~ ~r at,

U,V r=l

where the supremum is taken with respect to all pairs U, V of

unitary matrices. For the derivation of this result (which is also
due to yon Neumann) from (1), see e. g. [4].
The inequality (1) has stimulated a good deal of subsequent
research and has led to generalizations in several directions. We
content ourselves with a passing reference to the work of KY FAx
306 L. MIRSXu A Trace Inequality of John yon Neumann

[1], o f MARCUS and MoYLs [2], and of the present writer [3].
I m p o r t a n t applications of the inequality will be found in the book
by R. SC~TTEN [6].
Finally, we m a y ask whether it is possible to assign a signi-
ficant lower bound to Itr (A B)1. This question was considered by
I. SCHV~ [7] who obtained a very striking result which deserves
to be better known (and which appeared in ~ paper published in
the same year as y o n NEUMANN'S work). Schur demonstrated, in
fact, that, for any complex square matrices A,B (of the same
order), the inequality
[tr (A B)12 ~>tr (A*A BB*) -k tr (AA* B* B)--tr(A*A) 9tr (B* B)
is valid.


[1] FA~r K. : Maximum properties a n d inequalities for the eigenvalues

of completely continuous operators. Prec. Nat. Acad. Sci. 87, 760--766
[2] ~ c c s , M., a n d B. N. M o ~ s : On the m a x i m u m principle of K y
Fan. Canad. J. Math. 9, 313--320 (1957).
[3] iVII~SKu L. : Maximum principles in m a t r i x theory. Prec. Glasgow
Math. Assoc. 4, 34--37 (1958).
[4] MI~SKY, L. : O n the trace of matrix products. Math. Nachrichten
20, 171--174 (1959).
[5] vo~r NE~ANIr J. : Some matrix-inequalities and metrization of
matrix-space. Tomsk Univ. Rev. 1,286--300 (1937). R e p r i n t e d in Collected
Works (Pergamon Press, 1962), iv, 205--219.
[6] SC~_AT~mr R . : A Theory of Cross-Spaces. Princeton University
Press. 1950.
[7] S c H ~ , I . : ~]rber einige Ungleiehungen im Matrizenkalkiil. Prace
Mat.-fiz. 44, 353--370 (1937). R e p r i n t e d in Gesammelte Abhandlungen
(Springer-Verlag, 1973), iii, 330--347.

Prof. L. MIB$KY
D e p a r t m e n t of Pure ~VIathematics
University of Sheffield
Sheffield S 3 7 R E , England

