Hessian

Why is the Hessian of a function well-defined
only at its critical points?
Ivo Terek Couto
Defining d2 f p :
Let Mn be a differentiable manifold, f : M → R be a smooth function, and p ∈ M be
a critical point of f , that is, satisfying d f p = 0. This means that the partial derivatives
of f with respect to any chart around p vanish when evaluated at p. This allows us to
write the:
Definition. The Hessian of f at p is the bilinear form d2 f p : Tp M × Tp M → R defined
by
. n ∂2 f
d2 f p (v, w) = ∑
j
j
( p)viα wα ,
i
i,j=1 ∂xα ∂xα
where (Uα , ϕα = ( xα1 , . . . , xαn )) is a chart around p for which we write

n n
i ∂ j ∂
v = ∑ vα i and w = ∑ wα j .

i =1 ∂xα p i =1 ∂xα p
To make this definition valid, we have to verify that the expression does not depend
on the choice of chart around p. For this end, assume that we are given a second chart
(Uβ , ϕ β = ( x1β , . . . , x nβ )) around p. Then Uα ∩ Uβ is an open set around p (hence we
are able to take derivatives), and we may assume without loss of generality (and to
simplify the writing) that ϕα ( p) = ϕ β ( p) = 0 = (0, . . . , 0) ∈ Rn . The relation between
the coordinate vector fields along Uα ∩ Uβ , evaluated at the correct points, is just
n
∂ ∂xα` ∂
j
= ∑ j `
, j = 1, 2, . . . , n.
∂x β `=1 ∂x ∂xα
β
Seeing this as an equality between differential operators, it follows that

n
∂f ∂xα` ∂ f
j
= ∑ j ∂x` , j = 1, 2, . . . , n.
∂x β `=1 ∂x β α
Applying ∂/∂xiβ on both sides and applying the product rule, we get
n n
∂2 f ∂2 xα`
∂f ∂xαk ∂xα` ∂2 f
j
= ∑ i j ∂x` ∑ ∂xi j ∂xk ∂x` .
+
∂xiβ ∂x β `=1 ∂x β ∂x β α k,`=1 β ∂x β α α
1
Evaluating the above at the point p kills the first sum in the right hand side, in view
of the condition d f p = 0 (which implies that (∂ f /∂xα` )( p) = 0 for ` = 1, 2, . . . , n),
resulting in
n
∂2 f ∂xαk ∂xα` ∂2 f
j
( p) = ∑ i
( 0 ) j ( 0 ) k ` ( p ).
∂xi ∂x k,`=1 ∂x β
β β∂x ∂xα ∂xα
β
Now, to compute the Hessian of f according to the chart (Uβ , ϕ β ), we need to know
the components of the tangent vectors v and w with respect to this new coordinate
basis. Using self-evident notation, we have that
j
n ∂xiβ n ∂x β
∑ ∂xr (0)vrα ∑ ∂xs (0)wαs ,
j
viβ = and wβ = i, j = 1, 2, . . . , n.
r =1 α s =1 α
Putting everything together, we finally compute:

j
n
∂2 f n
∂xαk ∂xα` ∂2 f ∂xiβ ∂x β
∑ ∑ ∂xi
j
j
( p)viβ w β = ( 0 ) j
( 0 ) k ∂x `
( p ) r
( 0 ) v r
α s
(0)wαs
i ∂x ∂x ∂x
i,j=1 ∂x β ∂x β i,j,k,`,r,s=1 β ∂x β α α α α
 
i
! j
n n k n `
∂x ∂x ∂x ∂x ∂2 f
= ∑ ∑ ∂xiα (0) ∂xr (0)  ∑ αj (0) ∂xs (0) ∂xk ∂x` ( p)vrα wαs
β β
k,`,r,s=1 i =1 β α j=1 ∂x β α α α
n
∂2 f
= ∑ δrk δs`
∂xαk ∂xα`
( p)vrα wαs
k,`,r,s=1
n
∂2 f
= ∑ k `
( p)vkα wα` ,
k,`=1 ∂xα ∂xα
as wanted. This means that the Hessian is indeed well-defined if p is a critical point of
the function f . Perhaps a more elegant approach for checking this last part, avoiding
picking the tangent vectors v and w (but which obviously boils down to the same
computation), is to write the transformation law for the differentials at the point p
instead:
n ∂x i
dxiβ p = ∑ r (0) dxαr p
β
i = 1, 2, . . . , n,
r =1
∂x α
setting up
j
 
∂xiβ
!
n n n n
∂2 f ∂xαk ∂x ` ∂2 f ∂x β
∑ ∑ ∑ ⊗ ∑
j
(0) dxiβ p ⊗ dx β p = (0) dxαr (0) dxαs 

(0) αj (0) k ` ( p)
i,j=1 ∂xiβ ∂x β
j i
i,j,k,l =1 ∂x β ∂x β ∂xα ∂xα r =1
∂xαr p
s =1
∂xαs p
and recognizing δrk and δs` to again obtain the same conclusion.
Generalizations:
Everything here uses in a crucial way the fact that d f p = 0. So this raises the natural
question: is it possible to define such a Hessian for arbitrary points of the manifold M?
2
Without additional structure, the answer is no. If you do, however, have some extra
structure to work with, here’s what happens: let ∇ be a (Koszul) connection in the
tangent bundle TM, and define the covariant Hessian of f with respect to ∇ at p as the
map Hess∇ ( f ) p : Tp M × Tp M → R given by
Hess∇ ( f ) p (v, w) = v(w

e ( f )) − d f p (∇v w
e ),
where w e is some extension of w to a neighborhood of p (i.e., a vector field defined
in a neighborhood of p such that w e p = w). By the Leibniz rule for ∇ and its local
character, we see that the right hand side above is actually independent of the choice
of extension for w, and defines a bilinear form on Tp M. Note that if p happens to be a
critical point of f , we recover Hess∇ ( f ) p = d2 f p .
This actually induces a C∞ ( M )-bilinear map Hess∇ ( f ) : X( M) × X( M) → C∞ ( M ),
which is given in local coordinates (U, ( x1 , . . . , x n )) by
Hess∇ ( f )(∂i , ∂ j ) = ∂i ∂ j f − d f (∇∂i ∂ j )
!
n
= ∂i ∂ j f − d f ∑ Γijk ∂k
k =1
n
= ∂i ∂ j f − ∑ Γijk ∂k f ,
k =1
where the n3 functions Γijk are the connection components of ∇. Writing it in its full
glory, we have
!
n 2f n
∂ ∂ f
Hess∇ ( f ) = ∑ i ∂x j
− ∑ Γijk k dxi ⊗ dx j .
i,j=1 ∂x k =1 ∂x
One might also recognize the object Hess∇ ( f ) as the covariant differential ∇(d f ) of
the (0, 1)-tensor d f , which is then a (0, 2)-tensor. But despite all these ways of look-
ing at the Hessian, we cannot expect it to necessarily have good properties, since the
connection ∇ was so arbitrary. In fact, recall that the torsion of the connection ∇ is the
(0, 2)-tensor field τ ∇ : X( M) × X( M) → X( M) given by
τ ∇ ( X, Y ) = ∇ X Y − ∇Y X − [ X, Y ],
where [ X, Y ] is the Lie bracket of X and Y. The presence of [ X, Y ] has the purpose
of making the torsion τ ∇ C∞ ( M )-bilinear. We’ll conclude the discussion with the
following characterization of this torsion:
Proposition. τ ∇ = 0 if and only if Hess∇ ( f ) is a symmetric tensor, for every f ∈ C∞ ( M).
Proof: Given vector fields X, Y ∈ X( M), we compute directly that
Hess∇ ( f )(Y,X ) = Y ( X ( f )) − d f (∇Y X ) = X (Y ( f )) − [ X, Y ]( f ) − d f (∇Y X )
= X (Y ( f )) − d f (∇Y X + [ X, Y ]) = X (Y ( f )) − d f (∇ X Y − τ ∇ ( X, Y ))
= Hess∇ ( f )( X, Y ) + τ ∇ ( X, Y )( f ).
The conclusion follows.

Hessian

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hessian

Uploaded by

Copyright:

Available Formats

Why is the Hessian of a function well-defined

only at its critical points?

Ivo Terek Couto

where (Uα , ϕα = ( xα1 , . . . , xαn )) is a chart around p for which we write

Seeing this as an equality between differential operators, it follows that

Putting everything together, we finally compute:

Hess∇ ( f ) p (v, w) = v(w

You might also like