You are on page 1of 1

DAMA50 - MATHEMATICS FOR MACHINE LEARNING

● Chain Rule using matrix representation

Consider a function f of two variables, x1 and x2, i.e. f(x1, x2), where x1 and x2
are themselves functions of (s, t). Compute the gradient of f with respect to s, t,

∂x1 ∂x1

[ ∂x1 ∂s ∂x2 ∂t ]
df ∂f ∂x ∂f ∂x1 ∂f ∂x2 ∂f ∂x1 ∂f ∂x2
= [ ∂x ∂x2 ]
∂f ∂f ∂s ∂t
= = + +
d(s, t) ∂x ∂(s, t) 1 ∂x2 ∂x2 ∂x2 ∂s ∂x1 ∂t
∂s ∂t

You might also like