0% found this document useful (0 votes)
181 views18 pages

Matrix Calculus: From Wikipedia, The Free Encyclopedia

Matrix calculus refers to notations for doing multivariable calculus over spaces of matrices. It collects derivatives of functions with respect to matrices and vectors into single entities like matrices and vectors. There are competing notations regarding whether derivatives of scalars with respect to vectors are written as row or column vectors. Care must be taken to ensure notational consistency when combining results from different sources.

Uploaded by

benjarray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views18 pages

Matrix Calculus: From Wikipedia, The Free Encyclopedia

Matrix calculus refers to notations for doing multivariable calculus over spaces of matrices. It collects derivatives of functions with respect to matrices and vectors into single entities like matrices and vectors. There are competing notations regarding whether derivatives of scalars with respect to vectors are written as row or column vectors. Care must be taken to ensure notational consistency when combining results from different sources.

Uploaded by

benjarray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Matrixcalculus
FromWikipedia,thefreeencyclopedia

Inmathematics,matrixcalculusisaspecializednotationfordoingmultivariablecalculus,especiallyoverspacesofmatrices.Itcollects
thevariouspartialderivativesofasinglefunctionwithrespecttomanyvariables,and/orofamultivariatefunctionwithrespecttoa
singlevariable,intovectorsandmatricesthatcanbetreatedassingleentities.Thisgreatlysimplifiesoperationssuchasfindingthe
maximumorminimumofamultivariatefunctionandsolvingsystemsofdifferentialequations.Thenotationusedhereiscommonlyused
instatisticsandengineering,whilethetensorindexnotationispreferredinphysics.
Twocompetingnotationalconventionssplitthefieldofmatrixcalculusintotwoseparategroups.Thetwogroupscanbedistinguished
bywhethertheywritethederivativeofascalarwithrespecttoavectorasacolumnvectororarowvector.Bothoftheseconventionsare
possibleevenwhenthecommonassumptionismadethatvectorsshouldbetreatedascolumnvectorswhencombinedwithmatrices
(ratherthanrowvectors).Asingleconventioncanbesomewhatstandardthroughoutasinglefieldthatcommonlyusematrixcalculus
(e.g.econometrics,statistics,estimationtheoryandmachinelearning).However,evenwithinagivenfielddifferentauthorscanbefound
usingcompetingconventions.Authorsofbothgroupsoftenwriteasthoughtheirspecificconventionisstandard.Seriousmistakescan
resultwhencombiningresultsfromdifferentauthorswithoutcarefullyverifyingthatcompatiblenotationsareused.Therefore,greatcare
shouldbetakentoensurenotationalconsistency.Definitionsofthesetwoconventionsandcomparisonsbetweenthemarecollectedin
thelayoutconventionssection.

Contents
1 Scope
1.1 Relationtootherderivatives
1.2 Usages
2 Notation
2.1 Alternatives
3 Derivativeswithvectors
3.1 Vectorbyscalar
3.2 Scalarbyvector
3.3 Vectorbyvector
4 Derivativeswithmatrices
4.1 Matrixbyscalar
4.2 Scalarbymatrix
4.3 Othermatrixderivatives
5 Layoutconventions
5.1 Numeratorlayoutnotation
5.2 Denominatorlayoutnotation
6 Identities
6.1 Vectorbyvectoridentities
6.2 Scalarbyvectoridentities
6.3 Vectorbyscalaridentities
6.4 Scalarbymatrixidentities
6.5 Matrixbyscalaridentities
6.6 Scalarbyscalaridentities
6.6.1 Withvectorsinvolved
6.6.2 Withmatricesinvolved
6.7 Identitiesindifferentialform
7 Seealso
8 Notes
9 Furtherreading
10 Externallinks

Scope
Matrixcalculusreferstoanumberofdifferentnotationsthatusematricesandvectorstocollectthederivativeofeachcomponentofthe
dependentvariablewithrespecttoeachcomponentoftheindependentvariable.Ingeneral,theindependentvariablecanbeascalar,a
vector,oramatrixwhilethedependentvariablecanbeanyoftheseaswell.Eachdifferentsituationwillleadtoadifferentsetofrules,or
aseparatecalculus,usingthebroadersenseoftheterm.Matrixnotationservesasaconvenientwaytocollectthemanyderivativesinan
organizedway.
Asafirstexample,considerthegradientfromvectorcalculus.Forascalarfunctionofthreeindependentvariables,
gradientisgivenbythevectorequation

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

,the

1/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

,
where representsaunitvectorinthe directionfor
.Thistypeofgeneralizedderivativecanbeseenasthederivativeofa
scalar,f,withrespecttoavector, anditsresultcanbeeasilycollectedinvectorform.

Morecomplicatedexamplesincludethederivativeofascalarfunctionwithrespecttoamatrix,knownasthegradientmatrix,which
collectsthederivativewithrespecttoeachmatrixelementinthecorrespondingpositionintheresultingmatrix.Inthatcasethescalar
mustbeafunctionofeachoftheindependentvariablesinthematrix.Asanotherexample,ifwehaveannvectorofdependentvariables,
orfunctions,ofmindependentvariableswemightconsiderthederivativeofthedependentvectorwithrespecttotheindependentvector.
Theresultcouldbecollectedinanmnmatrixconsistingofallofthepossiblederivativecombinations.Thereare,ofcourse,atotalof
ninepossibilitiesusingscalars,vectors,andmatrices.Noticethatasweconsiderhighernumbersofcomponentsineachofthe
independentanddependentvariableswecanbeleftwithaverylargenumberofpossibilities.
Thesixkindsofderivativesthatcanbemostneatlyorganizedinmatrixformarecollectedinthefollowingtable.[1]
Types

TypesofMatrixDerivatives
Scalar
Vector

Matrix

Scalar
Vector
Matrix
Here,wehaveusedtheterm"matrix"initsmostgeneralsense,recognizingthatvectorsandscalarsaresimplymatriceswithonecolumn
andthenonerowrespectively.Moreover,wehaveusedboldletterstoindicatevectorsandboldcapitallettersformatrices.Thisnotation
isusedthroughout.
Noticethatwecouldalsotalkaboutthederivativeofavectorwithrespecttoamatrix,oranyoftheotherunfilledcellsinourtable.
However,thesederivativesaremostnaturallyorganizedinatensorofrankhigherthan2,sothattheydonotfitneatlyintoamatrix.In
thefollowingthreesectionswewilldefineeachoneofthesederivativesandrelatethemtootherbranchesofmathematics.Seethelayout
conventionssectionforamoredetailedtable.

Relationtootherderivatives
Thematrixderivativeisaconvenientnotationforkeepingtrackofpartialderivativesfordoingcalculations.TheFrchetderivativeisthe
standardwayinthesettingoffunctionalanalysistotakederivativeswithrespecttovectors.Inthecasethatamatrixfunctionofamatrix
isFrchetdifferentiable,thetwoderivativeswillagreeuptotranslationofnotations.Asisthecaseingeneralforpartialderivatives,
someformulaemayextendunderweakeranalyticconditionsthantheexistenceofthederivativeasapproximatinglinearmapping.

Usages
Matrixcalculusisusedforderivingoptimalstochasticestimators,ofteninvolvingtheuseofLagrangemultipliers.Thisincludesthe
derivationof:
Kalmanfilter
Wienerfilter
ExpectationmaximizationalgorithmforGaussianmixture

Notation
Thevectorandmatrixderivativespresentedinthesectionstofollowtakefulladvantageofmatrixnotation,usingasinglevariableto
representalargenumberofvariables.Inwhatfollowswewilldistinguishscalars,vectorsandmatricesbytheirtypeface.Wewilllet
M(n,m)denotethespaceofrealnmmatriceswithnrowsandmcolumns.Suchmatriceswillbedenotedusingboldcapitalletters:A,X,
Y,etc.AnelementofM(n,1),thatis,acolumnvector,isdenotedwithaboldfacelowercaseletter:a,x,y,etc.AnelementofM(1,1)isa
scalar,denotedwithlowercaseitalictypeface:a,t,x,etc.XTdenotesmatrixtranspose,tr(X)isthetrace,anddet(X)isthedeterminant.
AllfunctionsareassumedtobeofdifferentiabilityclassC1unlessotherwisenoted.Generallylettersfromthefirsthalfofthealphabet(a,
b,c,)willbeusedtodenoteconstants,andfromthesecondhalf(t,x,y,)todenotevariables.

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

2/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

NOTE:Asmentionedabove,therearecompetingnotationsforlayingoutsystemsofpartialderivativesinvectorsandmatrices,andno
standardappearstobeemergingyet.Thenexttwointroductorysectionsusethenumeratorlayoutconventionsimplyforthepurposesof
convenience,toavoidoverlycomplicatingthediscussion.Thesectionafterthemdiscusseslayoutconventionsinmoredetail.Itis
importanttorealizethefollowing:
1.Despitetheuseoftheterms"numeratorlayout"and"denominatorlayout",thereareactuallymorethantwopossiblenotational
choicesinvolved.Thereasonisthatthechoiceofnumeratorvs.denominator(orinsomesituations,numeratorvs.mixed)canbe
madeindependentlyforscalarbyvector,vectorbyscalar,vectorbyvector,andscalarbymatrixderivatives,andanumberof
authorsmixandmatchtheirlayoutchoicesinvariousways.
2.Thechoiceofnumeratorlayoutintheintroductorysectionsbelowdoesnotimplythatthisisthe"correct"or"superior"choice.
Thereareadvantagesanddisadvantagestothevariouslayouttypes.Seriousmistakescanresultfromcarelesslycombining
formulaswrittenindifferentlayouts,andconvertingfromonelayouttoanotherrequirescaretoavoiderrors.Asaresult,when
workingwithexistingformulasthebestpolicyisprobablytoidentifywhicheverlayoutisusedandmaintainconsistencywithit,
ratherthanattemptingtousethesamelayoutinallsituations.

Alternatives
ThetensorindexnotationwithitsEinsteinsummationconventionisverysimilartothematrixcalculus,exceptonewritesonlyasingle
componentatatime.Ithastheadvantagethatonecaneasilymanipulatearbitrarilyhighranktensors,whereastensorsofrankhigher
thantwoarequiteunwieldywithmatrixnotation.Alloftheworkherecanbedoneinthisnotationwithoutuseofthesinglevariable
matrixnotation.However,manyproblemsinestimationtheoryandotherareasofappliedmathematicswouldresultintoomanyindices
toproperlykeeptrackof,pointinginfavorofmatrixcalculusinthoseareas.Also,Einsteinnotationcanbeveryusefulinprovingthe
identitiespresentedhere,asanalternativetotypicalelementnotation,whichcanbecomecumbersomewhentheexplicitsumsarecarried
around.Notethatamatrixcanbeconsideredatensorofranktwo.

Derivativeswithvectors
Becausevectorsarematriceswithonlyonecolumn,thesimplestmatrixderivativesarevectorderivatives.
ThenotationsdevelopedherecanaccommodatetheusualoperationsofvectorcalculusbyidentifyingthespaceM(n,1)ofnvectorswith
theEuclideanspaceRn,andthescalarM(1,1)isidentifiedwithR.Thecorrespondingconceptfromvectorcalculusisindicatedatthe
endofeachsubsection.
NOTE:Thediscussioninthissectionassumesthenumeratorlayoutconventionforpedagogicalpurposes.Someauthorsusedifferent
conventions.Thesectiononlayoutconventionsdiscussesthisissueingreaterdetail.Theidentitiesgivenfurtherdownarepresentedin
formsthatcanbeusedinconjunctionwithallcommonlayoutconventions.

Vectorbyscalar

Thederivativeofavector

,byascalarxiswritten(innumeratorlayoutnotation)as

Invectorcalculusthederivativeofavectorywithrespecttoascalarxisknownasthetangentvectorofthevectory,
thaty:R1

.Noticehere

Rm.

ExampleSimpleexamplesofthisincludethevelocityvectorinEuclideanspace,whichisthetangentvectorofthepositionvector
(consideredasafunctionoftime).Also,theaccelerationisthetangentvectorofthevelocity.

Scalarbyvector

Thederivativeofascalarybyavector

,iswritten(innumeratorlayoutnotation)as

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

3/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Invectorcalculus,thegradientofascalarfieldyinthespaceRn(whoseindependentcoordinatesarethecomponentsofx)isthe
derivativeofascalarbyavector.Inphysics,theelectricfieldisthevectorgradientoftheelectricpotential.
Thedirectionalderivativeofascalarfunctionf(x)ofthespacevectorxinthedirectionoftheunitvectoruisdefinedusingthegradient
asfollows.

Usingthenotationjustdefinedforthederivativeofascalarwithrespecttoavectorwecanrewritethedirectionalderivativeas
Thistypeofnotationwillbenicewhenprovingproductrulesandchainrulesthatcomeoutlookingsimilartowhatwe
arefamiliarwithforthescalarderivative.

Vectorbyvector
Eachoftheprevioustwocasescanbeconsideredasanapplicationofthederivativeofavectorwithrespecttoavector,usingavectorof
sizeoneappropriately.Similarlywewillfindthatthederivativesinvolvingmatriceswillreducetoderivativesinvolvingvectorsina
correspondingway.

Thederivativeofavectorfunction(avectorwhosecomponentsarefunctions)

,withrespecttoaninputvector,

iswritten(innumeratorlayoutnotation)as

Invectorcalculus,thederivativeofavectorfunctionywithrespecttoavectorxwhosecomponentsrepresentaspaceisknownasthe
pushforward(ordifferential),ortheJacobianmatrix.
ThepushforwardalongavectorfunctionfwithrespecttovectorvinRnisgivenby

Derivativeswithmatrices
Therearetwotypesofderivativeswithmatricesthatcanbeorganizedintoamatrixofthesamesize.Thesearethederivativeofamatrix
byascalarandthederivativeofascalarbyamatrixrespectively.Thesecanbeusefulinminimizationproblemsfoundmanyareasof
appliedmathematicsandhaveadoptedthenamestangentmatrixandgradientmatrixrespectivelyaftertheiranalogsforvectors.
NOTE:Thediscussioninthissectionassumesthenumeratorlayoutconventionforpedagogicalpurposes.Someauthorsusedifferent
conventions.Thesectiononlayoutconventionsdiscussesthisissueingreaterdetail.Theidentitiesgivenfurtherdownarepresentedin
formsthatcanbeusedinconjunctionwithallcommonlayoutconventions.

Matrixbyscalar
ThederivativeofamatrixfunctionYbyascalarxisknownasthetangentmatrixandisgiven(innumeratorlayoutnotation)by

Scalarbymatrix
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

4/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

ThederivativeofascalaryfunctionofapqmatrixXofindependentvariables,withrespecttothematrixX,isgiven(innumerator
layoutnotation)by

Importantexamplesofscalarfunctionsofmatricesincludethetraceofamatrixandthedeterminant.
Inanalogwithvectorcalculusthisderivativeisoftenwrittenasthefollowing.

Alsoinanalogwithvectorcalculus,thedirectionalderivativeofascalarf(X)ofamatrixXinthedirectionofmatrixYisgivenby

Itisthegradientmatrix,inparticular,thatfindsmanyusesinminimizationproblemsinestimationtheory,particularlyinthederivation
oftheKalmanfilteralgorithm,whichisofgreatimportanceinthefield.

Othermatrixderivatives
Thethreetypesofderivativesthathavenotbeenconsideredarethoseinvolvingvectorsbymatrices,matricesbyvectors,andmatrices
bymatrices.Thesearenotaswidelyconsideredandanotationisnotwidelyagreedupon.Asforvectors,theothertwotypesofhigher
matrixderivativescanbeseenasapplicationsofthederivativeofamatrixbyamatrixbyusingamatrixwithonecolumninthecorrect
place.Forthisreason,inthissubsectionweconsideronlyhowonecanwritethederivativeofamatrixbyanothermatrix.
ThedifferentialorthematrixderivativeofamatrixfunctionF(X)thatmapsfromnmmatricestopqmatrices,F:M(n,m) M(p,q),
isanelementofM(p,q) M(m,n),afourthranktensor(thereversalofmandnhereindicatesthedualspaceofM(n,m)).Inshortitisan
mnmatrixeachofwhoseentriesisapqmatrix.

andnotethateach

isapqmatrixdefinedasabove.Notealsothatthismatrixhasitsindexingtransposedmrowsandncolumns.

ThepushforwardalongFofannmmatrixYinM(n,m)isthen
asformalblockmatrices.
Notethatthisdefinitionencompassesalloftheprecedingdefinitionsasspecialcases.
AccordingtoJanR.MagnusandHeinzNeudecker,thefollowingnotationsarebothunsuitable,asthedeterminantofthesecond
resultingmatrixwouldhave"nointerpretation"and"ausefulchainruledoesnotexist"ifthesenotationsarebeingused:[2]
Given ,adifferentiablefunctionofan

Given

,adifferentiable

matrix

functionofan

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

matrix ,

5/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

TheJacobianmatrix,accordingtoMagnusandNeudecker,[2]is

Layoutconventions
Thissectiondiscussesthesimilaritiesanddifferencesbetweennotationalconventionsthatareusedinthevariousfieldsthattake
advantageofmatrixcalculus.Althoughtherearelargelytwoconsistentconventions,someauthorsfinditconvenienttomixthetwo
conventionsinformsthatarediscussedbelow.Afterthissectionequationswillbelistedinbothcompetingformsseparately.
Thefundamentalissueisthatthederivativeofavectorwithrespecttoavector,i.e.

,isoftenwrittenintwocompetingways.Ifthe

numeratoryisofsizemandthedenominatorxofsizen,thentheresultcanbelaidoutaseitheranmnmatrixornmmatrix,i.e.the
elementsofylaidoutincolumnsandtheelementsofxlaidoutinrows,orviceversa.Thisleadstothefollowingpossibilities:
1.Numeratorlayout,i.e.layoutaccordingtoyandxT(i.e.contrarilytox).ThisissometimesknownastheJacobianformulation.
2.Denominatorlayout,i.e.layoutaccordingtoyTandx(i.e.contrarilytoy).ThisissometimesknownastheHessianformulation.
Someauthorstermthislayoutthegradient,indistinctiontotheJacobian(numeratorlayout),whichisitstranspose.(However,
"gradient"morecommonlymeansthederivative

regardlessoflayout.)

3.Athirdpossibilitysometimesseenistoinsistonwritingthederivativeas

(i.e.thederivativeistakenwithrespecttothe

transposeofx)andfollowthenumeratorlayout.Thismakesitpossibletoclaimthatthematrixislaidoutaccordingtoboth
numeratoranddenominator.Inpracticethisproducesresultsthesameasthenumeratorlayout.
Whenhandlingthegradient

andtheoppositecase

wehavethesameissues.Tobeconsistent,weshoulddooneofthe

following:
1.Ifwechoosenumeratorlayoutfor
2.Ifwechoosedenominatorlayoutfor
3.Inthethirdpossibilityabove,wewrite

weshouldlayoutthegradient

asarowvector,and

weshouldlayoutthegradient
and

asacolumnvector.

asacolumnvector,and

asarowvector.

andusenumeratorlayout.

Notallmathtextbooksandpapersareconsistentinthisrespectthroughouttheentirepaper.Thatis,sometimesdifferentconventionsare
usedindifferentcontextswithinthesamepaper.Forexample,somechoosedenominatorlayoutforgradients(layingthemoutascolumn
vectors),butnumeratorlayoutforthevectorbyvectorderivative
Similarly,whenitcomestoscalarbymatrixderivatives

andmatrixbyscalarderivatives

thenconsistentnumeratorlayout

laysoutaccordingtoYandXT,whileconsistentdenominatorlayoutlaysoutaccordingtoYTandX.Inpractice,however,followinga
denominatorlayoutfor

andlayingtheresultoutaccordingtoYT,israrelyseenbecauseitmakesforuglyformulasthatdonot

correspondtothescalarformulas.Asaresult,thefollowinglayoutscanoftenbefound:
1.Consistentnumeratorlayout,whichlaysout
2.Mixedlayout,whichlaysout
3.Usethenotation

accordingtoYand

accordingtoYand

accordingtoXT.

accordingtoX.

withresultsthesameasconsistentnumeratorlayout.

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

6/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Inthefollowingformulas,wehandlethefivepossiblecombinations

and

separately.Wealsohandlecasesof

scalarbyscalarderivativesthatinvolveanintermediatevectorormatrix.(Thiscanarise,forexample,ifamultidimensionalparametric
curveisdefinedintermsofascalarvariable,andthenaderivativeofascalarfunctionofthecurveistakenwithrespecttothescalarthat
parameterizesthecurve.)Foreachofthevariouscombinations,wegivenumeratorlayoutanddenominatorlayoutresults,exceptinthe
casesabovewheredenominatorlayoutrarelyoccurs.Incasesinvolvingmatriceswhereitmakessense,wegivenumeratorlayoutand
mixedlayoutresults.Asnotedabove,caseswherevectorandmatrixdenominatorsarewrittenintransposenotationareequivalentto
numeratorlayoutwiththedenominatorswrittenwithoutthetranspose.
Keepinmindthatvariousauthorsusedifferentcombinationsofnumeratoranddenominatorlayoutsfordifferenttypesofderivatives,
andthereisnoguaranteethatanauthorwillconsistentlyuseeithernumeratorordenominatorlayoutforalltypes.Matchuptheformulas
belowwiththosequotedinthesourcetodeterminethelayoutusedforthatparticulartypeofderivative,butbecarefulnottoassumethat
derivativesofothertypesnecessarilyfollowthesamekindoflayout.
Whentakingderivativeswithanaggregate(vectorormatrix)denominatorinordertofindamaximumorminimumoftheaggregate,it
shouldbekeptinmindthatusingnumeratorlayoutwillproduceresultsthataretransposedwithrespecttotheaggregate.Forexample,in
attemptingtofindthemaximumlikelihoodestimateofamultivariatenormaldistributionusingmatrixcalculus,ifthedomainisakx1
columnvector,thentheresultusingthenumeratorlayoutwillbeintheformofa1xkrowvector.Thus,eithertheresultsshouldbe
transposedattheendorthedenominatorlayout(ormixedlayout)shouldbeused.
Resultofdifferentiatingvariouskindsofaggregateswithotherkindsofaggregates
Scalary
Vectory(sizem)
MatrixY(sizemn)
Notation

Type

Notation

Type
(numeratorlayout)sizem
columnvector

Scalarx

Vectorx(size
n)

scalar

(denominatorlayout)sizem
rowvector

(numeratorlayout)sizenrow
vector

(numeratorlayout)mn
matrix

(denominatorlayout)sizen
columnvector

(denominatorlayout)nm
matrix

Notation

Type

(numeratorlayout)
mnmatrix

(numeratorlayout)qpmatrix
MatrixX(size
pq)

(denominatorlayout)pq
matrix

Theresultsofoperationswillbetransposedwhenswitchingbetweennumeratorlayoutanddenominatorlayoutnotation.

Numeratorlayoutnotation
Usingnumeratorlayoutnotation,wehave:[1]

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

7/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Thefollowingdefinitionsareonlyprovidedinnumeratorlayoutnotation:

Denominatorlayoutnotation
Usingdenominatorlayoutnotation,wehave:[3]

Identities
Asnotedabove,ingeneral,theresultsofoperationswillbetransposedwhenswitchingbetweennumeratorlayoutanddenominator
layoutnotation.
Tohelpmakesenseofalltheidentitiesbelow,keepinmindthemostimportantrules:thechainrule,productruleandsumrule.Thesum
ruleappliesuniversally,andtheproductruleappliesinmostofthecasesbelow,providedthattheorderofmatrixproductsismaintained,
sincematrixproductsarenotcommutative.Thechainruleappliesinsomeofthecases,butunfortunatelydoesnotapplyinmatrixby

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

8/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

scalarderivativesorscalarbymatrixderivatives(inthelattercase,mostlyinvolvingthetraceoperatorappliedtomatrices).Inthelatter
case,theproductrulecan'tquitebeapplieddirectly,either,buttheequivalentcanbedonewithabitmoreworkusingthedifferential
identities.

Vectorbyvectoridentities
Thisispresentedfirstbecausealloftheoperationsthatapplytovectorbyvectordifferentiationapplydirectlytovectorbyscalaror
scalarbyvectordifferentiationsimplybyreducingtheappropriatevectorinthenumeratorordenominatortoascalar.
Identities:vectorbyvector
Condition

Expression

Numerator
layout,i.e.byy
andxT

Denominator
layout,i.e.byyT
andx

aisnotafunctionofx

Aisnotafunctionofx
Aisnotafunctionofx
aisnotafunctionofx,
u=u(x)
a=a(x),u=u(x)
Aisnotafunctionof
x,
u=u(x)
u=u(x),v=v(x)
u=u(x)
u=u(x)

Scalarbyvectoridentities
Thefundamentalidentitiesareplacedabovethethickblackline.

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

9/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Identities:scalarbyvector
Condition

Expression

aisnotafunction
ofx

Numeratorlayout,
i.e.byxTresultisrowvector
[4]

Denominatorlayout,
i.e.byxresultiscolumnvector
[4]

aisnotafunction
ofx,
u=u(x)
u=u(x),v=v(x)
u=u(x),v=v(x)
u=u(x)
u=u(x)

u=u(x),v=v(x)

assumesnumeratorlayoutof

assumesdenominatorlayoutof

u=u(x),v=v(x),
Aisnota
functionofx

assumesnumeratorlayoutof

assumesdenominatorlayoutof

,theHessianmatrix[5]

aisnotafunction
ofx

Aisnota
functionofx
bisnotafunction
ofx
Aisnota
functionofx
Aisnota
functionofx
Aissymmetric
Aisnota
functionofx
Aisnota
functionofx
Aissymmetric

aisnotafunction
ofx,
u=u(x)
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

10/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

assumesnumeratorlayoutof

assumesdenominatorlayoutof

a,barenot
functionsofx
A,b,C,D,eare
notfunctionsofx
aisnotafunction
ofx

Vectorbyscalaridentities
Identities:vectorbyscalar

Condition

Expression

Numeratorlayout,
Denominator
i.e.byy,
layout,i.e.byyT,
resultiscolumn
resultisrowvector
vector
[4]

aisnotafunctionofx
aisnotafunctionofx,
u=u(x)
Aisnotafunctionofx,
u=u(x)
u=u(x)
u=u(x),v=v(x)
u=u(x),v=v(x)

u=u(x)

u=u(x)

Assumesconsistentmatrixlayoutsee
below.

Assumesconsistentmatrixlayoutsee
below.

NOTE:Theformulasinvolvingthevectorbyvectorderivatives

and

(whoseoutputsarematrices)assumethematrices

arelaidoutconsistentwiththevectorlayout,i.e.numeratorlayoutmatrixwhennumeratorlayoutvectorandviceversaotherwise,
transposethevectorbyvectorderivatives.

Scalarbymatrixidentities
Notethatexactequivalentsofthescalarproductruleandchainruledonotexistwhenappliedtomatrixvaluedfunctionsofmatrices.
However,theproductruleofthissortdoesapplytothedifferentialform(seebelow),andthisisthewaytoderivemanyoftheidentities
belowinvolvingthetracefunction,combinedwiththefactthatthetracefunctionallowstransposingandcyclicpermutation,i.e.:

Forexample,tocompute

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

11/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Therefore,

(Forthelaststep,seethe`Conversionfromdifferentialtoderivativeform'section.)

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

12/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Identities:scalarbymatrix
Condition

Expression

aisnota
functionofX

Numeratorlayout,i.e.byXT

Denominatorlayout,i.e.byX

[6]

[6]

aisnota
functionofX,
u=u(X)
u=u(X),v=
v(X)
u=u(X),v=
v(X)
u=u(X)
u=u(X)

[5]

U=U(X)

Bothformsassumenumeratorlayoutfor
i.e.mixedlayoutifdenominatorlayoutforXisbeingused.

U=U(X),V
=V(X)
aisnota
functionofX,
U=U(X)
g(X)isany
polynomial
withscalar
coefficients,
oranymatrix
function
definedbyan
infinite
polynomial
series(e.g.
eX,sin(X),
cos(X),ln(X),
etc.usinga
Taylorseries)
g(x)isthe
equivalent
scalar
function,g(x)
isits
derivative,
andg(X)is
the
corresponding
matrix
function
Aisnota
functionofX

[7]

Aisnota [5]

functionofX
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

13/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Aisnota
functionofX

[5]

Aisnota
functionofX

[5]

A,Barenot
functionsof
X
A,B,Care
notfunctions
ofX
nisapositive
integer
Aisnota
functionofX,
nisapositive
integer

[5]

[5]

[5]
[5]
[8]

aisnota
functionofX
A,Barenot
functionsof
X

[5]

[9]

[5]

nisapositive
integer

[5]

(seepseudo
inverse)

[5]

(seepseudo
inverse)

[5]

Aisnota
functionofX,
Xissquare
andinvertible
Aisnota
functionofX,
Xisnon
square,
Ais
symmetric
Aisnota
functionofX,
Xisnon
square,
Aisnon
symmetric

Matrixbyscalaridentities

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

14/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Identities:matrixbyscalar
Condition

Expression

Numeratorlayout,i.e.byY

U=U(x)
A,Barenotfunctionsofx,
U=U(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x)
U=U(x,y)
Aisnotafunctionofx,g(X)isanypolynomial
withscalarcoefficients,oranymatrixfunction
definedbyaninfinitepolynomialseries(e.g.eX,
sin(X),cos(X),ln(X),etc.)g(x)istheequivalent
scalarfunction,g(x)isitsderivative,andg(X)
isthecorrespondingmatrixfunction
Aisnotafunctionofx
FurtherseeDerivativeoftheexponentialmap.

Scalarbyscalaridentities
Withvectorsinvolved
Identities:scalarbyscalar,withvectorsinvolved
Anylayout(assumes
dotproductignores
Condition
Expression
rowvs.column
layout)
u=u(x)
u=u(x),v=v(x)
Withmatricesinvolved

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

15/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Condition

Identities:scalarbyscalar,withmatricesinvolved[5]
Consistentnumeratorlayout,
Expression
i.e.byYandXT

Mixedlayout,
i.e.byYandX

U=U(x)
U=U(x)
U=U(x)

U=U(x)
Aisnotafunctionofx,
g(X)isanypolynomial
withscalarcoefficients,
oranymatrixfunction
definedbyaninfinite
polynomialseries(e.g.
eX,sin(X),cos(X),ln(X),
etc.)g(x)isthe
equivalentscalar
function,g(x)isits
derivative,andg(X)is
thecorrespondingmatrix
function.
Aisnotafunctionofx

Identitiesindifferentialform
Itisofteneasiertoworkindifferentialformandthenconvertbacktonormalderivatives.Thisonlyworkswellusingthenumerator
layout.Intheserules,"a"isascalar.
Differentialidentities:scalarinvolvingmatrix[1][5]
Condition Expression Result(numeratorlayout)

Differentialidentities:matrix[1][5]
Condition
Expression Result(numeratorlayout)
AisnotafunctionofX
aisnotafunctionofX

(Kroneckerproduct)
(Hadamardproduct)

(conjugatetranspose)
Toconverttonormalderivativeform,firstconvertittooneofthefollowingcanonicalforms,andthenusetheseidentities:

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

16/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Conversionfromdifferentialtoderivativeform[1]
Canonicaldifferentialform Equivalentderivativeform

Seealso
Derivative(generalizations)
Productintegral

Notes
1.ThomasP.,Minka(December28,2000)."OldandNewMatrixAlgebraUsefulforStatistics".MITMediaLabnote(1997revised12/00).
Retrieved5February2016.
2.Magnus,JanR.Neudecker,Heinz(1999).Matrixdifferentialcalculuswithapplicationsinstatisticsandeconometrics(Reviseded.).NewYork:
JohnWiley&Sons.pp.171173.ISBN9780471986331.
3.Felippa,CarlosA."AppendixD,LinearAlgebra:Determinants,Inverses,Rank".ASEN5007:IntroductionToFiniteElementMethods(PDF).
Boulder,Colorado:UniversityofColorado.Retrieved5February2016.UsestheHessian(transposetoJacobian)definitionofvectorandmatrix
derivatives.
4.Here, referstoacolumnvectorofall0's,ofsizen,wherenisthelengthofx.
5.Petersen,KaareBrandtPedersen,MichaelSyskind.TheMatrixCookbook(PDF).Archivedfromtheoriginalon2March2010.Retrieved
5February2016.Thisbookusesamixedlayout,i.e.byYin

byXin

6.Here, referstoamatrixofall0's,ofthesameshapeasX.
7.Duchi,JohnC."PropertiesoftheTraceandMatrixDerivatives"(PDF).StanfordUniversity.Retrieved5February2016.
8.SeeDeterminant#Derivativeforthederivation.
9.Theconstantadisappearsintheresult.Thisisintentional.Ingeneral,

Furtherreading
Lax,PeterD.(2007)."9.CalculusofVectorandMatrixValuedFunctions".Linearalgebraanditsapplications(2nded.).Hoboken,N.J.:Wiley
Interscience.ISBN9780471751564.

Externallinks
MatrixReferenceManual(http://www.psi.toronto.edu/matrix/calculus.html),MikeBrookes,ImperialCollegeLondon.
MatrixDifferentiation(andsomeotherstuff)(http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf),RandalJ.Barnes,
DepartmentofCivilEngineering,UniversityofMinnesota.
NotesonMatrixCalculus(http://www4.ncsu.edu/~pfackler/MatCalc.pdf),PaulL.Fackler,NorthCarolinaStateUniversity.
MatrixDifferentialCalculus(https://wiki.inf.ed.ac.uk/twiki/pub/CSTR/ListenSemester1_2006_7/slide.pdf)(slidepresentation),
ZhangLe,UniversityofEdinburgh.
IntroductiontoVectorandMatrixDifferentiation(http://www.econ.ku.dk/metrics/Econometrics2_05_II/LectureNotes/matrixdiff.p
df)(notesonmatrixdifferentiation,inthecontextofEconometrics),HeinoBohnNielsen.
Anoteondifferentiatingmatrices(http://mpra.ub.unimuenchen.de/1239/1/MPRA_paper_1239.pdf)(notesonmatrix
differentiation),PawelKoval,fromMunichPersonalRePEcArchive.
Vector/MatrixCalculus(http://www.personal.rdg.ac.uk/~sis01xh/teaching/CY4C9/ANN3.pdf)Morenotesonmatrix
differentiation.
MatrixIdentities(http://www.cs.nyu.edu/~roweis/notes/matrixid.pdf)(notesonmatrixdifferentiation),SamRoweis.
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Matrix_calculus&oldid=735127910"
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

17/18

27/08/2016

MatrixcalculusWikipedia,thefreeencyclopedia

Categories: Matrixtheory Linearalgebra Multivariablecalculus


Thispagewaslastmodifiedon18August2016,at19:34.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmayapply.Byusingthissite,you
agreetotheTermsofUseandPrivacyPolicy.WikipediaisaregisteredtrademarkoftheWikimediaFoundation,Inc.,anon
profitorganization.

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

18/18

You might also like