27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Matrixcalculus
FromWikipedia,thefreeencyclopedia
Inmathematics,matrixcalculusisaspecializednotationfordoingmultivariablecalculus,especiallyoverspacesofmatrices.Itcollects
thevariouspartialderivativesofasinglefunctionwithrespecttomanyvariables,and/orofamultivariatefunctionwithrespecttoa
singlevariable,intovectorsandmatricesthatcanbetreatedassingleentities.Thisgreatlysimplifiesoperationssuchasfindingthe
maximumorminimumofamultivariatefunctionandsolvingsystemsofdifferentialequations.Thenotationusedhereiscommonlyused
instatisticsandengineering,whilethetensorindexnotationispreferredinphysics.
Twocompetingnotationalconventionssplitthefieldofmatrixcalculusintotwoseparategroups.Thetwogroupscanbedistinguished
bywhethertheywritethederivativeofascalarwithrespecttoavectorasacolumnvectororarowvector.Bothoftheseconventionsare
possibleevenwhenthecommonassumptionismadethatvectorsshouldbetreatedascolumnvectorswhencombinedwithmatrices
(ratherthanrowvectors).Asingleconventioncanbesomewhatstandardthroughoutasinglefieldthatcommonlyusematrixcalculus
(e.g.econometrics,statistics,estimationtheoryandmachinelearning).However,evenwithinagivenfielddifferentauthorscanbefound
usingcompetingconventions.Authorsofbothgroupsoftenwriteasthoughtheirspecificconventionisstandard.Seriousmistakescan
resultwhencombiningresultsfromdifferentauthorswithoutcarefullyverifyingthatcompatiblenotationsareused.Therefore,greatcare
shouldbetakentoensurenotationalconsistency.Definitionsofthesetwoconventionsandcomparisonsbetweenthemarecollectedin
thelayoutconventionssection.
Contents
1 Scope
1.1 Relationtootherderivatives
1.2 Usages
2 Notation
2.1 Alternatives
3 Derivativeswithvectors
3.1 Vectorbyscalar
3.2 Scalarbyvector
3.3 Vectorbyvector
4 Derivativeswithmatrices
4.1 Matrixbyscalar
4.2 Scalarbymatrix
4.3 Othermatrixderivatives
5 Layoutconventions
5.1 Numeratorlayoutnotation
5.2 Denominatorlayoutnotation
6 Identities
6.1 Vectorbyvectoridentities
6.2 Scalarbyvectoridentities
6.3 Vectorbyscalaridentities
6.4 Scalarbymatrixidentities
6.5 Matrixbyscalaridentities
6.6 Scalarbyscalaridentities
6.6.1 Withvectorsinvolved
6.6.2 Withmatricesinvolved
6.7 Identitiesindifferentialform
7 Seealso
8 Notes
9 Furtherreading
10 Externallinks
Scope
Matrixcalculusreferstoanumberofdifferentnotationsthatusematricesandvectorstocollectthederivativeofeachcomponentofthe
dependentvariablewithrespecttoeachcomponentoftheindependentvariable.Ingeneral,theindependentvariablecanbeascalar,a
vector,oramatrixwhilethedependentvariablecanbeanyoftheseaswell.Eachdifferentsituationwillleadtoadifferentsetofrules,or
aseparatecalculus,usingthebroadersenseoftheterm.Matrixnotationservesasaconvenientwaytocollectthemanyderivativesinan
organizedway.
Asafirstexample,considerthegradientfromvectorcalculus.Forascalarfunctionofthreeindependentvariables,
gradientisgivenbythevectorequation
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
,the
1/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
,
where representsaunitvectorinthe directionfor
.Thistypeofgeneralizedderivativecanbeseenasthederivativeofa
scalar,f,withrespecttoavector, anditsresultcanbeeasilycollectedinvectorform.
Morecomplicatedexamplesincludethederivativeofascalarfunctionwithrespecttoamatrix,knownasthegradientmatrix,which
collectsthederivativewithrespecttoeachmatrixelementinthecorrespondingpositionintheresultingmatrix.Inthatcasethescalar
mustbeafunctionofeachoftheindependentvariablesinthematrix.Asanotherexample,ifwehaveannvectorofdependentvariables,
orfunctions,ofmindependentvariableswemightconsiderthederivativeofthedependentvectorwithrespecttotheindependentvector.
Theresultcouldbecollectedinanmnmatrixconsistingofallofthepossiblederivativecombinations.Thereare,ofcourse,atotalof
ninepossibilitiesusingscalars,vectors,andmatrices.Noticethatasweconsiderhighernumbersofcomponentsineachofthe
independentanddependentvariableswecanbeleftwithaverylargenumberofpossibilities.
Thesixkindsofderivativesthatcanbemostneatlyorganizedinmatrixformarecollectedinthefollowingtable.[1]
Types
TypesofMatrixDerivatives
Scalar
Vector
Matrix
Scalar
Vector
Matrix
Here,wehaveusedtheterm"matrix"initsmostgeneralsense,recognizingthatvectorsandscalarsaresimplymatriceswithonecolumn
andthenonerowrespectively.Moreover,wehaveusedboldletterstoindicatevectorsandboldcapitallettersformatrices.Thisnotation
isusedthroughout.
Noticethatwecouldalsotalkaboutthederivativeofavectorwithrespecttoamatrix,oranyoftheotherunfilledcellsinourtable.
However,thesederivativesaremostnaturallyorganizedinatensorofrankhigherthan2,sothattheydonotfitneatlyintoamatrix.In
thefollowingthreesectionswewilldefineeachoneofthesederivativesandrelatethemtootherbranchesofmathematics.Seethelayout
conventionssectionforamoredetailedtable.
Relationtootherderivatives
Thematrixderivativeisaconvenientnotationforkeepingtrackofpartialderivativesfordoingcalculations.TheFrchetderivativeisthe
standardwayinthesettingoffunctionalanalysistotakederivativeswithrespecttovectors.Inthecasethatamatrixfunctionofamatrix
isFrchetdifferentiable,thetwoderivativeswillagreeuptotranslationofnotations.Asisthecaseingeneralforpartialderivatives,
someformulaemayextendunderweakeranalyticconditionsthantheexistenceofthederivativeasapproximatinglinearmapping.
Usages
Matrixcalculusisusedforderivingoptimalstochasticestimators,ofteninvolvingtheuseofLagrangemultipliers.Thisincludesthe
derivationof:
Kalmanfilter
Wienerfilter
ExpectationmaximizationalgorithmforGaussianmixture
Notation
Thevectorandmatrixderivativespresentedinthesectionstofollowtakefulladvantageofmatrixnotation,usingasinglevariableto
representalargenumberofvariables.Inwhatfollowswewilldistinguishscalars,vectorsandmatricesbytheirtypeface.Wewilllet
M(n,m)denotethespaceofrealnmmatriceswithnrowsandmcolumns.Suchmatriceswillbedenotedusingboldcapitalletters:A,X,
Y,etc.AnelementofM(n,1),thatis,acolumnvector,isdenotedwithaboldfacelowercaseletter:a,x,y,etc.AnelementofM(1,1)isa
scalar,denotedwithlowercaseitalictypeface:a,t,x,etc.XTdenotesmatrixtranspose,tr(X)isthetrace,anddet(X)isthedeterminant.
AllfunctionsareassumedtobeofdifferentiabilityclassC1unlessotherwisenoted.Generallylettersfromthefirsthalfofthealphabet(a,
b,c,)willbeusedtodenoteconstants,andfromthesecondhalf(t,x,y,)todenotevariables.
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
2/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
NOTE:Asmentionedabove,therearecompetingnotationsforlayingoutsystemsofpartialderivativesinvectorsandmatrices,andno
standardappearstobeemergingyet.Thenexttwointroductorysectionsusethenumeratorlayoutconventionsimplyforthepurposesof
convenience,toavoidoverlycomplicatingthediscussion.Thesectionafterthemdiscusseslayoutconventionsinmoredetail.Itis
importanttorealizethefollowing:
1.Despitetheuseoftheterms"numeratorlayout"and"denominatorlayout",thereareactuallymorethantwopossiblenotational
choicesinvolved.Thereasonisthatthechoiceofnumeratorvs.denominator(orinsomesituations,numeratorvs.mixed)canbe
madeindependentlyforscalarbyvector,vectorbyscalar,vectorbyvector,andscalarbymatrixderivatives,andanumberof
authorsmixandmatchtheirlayoutchoicesinvariousways.
2.Thechoiceofnumeratorlayoutintheintroductorysectionsbelowdoesnotimplythatthisisthe"correct"or"superior"choice.
Thereareadvantagesanddisadvantagestothevariouslayouttypes.Seriousmistakescanresultfromcarelesslycombining
formulaswrittenindifferentlayouts,andconvertingfromonelayouttoanotherrequirescaretoavoiderrors.Asaresult,when
workingwithexistingformulasthebestpolicyisprobablytoidentifywhicheverlayoutisusedandmaintainconsistencywithit,
ratherthanattemptingtousethesamelayoutinallsituations.
Alternatives
ThetensorindexnotationwithitsEinsteinsummationconventionisverysimilartothematrixcalculus,exceptonewritesonlyasingle
componentatatime.Ithastheadvantagethatonecaneasilymanipulatearbitrarilyhighranktensors,whereastensorsofrankhigher
thantwoarequiteunwieldywithmatrixnotation.Alloftheworkherecanbedoneinthisnotationwithoutuseofthesinglevariable
matrixnotation.However,manyproblemsinestimationtheoryandotherareasofappliedmathematicswouldresultintoomanyindices
toproperlykeeptrackof,pointinginfavorofmatrixcalculusinthoseareas.Also,Einsteinnotationcanbeveryusefulinprovingthe
identitiespresentedhere,asanalternativetotypicalelementnotation,whichcanbecomecumbersomewhentheexplicitsumsarecarried
around.Notethatamatrixcanbeconsideredatensorofranktwo.
Derivativeswithvectors
Becausevectorsarematriceswithonlyonecolumn,thesimplestmatrixderivativesarevectorderivatives.
ThenotationsdevelopedherecanaccommodatetheusualoperationsofvectorcalculusbyidentifyingthespaceM(n,1)ofnvectorswith
theEuclideanspaceRn,andthescalarM(1,1)isidentifiedwithR.Thecorrespondingconceptfromvectorcalculusisindicatedatthe
endofeachsubsection.
NOTE:Thediscussioninthissectionassumesthenumeratorlayoutconventionforpedagogicalpurposes.Someauthorsusedifferent
conventions.Thesectiononlayoutconventionsdiscussesthisissueingreaterdetail.Theidentitiesgivenfurtherdownarepresentedin
formsthatcanbeusedinconjunctionwithallcommonlayoutconventions.
Vectorbyscalar
Thederivativeofavector
,byascalarxiswritten(innumeratorlayoutnotation)as
Invectorcalculusthederivativeofavectorywithrespecttoascalarxisknownasthetangentvectorofthevectory,
thaty:R1
.Noticehere
Rm.
ExampleSimpleexamplesofthisincludethevelocityvectorinEuclideanspace,whichisthetangentvectorofthepositionvector
(consideredasafunctionoftime).Also,theaccelerationisthetangentvectorofthevelocity.
Scalarbyvector
Thederivativeofascalarybyavector
,iswritten(innumeratorlayoutnotation)as
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
3/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Invectorcalculus,thegradientofascalarfieldyinthespaceRn(whoseindependentcoordinatesarethecomponentsofx)isthe
derivativeofascalarbyavector.Inphysics,theelectricfieldisthevectorgradientoftheelectricpotential.
Thedirectionalderivativeofascalarfunctionf(x)ofthespacevectorxinthedirectionoftheunitvectoruisdefinedusingthegradient
asfollows.
Usingthenotationjustdefinedforthederivativeofascalarwithrespecttoavectorwecanrewritethedirectionalderivativeas
Thistypeofnotationwillbenicewhenprovingproductrulesandchainrulesthatcomeoutlookingsimilartowhatwe
arefamiliarwithforthescalarderivative.
Vectorbyvector
Eachoftheprevioustwocasescanbeconsideredasanapplicationofthederivativeofavectorwithrespecttoavector,usingavectorof
sizeoneappropriately.Similarlywewillfindthatthederivativesinvolvingmatriceswillreducetoderivativesinvolvingvectorsina
correspondingway.
Thederivativeofavectorfunction(avectorwhosecomponentsarefunctions)
,withrespecttoaninputvector,
iswritten(innumeratorlayoutnotation)as
Invectorcalculus,thederivativeofavectorfunctionywithrespecttoavectorxwhosecomponentsrepresentaspaceisknownasthe
pushforward(ordifferential),ortheJacobianmatrix.
ThepushforwardalongavectorfunctionfwithrespecttovectorvinRnisgivenby
Derivativeswithmatrices
Therearetwotypesofderivativeswithmatricesthatcanbeorganizedintoamatrixofthesamesize.Thesearethederivativeofamatrix
byascalarandthederivativeofascalarbyamatrixrespectively.Thesecanbeusefulinminimizationproblemsfoundmanyareasof
appliedmathematicsandhaveadoptedthenamestangentmatrixandgradientmatrixrespectivelyaftertheiranalogsforvectors.
NOTE:Thediscussioninthissectionassumesthenumeratorlayoutconventionforpedagogicalpurposes.Someauthorsusedifferent
conventions.Thesectiononlayoutconventionsdiscussesthisissueingreaterdetail.Theidentitiesgivenfurtherdownarepresentedin
formsthatcanbeusedinconjunctionwithallcommonlayoutconventions.
Matrixbyscalar
ThederivativeofamatrixfunctionYbyascalarxisknownasthetangentmatrixandisgiven(innumeratorlayoutnotation)by
Scalarbymatrix
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
4/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
ThederivativeofascalaryfunctionofapqmatrixXofindependentvariables,withrespecttothematrixX,isgiven(innumerator
layoutnotation)by
Importantexamplesofscalarfunctionsofmatricesincludethetraceofamatrixandthedeterminant.
Inanalogwithvectorcalculusthisderivativeisoftenwrittenasthefollowing.
Alsoinanalogwithvectorcalculus,thedirectionalderivativeofascalarf(X)ofamatrixXinthedirectionofmatrixYisgivenby
Itisthegradientmatrix,inparticular,thatfindsmanyusesinminimizationproblemsinestimationtheory,particularlyinthederivation
oftheKalmanfilteralgorithm,whichisofgreatimportanceinthefield.
Othermatrixderivatives
Thethreetypesofderivativesthathavenotbeenconsideredarethoseinvolvingvectorsbymatrices,matricesbyvectors,andmatrices
bymatrices.Thesearenotaswidelyconsideredandanotationisnotwidelyagreedupon.Asforvectors,theothertwotypesofhigher
matrixderivativescanbeseenasapplicationsofthederivativeofamatrixbyamatrixbyusingamatrixwithonecolumninthecorrect
place.Forthisreason,inthissubsectionweconsideronlyhowonecanwritethederivativeofamatrixbyanothermatrix.
ThedifferentialorthematrixderivativeofamatrixfunctionF(X)thatmapsfromnmmatricestopqmatrices,F:M(n,m) M(p,q),
isanelementofM(p,q) M(m,n),afourthranktensor(thereversalofmandnhereindicatesthedualspaceofM(n,m)).Inshortitisan
mnmatrixeachofwhoseentriesisapqmatrix.
andnotethateach
isapqmatrixdefinedasabove.Notealsothatthismatrixhasitsindexingtransposedmrowsandncolumns.
ThepushforwardalongFofannmmatrixYinM(n,m)isthen
asformalblockmatrices.
Notethatthisdefinitionencompassesalloftheprecedingdefinitionsasspecialcases.
AccordingtoJanR.MagnusandHeinzNeudecker,thefollowingnotationsarebothunsuitable,asthedeterminantofthesecond
resultingmatrixwouldhave"nointerpretation"and"ausefulchainruledoesnotexist"ifthesenotationsarebeingused:[2]
Given ,adifferentiablefunctionofan
Given
,adifferentiable
matrix
functionofan
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
matrix ,
5/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
TheJacobianmatrix,accordingtoMagnusandNeudecker,[2]is
Layoutconventions
Thissectiondiscussesthesimilaritiesanddifferencesbetweennotationalconventionsthatareusedinthevariousfieldsthattake
advantageofmatrixcalculus.Althoughtherearelargelytwoconsistentconventions,someauthorsfinditconvenienttomixthetwo
conventionsinformsthatarediscussedbelow.Afterthissectionequationswillbelistedinbothcompetingformsseparately.
Thefundamentalissueisthatthederivativeofavectorwithrespecttoavector,i.e.
,isoftenwrittenintwocompetingways.Ifthe
numeratoryisofsizemandthedenominatorxofsizen,thentheresultcanbelaidoutaseitheranmnmatrixornmmatrix,i.e.the
elementsofylaidoutincolumnsandtheelementsofxlaidoutinrows,orviceversa.Thisleadstothefollowingpossibilities:
1.Numeratorlayout,i.e.layoutaccordingtoyandxT(i.e.contrarilytox).ThisissometimesknownastheJacobianformulation.
2.Denominatorlayout,i.e.layoutaccordingtoyTandx(i.e.contrarilytoy).ThisissometimesknownastheHessianformulation.
Someauthorstermthislayoutthegradient,indistinctiontotheJacobian(numeratorlayout),whichisitstranspose.(However,
"gradient"morecommonlymeansthederivative
regardlessoflayout.)
3.Athirdpossibilitysometimesseenistoinsistonwritingthederivativeas
(i.e.thederivativeistakenwithrespecttothe
transposeofx)andfollowthenumeratorlayout.Thismakesitpossibletoclaimthatthematrixislaidoutaccordingtoboth
numeratoranddenominator.Inpracticethisproducesresultsthesameasthenumeratorlayout.
Whenhandlingthegradient
andtheoppositecase
wehavethesameissues.Tobeconsistent,weshoulddooneofthe
following:
1.Ifwechoosenumeratorlayoutfor
2.Ifwechoosedenominatorlayoutfor
3.Inthethirdpossibilityabove,wewrite
weshouldlayoutthegradient
asarowvector,and
weshouldlayoutthegradient
and
asacolumnvector.
asacolumnvector,and
asarowvector.
andusenumeratorlayout.
Notallmathtextbooksandpapersareconsistentinthisrespectthroughouttheentirepaper.Thatis,sometimesdifferentconventionsare
usedindifferentcontextswithinthesamepaper.Forexample,somechoosedenominatorlayoutforgradients(layingthemoutascolumn
vectors),butnumeratorlayoutforthevectorbyvectorderivative
Similarly,whenitcomestoscalarbymatrixderivatives
andmatrixbyscalarderivatives
thenconsistentnumeratorlayout
laysoutaccordingtoYandXT,whileconsistentdenominatorlayoutlaysoutaccordingtoYTandX.Inpractice,however,followinga
denominatorlayoutfor
andlayingtheresultoutaccordingtoYT,israrelyseenbecauseitmakesforuglyformulasthatdonot
correspondtothescalarformulas.Asaresult,thefollowinglayoutscanoftenbefound:
1.Consistentnumeratorlayout,whichlaysout
2.Mixedlayout,whichlaysout
3.Usethenotation
accordingtoYand
accordingtoYand
accordingtoXT.
accordingtoX.
withresultsthesameasconsistentnumeratorlayout.
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
6/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Inthefollowingformulas,wehandlethefivepossiblecombinations
and
separately.Wealsohandlecasesof
scalarbyscalarderivativesthatinvolveanintermediatevectorormatrix.(Thiscanarise,forexample,ifamultidimensionalparametric
curveisdefinedintermsofascalarvariable,andthenaderivativeofascalarfunctionofthecurveistakenwithrespecttothescalarthat
parameterizesthecurve.)Foreachofthevariouscombinations,wegivenumeratorlayoutanddenominatorlayoutresults,exceptinthe
casesabovewheredenominatorlayoutrarelyoccurs.Incasesinvolvingmatriceswhereitmakessense,wegivenumeratorlayoutand
mixedlayoutresults.Asnotedabove,caseswherevectorandmatrixdenominatorsarewrittenintransposenotationareequivalentto
numeratorlayoutwiththedenominatorswrittenwithoutthetranspose.
Keepinmindthatvariousauthorsusedifferentcombinationsofnumeratoranddenominatorlayoutsfordifferenttypesofderivatives,
andthereisnoguaranteethatanauthorwillconsistentlyuseeithernumeratorordenominatorlayoutforalltypes.Matchuptheformulas
belowwiththosequotedinthesourcetodeterminethelayoutusedforthatparticulartypeofderivative,butbecarefulnottoassumethat
derivativesofothertypesnecessarilyfollowthesamekindoflayout.
Whentakingderivativeswithanaggregate(vectorormatrix)denominatorinordertofindamaximumorminimumoftheaggregate,it
shouldbekeptinmindthatusingnumeratorlayoutwillproduceresultsthataretransposedwithrespecttotheaggregate.Forexample,in
attemptingtofindthemaximumlikelihoodestimateofamultivariatenormaldistributionusingmatrixcalculus,ifthedomainisakx1
columnvector,thentheresultusingthenumeratorlayoutwillbeintheformofa1xkrowvector.Thus,eithertheresultsshouldbe
transposedattheendorthedenominatorlayout(ormixedlayout)shouldbeused.
Resultofdifferentiatingvariouskindsofaggregateswithotherkindsofaggregates
Scalary
Vectory(sizem)
MatrixY(sizemn)
Notation
Type
Notation
Type
(numeratorlayout)sizem
columnvector
Scalarx
Vectorx(size
n)
scalar
(denominatorlayout)sizem
rowvector
(numeratorlayout)sizenrow
vector
(numeratorlayout)mn
matrix
(denominatorlayout)sizen
columnvector
(denominatorlayout)nm
matrix
Notation
Type
(numeratorlayout)
mnmatrix
(numeratorlayout)qpmatrix
MatrixX(size
pq)
(denominatorlayout)pq
matrix
Theresultsofoperationswillbetransposedwhenswitchingbetweennumeratorlayoutanddenominatorlayoutnotation.
Numeratorlayoutnotation
Usingnumeratorlayoutnotation,wehave:[1]
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
7/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Thefollowingdefinitionsareonlyprovidedinnumeratorlayoutnotation:
Denominatorlayoutnotation
Usingdenominatorlayoutnotation,wehave:[3]
Identities
Asnotedabove,ingeneral,theresultsofoperationswillbetransposedwhenswitchingbetweennumeratorlayoutanddenominator
layoutnotation.
Tohelpmakesenseofalltheidentitiesbelow,keepinmindthemostimportantrules:thechainrule,productruleandsumrule.Thesum
ruleappliesuniversally,andtheproductruleappliesinmostofthecasesbelow,providedthattheorderofmatrixproductsismaintained,
sincematrixproductsarenotcommutative.Thechainruleappliesinsomeofthecases,butunfortunatelydoesnotapplyinmatrixby
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
8/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
scalarderivativesorscalarbymatrixderivatives(inthelattercase,mostlyinvolvingthetraceoperatorappliedtomatrices).Inthelatter
case,theproductrulecan'tquitebeapplieddirectly,either,buttheequivalentcanbedonewithabitmoreworkusingthedifferential
identities.
Vectorbyvectoridentities
Thisispresentedfirstbecausealloftheoperationsthatapplytovectorbyvectordifferentiationapplydirectlytovectorbyscalaror
scalarbyvectordifferentiationsimplybyreducingtheappropriatevectorinthenumeratorordenominatortoascalar.
Identities:vectorbyvector
Condition
Expression
Numerator
layout,i.e.byy
andxT
Denominator
layout,i.e.byyT
andx
aisnotafunctionofx
Aisnotafunctionofx
Aisnotafunctionofx
aisnotafunctionofx,
u=u(x)
a=a(x),u=u(x)
Aisnotafunctionof
x,
u=u(x)
u=u(x),v=v(x)
u=u(x)
u=u(x)
Scalarbyvectoridentities
Thefundamentalidentitiesareplacedabovethethickblackline.
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
9/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Identities:scalarbyvector
Condition
Expression
aisnotafunction
ofx
Numeratorlayout,
i.e.byxTresultisrowvector
[4]
Denominatorlayout,
i.e.byxresultiscolumnvector
[4]
aisnotafunction
ofx,
u=u(x)
u=u(x),v=v(x)
u=u(x),v=v(x)
u=u(x)
u=u(x)
u=u(x),v=v(x)
assumesnumeratorlayoutof
assumesdenominatorlayoutof
u=u(x),v=v(x),
Aisnota
functionofx
assumesnumeratorlayoutof
assumesdenominatorlayoutof
,theHessianmatrix[5]
aisnotafunction
ofx
Aisnota
functionofx
bisnotafunction
ofx
Aisnota
functionofx
Aisnota
functionofx
Aissymmetric
Aisnota
functionofx
Aisnota
functionofx
Aissymmetric
aisnotafunction
ofx,
u=u(x)
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
10/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
assumesnumeratorlayoutof
assumesdenominatorlayoutof
a,barenot
functionsofx
A,b,C,D,eare
notfunctionsofx
aisnotafunction
ofx
Vectorbyscalaridentities
Identities:vectorbyscalar
Condition
Expression
Numeratorlayout,
Denominator
i.e.byy,
layout,i.e.byyT,
resultiscolumn
resultisrowvector
vector
[4]
aisnotafunctionofx
aisnotafunctionofx,
u=u(x)
Aisnotafunctionofx,
u=u(x)
u=u(x)
u=u(x),v=v(x)
u=u(x),v=v(x)
u=u(x)
u=u(x)
Assumesconsistentmatrixlayoutsee
below.
Assumesconsistentmatrixlayoutsee
below.
NOTE:Theformulasinvolvingthevectorbyvectorderivatives
and
(whoseoutputsarematrices)assumethematrices
arelaidoutconsistentwiththevectorlayout,i.e.numeratorlayoutmatrixwhennumeratorlayoutvectorandviceversaotherwise,
transposethevectorbyvectorderivatives.
Scalarbymatrixidentities
Notethatexactequivalentsofthescalarproductruleandchainruledonotexistwhenappliedtomatrixvaluedfunctionsofmatrices.
However,theproductruleofthissortdoesapplytothedifferentialform(seebelow),andthisisthewaytoderivemanyoftheidentities
belowinvolvingthetracefunction,combinedwiththefactthatthetracefunctionallowstransposingandcyclicpermutation,i.e.:
Forexample,tocompute
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
11/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Therefore,
(Forthelaststep,seethe`Conversionfromdifferentialtoderivativeform'section.)
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
12/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Identities:scalarbymatrix
Condition
Expression
aisnota
functionofX
Numeratorlayout,i.e.byXT
Denominatorlayout,i.e.byX
[6]
[6]
aisnota
functionofX,
u=u(X)
u=u(X),v=
v(X)
u=u(X),v=
v(X)
u=u(X)
u=u(X)
[5]
U=U(X)
Bothformsassumenumeratorlayoutfor
i.e.mixedlayoutifdenominatorlayoutforXisbeingused.
U=U(X),V
=V(X)
aisnota
functionofX,
U=U(X)
g(X)isany
polynomial
withscalar
coefficients,
oranymatrix
function
definedbyan
infinite
polynomial
series(e.g.
eX,sin(X),
cos(X),ln(X),
etc.usinga
Taylorseries)
g(x)isthe
equivalent
scalar
function,g(x)
isits
derivative,
andg(X)is
the
corresponding
matrix
function
Aisnota
functionofX
[7]
Aisnota [5]
functionofX
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
13/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Aisnota
functionofX
[5]
Aisnota
functionofX
[5]
A,Barenot
functionsof
X
A,B,Care
notfunctions
ofX
nisapositive
integer
Aisnota
functionofX,
nisapositive
integer
[5]
[5]
[5]
[5]
[8]
aisnota
functionofX
A,Barenot
functionsof
X
[5]
[9]
[5]
nisapositive
integer
[5]
(seepseudo
inverse)
[5]
(seepseudo
inverse)
[5]
Aisnota
functionofX,
Xissquare
andinvertible
Aisnota
functionofX,
Xisnon
square,
Ais
symmetric
Aisnota
functionofX,
Xisnon
square,
Aisnon
symmetric
Matrixbyscalaridentities
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
14/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Identities:matrixbyscalar
Condition
Expression
Numeratorlayout,i.e.byY
U=U(x)
A,Barenotfunctionsofx,
U=U(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x),V=V(x)
U=U(x)
U=U(x,y)
Aisnotafunctionofx,g(X)isanypolynomial
withscalarcoefficients,oranymatrixfunction
definedbyaninfinitepolynomialseries(e.g.eX,
sin(X),cos(X),ln(X),etc.)g(x)istheequivalent
scalarfunction,g(x)isitsderivative,andg(X)
isthecorrespondingmatrixfunction
Aisnotafunctionofx
FurtherseeDerivativeoftheexponentialmap.
Scalarbyscalaridentities
Withvectorsinvolved
Identities:scalarbyscalar,withvectorsinvolved
Anylayout(assumes
dotproductignores
Condition
Expression
rowvs.column
layout)
u=u(x)
u=u(x),v=v(x)
Withmatricesinvolved
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
15/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Condition
Identities:scalarbyscalar,withmatricesinvolved[5]
Consistentnumeratorlayout,
Expression
i.e.byYandXT
Mixedlayout,
i.e.byYandX
U=U(x)
U=U(x)
U=U(x)
U=U(x)
Aisnotafunctionofx,
g(X)isanypolynomial
withscalarcoefficients,
oranymatrixfunction
definedbyaninfinite
polynomialseries(e.g.
eX,sin(X),cos(X),ln(X),
etc.)g(x)isthe
equivalentscalar
function,g(x)isits
derivative,andg(X)is
thecorrespondingmatrix
function.
Aisnotafunctionofx
Identitiesindifferentialform
Itisofteneasiertoworkindifferentialformandthenconvertbacktonormalderivatives.Thisonlyworkswellusingthenumerator
layout.Intheserules,"a"isascalar.
Differentialidentities:scalarinvolvingmatrix[1][5]
Condition Expression Result(numeratorlayout)
Differentialidentities:matrix[1][5]
Condition
Expression Result(numeratorlayout)
AisnotafunctionofX
aisnotafunctionofX
(Kroneckerproduct)
(Hadamardproduct)
(conjugatetranspose)
Toconverttonormalderivativeform,firstconvertittooneofthefollowingcanonicalforms,andthenusetheseidentities:
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
16/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Conversionfromdifferentialtoderivativeform[1]
Canonicaldifferentialform Equivalentderivativeform
Seealso
Derivative(generalizations)
Productintegral
Notes
1.ThomasP.,Minka(December28,2000)."OldandNewMatrixAlgebraUsefulforStatistics".MITMediaLabnote(1997revised12/00).
Retrieved5February2016.
2.Magnus,JanR.Neudecker,Heinz(1999).Matrixdifferentialcalculuswithapplicationsinstatisticsandeconometrics(Reviseded.).NewYork:
JohnWiley&Sons.pp.171173.ISBN9780471986331.
3.Felippa,CarlosA."AppendixD,LinearAlgebra:Determinants,Inverses,Rank".ASEN5007:IntroductionToFiniteElementMethods(PDF).
Boulder,Colorado:UniversityofColorado.Retrieved5February2016.UsestheHessian(transposetoJacobian)definitionofvectorandmatrix
derivatives.
4.Here, referstoacolumnvectorofall0's,ofsizen,wherenisthelengthofx.
5.Petersen,KaareBrandtPedersen,MichaelSyskind.TheMatrixCookbook(PDF).Archivedfromtheoriginalon2March2010.Retrieved
5February2016.Thisbookusesamixedlayout,i.e.byYin
byXin
6.Here, referstoamatrixofall0's,ofthesameshapeasX.
7.Duchi,JohnC."PropertiesoftheTraceandMatrixDerivatives"(PDF).StanfordUniversity.Retrieved5February2016.
8.SeeDeterminant#Derivativeforthederivation.
9.Theconstantadisappearsintheresult.Thisisintentional.Ingeneral,
Furtherreading
Lax,PeterD.(2007)."9.CalculusofVectorandMatrixValuedFunctions".Linearalgebraanditsapplications(2nded.).Hoboken,N.J.:Wiley
Interscience.ISBN9780471751564.
Externallinks
MatrixReferenceManual(http://www.psi.toronto.edu/matrix/calculus.html),MikeBrookes,ImperialCollegeLondon.
MatrixDifferentiation(andsomeotherstuff)(http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf),RandalJ.Barnes,
DepartmentofCivilEngineering,UniversityofMinnesota.
NotesonMatrixCalculus(http://www4.ncsu.edu/~pfackler/MatCalc.pdf),PaulL.Fackler,NorthCarolinaStateUniversity.
MatrixDifferentialCalculus(https://wiki.inf.ed.ac.uk/twiki/pub/CSTR/ListenSemester1_2006_7/slide.pdf)(slidepresentation),
ZhangLe,UniversityofEdinburgh.
IntroductiontoVectorandMatrixDifferentiation(http://www.econ.ku.dk/metrics/Econometrics2_05_II/LectureNotes/matrixdiff.p
df)(notesonmatrixdifferentiation,inthecontextofEconometrics),HeinoBohnNielsen.
Anoteondifferentiatingmatrices(http://mpra.ub.unimuenchen.de/1239/1/MPRA_paper_1239.pdf)(notesonmatrix
differentiation),PawelKoval,fromMunichPersonalRePEcArchive.
Vector/MatrixCalculus(http://www.personal.rdg.ac.uk/~sis01xh/teaching/CY4C9/ANN3.pdf)Morenotesonmatrix
differentiation.
MatrixIdentities(http://www.cs.nyu.edu/~roweis/notes/matrixid.pdf)(notesonmatrixdifferentiation),SamRoweis.
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Matrix_calculus&oldid=735127910"
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
17/18
27/08/2016
MatrixcalculusWikipedia,thefreeencyclopedia
Categories: Matrixtheory Linearalgebra Multivariablecalculus
Thispagewaslastmodifiedon18August2016,at19:34.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmayapply.Byusingthissite,you
agreetotheTermsofUseandPrivacyPolicy.WikipediaisaregisteredtrademarkoftheWikimediaFoundation,Inc.,anon
profitorganization.
https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices
18/18