1 views

Uploaded by Seymur

Quick Linear Algebra for Econometrics

- Notes for MBA Maths
- Syllabus Maths
- M16_REND6289_10_IM_C16
- TANCET
- Massey 97asasa
- Matrices
- sanet.cd.3319040359
- Opencv User
- MatrixSummarySheet122011
- A New Least Mean Squares Adaptive Algorithm over Distributed Networks Based on Incremental Strategy
- MATLAB Notes_PWR.pdf
- Ddd
- Blind Spatial Signature Estimation RonVorSidGerSSE
- AIEEE formulas
- Mathematics 2008 Unsolved Paper Outside Delhi.pdf
- Computer Lab Exercise-2
- Chapter 4 Determinants
- Appdix.A
- Homework 3
- Decimation

You are on page 1of 9

Rob Letzler

January 19, 2007

Thanks to Professor Jill Pipher for teaching me linear algebra and to John Leen for providing

useful input on this talk. This document is prepared in LATEXwhich I learned from Leslie Lamport’s

LaTeX: A Document Preparation System book.

Linear Algebra:

• Provides natural, convenient tools to write down and characterize operations involving

adding or multiplying lots of numbers.

• Is not difficult. There are no deep new ideas here.

• Involves a considerable amount of mechanics, jargon, and notation.

This document provides a quick overview of linear algebra. There are plenty of linear

algebra textbooks and appendices in econometrics books that provide more details; show you

how to do proofs involving linear algebra; provide plenty of practice problems; and present a

lot of topics that you do not really need to do the kind of statistics that social science grad

students typically deal with. (I can give you the lay of the land in 90 minutes; but you have

to practice before it will get into your bloodstream.) Howard Anton’s Elementary Linear

Algebra books are particularly good. William Greene’s Econometric Analysis textbook has

an appendix that covers linear algebra. It is very terse and provides few examples and little

context about why things are important.

2.1 Motivation: Regression Analysis

Suppose we want to use a dataset to analyze how public action affects outcomes.

For example, we might be wondering whether to give unemployed people more training

and want to estimate the relationship between the number of job offers the person gets (yi )

and the person’s training and age:

1

yi = α + β1 × (number of hours of training takeni ) + β2 × (agei ) + i

Similarly, we might wonder whether to require safety equipment on cars and and want

to estimate the relationship between the probability of driver injury yi and a characteristic

of the car:

yi = α + β1 × (number of airbags on cari ) + i

In other words, this is an relationship between

• the listof coefficients, βj , for each of the j variables that we are trying to solve for,

Linear algebra lets us express this relationship as: ~y = Xβ~ where ~y and β~ are vectors

(1-dimensional lists of numbers) and X is a matrix (a two-dimensional array of numbers).

In linear algebra, we call a single number, like 5, a scalar.

(Throughout this document, I will write vectors in lower case with arrows above them,

and matrices in uppercase bold, and scalars in lower case. Often I use an (optional) dot

(·) to indicate multiplication. Books vary slightly in the notation they use for vectors and

matrices.)

y1 x11 x12 " #

α

y2 = x21 x22 ·

β1

y3 x31 x32

Notice that we write an entry in matrix X as xcolumn#,row# with the first subscript being the

column.

Suppose we want to solve the system of equations:

x + 2y = 4

2x + y = 5

(How would you solve this? The matrix approach to solving it – called “Gaussian Elim-

ination” that yields a “triangular matrix” – is almost exactly the same as the grade school

algebra approach of adding and subtracting the equations from each other in order to elim-

inate all but one variable from one equation.)

A

H AH

A HHH

A HH

A H

Graphically, we want to know where the lines intersect: A HH

2

We can write the system of equations as matrices and vectors:

" # " # " #

1 2 x 4

· =

2 1 y 5

We can express the same question as above as the challenge of finding scaling factors x and

y that stretch or shrink two vectors so that their sum is equal to a third vector:

" # " # " #

1 2 4

x+ y=

2 1 5

(4,5)

(1,2)

*(2,1)

Thinking of matrix multiplication as an operation that can stretch, rotate, and potentially

flatten vectors will be a very natural way to understand whether a matrix is “well behaved”

– which I talk about below.

• Can we “add” vectors? Can we “add” matrices? Answer: yes, if two matrices (vectors)

have are of the same size, addition works by adding the corresponding items in the

“obvious way. If the vectors or matrices are different sizes, they cannot be added. Here

is an example:

" # " # " #

1 3 4

+ =

2 4 6

• Can we multiply matrices and vectors by scalars. Answer: Yes, it happens in the

“obvious” way: multiply every entry in the vector or matrix by the scalar value.

" # " #

3 6

2 =

5 10

• Can we “multiply” matrices and vectors? What are the properties of matrix multipli-

cation? Answer: Yes. The rest of this document discusses matrix multiplication and

its pitfalls.

3

3 Matrix Multiplication

Here is a simple example of matrix multiplication:

• Notice that we multiply row times column – in other words, we multiply rows in the

first matrix times the columns in the second matrix.

• Each multiplication of a row times a column maps to a scalar in the result. Specifically,

we multiply the first (nth) entry in the row times the first (nth) entry in the column,

which yields n pairs of products – which we then add up. The example below illustrates

this:

3 4 1 3×1+4×2 11

· = =

5 6 2 5×1+6×2 17

As we move right (down) in the columns of the second (first) matrix, we move right

(down) in the results matrix. Here is a simple, abstract example involving multiplying a

matrix consisting of two rows – r~1 and r~2 – by a matrix containing two columns c~1 and c~2 .

" # | | " #

− r~1 − r~1 · c~1 r~1 · c~2

· c~1 c~2 =

− r~2 − r~2 · c~1 r~2 · c~2

| |

We can only multiply two matrices if the number of columns in the first

matrix matches the number of rows in the second matrix

columns, enables us to multiply

An OLS regression seeks to minimize the the sum of the squared errors in its prediction. It

is easy to write down the errors as a vector:

e1

e2

~e =

...

en

And we’d like to be able to multiply this vector by itself to get the sum of the squared

errors ( e2i ), but just multiplying the vector times itself does not work because the vectors

P

are of the wrong sizes – we cannot match the single entry in each row to n entries in each

column:

e1 e1

e2 e2

~e · ~e = ·

... ...

en en

4

Solution: Taking the transpose of the error vector, denoted either e~0 or e~T , solves the

problem and yields:

e1

h i e2

e~0~e = e2i

X

e1 e2 ... en · =

...

en

• exchanges the row and column index, so entry xij in matrix X is entry xji in X0

• reflects items across the diagonal in a square matrix. (The diagonal that runs from the

top left to the bottom right is very important in linear algebra. When we talk about

a diagonal, we mean that one – we generally ignore the other diagonal.)

• converts column vectors into row vectors – and row vectors into column vectors.

Multiplying matrices is different from multiplying individual numbers because order matters.

Consider the following multiplications that take the same two matrices, A and B, but

get different results depending on their order.

" # " # " #

0 1 1 0 3 0

· =

0 0 3 0 0 0

1 0 0 1 0 1

· =

3 0 0 0 0 3

The discussion above looked at: X~b which is very well behaved. We cannot calculate the

product of the matrices in the opposite order, ~bX, since the two items are the wrong shape

to be multiplied together:

" # " #

x 1 2

·

y 2 1

4 (Multiplicative) Inverses

~

Econometrics is the story of the valiant search for the correct (vector of) coefficients, β,

that show how X affects Y. There’s a dramatic moment in a matrix-based presentation of

5

econometrics where we have almost got the vector β~ that we want. The story so far shows

that getting the best estimate implies that:1

Aβ~ = X0 ~y

So we can discover the β~ vector if we can just get rid of that pesky A matrix.

Your algebra 1 instincts are (almost) right here about what we need to do. If A were a

single number, we would divide both sides by A – or equivalently – we would multiply both

sides of the equation by A’s multiplicative inverse, A1 . With a matrix, we will do the same

thing – we will multiply both sides of the equation by the multiplicative inverse of matrix

A, which we denote as A−1 and call “A-inverse” when we talk.

Notice that multiplying scalar A by its multiplicative inverse, A1 , yields the multiplicative

identity, 1: A1 × A = 1. Multiplying a matrix by its inverse yields the identity matrix2 , I,:

A−1 A = I. While in general, order matters in matrix multiplication, it turns out that the

left left and right inverses are the same, so A−1 A = AA−1 = I

invertible

Consider the following regression:3

Suppose that each year of education is exactly two semesters of education. This yields

an example data matrix X and outcome vector ~y as follows:

3 6

X= 1 2

5 10

6

~y = 2

10

1

For purposes of keeping the presentation simple, I gloss over the fact that the A matrix is actually the

matrix we get by transposing the data matrix and multiplying by itself: (X0 X) = A

2

The identity matrix has ones on the diagonal that runs from the top left to the bottom right. All its

other entries are zero. Here is a two dimensional identity matrix:

1 0

I=

0 1

3

Notice that this regression asks an impossible-to-answer question: it demands that we separate the

effects of education from the effects of education measured in exactly the same way. If you were to feed it

to Stata, Stata would drop one of the two education variables before attempting to answer.

6

We want to solve for the β~ in the equations of the form Xβ~ = ~y below.4 The following

calculations show that there are two β~ vectors that solve Xβ~ = ~y equally well.5

3 6 " # 6

0

1 2 · = 2

1

5 10 10

3 6 " # 6

2

1 2 · = 2

0

5 10 10

we can multiply any two dimensional vector – which can represent any point in two dimensional-

space and will get points on a single plane of the form y = 13 x = 15 Z). It maps many vectors

to the same ~y vector. Thus, we cannot tell which of the many ways to get to this result we

actually took so this matrix has no inverse.

If one column of the matrix is a linear combination (weighted sum) of other columns in

the matrix, it is not “linearly independent” of them. (In our singular matrix in this section,

the right column is exactly twice the left column.) That means we can capture an equal

amount of information in fewer columns and that the matrix is singular. The rank of the

matrix is the number of linearly independent rows. A matrix is said to have full rank if all

of its columns are linearly independent. The matrix in the example above is not of full rank.

A square matrix of full rank is invertible. All other matrices are singular.

matrices

A linear algebra book spends a great deal of time exploring the implications of being a good

or bad matrix. Here are some of the punchlines. Some but not all of these results have direct

implications for econometrics.

4

I have simplified this example by presenting an example where we can find β ~ so that the regression

line passes through every data point exactly and all the residuals are zero. If this were a real econometrics

problem where there is no perfect regression line, the fact that the data matrix is not of full rank, so X0 X

is singular, means that there would be more than one regression line that minimized the residuals.

5 ~ that solve this set of equations; which is a real problem when

There are in fact an infinite family of β’s

econometrics aspires to provide a single, unqiue solution.

7

Invertible (Good) Matrices Singular (Bad) Matrices

Multiplying a vector by an invert- Multiplying a vector by an sin-

ible matrix rotates and stretches gular matrix rotates, stretches,

the vector and flattens the vector. In

other words, the multiplication

destroys information.

Multiplication maps each input Multiplication maps many differ-

vector (matrix) to a unique out- ent inputs to each output

put vector (matrix)

A unique inverse A−1 exists. Has no inverse.

Multiplying by A−1 undoes the

effects of multiplying by A.

is square (full rank rectangular square or rectangular

matrices are somewhat well be-

haved; and if A is a full rank

rectangular matrix, then A0 A is

a well behaved invertible square

matrix).

Maps only the zero vector (~0) to Maps an infinite family of non-

~0 zero vectors to ~0. This family is

called its null space.

N N

is a basis for R and spans R is NOT a basis for RN ; spans

(i.e. for every Y~ ∈ R there ex- some space, but it’s smaller than

N

is of full rank; i.e. there is no is not of full rank; at least one of

way to write down a matrix that the vectors in the matrix can be

contains fewer vectors in its nar- expressed as a linear combination

rowest dimension but contains as of the others.

much information

has N (potentially imaginary) at least one eigenvalue is zero. the

nonzero eigenvalues corresponding eigenvector is in its

null space.

ear Algebra

• A database of right hand side variables with one entry per observation can be a matrix.

We typically call it the X matrix.

• Vectors are natural ways to write down the outcomes Y~ and the coefficients we are

solving for, β~

• The sum of squares is just the dot product of two vectors: xi yi = ~x0 · ~y and

P P

xi xi =

~x0 · ~x

8

• Further, we can compute the OLS β~ through matrix multiplication: β~ = (X0 X)−1 X0 ~y

useful statistically

A matrix A is idempotent if multiplying it by itself does not affect its value, i.e. AA =

A2 = A

We can calculate the standard deviation of a vector, ~x, using the following matrix mul-

tiplication: n−1 1 1

....

n n n

1 n−1 1

....

~x0 n1 n n ~x

1 n−1

n n n

....

... ... ... ...

We can get a vector containing the average value in ~x through the following multiplication:

1 1 1

....

n n n

1 1 1

....

~x0 n

1

n

1

n

1 ~

x

....

n n n

... ... ... ...

- Notes for MBA MathsUploaded byKoushal Sharma
- Syllabus MathsUploaded byvkartikey
- M16_REND6289_10_IM_C16Uploaded byJosé Manuel Orduño Villa
- TANCETUploaded byNagarajan Rv
- Massey 97asasaUploaded bycarper935
- MatricesUploaded bydeepak mishra
- sanet.cd.3319040359Uploaded bydrsteel
- Opencv UserUploaded byful138
- MatrixSummarySheet122011Uploaded bypfreda
- A New Least Mean Squares Adaptive Algorithm over Distributed Networks Based on Incremental StrategyUploaded byijsret
- MATLAB Notes_PWR.pdfUploaded bymohamed maher
- DddUploaded byNagarjuna Iyer
- Blind Spatial Signature Estimation RonVorSidGerSSEUploaded bySylviane Elhaj
- AIEEE formulasUploaded byAnonymous 8pCXXs
- Mathematics 2008 Unsolved Paper Outside Delhi.pdfUploaded bysagar
- Computer Lab Exercise-2Uploaded bydilaraatilgan
- Chapter 4 DeterminantsUploaded bysairajdream
- Appdix.AUploaded bylechanhung
- Homework 3Uploaded byAnonymous S1L5A5aeBz
- DecimationUploaded bykaushik_ranjan_2
- am584hw1solnUploaded byLenny Ciletti
- optimizationUploaded byasdas
- 5 Matrices and Linear TransformsUploaded bybbscribed
- MATLABUploaded byANGAD YADAV
- Brealey - Fundamentals of Corporate Finance 3e (McGraw, 2001) (1).pdfUploaded byHenizion
- Quickref LinalgUploaded bykalwilson
- Lecture 4, Vectors and Matrices-1Uploaded byVishvas Suthar
- pulkitreportUploaded bySanchit Agarwal
- Skill MatrixUploaded byAnish Nair
- Jacobian (1)Uploaded byAnil Kumar

- Regression+ExampleUploaded bySeymur
- Matrix AlgebraUploaded bymuhsafiq
- Countercyclical Capital Buffer in HongkongUploaded bySeymur
- Cover Letter Sample in KoreanUploaded bySeymur
- e-procurementUploaded bySeymur
- Hilton Macc Ch2 SolutionUploaded bySeymur
- Numerical Reasoning Test1 SolutionsUploaded byJonson Cao
- Ifrs and Us Gaap Similarities and Differences 2014Uploaded byNaveed Mughal
- Emotion 2010Uploaded bySeymur
- Successful ApplicationUploaded byKathryn Cotton
- TOEFL Essential WordsUploaded bySeymur

- RootUploaded byMuhammad Fahad
- EOF TutorialUploaded bycfisicaster
- SAS IML User Guide.pdfUploaded byJorge Muñoz Aristizabal
- itug.pdfUploaded byHalder Subhas
- Syllabus of MSc Mathematics CUJUploaded byAkash Gartia
- Chapter 6Uploaded byRodney Chen
- Tomographic ResolutionUploaded bysatyam mehta
- [London Business School, Bunn] Forecasting Electricity PricesUploaded byDebarati Kundu
- decanterUploaded byali25199
- Information Technology 2006 Sem IIIUploaded byJinu Madhavan
- 2mtechdip11Uploaded byraviteja595
- (2009) Dimensionality Reduction a Comparative ReviewUploaded byferryzhou
- Matlab Help TextUploaded byAmine Bouaz
- Feature Reduction SurveyUploaded byss_barpanda8473
- Class2 Moment of InertiaUploaded bydccadar
- matlab ProgrammingUploaded byRajesh Kachroo
- ti89vshp49Uploaded bynismogtr10
- A Tutorial on Spectral ClusteringUploaded bySirui Xu
- Tutorial 1Uploaded byabin
- Final Report DipUploaded bySai Kumar Reddy
- Math SolutionUploaded byLinKaYan
- KLTUploaded byVineeth Kumar
- 18348B.pdfUploaded byLynVanderson
- Lecture 1 - SlidesUploaded byBrokenSlam
- Essential Mathematical Skills (Iqbalkalmati.blogspot.com)Uploaded byMuhammad Iqbal
- Matlab and RUploaded byelcebir
- Linear Algebra with applications solution manuelUploaded byvoldemort2603
- Symmetric matrixUploaded byRaúl Ruiz
- BookUploaded bymauricio carrillo oliveros
- Brain electric correlates of strong belief in paranormal Phenomena; Pizzagalli, Tanaka & Brutgger (Psychiatry Research Neuroimaging, 2000. 139]154).pdfUploaded byVas Ra