Linear Algebra For Data Science (DataCamp)

5/20/2021 Linear Algebra for Data Science (DataCamp)
Linear Algebra for Data Science (DataCamp)
Ch. 1 - Introduction to Linear Algebra

Motivations
[Video]
Creating Vectors in R
# Creating three 3's and four 4's, respectively
rep(3, 3)
## [1] 3 3 3
rep(4, 4)
## [1] 4 4 4 4
# Creating a vector with the first three even numbers and the first three odd numbers
seq(2, 6, by = 2)
## [1] 2 4 6
seq(1, 5, by = 2)
## [1] 1 3 5
# Re-creating the previous four vectors using the 'c' command

c(3, 3, 3)
## [1] 3 3 3
c(4, 4, 4, 4)
## [1] 4 4 4 4
c(2, 4, 6)
## [1] 2 4 6
c(1, 3, 5)
## [1] 1 3 5
The Algebra of Vectors

# Add x to y and print
print(x + y)
## [1] 3 6 9 12 15 18 21
# Multiply z by 2 and print

print(2*z)
## [1] 2 2 4
# Multiply x and y by each other and print

print(x*y)
## [1] 2 8 18 32 50 72 98
# Add x to z, if possible, and print

print(x + z)
https://rstudio-pubs-static.s3.amazonaws.com/597406_cae47e2d352642c7b64449b4fee456c7.html 1/13
## Warning in x + z: longer object length is not a multiple of shorter object

## length
## [1] 2 3 5 5 6 8 8
Creating Matrices in R
# Create a matrix of all 1's and all 2's that are 2 by 3 and 3 by 2, respectively
matrix(1, nrow = 2, ncol = 3)
## [,1] [,2] [,3]

## [1,] 1 1 1
## [2,] 1 1 1
print(matrix(2, nrow = 3, ncol = 2))
## [,1] [,2]
## [1,] 2 2
## [2,] 2 2
## [3,] 2 2
# Create a matrix and changing the byrow designation.

B <- matrix(c(1, 2, 3, 2), nrow = 2, ncol = 2, byrow = FALSE)
B <- matrix(c(1, 2, 3, 2), nrow = 2, ncol = 2, byrow = TRUE)
# Add A to the previously-created matrix

A + B
## [,1] [,2]
## [1,] 2 3
## [2,] 4 3
Matrix-Vector Operations
[Video]
Matrix-Vector Compatibility
Consider the matrix A created by the R code:
A = matrix(c(1, 2, 3, -1, 0, 3), nrow = 2, ncol = 3, byrow = TRUE)
Which of the following vectors b can be multiplied by A to create Ab?
[*] b = c(1, 1, -1)

b = c(-2, 2)
b = c(2, -1, 3, 4, 7)
b = c(-1, 2, 1, 3)
Matrix Multiplication as a Transformation

# Multiply A by b
A%*%b
## [,1]
## [1,] 4
## [2,] 1
# Multiply B by b
B%*%b
## [,1]
## [1,] 0.000000
## [2,] 1.666667
Reflections
# Multiply A by b
A%*%b
## [,1]
## [1,] -2
## [2,] 1
# Multiply B by b
B%*%b
## [,1]
## [1,] 2
## [2,] -1
# Multiply C by b
C%*%b
## [,1]
## [1,] -8
## [2,] -2
Matrix-Matrix Calculations
[Video]
Matrix Multiplication Compatibility

The two matrices generated by the R code below are (small) examples of what are used in neural network models to weigh datasets for prediction:
A = matrix(c(1, 3, 2, -1, 0, 1), nrow = 2, ncol = 3)
B = matrix(c(-1, 1, 2, -3), nrow = 2, ncol = 2)
Often times these collections of weights are applied iteratively using successive applications of matrix multiplication.
Are A and B compatible in any way in terms of matrix multiplication? Use A%*%B and B%*%A in the console to check. What are the dimensions of
the resulting matrix?
No, these matrices are not compatible.

[*] Yes, the multiplication BA results in a 2 by 3 matrix.
Yes, the multiplication AB results in a 2 by 3 matrix.
Yes, the multiplication BA results in a 3 by 2 matrix.
Matrix Multiplication - Order Matters

# Multiply A by B
A%*%B
## [,1] [,2]
## [1,] 0.7071068 0.7071068
## [2,] 0.7071068 -0.7071068
# Multiply A on the right of B

B%*%A
## [,1] [,2]
## [1,] 0.7071068 -0.7071068
## [2,] -0.7071068 -0.7071068
# Multiply the product of A and B by the vector b

A%*%B%*%b
## [,1]
## [1,] 1.414214
## [2,] 0.000000
# Multiply A on the right of B, and then by the vector b

B%*%A%*%b
## [,1]
## [1,] 0.000000
## [2,] -1.414214
Intro to The Matrix Inverse

# Take the inverse of the 2 by 2 identity matrix
solve(diag(2))
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
# Take the inverse of the matrix A

Ainv <- solve(A)
# Multiply A inverse by A
Ainv%*%A
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
# Multiply A by its inverse

A%*%Ainv
## [,1] [,2]
## [1,] 1 0
## [2,] 0 1
Ch. 2 - Matrix-Vector Equations

Motivation for Solving Matrix-Vector Equations
[Video]
The Meaning of Ax = b
A great deal of applied mathematics and statistics, as well as data science, ends in a matrix-vector equation of the form:
Ax = b
Which of the following is the most correct way to describe what solving this equation for x is trying to accomplish?
Finding the vector x that, upon some mysterious transformation, makes b .

Finding the vector x that is a linear combination of the elements of b .
[*] To produce b using a linear combination of the columns of A .
To produce b using a linear combination of the rows of A .
Exploring WNBA Data

# Print the Massey Matrix M
print(M)
## Atlanta Chicago Connecticut Dallas Indiana Los.Angeles Minnesota New.York

## 1 33 -4 -2 -3 -3 -3 -3 -3
## 2 -4 33 -3 -3 -3 -3 -2 -3
## 3 -2 -3 34 -3 -3 -3 -3 -4
## 4 -3 -3 -3 34 -3 -4 -3 -3
## 5 -3 -3 -3 -3 33 -3 -3 -3
## 6 -3 -3 -3 -4 -3 41 -8 -3
## 7 -3 -2 -3 -3 -3 -8 41 -3
## 8 -3 -3 -4 -3 -3 -3 -3 34
## 9 -3 -3 -4 -2 -3 -6 -4 -3
## 10 -3 -3 -3 -3 -3 -3 -3 -2
## 11 -3 -3 -3 -3 -2 -2 -3 -3
## 12 -3 -3 -3 -4 -4 -3 -6 -4
## Phoenix San.Antonio Seattle Washington
## 1 -3 -3 -3 -3
## 2 -3 -3 -3 -3
## 3 -4 -3 -3 -3
## 4 -2 -3 -3 -4
## 5 -3 -3 -2 -4
## 6 -6 -3 -2 -3
## 7 -4 -3 -3 -6
## 8 -3 -2 -3 -4
## 9 38 -3 -4 -3
## 10 -3 32 -4 -2
## 11 -4 -4 33 -3
## 12 -3 -2 -3 38
# Print the vector of point differentials f

print(f)
## Differential
## 1 -135
## 2 -171
## 3 152
## 4 -104
## 5 -308
## 6 292
## 7 420
## 8 83
## 9 -4
## 10 -213
## 11 -5
## 12 -7
## 13 0
# Find the sum of the first column of M

sum(M[, 1])
## [1] 0
# Find the sum of the vector f

sum(f)
## [1] 0
Matrix-Vector Equations - Some Theory

[Video]
Why is a Matrix Not Invertible?

For our WNBA Massey Matrix model, some adjustments need to be made for a solution to our rating problem to exist and be unique.
To see this, notice that the following code produces an error:
` > print(M)
1 33 -4 -2 -3 -3 -3 -3 -3 -3 -3 -3 -3 2 -4 33 -3 -3 -3 -3 -2 -3 -3 -3 -3 -3 3 -2 -3 34 -3 -3 -3 -3 -4 -4 -3 -3 -3 4 -3 -3 -3 34 -3 -4 -3 -3 -2 -3 -3 -4 5 -3 -3 -3
-3 33 -3 -3 -3 -3 -3 -2 -4 6 -3 -3 -3 -4 -3 41 -8 -3 -6 -3 -2 -3 7 -3 -2 -3 -3 -3 -8 41 -3 -4 -3 -3 -6 8 -3 -3 -4 -3 -3 -3 -3 34 -3 -2 -3 -4 9 -3 -3 -4 -2 -3 -6 -4
-3 38 -3 -4 -3 10 -3 -3 -3 -3 -3 -3 -3 -2 -3 32 -4 -2 11 -3 -3 -3 -3 -2 -2 -3 -3 -4 -4 33 -3 12 -3 -3 -3 -4 -4 -3 -6 -4 -3 -2 -3 38
solve(M) Error in solve.default(M) : system is computationally singular: reciprocal condition number = 3.06615e-
17 `
Which of the conditions does M explicitly violate in this case?
M is not a square matrix.

The determinant of M is zero.
The sum of each of the columns and rows is equal to zero.
[*] M does not have an inverse.
Understanding a Linear System’s Three Outcomes

In two dimensions, the solution structure of a system of two equations in two unknowns can be understood in a straightforward way via pictures,
with the two equations representing lines (this is why it’s called linear algebra) in the x-y (or x1 - x2) plane. A solution is any point (x,y) ((x1,x2))
where the two lines intersect.
Which of the following three graphs is that of a linear system of two equations with two unknowns that has no solutions?
The first graph.

[*] The second graph.
The third graph.
Understanding the Massey Matrix

For our WNBA Massey Matrix model, some adjustments need to be made for a solution to our rating problem to exist and be unique.
This is because the matrix M, with R output

1 33 -4 -2 -3 -3 -3 -3 -3 -3 -3 -3 -3 2 -4 33 -3 -3 -3 -3 -2 -3 -3 -3 -3 -3 3 -2 -3 34 -3 -3 -3 -3 -4 -4 -3 -3 -3 4 -3 -3 -3 34 -3 -4 -3 -3
usually does not (computationally) have an inverse, as shown by the error produced from running solve(M) in a previous exercise.
One way we can change this is to add a row of 1 ’s on the bottom of the matrix M, a column of -1 ’s to the far right of M, and a 0 to the bottom of
the vector of point differentials f .
What does that row of 1 ’s represent in the setting of rating teams? In other words, what does the final equation stipulate?
Each team gets an equal rating.

[*] The ratings for the entire league add to zero.
The sum of the ratings for the entire league is positive.
The sum of the ratings for the entire league is negative.
Adjusting the Massey Matrix

# Add a row of 1's
M_2 <- rbind(M, rep(1, 12))
# Add a column of -1's

M_3 <- cbind(M_2, rep(-1, 13))
# Change the element in the lower-right corner of the matrix

M_3[13, 13] <- 1
# Print M_3
print(M_3)

## 1 33 -4 -2 -3 -3 -3 -3 -3
## 2 -4 33 -3 -3 -3 -3 -2 -3
## 3 -2 -3 34 -3 -3 -3 -3 -4
## 4 -3 -3 -3 34 -3 -4 -3 -3
## 5 -3 -3 -3 -3 33 -3 -3 -3
## 6 -3 -3 -3 -4 -3 41 -8 -3
## 7 -3 -2 -3 -3 -3 -8 41 -3
## 8 -3 -3 -4 -3 -3 -3 -3 34
## 9 -3 -3 -4 -2 -3 -6 -4 -3
## 10 -3 -3 -3 -3 -3 -3 -3 -2
## 11 -3 -3 -3 -3 -2 -2 -3 -3
## 12 -3 -3 -3 -4 -4 -3 -6 -4
## 13 1 1 1 1 1 1 1 1
## Phoenix San.Antonio Seattle Washington rep(-1, 13)
## 1 -3 -3 -3 -3 -1
## 2 -3 -3 -3 -3 -1
## 3 -4 -3 -3 -3 -1
## 4 -2 -3 -3 -4 -1
## 5 -3 -3 -2 -4 -1
## 6 -6 -3 -2 -3 -1
## 7 -4 -3 -3 -6 -1
## 8 -3 -2 -3 -4 -1
## 9 38 -3 -4 -3 -1
## 10 -3 32 -4 -2 -1
## 11 -4 -4 33 -3 -1
## 12 -3 -2 -3 38 -1
## 13 1 1 1 1 1
Inverting the Massey Matrix

# Find the inverse of M
solve(M)
## [,1] [,2] [,3] [,4] [,5]

## Atlanta 0.032449804 0.005402927 0.003876665 0.004630004 0.004629590
## Chicago 0.005402927 0.032446789 0.004608094 0.004626913 0.004628272
## Connecticut 0.003876665 0.004608094 0.031714805 0.004613451 0.004629714
## Dallas 0.004630004 0.004626913 0.004613451 0.031707219 0.004649172
## Indiana 0.004629590 0.004628272 0.004629714 0.004649172 0.032447936
## Los.Angeles 0.004626242 0.004554829 0.004676789 0.005214940 0.004652111
## Minnesota 0.004611109 0.003985203 0.004651940 0.004727810 0.004678479
## New.York 0.004609212 0.004627729 0.005362761 0.004647832 0.004649262
## Phoenix 0.004610546 0.004608018 0.005295038 0.004013187 0.004613089
## San.Antonio 0.004630254 0.004631081 0.004608596 0.004609009 0.004587382
## Seattle 0.004629212 0.004631185 0.004646217 0.004595132 0.003854641
## Washington 0.004627769 0.004582295 0.004649264 0.005298666 0.005313685
## rep(-1, 13) -0.083333333 -0.083333333 -0.083333333 -0.083333333 -0.083333333
## [,6] [,7] [,8] [,9] [,10]
## Atlanta 0.004626242 0.004611109 0.004609212 0.004610546 0.004630254
## Chicago 0.004554829 0.003985203 0.004627729 0.004608018 0.004631081
## Connecticut 0.004676789 0.004651940 0.005362761 0.005295038 0.004608596
## Dallas 0.005214940 0.004727810 0.004647832 0.004013187 0.004609009
## Indiana 0.004652111 0.004678479 0.004649262 0.004613089 0.004587382
## Los.Angeles 0.027807608 0.007319076 0.004637275 0.006363490 0.004606288
## Minnesota 0.007319076 0.027810474 0.004677632 0.005388578 0.004578013
## New.York 0.004637275 0.004677632 0.031716432 0.004648253 0.003835528
## Phoenix 0.006363490 0.005388578 0.004648253 0.029212019 0.004646110
## San.Antonio 0.004606288 0.004578013 0.003835528 0.004646110 0.033267202
## Seattle 0.004032687 0.004573214 0.004607331 0.005265228 0.005427397
## Washington 0.004841998 0.006331805 0.005314087 0.004669776 0.003906474
## rep(-1, 13) -0.083333333 -0.083333333 -0.083333333 -0.083333333 -0.083333333
## [,11] [,12] [,13]
## Atlanta 0.004629212 0.004627769 8.333333e-02
## Chicago 0.004631185 0.004582295 8.333333e-02
## Connecticut 0.004646217 0.004649264 8.333333e-02
## Dallas 0.004595132 0.005298666 8.333333e-02
## Indiana 0.003854641 0.005313685 8.333333e-02
## Los.Angeles 0.004032687 0.004841998 8.333333e-02
## Minnesota 0.004573214 0.006331805 8.333333e-02
## New.York 0.004607331 0.005314087 8.333333e-02
## Phoenix 0.005265228 0.004669776 8.333333e-02
## San.Antonio 0.005427397 0.003906474 8.333333e-02
## Seattle 0.032485332 0.004585756 8.333333e-02
## Washington 0.004585756 0.029211757 8.333333e-02
## rep(-1, 13) -0.083333333 -0.083333333 2.220446e-16
Solving Matrix-Vector Equations

[Video]
An Analogy with Regular Algebra

As we saw in the video, solving matrix-vector equations is as simple as multiplying both sides of the equation by A ’s inverse, A−1 , should it exist.
The analogy with solving linear equations like 5x=7 is a good one.
If A−1 doesn’t exist, this does not work. The equivalent analogy for linear equations would be a situation in which the coefficient in front of the x
were 0, which is the only real number that does not have an inverse. Which of the following does NOT analogize in this situation?
Dividing by zero is illegal, and is analogous to trying to invert a matrix with a zero determinant.
[*] All of the elements of a matrix must be zero for it to fail to have an inverse.
The equation 0x=b has zero solutions (if b≠0).
The equation 0x=b has infinitely many solutions (if b=0).
2017 WNBA Ratings!

# Solve for r and rename column
r <- solve(M)%*%f
colnames(r) <- "Rating"
# Print r
print(r)
## Rating
## Atlanta -4.012938e+00
## Chicago -5.156260e+00
## Connecticut 4.309525e+00
## Dallas -2.608129e+00
## Indiana -8.532958e+00
## Los.Angeles 7.850327e+00
## Minnesota 1.061241e+01
## New.York 2.541565e+00
## Phoenix 8.979110e-01
## San.Antonio -6.181574e+00
## Seattle -2.666953e-01
## Washington 5.468121e-01
## WNBA 1.043610e-14
Who Was the Champion?

The dplyr package has been loaded for you, as has the solution to the previous question. The arrange() function in dplyr allows you to re-
order a vector based on a trait.
In the previous exercise, you rated the teams at the end of the 2017 WNBA season using the solution to a matrix-vector equation.
Using the the syntax
arrange(r, -Rating)
we can see which team was the best in the WNBA in 2017 (using the negative (“-”) sign in front of the ordering variable (“Rating”) puts the values
in descending order, as opposes to ascending order if just “Rating” is used).
Which team was the best?
# arrange(r, -Rating)
San Antonio
[*] Minnesota
Los Angeles
Phoenix
Other Considerations for Matrix-Vector Equations

[Video]
Other Methods for Matrix-Vector Equations

Which of the following was NOT proposed as a method to solve matrix-vector equations with non-square matrices?
[*] Euler’s method

Least squares
Singular Value Decomposition
Row reduction
Alternatives to the Regular Matrix Inverse

# Print M
print(M)

## [1,] 33 -4 -2 -3 -3 -3 -3 -3
## [2,] -4 33 -3 -3 -3 -3 -2 -3
## [3,] -2 -3 34 -3 -3 -3 -3 -4
## [4,] -3 -3 -3 34 -3 -4 -3 -3
## [5,] -3 -3 -3 -3 33 -3 -3 -3
## [6,] -3 -3 -3 -4 -3 41 -8 -3
## [7,] -3 -2 -3 -3 -3 -8 41 -3
## [8,] -3 -3 -4 -3 -3 -3 -3 34
## [9,] -3 -3 -4 -2 -3 -6 -4 -3
## [10,] -3 -3 -3 -3 -3 -3 -3 -2
## [11,] -3 -3 -3 -3 -2 -2 -3 -3
## [12,] -3 -3 -3 -4 -4 -3 -6 -4
## [13,] 1 1 1 1 1 1 1 1
## Phoenix San.Antonio Seattle Washington WNBA
## [1,] -3 -3 -3 -3 -1
## [2,] -3 -3 -3 -3 -1
## [3,] -4 -3 -3 -3 -1
## [4,] -2 -3 -3 -4 -1
## [5,] -3 -3 -2 -4 -1
## [6,] -6 -3 -2 -3 -1
## [7,] -4 -3 -3 -6 -1
## [8,] -3 -2 -3 -4 -1
## [9,] 38 -3 -4 -3 -1
## [10,] -3 32 -4 -2 -1
## [11,] -4 -4 33 -3 -1
## [12,] -3 -2 -3 38 -1
## [13,] 1 1 1 1 1
# Find the rating vector the conventional way

r <- solve(M)%*%f
print(r)
## Rating
## Atlanta -4.012938e+00
## Chicago -5.156260e+00
## Connecticut 4.309525e+00
## Dallas -2.608129e+00
## Indiana -8.532958e+00
## Los.Angeles 7.850327e+00
## Minnesota 1.061241e+01
## New.York 2.541565e+00
## Phoenix 8.979110e-01
## San.Antonio -6.181574e+00
## Seattle -2.666953e-01
## Washington 5.468121e-01
## WNBA 1.043610e-14
# Find the rating vector using ginv

r <- ginv(M)%*%f
print(r)
## Rating
## [1,] -4.012938e+00
## [2,] -5.156260e+00
## [3,] 4.309525e+00
## [4,] -2.608129e+00
## [5,] -8.532958e+00
## [6,] 7.850327e+00
## [7,] 1.061241e+01
## [8,] 2.541565e+00
## [9,] 8.979110e-01
## [10,] -6.181574e+00
## [11,] -2.666953e-01
## [12,] 5.468121e-01
## [13,] 5.773160e-14
Ch. 3 - Eigenvalues and Eigenvectors

Intro to Eigenvalues and Eigenvectors
[Video]
Matrix-Vector Multiplications
Rotations
Reflections
Dilations
Contradictions
Projections
Every imaginable combinations of these
Scalar Multiplication
c times vector x ⃗ Notation: cx ⃗
Interpreting Scalar Multiplication

Scaling Different Axes
Definition of Eigenvalues and Eigenvectors
Why “Eigen”?
Finding Eigenvalues in R
Scalar Multiplies of Eigenvectors are Eigenvectors
Computing Eigenvalues and Eigenvectors in R
How Many Eigenvalues?
Verifying the Math on Eigenvalues
Computing Eigenvectors in R
Some More on Eigenvalues and Eigenvectors
Eigenvalue Ordering
Markov Models for Allele Frequencies
Ch. 4 - Principal Component Analysis

Intro to the Idea of PCA
[Video]
What Does “Big Data” Mean?

In data science, the term “big data” is generally referring to what with the term “big”?
[*] The number of rows and the number of columns.

The number of rows.
The number of columns.
The number of rows or the number of columns.
Finding Redundancies
# Print the first 6 observations of the dataset
head(combine)
## player position school year height weight forty vertical

## 1 Jaire Alexander CB Louisville 2018 71 192 4.38 35.0
## 2 Brian Allen C Michigan St. 2018 73 298 5.34 26.5
## 3 Mark Andrews TE Oklahoma 2018 77 256 4.67 31.0
## 4 Troy Apke S Penn St. 2018 74 198 4.34 41.0
## 5 Dorance Armstrong EDGE Kansas 2018 76 257 4.87 30.0
## 6 Ade Aruna DE Tulane 2018 78 262 4.60 38.5
## bench broad_jump three_cone shuttle
## 1 14 127 6.71 3.98
## 2 27 99 7.81 4.71
## 3 17 113 7.34 4.38
## 4 16 131 6.56 4.03
## 5 20 118 7.12 4.23
## 6 18 128 7.53 4.48
## drafted
## 1 Green Bay Packers / 1st / 18th pick / 2018
## 2 Los Angeles Rams / 4th / 111th pick / 2018
## 3 Baltimore Ravens / 3rd / 86th pick / 2018
## 4
## 5
## 6 Minnesota Vikings / 6th / 218th pick / 2018
# Find the correlation between variables forty and three_cone

cor(combine$forty, combine$three_cone)
## [1] 0.8315171
# Find the correlation between variables vertical and broad_jump

cor(combine$vertical, combine$broad_jump)
## [1] 0.8163375
Given the results of the previous parts of the exercise, what can you say about the dataset combine at this point?
We have yet to find any redundancy in the dataset.

forty and three_cone are the only redundant variables we’ve found so far.
vertical and broad_jump are the only redundant variables we’ve found so far.
[*] There are at least two sets of redundant variables in this dataset.
The Linear Algebra Behind PCA

[Video]
Covariance Explored
If the covariance between two columns of a matrix is positive and large, what can we say?
The variables are not related.

When one of the variables goes up, the other goes down.
[*] When one of the variables goes up, the other goes up as well.
The variables are related, but we don’t know how.
Standardizing Your Data

# Extract columns 5-12 of combine

A <- combine[, 5:12]
# Make A into a matrix

A <- as.matrix(A)
# Subtract the mean of each column

A[, 1] <- A[, 1] - mean(A[, 1])
A[, 2] <- A[, 2] - mean(A[, 2])
A[, 3] <- A[, 3] - mean(A[, 3])
A[, 4] <- A[, 4] - mean(A[, 4])
A[, 5] <- A[, 5] - mean(A[, 5])
A[, 6] <- A[, 6] - mean(A[, 6])
A[, 7] <- A[, 7] - mean(A[, 7])
A[, 8] <- A[, 8] - mean(A[, 8])
Variance/Covariance Calculations
# Create matrix B from equation in instructions
B <- t(A)%*%A/(nrow(A) - 1)
# Compare 1st element of the 1st column of B to the variance of the first column of A
B[1,1]
## [1] 7.159794
var(A[, 1])
## [1] 7.159794
# Compare 1st element of 2nd column of B to the 1st element of the 2nd row of B to the covariance between the first two colu
mns of A
B[1, 2]
## [1] 90.78808
B[2, 1]
## [1] 90.78808
cov(A[, 1], A[, 2])
## [1] 90.78808
Eigenanalyses of Combine Data

# Find eigenvalues of B
V <- eigen(B)
# Print eigenvalues
V$values
## [1] 2.187628e+03 4.403246e+01 2.219205e+01 5.267129e+00 2.699702e+00

## [6] 6.317016e-02 1.480866e-02 1.307283e-02
Where’s the Variance?

The eigenvalues of B are, when rounding to four digits,
2187.6283 44.0325 22.1921 5.2671 2.6997 0.0632 0.0148 0.0131
Roughly how much of the variability in the dataset can be explained by the first principal component?
About 15 percent.
About 50 percent.
About 75 percent.
[*] About 95 percent.
Performing PCA in R
[Video]
Scaling Data Before PCA
# Scale columns 5-12 of combine

B <- scale(combine[, 5:12])
# Print the first 6 rows of the data

head(B)
## height weight forty vertical bench broad_jump

## [1,] -1.11844839 -1.30960025 -1.3435337 0.5624657 -1.1089286 1.45502476
## [2,] -0.37100257 1.00066356 1.6449741 -1.4281627 0.9238361 -1.49512459
## [3,] 1.12388907 0.08527601 -0.4407553 -0.3743006 -0.6398290 -0.02004991
## [4,] 0.00272034 -1.17883060 -1.4680548 1.9676151 -0.7961955 1.87647467
## [5,] 0.75016616 0.10707096 0.1818505 -0.6084922 -0.1707295 0.50676247
## [6,] 1.49761199 0.21604566 -0.6586673 1.3821362 -0.4834625 1.56038724
## three_cone shuttle
## [1,] -1.38083506 -1.5879750
## [2,] 1.16888714 1.1170258
## [3,] 0.07946038 -0.1057828
## [4,] -1.72852445 -1.4027010
## [5,] -0.43048406 -0.6616049
## [6,] 0.51986694 0.2647653
# Summarize the principal component analysis

summary(prcomp(B))
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 2.3679 0.9228 0.78904 0.61348 0.46811 0.37178 0.34834
## Proportion of Variance 0.7009 0.1064 0.07782 0.04704 0.02739 0.01728 0.01517
## Cumulative Proportion 0.7009 0.8073 0.88514 0.93218 0.95957 0.97685 0.99202
## PC8
## Standard deviation 0.25266
## Proportion of Variance 0.00798
## Cumulative Proportion 1.00000
Summarizing PCA in R
# Subset combine only to "WR"
combine_WR <- subset(combine, position == "WR")
# Scale columns 5-12 of combine_WR

B <- scale(combine_WR[, 5:12])
# Print the first 6 rows of the data

head(B)
## height weight forty vertical bench broad_jump

## 7 1.4022982 0.88324903 1.20674474 -0.3430843 -0.3223377 0.07414249
## 17 0.5575402 -0.09700717 -0.80129388 -0.4969965 -0.7938424 -0.95388361
## 18 0.9799192 1.58343202 0.88968601 1.0421255 0.8564239 1.61618163
## 25 0.9799192 1.16332222 1.41811723 -1.5743819 -0.7938424 -1.29655897
## 29 -1.1319757 -1.56739147 -0.80129388 -0.1891721 -0.0865854 -1.29655897
## 46 0.1351613 0.11304773 0.04419607 0.2725645 -1.0295947 0.24548017
## three_cone shuttle
## 7 0.712845019 0.02833449
## 17 -1.098542478 0.84141123
## 18 -1.853287268 -1.46230619
## 25 -1.148858797 0.50262926
## 29 0.008416548 -0.64922946
## 46 0.109049187 0.84141123
# Summarize the principal component analysis

summary(prcomp(B))
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 1.5425 1.4255 1.0509 0.9603 0.77542 0.63867 0.59792
## Proportion of Variance 0.2974 0.2540 0.1380 0.1153 0.07516 0.05099 0.04469
## Cumulative Proportion 0.2974 0.5514 0.6894 0.8047 0.87987 0.93085 0.97554
## PC8
## Standard deviation 0.44235
## Proportion of Variance 0.02446
## Cumulative Proportion 1.00000
Does Subsetting Change Things?

In the last exercise, you looked at the PCA analysis of just the wide receivers in the NFL combine data. The summaries of the PCA analysis for
the whole combine dataset and the wide receiver subset are loaded as pca_summary and pca_summary_wr , respectively.
What is true about this data in relation to the dataset as a whole?
With less data, the first PC of the subset data explains more of the variability in the dataset.
The first PC explains similar amounts of variability for both datasets.
[*] It takes the first 3 PCs of the subset data to explain the same amount of variability as the first PC of the whole dataset.
Wrap-Up
[Video]
About Michael Mallari

Michael is a hybrid thinker and doer (https://www.michaelmallari.com/)—a byproduct of being a StrengthsFinder “Learner”
(https://news.gallup.com/businessjournal/694/learner.aspx) over time. With 20+ years of engineering, design, and product experience, he helps
organizations identify market needs, mobilize internal and external resources, and deliver delightful digital customer experiences that align with
business goals. He has been entrusted with problem-solving for brands—ranging from Fortune 500 companies to early-stage startups to not-for-
profit organizations.
Michael earned his BS in Computer Science from New York Institute of Technology and his MBA from the University of Maryland, College Park. He
is also a candidate to receive his MS in Applied Analytics from Columbia University.
LinkedIn (https://www.linkedin.com/in/mmallari/) | Twitter (https://twitter.com/MichaelMallari) | www.michaelmallari.com/data

(https://www.michaelmallari.com/data/) | www.columbia.edu/~mm5470 (http://www.columbia.edu/~mm5470/)

Linear Algebra For Data Science (DataCamp)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Algebra For Data Science (DataCamp)

Uploaded by

Copyright:

Available Formats

5/20/2021 Linear Algebra for Data Science (DataCamp)

Linear Algebra for Data Science (DataCamp)

Ch. 1 - Introduction to Linear Algebra

# Re-creating the previous four vectors using the 'c' command

The Algebra of Vectors

# Multiply z by 2 and print

# Multiply x and y by each other and print

# Add x to z, if possible, and print

## Warning in x + z: longer object length is not a multiple of shorter object

## [,1] [,2] [,3]

print(matrix(2, nrow = 3, ncol = 2))

# Create a matrix and changing the byrow designation.

# Add A to the previously-created matrix

A = matrix(c(1, 2, 3, -1, 0, 3), nrow = 2, ncol = 3, byrow = TRUE)

Which of the following vectors b can be multiplied by A to create Ab?

[*] b = c(1, 1, -1)

Matrix Multiplication as a Transformation

Matrix Multiplication Compatibility

A = matrix(c(1, 3, 2, -1, 0, 1), nrow = 2, ncol = 3)

B = matrix(c(-1, 1, 2, -3), nrow = 2, ncol = 2)

No, these matrices are not compatible.

Matrix Multiplication - Order Matters

# Multiply A on the right of B

# Multiply the product of A and B by the vector b

# Multiply A on the right of B, and then by the vector b

Intro to The Matrix Inverse

# Take the inverse of the matrix A

# Multiply A by its inverse

Ch. 2 - Matrix-Vector Equations

Finding the vector x that, upon some mysterious transformation, makes b .

Exploring WNBA Data

## Atlanta Chicago Connecticut Dallas Indiana Los.Angeles Minnesota New.York

# Print the vector of point differentials f

# Find the sum of the first column of M

# Find the sum of the vector f

Matrix-Vector Equations - Some Theory

Why is a Matrix Not Invertible?

To see this, notice that the following code produces an error:

Which of the conditions does M explicitly violate in this case?

M is not a square matrix.

Understanding a Linear System’s Three Outcomes

The first graph.

Understanding the Massey Matrix

This is because the matrix M, with R output

Each team gets an equal rating.

Adjusting the Massey Matrix

# Add a column of -1's

# Change the element in the lower-right corner of the matrix

## Atlanta Chicago Connecticut Dallas Indiana Los.Angeles Minnesota New.York

Inverting the Massey Matrix

## [,1] [,2] [,3] [,4] [,5]

Solving Matrix-Vector Equations

An Analogy with Regular Algebra

2017 WNBA Ratings!

Who Was the Champion?

Using the the syntax

Which team was the best?

Other Considerations for Matrix-Vector Equations

Other Methods for Matrix-Vector Equations

[*] Euler’s method

Alternatives to the Regular Matrix Inverse

## Atlanta Chicago Connecticut Dallas Indiana Los.Angeles Minnesota New.York