Professional Documents
Culture Documents
DSA1101
Introduction to Data Science
August 31, 2018
n n n
1X 1X 1X
(axi + b) = axi + b
n i=1 n i=1 n i=1
n
1X nb
=a xi +
n i=1 n
= ax̄ + b
(b) Recall
Pnfrom lecture that the sample variance of x is given by var(x) =
1 2
n−1 i=1 (xi − x̄) . For the new data vector ax = c(ax1 , ax2 , ..., axn ),
show that var(ax) = a2 var(x).
1
n
1 X
var(ax) = (axi − ax)2
n − 1 i=1
n
1 X
= (axi − ax̄)2
n − 1 i=1
n
1 X 2
= a (xi − x̄)2
n − 1 i=1
n
1 X
=a 2
(xi − x̄)2
n − 1 i=1
= a2 var(x)
2
Exercise 2. Data input and manipulation in R.
(e) Draw the histograms of GradPer separately for LibArts (liberal arts)
and Univ (University) institutions
1 hist ( col $ GradPer [ col $ School _ Type == " Lib Arts " ])
2 hist ( col $ GradPer [ col $ School _ Type ! = " Lib Arts " ])
(f) Find out which institutions have more than 75% of faculty members
with Ph.D. degrees
1 col $ School [ col $ PerPhD >75]