Professional Documents
Culture Documents
統計學習作業1 (ch2.8) - Jupyter Notebook (橫向)
統計學習作業1 (ch2.8) - Jupyter Notebook (橫向)
In [34]: #第(a)題
college = pd.read_csv('College.csv') #⽤pandas讀入excel檔
college
Out[34]:
Unnamed: 0 Private Apps Accept Enroll Top10perc Top25perc F.Undergrad P.Undergrad Outstate Room.Board Books Personal
Abilene
0 Christian Yes 1660 1232 721 23 52 2885 537 7440 3300 450 2200
University
Adelphi
1 Yes 2186 1924 512 16 29 2683 1227 12280 6450 750 1500
University
Adrian
2 Yes 1428 1097 336 22 50 1036 99 11250 3750 400 1165
College
Agnes Scott
3 Yes 417 349 137 60 89 510 63 12960 5450 450 875
College
Alaska
4 Pacific Yes 193 146 55 16 44 249 869 7560 4120 800 1500
University
... ... ... ... ... ... ... ... ... ... ... ... ... ...
Worcester
772 State No 2197 1515 543 4 26 3089 2029 6797 3900 500 1200
College
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第1⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
Xavier
773 Yes 1959 1805 695 24 47 2849 1107 11520 4960 600 1250
University
Xavier
774 University of Yes 2097 1915 695 34 61 2793 166 6900 4200 617 781
Louisiana
Yale
775 Yes 10705 2453 1317 95 99 5217 83 19840 6510 630 2115
University
York College
776 of Yes 2989 1855 691 28 63 2988 1726 4990 3560 500 1250
Pennsylvania
In [35]: #第(b)題
college2 = pd.read_csv('College.csv', index_col=0)
college2
Out[35]:
Private Apps Accept Enroll Top10perc Top25perc F.Undergrad P.Undergrad Outstate Room.Board Books Personal PhD
Abilene
Christian Yes 1660 1232 721 23 52 2885 537 7440 3300 450 2200 70
University
Adelphi
Yes 2186 1924 512 16 29 2683 1227 12280 6450 750 1500 29
University
Adrian
Yes 1428 1097 336 22 50 1036 99 11250 3750 400 1165 53
College
Agnes Scott
Yes 417 349 137 60 89 510 63 12960 5450 450 875 92
College
Alaska
Pacific Yes 193 146 55 16 44 249 869 7560 4120 800 1500 76
University
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第2⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
... ... ... ... ... ... ... ... ... ... ... ... ... ...
Worcester
State No 2197 1515 543 4 26 3089 2029 6797 3900 500 1200 60
College
Xavier
Yes 1959 1805 695 24 47 2849 1107 11520 4960 600 1250 73
University
Xavier
University of Yes 2097 1915 695 34 61 2793 166 6900 4200 617 781 67
Louisiana
Yale
Yes 10705 2453 1317 95 99 5217 83 19840 6510 630 2115 96
University
York College
of Yes 2989 1855 691 28 63 2988 1726 4990 3560 500 1250 75
Pennsylvania
Out[42]:
Private Apps Accept Enroll Top10perc Top25perc F.Undergrad P.Undergrad Outstate Room.Board Books Personal PhD
College
Abilene
Christian Yes 1660 1232 721 23 52 2885 537 7440 3300 450 2200 70
University
Adelphi
Yes 2186 1924 512 16 29 2683 1227 12280 6450 750 1500 29
University
Adrian
Yes 1428 1097 336 22 50 1036 99 11250 3750 400 1165 53
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第3⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
College
Agnes Scott
Yes 417 349 137 60 89 510 63 12960 5450 450 875 92
College
Alaska
Pacific Yes 193 146 55 16 44 249 869 7560 4120 800 1500 76
University
... ... ... ... ... ... ... ... ... ... ... ... ... ...
Worcester
State No 2197 1515 543 4 26 3089 2029 6797 3900 500 1200 60
College
Xavier
Yes 1959 1805 695 24 47 2849 1107 11520 4960 600 1250 73
University
Xavier
University of Yes 2097 1915 695 34 61 2793 166 6900 4200 617 781 67
Louisiana
Yale
Yes 10705 2453 1317 95 99 5217 83 19840 6510 630 2115 96
University
York College
of Yes 2989 1855 691 28 63 2988 1726 4990 3560 500 1250 75
Pennsylvania
In [37]: college=college3
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第4⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
In [38]: #第(c)題
college.describe(include='all')#⽤describe算出基本的敘述統計量,include='all'⽤來包含屬量變數
Out[38]:
College Private Apps Accept Enroll Top10perc Top25perc F.Undergrad P.Undergrad Outstate Room.Board Books
count 777 777 777.00 777.00 777.00 777.00 777.00 777.00 777.00 777.00 777.00 777.00
unique 777 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Abilene
top Christian Yes NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
University
freq 1 565 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
mean NaN NaN 3,001.64 2,018.80 779.97 27.56 55.80 3,699.91 855.30 10,440.67 4,357.53 549.38
std NaN NaN 3,870.20 2,451.11 929.18 17.64 19.80 4,850.42 1,522.43 4,023.02 1,096.70 165.11
min NaN NaN 81.00 72.00 35.00 1.00 9.00 139.00 1.00 2,340.00 1,780.00 96.00
25% NaN NaN 776.00 604.00 242.00 15.00 41.00 992.00 95.00 7,320.00 3,597.00 470.00
50% NaN NaN 1,558.00 1,110.00 434.00 23.00 54.00 1,707.00 353.00 9,990.00 4,200.00 500.00
75% NaN NaN 3,624.00 2,424.00 902.00 35.00 69.00 4,005.00 967.00 12,925.00 5,050.00 600.00
max NaN NaN 48,094.00 26,330.00 6,392.00 96.00 100.00 31,643.00 21,836.00 21,700.00 8,124.00 2,340.00
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第5⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第6⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
In [40]: #第(e)題
#Outstate做y軸, ⽤private做x軸,接著畫出盒鬚圖
sns.boxplot(x='Private', y='Outstate', data=college)
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第7⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
In [44]: #第(f)題
Out[44]: No 699
Yes 78
Name: Elite, dtype: int64
In [29]:
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第8⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
#第(g)題
#為了要畫直⽅圖,必須要先把資料做分類以及分群
college['PhD'] = pd.cut(college['PhD'], 3, labels=['Low', 'Medium', 'High'])
#將PhD這個欄位分為['Low', 'Medium', 'High']3組
college['Grad.Rate'] = pd.cut(college['Grad.Rate'], 5, labels=['Very low', 'Low', 'Medium', 'High', 'Very hi
#將Grad.Rate這個欄位分為['Very low', 'Low', 'Medium', 'High', 'Very high']5組
college['Books'] = pd.cut(college['Books'], 2, labels=['Low', 'High'])
#將Books這個欄位分為['Low', 'High']2組
college['Enroll'] = pd.cut(college['Enroll'], 4, labels=['Very low', 'Low', 'High', 'Very high'])
#將Enroll這個欄位分為['Very low', 'Low', 'High', 'Very high']4組
#畫圖
fig = plt.figure()
plt.subplot(221) #設定位置為2x2的左上⾓
college['PhD'].value_counts().plot(kind='bar', title = 'Private');
plt.subplot(222) #設定位置為2x2的右上⾓
college['Grad.Rate'].value_counts().plot(kind='bar', title = 'Grad.Rate');
plt.subplot(223) #設定位置為2x2的左下⾓
college['Books'].value_counts().plot(kind='bar', title = 'Books');
plt.subplot(224) #設定位置為2x2的右下⾓
college['Enroll'].value_counts().plot(kind='bar', title = 'Enroll');
fig.subplots_adjust(hspace=1) # 把⼦圖之間加上間隔
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第9⾴(共10⾴)
統計學習作業1 (ch2.8) - Jupyter Notebook 2023/9/30 下午6:11
http://localhost:8888/notebooks/Desktop/統計學習作業1%20(ch2.8).ipynb 第10⾴(共10⾴)