Professional Documents
Culture Documents
janzaib-masood / Educational-Data-Mining
Dismiss
Join GitHub today
GitHub is home to over 28 million developers working together to host
and review code, manage projects, and build software together.
Sign up
Branch: master Educational-Data-Mining / EDM Unbalanced data classification with Cross Validation.ipynb Find file Copy path
1 contributor
https://github.com/janzaib-masood/Educational-Data-Mining/blob/master/EDM%20Unbalanced%20data%20classification%20with%20Cross%20V… 1/7
7/11/2018 Educational-Data-Mining/EDM Unbalanced data classification with Cross Validation.ipynb at master · janzaib-masood/Educational-D…
In [1]: """
Authors: Abdul Samad samad19472002@gmail.com
Janzaib Masood janzaibaloch786@gmail.com
"""
warnings.filterwarnings('ignore')
%pylab inline
pylab.rcParams['figure.figsize'] = (12, 6)
plt.style.use('fivethirtyeight')
Out[2]: ers IntSchoolBrothers IntSchoolSisters ClassSchoolStatus Disability01 Lang1 Lang2 Lang3 Lang4 Religion RESULT
3 2 1 0 1 0 0 0 1 FAIL
1 1 1 0 1 0 0 0 1 PASS
0 0 1 0 1 0 0 0 1 PASS
1 4 1 0 1 0 0 0 1 PASS
0 0 1 0 1 0 0 0 1 FAIL
In [3]: # Slicing from main dataframe to Input Data(X) and output Data(y)
y = df.iloc[:, 12]
X = df.iloc[:,:12]
# Replacng PASS and FAIL with integers 1 and 0 respectively
y[y == 'PASS'] = 1
y[y == 'FAIL'] = 0
display(X.head())
display(y.head())
0 0 3 3 3 2 1 0 1 0
1 0 4 5 1 1 1 0 1 0
2 0 5 3 0 0 1 0 1 0
3 0 2 4 1 4 1 0 1 0
4 0 2 1 0 0 1 0 1 0
0 0
1 1
2 1
3 1
https://github.com/janzaib-masood/Educational-Data-Mining/blob/master/EDM%20Unbalanced%20data%20classification%20with%20Cross%20V… 2/7
7/11/2018 Educational-Data-Mining/EDM Unbalanced data classification with Cross Validation.ipynb at master · janzaib-masood/Educational-D…
3 1
4 0
Name: RESULT, dtype: object
In [4]: a = y.values
b = X.values
del(y)
del(X)
X = []
Y = []
length = len(a)
for i in range(length):
X.append(b[i,:])
Y.append(a[i])
k l (k X l d Y l d 10 i ' ')
https://github.com/janzaib-masood/Educational-Data-Mining/blob/master/EDM%20Unbalanced%20data%20classification%20with%20Cross%20V… 3/7
7/11/2018 Educational-Data-Mining/EDM Unbalanced data classification with Cross Validation.ipynb at master · janzaib-masood/Educational-D…
scores_knn = cross_val_score(knn, X_resampled, Y_resampled, cv = 10, scoring='roc_auc')
scores_knn = scores_knn.mean()
https://github.com/janzaib-masood/Educational-Data-Mining/blob/master/EDM%20Unbalanced%20data%20classification%20with%20Cross%20V… 5/7
7/11/2018 Educational-Data-Mining/EDM Unbalanced data classification with Cross Validation.ipynb at master · janzaib-masood/Educational-D…
print(report[0,:])
report.shape
#del(df)
df = pd.DataFrame(report, columns = Samplers, index = Classifiers)
df
DecisionTreeClassifier 0.496118 0.440419 0.329967 0.424171 0.666050 0.834829 0.489381 0.728201 0.758342
LogisticRegression 0.509190 0.331147 0.416988 0.617306 0.524899 0.527254 0.511774 0.364268 0.359198
KNeighborsClassifier 0.507561 0.469819 0.358006 0.489123 0.668403 0.860551 0.505590 0.718269 0.735444
RandomForestClassifier 0.502880 0.422818 0.281351 0.449868 0.730360 0.924528 0.513475 0.761615 0.803596
MLPClassifier 0.520391 0.372000 0.408957 0.673992 0.572032 0.690398 0.516574 0.505113 0.543419
https://github.com/janzaib-masood/Educational-Data-Mining/blob/master/EDM%20Unbalanced%20data%20classification%20with%20Cross%20V… 7/7