Professional Documents
Culture Documents
By
M Raghavendra Rao (161FA04246)
S Suresh (161FA04268)
P Ram Chand (161FA04260)
This is to certify that the report entitled “Higher education prediction Using Classification” is
being submitted by M Raghavendra Rao (161FA04246), S Suresh (161FA04268) and
P Ram Chand (161FA04260) in partial fulfilment of course work of Data Mining Techniques
as a Minor Project, carried in Department of CSE, Vignan’s Foundation for Science,
Technology and Research, Deemed to be University.
Dr N.Gnaneshwar Rao
Problem Description:
1. The major issue in the Student community who are at the stage of Higher Secondary is
the selection of their career.
2. It is mainly due to lack of information in the area which they want to choose.
3. Ignorance is the first issue they face that blocks them from continuing to the right
destination.
4. Finally they choose some course and institution randomly after sacrificing their own
dream.
5. Because of this our country loses many different potential students in various areas.
6. India produces 3, 60,000 engineering graduate every year, only 25% of them are
employable.
7. One of the drawbacks which make them unemployable is improper selection of right
branch during enrolment process.
8. We will propose a solution for this problem using knowledge based decision technique.
9. Our motivation behind this work is that, if students enroll themselves in right branch,
they will be able to perform in a better way.
Literature Survey:
1. The proposed knowledge based decision technique will guide the student for admission in
proper branch of engineering.
2. More specifically, it provides support for the student to better choose how many and
which courses to enroll on, having as basis the experience of previous students with
similar academic achievements.
3. Naive Bayes model assumes that all variables contribute toward classification and that
they are mutually independent.
4. In other words, it assumes that variables are not correlated.
Proposed Methodology:
1. In our proposed framework, students have to give their marks in the respective fields
such as Mathematics, chemistry, physics and aptitude.
2. With the help of submitted marks and data mining techniques, the system provides a
suggestion to the student, regarding which branch could be taken by him.
Stop
Code:
1. For generating CSV file(dataset)
import pandas as pd
from random import randint
class GenerateCsv:
def createCsv(self):
maths = []
physics = []
chemistry = []
aptitude = []
branch=[]
n=10000
for x in range(n):
math = randint(1, 5)
chem = randint(1, 5)
phy = randint(1, 5)
apt = randint(1, 5)
bran = randint(1, 5)
maths.append(math)
physics.append(phy)
chemistry.append(chem)
aptitude.append(apt)
branch.append(bran)
finalUserData = {'m': maths, 'p': physics,'c': chemistry,'a': aptitude,'b': branch}
df = pd.DataFrame(finalUserData, columns = ['m', 'p', 'c', 'a', 'b'])
df.to_csv('G:\marks1.csv')
GenerateCsv().createCsv()
2. Classification Using Naïve Bayes:
#1-cse,2-ece,3-eee
import numpy as np
import pandas as pd
import csv
temp = []
X = []
Y = []
data=[]
n=10000
csvfile = open("G:\marks1.csv", "r")
table = csv.reader(csvfile)
for row in table:
data.append(row)
csvfile.close()
for i in range(1,n+1):
for j in range(1,5):
temp.append(data[i][j])
X.append(temp)
temp=[]
Y.append(data[i][5])
X = np.array(X).astype(np.integer)
Y = np.array(Y).astype(np.integer)
from sklearn.naive_bayes import GaussianNB
clf_pf = GaussianNB()
clf_pf.partial_fit(X, Y, np.unique(Y))
3. For Creating Interface:
from tkinter import *
fields = ('maths', 'physics', 'chemistry', 'aptitude','branch')
def prediction(entries):
m = float(entries['maths'].get())
p = float(entries['physics'].get())
c = float(entries['chemistry'].get())
a = float(entries['aptitude'].get())
result=clf_pf.predict([[m,p,c,a]])
branch=["CSE","ECE","EEE","CHEMICAL","MECH"]
entries['branch'].delete(0,END)
entries['branch'].insert(0,branch[result[0]-1])
def makeform(root, fields):
entries = {}
for field in fields:
row = Frame(root)
lab = Label(row, width=22, text=field+": ", anchor='w')
ent = Entry(row)
ent.insert(0,"0")
row.pack(side = TOP, fill = X, padx = 5 , pady = 5)
lab.pack(side = LEFT)
ent.pack(side = RIGHT, expand = YES, fill = X)
entries[field] = ent
return entries
if __name__ == '__main__':
root = Tk()
ents = makeform(root, fields)
root.bind('<Return>', (lambda event, e = ents: fetch(e)))
b1 = Button(root, text = 'give branch',
command=(lambda e = ents: prediction(e)))
b1.pack(side = LEFT, padx = 5, pady = 5)
b2 = Button(root, text = 'Quit', command = root.quit)
b2.pack(side = LEFT, padx = 5, pady = 5)
root.mainloop()
Expected Result & Implementation:
For marks entry:
Languages used:
1. python
2. CSV file
Conclusions:
1. Information about test results is used to predict the suitable branch.
2. This study helps to minimize the failure ratio and to take acceptable action for career.
3. This study can facilitate the students, as it will guide them to take appropriate decision
while choosing the stream as his/her career.
4. This system will help the college to analyze the admissions and take the necessary actions
depending upon the results.
Progress of Project:
References:
1. http://staffwww.itn.liu.se/~aidvi/courses/06/dm/lectures/lec7.pdf
2. https://www.slideshare.net/INSOFE/apriorialgorithm-36054672
3. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/
4. International Research Journal of Engineering and Technology
(IRJET)