You are on page 1of 11

A MINOR PROJECT REPORT ON

“Higher education prediction Using Classification”


Submitted
In the partial fulfilment of the requirements for
Data Mining Techniques Course

By
M Raghavendra Rao (161FA04246)
S Suresh (161FA04268)
P Ram Chand (161FA04260)

Under the esteemed guidance of


Dr N Gnaneshwar Rao , Professor

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


VIGNAN'S FOUNDATION FOR SCIENCE, TECHNOLOGY AND RESEARCH
(Accredited by NAAC “A” grade)
Vadlamudi, Guntur.
CERTIFICATE

This is to certify that the report entitled “Higher education prediction Using Classification” is
being submitted by M Raghavendra Rao (161FA04246), S Suresh (161FA04268) and
P Ram Chand (161FA04260) in partial fulfilment of course work of Data Mining Techniques
as a Minor Project, carried in Department of CSE, Vignan’s Foundation for Science,
Technology and Research, Deemed to be University.

Dr N.Gnaneshwar Rao

Professor External Examiner


Objectives:
Our system consist of a mock test, and by giving the mock test, the student will get the result to
which particular branch he/she can enroll.

Problem Description:
1. The major issue in the Student community who are at the stage of Higher Secondary is
the selection of their career.
2. It is mainly due to lack of information in the area which they want to choose.
3. Ignorance is the first issue they face that blocks them from continuing to the right
destination.
4. Finally they choose some course and institution randomly after sacrificing their own
dream.
5. Because of this our country loses many different potential students in various areas.
6. India produces 3, 60,000 engineering graduate every year, only 25% of them are
employable.
7. One of the drawbacks which make them unemployable is improper selection of right
branch during enrolment process.
8. We will propose a solution for this problem using knowledge based decision technique.
9. Our motivation behind this work is that, if students enroll themselves in right branch,
they will be able to perform in a better way.
Literature Survey:
1. The proposed knowledge based decision technique will guide the student for admission in
proper branch of engineering.
2. More specifically, it provides support for the student to better choose how many and
which courses to enroll on, having as basis the experience of previous students with
similar academic achievements.
3. Naive Bayes model assumes that all variables contribute toward classification and that
they are mutually independent.
4. In other words, it assumes that variables are not correlated.

Proposed Methodology:
1. In our proposed framework, students have to give their marks in the respective fields
such as Mathematics, chemistry, physics and aptitude.
2. With the help of submitted marks and data mining techniques, the system provides a
suggestion to the student, regarding which branch could be taken by him.

Student Branch Prediction:


start

Take the Marks of student from the form

Give it to the Gaussian naive Bayes predictor

Get the result from the predictor

Print the result

Stop
Code:
1. For generating CSV file(dataset)
import pandas as pd
from random import randint
class GenerateCsv:
def createCsv(self):
maths = []
physics = []
chemistry = []
aptitude = []
branch=[]
n=10000
for x in range(n):
math = randint(1, 5)
chem = randint(1, 5)
phy = randint(1, 5)
apt = randint(1, 5)
bran = randint(1, 5)
maths.append(math)
physics.append(phy)
chemistry.append(chem)
aptitude.append(apt)
branch.append(bran)
finalUserData = {'m': maths, 'p': physics,'c': chemistry,'a': aptitude,'b': branch}
df = pd.DataFrame(finalUserData, columns = ['m', 'p', 'c', 'a', 'b'])
df.to_csv('G:\marks1.csv')
GenerateCsv().createCsv()
2. Classification Using Naïve Bayes:
#1-cse,2-ece,3-eee
import numpy as np
import pandas as pd
import csv
temp = []
X = []
Y = []
data=[]
n=10000
csvfile = open("G:\marks1.csv", "r")
table = csv.reader(csvfile)
for row in table:
data.append(row)
csvfile.close()
for i in range(1,n+1):
for j in range(1,5):
temp.append(data[i][j])
X.append(temp)
temp=[]
Y.append(data[i][5])
X = np.array(X).astype(np.integer)
Y = np.array(Y).astype(np.integer)
from sklearn.naive_bayes import GaussianNB
clf_pf = GaussianNB()
clf_pf.partial_fit(X, Y, np.unique(Y))
3. For Creating Interface:
from tkinter import *
fields = ('maths', 'physics', 'chemistry', 'aptitude','branch')
def prediction(entries):
m = float(entries['maths'].get())
p = float(entries['physics'].get())
c = float(entries['chemistry'].get())
a = float(entries['aptitude'].get())
result=clf_pf.predict([[m,p,c,a]])
branch=["CSE","ECE","EEE","CHEMICAL","MECH"]
entries['branch'].delete(0,END)
entries['branch'].insert(0,branch[result[0]-1])
def makeform(root, fields):
entries = {}
for field in fields:
row = Frame(root)
lab = Label(row, width=22, text=field+": ", anchor='w')
ent = Entry(row)
ent.insert(0,"0")
row.pack(side = TOP, fill = X, padx = 5 , pady = 5)
lab.pack(side = LEFT)
ent.pack(side = RIGHT, expand = YES, fill = X)
entries[field] = ent
return entries
if __name__ == '__main__':
root = Tk()
ents = makeform(root, fields)
root.bind('<Return>', (lambda event, e = ents: fetch(e)))
b1 = Button(root, text = 'give branch',
command=(lambda e = ents: prediction(e)))
b1.pack(side = LEFT, padx = 5, pady = 5)
b2 = Button(root, text = 'Quit', command = root.quit)
b2.pack(side = LEFT, padx = 5, pady = 5)
root.mainloop()
Expected Result & Implementation:
For marks entry:

Before giving branch: After Giving branch:


Tools & Language used:
Tools Used:
1. Anaconda
2. Jupyter lab

Languages used:
1. python
2. CSV file

Data Sets & It’s Description:


Relation Name: Marks
Attributes:
1. Mathematics
2. Aptitude
3. Chemistry
4. Physics
5. Branch

Conclusions:
1. Information about test results is used to predict the suitable branch.
2. This study helps to minimize the failure ratio and to take acceptable action for career.
3. This study can facilitate the students, as it will guide them to take appropriate decision
while choosing the stream as his/her career.
4. This system will help the college to analyze the admissions and take the necessary actions
depending upon the results.
Progress of Project:

s.no Modules Completion time

1 Data creation 13/03/2019 - 15/03/2019

2 Creating interface for entering marks 16/03/2019 - 25/03/2019

3 Generating Required Branch 26/03/2019 - 27/04/2019

References:
1. http://staffwww.itn.liu.se/~aidvi/courses/06/dm/lectures/lec7.pdf
2. https://www.slideshare.net/INSOFE/apriorialgorithm-36054672
3. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/
4. International Research Journal of Engineering and Technology
(IRJET)

You might also like