You are on page 1of 23

BHARATI VIDYAPEETH

COLLEGE OF ENGINEERING
DEPARTMENT OF INFORMATION TECHNOLOGY
ACADEMIC YEAR: 2021-2022

COURSE NAME: DS LAB

COURSE CODE ITL605

EXPERIMENT NO. Mini Project

EXPERIMENT TITLE. Salary Prediction

NAME OF STUDENT Rutuja Navghane

ROLL NO. 48

CLASS TE-IT

SEMESTER VI

GIVEN DATE 04/02/2022

SUBMISSION DATE 08/04/2022

CORRECTION DATE

REMARK

TIMELY TOTAL
PRESENTATION UNDERSTANDING
SUBMISSION MARKS

01 01 03 05

NAME& SIGN.
Prof. B.S. Dakhare
OF FACULTY
Department of Information Technology
Bharati Vidyapeeth College of Engineering, Navi Mumbai

Academic Year 2021-2022

DS-Mini Project
On
Salary Prediction

Submitted in partial fulfillment of the requirements of the degree of

Bachelor of Engineering

By
RUTUJA NAVGHANE [48]

Project Guide:

Prof .B.S.Dakhare
Page 3 of 25

Bharati Vidyapeeth College of Engineering,


Navi Mumbai
Department of Information Technology

CERTIFICATE

This is to certify that,

RUTUJA NAVGHANE [48]

Class- TEIT Semester-VI DS-Mini Project have completed the Mini Project Salary
Prediction Satisfactorily in the Department of Information Technology as
prescribed by the Mumbai University in the academic year 2021-2022.

PROF. B.S Dakhare

Mini Project Guide External Examiner

Prof. B.S. Dakhare Prof. S.M Patil

Mini Project Coordinator H.O.D.


Page 4 of 25

Contents

Sr No Topic Page No
1 Introduction 6
2 Literature Review 8
3 Problem Definition 9
4 Objectives of Work 10
5 Model/Block Diagram 11
6 System Requirements 12
7 Implementation 13
8 Results and Analysis 24
9 Conclusion 25
10 Future Scope 26
11 References 27
Page 5 of 25

Abstract

Machine learning is a technology which allows a software program to became


more accurate at pretending more accurate results without being explicitly
programmed and also ML algorithms uses historic data to predicts the new
outputs. Because of this ML gets a distinguish attention. Now a day’s prediction
engine has become so popular that they are generating accurate and affordable
predictions just like a human, and being using industry to solve many of the
problems. Predicting justified salary for employee is always being a challenging
job for an employer. In this paper and proposing a salary prediction model with
suitable algorithm using key features required to predict the salary of
employee.Machine learning rule is applied to predict salary levels of individuals
supported their years of expertise. Random forest regression is employed since it
gave higher accuracy compared to decision tree regression and Support Vector
Regression classifier. Choosing the foremost effective machine learning rule so as
to unravel the problems of classification and prediction of data is the most vital
part of machine learning which depends on dataset likewise. The predictive
accuracy of the Random forest regression on test data is 97%, while the accuracy
of Decision tree regression and Support Vector Regression is 85% and 90%
respectively. The model has been used on the training data to predict dependent
variables and to extract features with highest impact on salary prediction.
Chapter 1

Introduction

Now days, Major reason an employee switches the company is the salary of the
employee. Employees keep switching the company to get the expected salary. And
it leads to loss of the company and to overcome this loss we came with an idea
what if the employee gets the desired/expected salary from the Company or
Organization. In this Competitive world everyone has a higher expectation and
goals. But we cannot randomly provide everyone their expected salary there should
be a system which should measure the ability of the Employee for the Expected
salary. We cannot decide the exact salary but we can predict it by using certain
data sets. A prediction is an assumption about a future event. In this paper the main
aim is predicting salary and making a suitable user-friendly graph. So that an
Employee can get the desired salary on the basis of his qualification and hard
work. For developing this system, we are using a Linear regression algorithm of
supervised learning in machine learning. Supervised learning is basically a learning
task of a learning function that maps an input to an output of given example. In
supervised learning each example is pair having input parameter and the desired
output value. Linear regression algorithm in machine learning is a supervised
learning technique to approximate the mapping function to get the best predictions.
The main goal of regression is the construction of an efficient model to predict the
dependent attribute from a bunch of attribute variables. A regression problem is
when the output value is real or a continuous value like salary.
Page 8 of 25

Chapter 2

Review of Literature

1) Susmita Ray," A Quick Review of Machine Learning Algorithms," 2019


International Conference on Machine Learning, Big Data, Cloud and Parallel
Computing (Com-ITCon), India, 14th -16th Feb 2019 a brief review of various
machine learning algorithms which are most frequently used to solve
classification, regression and clustering problems. The advantages, disadvantages
of these algorithms have been discussed along with comparison of different
algorithms (wherever possible) in terms of performance, learning rate etc. Along
with that, examples of practical applications of these algorithms have been
discussed.[1]

2) Sananda Dutta, Airiddha Halder, Kousik Dasgupta,” Design of a novel


Prediction Engine for predicting suitable salary for a job” 2018 Fourth
International Conference on Research in Computational Intelligence and
Communication Networks (ICRCICN) - focused on the problem of predicting
salary for job advertisements in which salary are not mentioned and also tried to
help fresher to predict possible salary for different companies in different
locations. The corner stone of this study is a dataset provided by ADZUNA.
model is well capable to predict precise value.[2]

3) Pornthep Khongchai, Pokpong Songmuang, “Improving Students’ Motivation


to Study using Salary Prediction System” - proposed prediction model using
Decision tree technique with seven features. Moreover, the result of the system is
not only a predicted salary, but also the 3-highest salary of the graduated students
which share common attributes to the users. To test the system’s efficiency, they
set up an experiment by using 13,541 records of actual graduated student data.
The total result in accuracy is 41.39%.[3]

Page 9 of 25
Chapter 3

Problem Definition

Our aim is to build and test models which can predict the salary
(regression). In this paper, our purpose is to predict average salary via a
regression model and monitor the change of the regression parameters to
make inferences about the determinants of programming languages
known. We apply regression analysis using the acoustic features and the
stream data provided from dataset. However, OLS regression does not
take account of the dynamic behavior of the time-varying regression
coefficients. For this reason, we use the FLS regression to smoothen the
changes of the regression coefficients because in real life we do not
expect extreme changes in the coefficients over time. In this section, we
explain the methodology that we use to solve this type of regression
problem.
Page 10 of 25

Chapter 4

Objectives of Work

 The primary objective of this research paper is to predict employee


average salary.

 In this paper, we propose a novel model for predicting Employee


Salary using Machine Learning based approach which is highly
robust.

 In order to validate the accuracy of the system proposed for


Employee salary, the data set is acquired via online database and
fetched to the system and highly stunning and precision results are
shown by the system with regard to Employee salary.

 In this paper the main aim is predicting salary and making a suitable
user-friendly graph.

 So that an Employee can get the desired salary on the basis of his
qualification and hard work.

 For developing this system, we are using a Linear regression


algorithm of supervised learning in machine learning.
Page 11 of 25

Chapter 5
Model/Block
Diagram
Page 12 of 25

Chapter 6
System Requirements

HARDWARE AND SOFTWARE REQUIREMENTS

Hardware requirements:
Processor: Intel i3/i5/i7 /AMD FX Series
Ram: 4 GB or higher

Software requirements:
IDE: Google Colab
Programming Language: Python
Page 13 of 25

Chapter 7

Implementation
Page 24 of 25

Chapter8

Results and Analysis

The following section discusses the results of the proposed


system in comparison with other systems and methods.

Experimental Results

The precision, recall, and F1 functions were used to test the


effectiveness of UIBB, and the results were compared against the
following algorithms using the same sample of consumers. Traditional
CF based on users (CFUB): CFIB is a type of CF that is based on
things. The number of common consumers who purchased the same
item is used by CFIB to determine the item’s similarity. FPP creates a
rating matrix exclusively based on the users’ purchase data. FPP
incorporates both CF-based and SPA-based recommendations. (UIBB)
= user item behavior-based, the proposed method. Figures 4–6 show
that CFUB and In terms of precision and recall, the UIBB technique
outperformed the others since it took into account not just the user’s
purchasing behavior but also other behaviors, which enhanced the
forecast accuracy.
Page 25 of 25

Chapter9

Conclusion

We conclude that average salary can be predicted reasonably well using


machine learning techniques. With further feature engineering, we aim
to improve performance of current models. For further work, we plan to
optimize current implementation. We intend to leverage domain
knowledge to improve our performance metrics.

Through several different feature selection algorithms, we were able to


identify the most influential features in our dataset by taking the
intersection among the feature selection algorithms, namely low salary,
programing languages known. This would allow us to more accurately
capture what we define to be popular and allow us to generalize our
findings to commonly understood metrics. Furthermore, we can
potentially extend the application of this project to a job
recommendation system using salary prediction.
Page 26 of 29

Chapter 10

Future Score

In future work, we would like add graphical user interface to system


and try to save and reuse trained model. Additionally, we would like to
explore the possibility of further optimizing our feature extraction and
selection techniques. In future work we would like to include features
based on document statistics (document length, average number of
words, Flesch-Kincaid readability statistics, etc.) as well as features
based on similarities between documents (perhaps using latent dirichlet
allocation, an unsupervised learning technique which assumes that the
distribution of words in a document is indicative of underlying
categories).

Page 27 of 28
Chapter 11

References

[1] Susmita Ray," A Quick Review of Machine Learning Algorithms,"


2019 International Conference on Machine Learning, Big Data, Cloud
and Parallel Computing (ComIT-Con), India, 14th -16th Feb 2019

[2] Sananda Dutta, Airiddha Halder, Kousik Dasgupta,” Design of a


novel Prediction Engine for predicting suitable salary for a job” 2018
Fourth International Conference on Research in Computational
Intelligence and Communication Networks (ICRCICN).

[3] Pornthep Khongchai, Pokpong Songmuang, “Improving Students’


Motivation to Study using Salary Prediction System” 2016 13th
International Joint Conference on Computer Science and Software
Engineering (JCSSE)

[4] Phuwadol Viroonluecha, Thongchai Kaewkiriya,” Salary Predictor


System for Thailand Labour Workforce using Deep Learning” The 18th
International Symposium on Communications and Information
Technologies (ISCIT 2018)

[5] Mangui Wu, Shunmin Shu,” Top Management Salary, Stock Ratio
and Firm Performance: A Comparative Study of State-owned and
Private Listed Companies in China”

You might also like