Professional Documents
Culture Documents
Page
1
Problems Faced ................................................................................................................. (32)
CHAPTER 9 CONCLUSION…………………………………………………………(33)
ABSTRACT
2
Accurate recruitment of employees is a key element in the business strategy of every
Company due to its impact on companies’ productivity and competitiveness. The internet has
deeply affected the labour market. Identifying most rewarded and demanded items in job
offers is key for recruiters and candidates. The internet explosion has had a profound impact
on the labour market. It is important for recruiters and applicants to identify the most
rewarded and requested products in work offers. This work concludes that experience is
more rewarded than education. The proposed work identifies profile clusters based on the
abilities needed, by using tree-based ensembles to create a precise salary range classifier.
The recruiter plays a significant part in the suggested job, predicting wage range information
based on the Support Vector Machine(SVM) algorithm and Random Forest Algorithm(RFA).
The results shows that the model demonstrates information of the company's wage spectrum
3
CHAPTER 1
INTRODUCTION
In Machine learning methods have also been extensively applied to e-Recruitment. Propose a
machine learning model for detecting talent and updating company’s knowledge taxonomy,
which helps recruiters to detect and incorporate the professional profiles the company lacks.
In the authors employ different classification and clustering techniques and, in the authors,
automatically group job offers using supervised machine learning combined with expert
labelling. Promising results have been shown also in, where the authors revert to pattern
recognition in order to predict competency or skill emergence in the job market. Some
studies have also proposed systems and databases enriched by data mined from the web. In
particular, the use of social networks for recruitment purposes has gained attention recently,
especially on the recruiter side. For example, in an expert retrieval system is presented based
on profile information and user behaviour inside different social networks (Twitter, Facebook
and LinkedIn). In an expert finding algorithm is proposed based on location and connections
to potential candidates.
Most of the works that focus on the extraction of insights from e-Recruitment portals retrieve
the information associated with each job post as text and then they represent each sample as a
vector of word/keyword frequencies. As a consequence, these vectors are often characterized
by a very high number of dimensions (in the order of thousands). Therefore, it is necessary to
collect huge amounts of job posts to be able to train a classifier or a regression model
effectively. However, for websites with a limited target audience, such as portals developed
for a specific geographic area or job sector, there are relatively few job posts. Among these,
only a small percentage has an explicit indication of the offered salary. This can make the
prediction of the salary from the job post features a challenging task.
In this work we present a case study based on data collected from an e-Recruitment website
specifically designed for IT jobs in Spain, named Technopole. The website contains a large
collection of job offers, containing many machine-readable fields which are not common in
other similar sites, such as the requested skills. However, only a small portion of posts
include the offered salary. As a result, our dataset, which covers a period of 5 months,
includes only ≈ 4,000 job posts, which are represented as vectors of ≈ 2,000 features
4
- CHAPTER 2
SOFTWARE PROJECT PLAN
Table 2.Software Project Plan
5
CHAPTER 3
Functional Requirements
Purpose : prediction of salary in IT job market
Input : Employee and Job DataSets
Process : K-means,Svm,Random Forest
Output : Salary Prediction and Performance analysis of Svm and Randomforest.
Training:
Data sets of various objects are fed into the system and the system
should be able to do some preprocessing to remove noise, identify and model the objects for
extracting features.
Validation:
The system should take sample data set for validation as input and
check with the features extracted during training
Testing:
The system should take data set as input and predict the output of
salaries using job and employee data set.
Non-Functional Requirements
Performance Requirements:
1. The system should perform 1000 tests per second in peak load.
2. The system shall produce an accuracy of more than 90% in predicting the
Salaries.
3. The system should work for various data sets of the same salaries.
Interfaces requirements
User Interfaces
1. Programming language: JAVA
2. GUI: Java Eclipse
6
Hardware Interfaces:
Software Interfaces:
7
CHAPTER 4
SYSTEM ANALYSIS
ARCHITECTURE DIAGRAM
8
DFD DIAGRAM:
LEVEL 0
LEVEL 1
LEVEL2
9
10
Class Diagram :
11
Sequence Diagram :
12
CHAPTER 5
DESIGN
HOME PAGE
13
14
15
16
17
CHAPTER 6
CODING
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.sql.*;
import javax.servlet.RequestDispatcher;
import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import com.oreilly.servlet.MultipartRequest;
import CBF.Stem;
import CBF.Stopwords;
import CBF.replace;
import au.com.bytecode.opencsv.CSVReader;
import db.DB;
@WebServlet("/Upload_Action")
public class Upload_Action extends HttpServlet {
private static final long serialVersionUID = 1L;
public Upload_Action() {
super();
// TODO Auto-generated constructor stub
}
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
// TODO Auto-generated method stub
PrintWriter out=response.getWriter();
ServletContext sc=request.getSession().getServletContext();
MultipartRequest m=new MultipartRequest(request,sc.getRealPath("dataset"));
File file=m.getFile("file");
String fname=file.getName();
String csvFile =sc.getRealPath("dataset")+"\\"+fname;
CSVReader reader = null;
Connection con=new DB().Connect();
try
{
PreparedStatement paa= con.prepareStatement("truncate table job_dataset");
paa.executeUpdate();
18
reader = new CSVReader(new FileReader(csvFile));
String[] line;
Stopwords st=new Stopwords();
Stem stem=new Stem();
replace rep=new replace();
while ((line = reader.readNext()) != null)
{
String a=rep.remove(line[0]);
String aa=st.words(a);
String aaa=stem.stem(aa);
String b=rep.remove(line[1]);
String bb=st.words(b);
String bbb=stem.stem(bb);
String c=rep.remove(line[2]);
String cc=st.words(c);
String ccc=stem.stem(cc);
String d=rep.remove(line[3]);
String dd=st.words(d);
String ddd=stem.stem(dd);
String e=rep.remove(line[4]);
String ee=st.words(e);
String eee=stem.stem(ee);
String f=rep.remove(line[5]);
String ff=st.words(f);
String fff=stem.stem(ff);
String g=rep.remove(line[6]);
String gg=st.words(g);
String ggg=stem.stem(gg);
String h=rep.remove(line[7]);
String hh=st.words(h);
String hhh=stem.stem(hh);
String i=rep.remove(line[8]);
String ii=st.words(i);
String iii=stem.stem(ii);
String j=rep.remove(line[9]);
String jj=st.words(j);
String jjj=stem.stem(jj);
String k=rep.remove(line[10]);
String kk=st.words(k);
String kkk=stem.stem(kk);
String l=rep.remove(line[11]);
String ll=st.words(l);
String lll=stem.stem(ll);
String m1=rep.remove(line[12]);
String mm=st.words(m1);
String mmm=stem.stem(mm);
19
+ddd+"','"+eee+"','"+fff+"','"+ggg+"','"+hhh+"','"+iii+"','"+jjj+"','"+kkk+"','"+lll+"','"+mmm+
"') ";
PreparedStatement ps=con.prepareStatement(query);
ps.executeUpdate();
System.out.println(query);
}
out.println("<script type=\"text/javascript\">");
out.println("alert(\"Uploaded And Pre-Processed Successfully.\")");
out.println("</script>");
RequestDispatcher rd=request.getRequestDispatcher("Upload_Dataset.jsp");
rd.include(request, response);
} catch (IOException | SQLException e)
{
System.out.println(e);
out.println("<script type=\"text/javascript\">");
out.println("alert(\"Please Try Again..\")");
out.println("</script>");
RequestDispatcher rd=request.getRequestDispatcher("Upload_Dataset.jsp");
rd.include(request, response);
}
}
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
doGet(request, response);
}
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.sql.*;
import javax.servlet.RequestDispatcher;
import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import com.oreilly.servlet.MultipartRequest;
import CBF.Stem;
20
import CBF.Stopwords;
import CBF.replace;
import au.com.bytecode.opencsv.CSVReader;
import db.DB;
@WebServlet("/Upload_Action")
public class Upload_Action extends HttpServlet {
private static final long serialVersionUID = 1L;
public Upload_Action() {
super();
// TODO Auto-generated constructor stub
}
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
// TODO Auto-generated method stub
PrintWriter out=response.getWriter();
ServletContext sc=request.getSession().getServletContext();
MultipartRequest m=new MultipartRequest(request,sc.getRealPath("dataset"));
File file=m.getFile("file");
String fname=file.getName();
String csvFile =sc.getRealPath("dataset")+"\\"+fname;
CSVReader reader = null;
Connection con=new DB().Connect();
try
{
PreparedStatement paa= con.prepareStatement("truncate table job_dataset");
paa.executeUpdate();
reader = new CSVReader(new FileReader(csvFile));
String[] line;
Stopwords st=new Stopwords();
Stem stem=new Stem();
replace rep=new replace();
while ((line = reader.readNext()) != null)
{
String a=rep.remove(line[0]);
String aa=st.words(a);
String aaa=stem.stem(aa);
String b=rep.remove(line[1]);
String bb=st.words(b);
String bbb=stem.stem(bb);
String c=rep.remove(line[2]);
String cc=st.words(c);
String ccc=stem.stem(cc);
String d=rep.remove(line[3]);
String dd=st.words(d);
String ddd=stem.stem(dd);
String e=rep.remove(line[4]);
String ee=st.words(e);
String eee=stem.stem(ee);
String f=rep.remove(line[5]);
String ff=st.words(f);
21
String fff=stem.stem(ff);
String g=rep.remove(line[6]);
String gg=st.words(g);
String ggg=stem.stem(gg);
String h=rep.remove(line[7]);
String hh=st.words(h);
String hhh=stem.stem(hh);
String i=rep.remove(line[8]);
String ii=st.words(i);
String iii=stem.stem(ii);
String j=rep.remove(line[9]);
String jj=st.words(j);
String jjj=stem.stem(jj);
String k=rep.remove(line[10]);
String kk=st.words(k);
String kkk=stem.stem(kk);
String l=rep.remove(line[11]);
String ll=st.words(l);
String lll=stem.stem(ll);
String m1=rep.remove(line[12]);
String mm=st.words(m1);
String mmm=stem.stem(mm);
22
doGet(request, response);
}
}
CLUSTERING:
<!DOCTYPE html>
<%@page import="java.sql.*"%>
<%@page import="db.DB"%>
<html lang="en">
<head>
<title>Clustering</title>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!--
==================================================================
=============================-->
<link rel="icon" type="image/png" href="login/images/icons/favicon.ico"/>
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css"
href="login/vendor/bootstrap/css/bootstrap.min.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/fonts/font-awesome-4.7.0/css/font-
awesome.min.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/fonts/Linearicons-Free-v1.0.0/icon-
font.min.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/vendor/animate/animate.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/vendor/css-
hamburgers/hamburgers.min.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css"
href="login/vendor/animsition/css/animsition.min.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/vendor/select2/select2.min.css">
23
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css"
href="login/vendor/daterangepicker/daterangepicker.css">
<!--
==================================================================
=============================-->
<link rel="stylesheet" type="text/css" href="login/css/util.css">
<link rel="stylesheet" type="text/css" href="login/css/main.css">
<!--
==================================================================
=============================-->
</head>
<body>
<div class="limiter">
<div class="container-login100">
<div class="wrap-login100 p-l-55 p-r-55 p-t-65 p-b-50">
<form class="login100-form validate-form"
action="Cluster_Action.jsp">
<span class="login100-form-title p-b-33">
Clustering By Skills Based
</span>
<span class="focus-input100-1"></span>
<span class="focus-input100-2"></span>
</div>
24
</form>
</div>
</div>
</div>
<!--
==================================================================
=============================-->
<script src="login/vendor/jquery/jquery-3.2.1.min.js"></script>
<!--
==================================================================
=============================-->
<script src="login/vendor/animsition/js/animsition.min.js"></script>
<!--
==================================================================
=============================-->
<script src="login/vendor/bootstrap/js/popper.js"></script>
<script src="login/vendor/bootstrap/js/bootstrap.min.js"></script>
<!--
==================================================================
=============================-->
<script src="login/vendor/select2/select2.min.js"></script>
<!--
==================================================================
=============================-->
<script src="login/vendor/daterangepicker/moment.min.js"></script>
<script src="login/vendor/daterangepicker/daterangepicker.js"></script>
<!--
==================================================================
=============================-->
<script src="login/vendor/countdowntime/countdowntime.js"></script>
<!--
==================================================================
=============================-->
<script src="login/js/main.js"></script>
</body>
</html>
SALARY PREDICTION:
<!DOCTYPE html>
<%@page import="java.sql.*"%>
<%@page import="db.DB"%>
<%@page import="java.sql.PreparedStatement"%>
<html lang="en">
<head>
<!-- Required meta tags -->
<meta charset="utf-8" />
25
<meta
name="viewport"
content="width=device-width, initial-scale=1, shrink-to-fit=no"
/>
<title>Salary Prediction</title>
<link rel="icon" href="img/favicon.png" />
<!-- Bootstrap CSS -->
<link rel="stylesheet" href="css/bootstrap.min.css" />
<!-- animate CSS -->
<link rel="stylesheet" href="css/animate.css" />
<!-- owl carousel CSS -->
<link rel="stylesheet" href="css/owl.carousel.min.css" />
<!-- themify CSS -->
<link rel="stylesheet" href="css/themify-icons.css" />
<!-- flaticon CSS -->
<link rel="stylesheet" href="css/flaticon.css" />
<!-- font awesome CSS -->
<link rel="stylesheet" href="css/magnific-popup.css" />
<!-- swiper CSS -->
<link rel="stylesheet" href="css/slick.css" />
<!-- style CSS -->
<link rel="stylesheet" href="css/style.css" />
</head>
<body>
<!--::header part start::-->
<header class="main_menu home_menu">
<div class="container">
<div class="row align-items-center">
<div class="col-lg-12">
<nav class="navbar navbar-expand-lg navbar-light">
<button
class="navbar-toggler"
type="button"
data-toggle="collapse"
data-target="#navbarSupportedContent"
aria-controls="navbarSupportedContent"
aria-expanded="false"
aria-label="Toggle navigation"
>
<span class="ti-menu"></span>
</button>
<div
class="collapse navbar-collapse main-menu-item justify-content-center"
id="navbarSupportedContent"
>
<ul class="navbar-nav align-items-center">
<li class="nav-item">
26
<a class="nav-link" href="index.jsp">Home</a>
</li>
<li class="nav-item">
<a class="nav-link" href="Upload_Dataset.jsp">Upload Dataset</a>
</li>
<li class="nav-item">
<a class="nav-link" href="View_Process.jsp">View Pre-Processed Data</a>
</li>
<li class="nav-item">
<a class="nav-link" href="Clustering.jsp">Clustering</a>
</li>
<li class="nav-item dropdown">
<a
class="nav-link dropdown-toggle"
href="blog.html"
id="navbarDropdown"
role="button"
data-toggle="dropdown"
aria-haspopup="true"
aria-expanded="false"
>
Classification
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdown">
<a class="dropdown-item" href="Salary_Prediction.jsp">Salary Prediction</a>
<a class="dropdown-item" href="Employee_Prediction.jsp"
>Employee Prediction</a
>
</div>
</li>
<li class="nav-item dropdown">
<a
class="nav-link dropdown-toggle"
href="#"
id="navbarDropdown"
role="button"
data-toggle="dropdown"
aria-haspopup="true"
aria-expanded="false"
>
Performance Measure
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdown">
<a class="dropdown-item" href="svm_Performance.jsp">SVM</a>
<a class="dropdown-item" href="Employee_Prediction.jsp"
>Random Forests</a>
</div>
</li>
<li class="nav-item">
27
<a class="nav-link" href="Graph.jsp">Graph</a>
</li>
</ul>
</div>
</nav>
</div>
</div>
</div>
</header>
<!-- Header part end-->
<style>
table{
width:40%
}
td,input{
padding-bottom:5px;
}
input{
color: orangered;
}
td{
color:navy;
font-family: cursive;
font-size: 15px;
}
</style>
<!-- banner part start-->
<section class="banner_part" style="height: 100px;">
</section>
<div style="margin-top:50px;">
<center>
<h3 style="color:navy;font-size:30px;font-family:inherit;">Classification Based on
Salary</h3><br>
<style>
table,th,td,tr{
border-collapse:collapse;
border: 1px solid black;
font-weight:bold;
}
th{
color:black;
text-align:center;
}
</style>
<div style="overflow: scroll;width: 85%;height: 400px;" >
<table id="scr" style=" overflow: scroll;width: 65%;height: 337px;">
<tr>
<th>ID</th>
<th>Company_name</th>
28
<th>Qualification</th>
<th>Experience</th>
<th>Industry</th>
<th>Job Id</th>
<th>Job Title</th>
<th>Location</th>
<th>vaccancies</th>
<th>Pay Scale</th>
<th>Posted Date</th>
<th>Skills</th>
</tr>
<%
Connection con=new DB().Connect();
PreparedStatement ps=con.prepareStatement("SELECT * FROM
job_dataset GROUP BY payscale ");
ResultSet r=ps.executeQuery();
while(r.next()){
%>
<tr>
<td><%=r.getString("id") %></td>
<td><%=r.getString("company_name") %></td>
<td><%=r.getString("education") %></td>
<td><%=r.getString("experience") %></td>
<td><%=r.getString("industry") %></td>
<td><%=r.getString("job_id") %></td>
<td><%=r.getString("job_title") %></td>
<td><%=r.getString("job_location") %></td>
<td><%=r.getString("vaccancies") %></td>
<td><%=r.getString("payscale") %></td>
<td><%=r.getString("post_date") %></td>
<td><%=r.getString("skill") %></td>
</tr>
<%} %>
</table></div>
</center>
</div>
<!-- banner part start-->
<!-- jquery plugins here-->
<script src="js/jquery-1.12.1.min.js"></script>
<!-- popper js -->
<script src="js/popper.min.js"></script>
<!-- bootstrap js -->
<script src="js/bootstrap.min.js"></script>
<!-- easing js -->
<script src="js/jquery.nice-select.min.js"></script>
<!-- custom js -->
<script src="js/custom.js"></script>
</body>
</html>
29
CHAPTER 7
TESTING
UNIT TESTING
Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a business process
performs accurately to the documented specifications and contains clearly defined inputs and
expected results.
TEST-ID UTSO1
details
not
database
INTEGRATION TESTING
30
determine if they actually run as one program. Testing is event driven and is
more concerned with the basic outcome of screens or fields. Integration tests
demonstrate that although the components were individually satisfaction, as
shown by successfully unit testing, the combination of components is correct
and consistent. Integration testing is specifically aimed at exposing the
TEST-ID ITS-101
Validation Testing
Validation testing begins at the end of integration testing , when individual components
have been exercised , the software is completely essembled as a package and interfacing
errors have been uncovered . Testing focuses on user visible actuion and user recognizable
output from system. Definition is that validation succeeds when functions in a manner that
Test Results: All the test cases mentioned above passed successfully. No
defects encounter
31
CHAPTER 8
IMPLEMENTATION
PROBLEM FACED
Problem faced during building the code , file uploading , coding is very
difficult to implement. During beginning stage , it is difficult to find the features
and Usecase requirements . Next finding an actor for the system and their
operations . Finding the way to implement the project is very difficult.
Lesson Learnt
While developing this project I learnt many lessons. They are as follows :
Before starting the project I should have proper plan about the project.
Don’t jump into coding directly. First the project should be analyzed
thoroughly.
Code should be simple while executing a concept and try to get better
results.
Learn some more ideas and information about the project which helps to
produce better results.
32
CHAPTER 9
9.1 CONCLUSION:
This work focuses on the challenge of predicting the salary offered by companies through
job posts on the web. Instead of focusing on international and multidomain web portals,
which are abundant in terms of number of posts, this work analyses job posts collected from
Tecnoempleo, an e-Recruitment website specialized in IT jobs for young people in Spain.
Domain and geographical restrictions of the website make salary prediction a challenging
task. In fact, the number of posts including an explicit indication of the salary, collected in 5
months on a daily basis, is only ≈ 4,000. Moreover, each post is retrieved as a vector of ≈
2,000 features. From a machine learning perspective, the task is difficult because of the
limited number of samples, the relatively high dimensionality and the presence of noise.
After analysing key aspects from the job market, we assess the relevance of the features that
can be used to predict salaries. Results indicate that some features, such as experience, job
stability or certain job roles (i.e Team Leader and IT Architect) contribute significantly to the
final salary perceived by employees. Furthermore, we observe that posts can be arranged into
5 different skill-based profiles, namely: Back-end developer, Systems Administrator, .Net
developer, Java developer and Front-end developer. Such profiles seem to be similarly paid,
even though the demand for Back- end developers (including Java and .Net technologies) is
higher than that for the rest of professionals. Finally, this work classifies job posts according
to the offered salary range in a noisy and example scarce context. After collection, features
are pre processed and the dimensionality is reduced by 10 times by using a customized
procedure exploiting the domain knowledge. Embedded feature selection or other state-of-
the-art filter methods are not beneficial in terms of classification accuracy. We compare
several models SVMs, random forests based on all or part of them. Experiments show that
ensembles based on decision trees behave generally better and that a voting committee based
on them leads to an accuracy of ≈ 84%.
33
REFERENCES
Issue:Internet Empowerment.
3. Peter Kuhn and Hani Mansour. Is internet job search still ineffective?
2014.
Studies, 2006
34