A Customized Hiring Process: Problem Statement

A customized hiring process
INTRODUCTION
There are lot of companies which are available in the market and there are many
training sites which help the candidates both fresher’s as well as experienced in order to
prepare the people and help them find a suitable role in the industry. In the same way the
company might have screened dozens of applicants, vetted a select few through multiple
stages of your hiring process, and now you’re down to the final two candidates. First there is a
candidate. Interviewing her is like playing a great game of tennis. You serve the question and
she smashes it right back with a well-crafted answer. At times your conversation is like
the perfect rally. You cannot fault her game. At the same time there is one more candidate On
paper she looks great. But she is stumbling, struggling to find her feet. She is just not giving
you any kind of game. You think she can do it, but she is not convincing you.
Problem Statement
As the new technologies are evolving day by day, the human resourcing is facing peculiar
challenges in meeting the requirements from client to client. The same set of resumes for a same
JD doesn’t work for all the clients. As every organization carries a different point about a resume
while reading through the resume. Barely matching skills and experience is no more important
alone for the serious organizations. For example, some companies consider the Domain expertise
but some other gives more importance to the number of skills and total yea rs of professional
experience. Human Resource (HR) agencies use various head hunting tools and online search
methods. These search methods connected with the database of millions of resumes.
There is no portal as such for the candidates to find out based on taking certain kind of
tests what kind of company suits them. Hence an effort is made in this project to judge the
candidate capabilities and find the cluster for the candidate based on the answer analysis using
KNN classifier.
METHODOLOGY
[Type text] Page 1

Angular/ Ext JS View

with JSP
TOMCAT Web Container
Data Layer using MySQL
Middle Ware –
Controlling layer
Delegate Layer
Registration Service Login Service Question Creation Service
Prepare Company List for

K NN Classification Cluster
Test Analysis
Resume Upload
Candidate Link Recommendations

Preparation Service Classification and Search of
Resume
Fig: System Architecture
Angular/Ext JS View With JSP
[Type text] Page 2

This module is responsible for generation of views for the front end using angular and ext
js framework along with java server pages.
TOMCAT Web Container
There are many servers available in the market which is responsible for handling the web
requests. Most of the other servers are heavy weight and also are commercial in nature. Here we
make use of open source and light weight tomcat server.
Middle Ware – Controlling layer
This module is responsible for handling the web request and forwarding it to the
authentication layer. This also performs the basic validations like empty checks and regex
validations. If any validation fails then response is send to the front end otherwise the request is
forwarded to the authentication layer and respective services.
Delegate Layer
This layer is used to call the respective service in order to perform a specific task.
Registration Service
This module is responsible for allowing the user to register into the application by
providing details like Name, Password, Confirm Password, Email Id, Gender, Phone No, City,
State and Country
Login Service
This module is responsible for allowing the user in order to provide the username and
password and login into the application either as an administrator or as a user.
Question Creation Service
This module is used by an administrator in order to create a set of questions and each
question will have the following information.
1) Type, 2) Question Description, 3) Answer1, 4) Answer2, 5) Answer3, 6) Answer 4
7) Rating1, 8) Rating2, 9) Rating 3 , 10) Rating 4 and 11) QuestionID
Test Analysis
[Type text] Page 3

This module is responsible for providing questions to the user which will contain
aptitude, technical and general questions and then perform analysis for all the answers and then
generate a matrix which can act as an input to the k nn classification algorithm
K NN Classification
This module is responsible for taking the answers total rating and then performing the
count of nearest neighbors across the 3 clusters like Norm Package, Medium Package and High
Package and then predict this candidate is suitable for which kind of package.
Prepare Company List for Cluster
This module is responsible for creating company information like company name,
company url, company desc along with the cluster (Norm, Medium or High) so that after the
classification is done by Knn the list of companies can be provided based on the cluster.
Candidate Preparation Link
This module is responsible for providing the links for the candidates in order to prepare
for the aptitude, technical and general questions. Each Preparation Item will contain Name, Link
and Category.
Recommendations Service
This module will provide the recommendations of which companies are most suitable for
the candidate.
Resume Upload
The candidate will be able to upload the resume
Data Cleaning of Resumes
The Data Cleaning algorithm is responsible for removal of stop words. Each of resumes
are cleaned by removing the stop words from reviews. These are the set of words which do not
have any specific meaning. The data mining forum has defined set of keywords which do not
have any meaning like a, able, about, across, after, all, almost, also, am, among, an etc
Tokenization of Resumes
[Type text] Page 4

Tokenization is a process of converting the clean data into a set of words known as
tokens
Frequency Computation of Resumes
This is a process in which the frequency computation is performed. For each of the
th th
reviews the frequency is computed. Frequency is number of times a i token appears in j .
Resume.
TF-IDF Computation of Resumes
This module is used to compute the Inverse document frequency based on the number of
resumes and then frequency of the resume.
Classification of Domain for Resume
This module is responsible for training the support vector machine based on the test data
set and then performs the attributes frequency. Find appropriate kernel and then classify the
domain to which the resume mostly belongs to. The module also computes the distance and then
classifies the domain to which the resumes belong to.
Ranking of Resumes
The entire query is divided into tokens and then frequency of those tokens across the
various resumes is found and then finally the resumes are ranked based on descending order of
the resume.
Hybrid Recommendations based on Association Rule Mining
This module is to combine multiple criteria of the resume and then rank the best resumes
based on the requirements of multi attribute searches by doing intersection of the set of various
algorithms.
[Type text] Page 5

Objectives
1. The first objective is to perform the classification of candidates using KNN machine
learning algorithm for various companies- HIGH, MEDUIM and LOW Package.
2. The second objective is to perform the recommendations of list of companies to the
candidate based on the answer analysis
3. The third objective is to classify the resume into testing and development profiles using
SVM
4. The fourth objective is to provide the HR the capability of ranking the resumes based on
specific criteria keywords and then rank the resume that best suits the requirement based
on modified feature vector
Hardware Requirements
Sl No Parameter Description
1 RAM 4GB - 8GB
2 Hard Disk 500GB – 1TB
[Type text] Page 6

Software Requirements
Sl No Parameter Name Parameter Value

1 Development Language JAVA
2 Java Development Kit Version Jdk 1.6
3 Java Run Time Environment Jre 6
4 Database for Routing Tables Backend MySQL
5 Database Front End for Routing Tables Heildi SQL
7 Development Tool Eclipse
8 Sever Type Web Server
9 Web Server Tomcat 8.0
11 Framework Used Spring Framework
12 View Technology Used Java Server Pages
13 Designing Cascading Style Sheets
End User requirements
ReqID Requirement Name Requirement Description

1 Registration This module is responsible for registration of the user
2 Login This module is responsible for performing the login
functionality and then obtain either Admin or Customer
3 Question Creation This module is responsible for creating the questions
4 Test Analysis This module is responsible for performing the analysis of
the test
5 Company Information This module is responsible for saving of the company
information like company name, company url and company
[Type text] Page 7

image url for each of the 3 clusters

6 Training Data This is sample data set which will have the company
information for aptitude, general and technical with rating
and resultant companies
7 Prepare Link This is used to provide the preparation links for helping the
candidate in the preparation
8 Resume Upload Used to storage of the resume
9 Data Cleaning Used for removal of stopwords from the resume
10 Tokenzation Used for converting statements in the resume into set of
words
11 Frequency Used for removal of redundancy
Computation
12 TF-IDF Responsible for computing important keywords in resume
13 Ranking This is used to rank resumes by HR
14 Classification This is responsible for classification of resumes into QA
and Development
References
[1] Gregory A. Wilkin ; Xiuzhen Huang,"K-Means Clustering Algorithms: Implementation and

Comparison", Second International Multi-Symposiums on Computer and Computational
Sciences (IMSCCS 2007)
[2] Shi Na ; Liu Xumin ; Guan Yong, "Research on k-means Clustering Algorithm: An Improved
k-means Clustering Algorithm", 2010 Third International Symposium on Intelligent Information
Technology and Security Informatics, 22 April 2010
[Type text] Page 8

[3] Jie Chen,1 Chunxia Zhang,2 and Zhendong Niu,"A Two-Step Resume Information Extraction
Algorithm", Received 16 August 2017; Revised 26 February 2018; Accepted 26 March 2018;
Published 8 May 2018
[4] Thomas Schmitt, Philippe Caillou, and Michele Sebag,"Matching Jobs and Resumes: a Deep
Collaborative Filtering Task",EPiC Series in Computing
[5] Tsung-Hsien Chiang, Hung-Yi Lo,Shou-De Lin,"A Ranking-based KNN Approach for
Multi-Label Classification",Graduate Institute of Computer Science and Information Engineering
National Taiwan University
[6] Junjie Wu, Advances in K-means Clustering, Springer-Verlag Berlin Heidelberg, 2012.
[7] Jure Leskovec, Anand Rajaraman, Jeffrey D. Ullman, Mining of Massive Datasets, Stanford
Infolab, 2014.
[8] Michael Steinbach, Vipin Kumar, Pang-Ning Tan, Introduction to Data Mining, Pearson
Publications, 2006.
[9] Yanchang Zhao, R and Data Mining: Examples and Case Studies, 2013.
[Type text] Page 9

A Customized Hiring Process: Problem Statement

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Customized Hiring Process: Problem Statement

Uploaded by

Copyright:

Available Formats

A customized hiring process

[Type text] Page 1

Angular/ Ext JS View

TOMCAT Web Container

Data Layer using MySQL

Registration Service Login Service Question Creation Service

Prepare Company List for

Candidate Link Recommendations

Fig: System Architecture

Angular/Ext JS View With JSP

[Type text] Page 2

TOMCAT Web Container

Middle Ware – Controlling layer

Question Creation Service

1) Type, 2) Question Description, 3) Answer1, 4) Answer2, 5) Answer3, 6) Answer 4

7) Rating1, 8) Rating2, 9) Rating 3 , 10) Rating 4 and 11) QuestionID

[Type text] Page 3

Prepare Company List for Cluster

Candidate Preparation Link

The candidate will be able to upload the resume

Data Cleaning of Resumes

[Type text] Page 4

Classification of Domain for Resume

Hybrid Recommendations based on Association Rule Mining

[Type text] Page 5

[Type text] Page 6

Sl No Parameter Name Parameter Value

End User requirements

ReqID Requirement Name Requirement Description

[Type text] Page 7

image url for each of the 3 clusters

[1] Gregory A. Wilkin ; Xiuzhen Huang,"K-Means Clustering Algorithms: Implementation and

[Type text] Page 8

National Taiwan University

[Type text] Page 9

You might also like