You are on page 1of 4

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056

Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072

Implementation of an Automated Job Recommendation System Based


on Candidate Profiles
Vinay Desai, Dheeraj Bahl, Shreekumar Vibhandik,Isra Fatma
Prof. Vijay Kolekar , Dept. of Computer Engineering, K.J. College of Engg & Mgt. Research, Pune, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Dealing with the enormous amount of recruiting CF is a popular recommendation algorithm that bases its
information on the Internet, a job seeker always spends hours predictions and recommendations on the ratings or
to find useful ones. To reduce this laborious work, we design behaviour of other users of the system.
and implement a recommendation system for online job- It also uses a profound technique called as Information
hunting. In this paper, we contrast user-based and item-based Retrieval (IR). Information Retrieval is a new and advanced
collaborative filtering algorithm to choose a better performed technique used for achieving the most accurate and desired
one. We also take background information including students’ result without compromising on the efficiency of the result.
resumes and details of recruiting information into
consideration, bring weights of co-apply users (the users who 1.2 RELATED WORK
had applied the candidate jobs) and weights of student used-
liked jobs into their commendation algorithm. At last, the A. Recommendation Algorithms
model we proposed is verified through experiments study
which is using actual data. The recommended results can
achieve higher score of precision and recall, and they are more 1) Content-based filtering (CBF):
relevant with users’ preferences before. In Content-based methods, features of items are abstract and
compared with a profile of the user’s preference.
Key Words- recommendation system; item-based
In other words, this algorithm tries to recommend items that
collaborative filtering; content-based filtering; Vector
are similar to those that a user liked in the past. It is widely
Space Model(VSM); Mahout
applied in information retrieval(IR). However it performs
badly in multimedia field such as music or movie
1. INTRODUCTION recommendation because it is hard to extract items
attributes and obtains user’s preference sometimes.
The increasing usage of Internet has heightened the need for
online job hunting. According to Jobsite’s report 2014, 68%
2) Collaborative Filtering (CF):
of online jobseekers are college graduates or post graduates.
CF is a popular recommendation algorithm that bases its
The key problem is that most of job-hunting websites just
predictions and recommendations on the ratings or behavior
display recruitment information to website viewers.
of other users in the system. There are two basic types:
Websites just display recruitment information to website
viewers. Students have to retrieve among all the information
to find jobs they want to apply. The whole procedure is User-based CF and Item-based CF.
tedious and inefficient. By creating an easy job  User-based CF: find other users whose past rating
recommendation system where everyone will have a fair and behavior is similar to that of the current user and use
square chance. This saves a lot of potential time and money their ratings on other items to predict what the current
both on the industrial as well as the job seeker’s side. user will like. The working of User –based CF is a quite
Moreover, as the candidate gets a fair chance to prove his simplified technique; all it does is examine the past
talent in the real world it is a lot more efficient system. The interests of the user and based on the past results the
basic agenda of every algorithm used in today’s world be it a system makes an accurate result of the candidate who
traditional algorithm or a hybrid algorithm is to provide a has applied for a job.
suitable job that the user actually seeks and wishes for.
Item-based CF: Rather than using similarities between
1.1 LITERATURE SURVEY users’ rating behavior to predict preferences, item–based CF
uses similarities
There are endless algorithms to help a seeker find the right
job, some are the traditional algorithms while some are  Between the rating patterns of items. Since finding
newly found and there are a large number of hybrid similar items is easier than finding similar users, and
algorithms which are a combination of many algorithms. attributes of items are more stable than users’ preference,
All these algorithms have only goal to seek a righteous job item-based methods are suitable for off-line computing.
for the candidate. The preferred outcome of any of the two methods is to
provide a suitable job. There are some drawbacks in

© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1018
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072

Collaborative Filtering as a whole aspect. Collaborative  Data Preprocessing: In this step, we clean the raw
Filtering approaches often suffer from three problems: data to filter useless data including inactive users
cold start, scalability and sparsely. These three drawbacks and expired recruiting information.
are very problematic at times and crucial opportunities
can be missed because of this. B. Item-based CF Deals with Boolean Data:

B. Methods of Similarity Calculation Rather than using similarities between users’ rating
1) Cosine Similarity behavior to predict preferences, item– based CF uses
Cosine similarity uses two N-dimensional vector’s cosine similarities between the rating patterns of items. Since
Value to indicate the degree of similarity between them. It is finding similar items is easier than finding similar users, and
widely used in information retrieval (IR). attributes of items are more stable than users’ preference,
item-based methods are suitable for off-line computing. The
2) Tanimoto Coefficient: Tanimoto coefficient, also known as preferred outcome of any of the two methods is to provide a
the Jaccard index,measures similarity between finite sample suitable job.
sets, and is defined as the size of the intersection divided by the
size of the union of the sample sets. The procedure is presented below:
XY for each jobi useri applied{
Jaccard (X, Y) =X Y(2) for each co-applied userj who applied jobi{
find out jobs that userj applied;
3) Log Likelihood add these jobs to candidate set;
Similar to Tanimoto coefficient, the Log likelihood method }
calculate similarity based on the common preference two delete jobi from candidate set;
users shared. Given the total number of items and the }
number of each user rated items, the final result is the
impossibility of that the two users have such common IV. EXPERIMENTS
preference.
In this section, user-based and item-based CF algorithms are
4) The City Block Distance tested on our data set respectively. Then item-based CF,the
The city block distance is the sum of the lengths of the better performed one, to be applied on the Student
projections of the line segment between the points onto the JobHunting recommendation system. At last, we evaluate the
coordinate axes. performance of improved recommender that using used-
n liked job and co-apply users weights based on item-based
i i i D x y x y (3) algorithm. The implementation of our experiments is based
on ApacheMahout.
2. RECOMMENDATION SYSTEM OF STUDENTS JOB
HUNTING(SJH)

A. Procedure of SJH Recommendation


There are four steps in our system as Fig. 1 shows:

Fig. 2 shows that all three user-based algorithm and


itembasedalgorithm which using Log likelihood
similarityreached higher precision and recall than other
two algorithms.

Under this circumstance, we continue to evaluate these four


Figure. 1. Procedure of SJH recommendation methods via some other variables, for example the number
of neighborhood.
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1019
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072

C. Evaluation Of User-based CF
1) Contrast of different similarities

Since user’s preference on jobs values (0, 1) in the job apply


records, and Log likelihood, City Block and Tanimotoare
three methods of similarity calculation that are suitable for
Boolean data, in this experiment we recommended three
items to test precision and recall with these three methods.
We chose different neighborhood numbers to reduce its
influence.

TABLE I. PRECISION OF USER-BASED CF WITH


DIFFERENTSIMILARITY METHODS Figure. 4. Recall of user-based CF with different
similarity methods

It should be noted that the Generic Recommender IR Stats


Evaluator that Mahout offered just focuses on users who
have recommended items, while automatically ignores users
that cannot be recommended. So the result of evaluation
seemed very good. In next section we will test the
recommendation results artificially.

D. Evaluation of Item-based CF
TABLE II. RECALL OF USER-BASED CF WITH
According to the result of section IV.C, when using Log
DIFFERENT SIMILARITY
likelihood similarity method, item-based algorithm
performed well. So we decided to use item-based CF and
selected Log likelihood method to compute candidate items’
similarities in the Student Job Hunting recommendation
system.

1) The performance of improved recommender:


We evaluate the capability of original item-based
recommender and the improved recommender that
takes co-apply users’ weight and used-liked jobs’
weight into account when recommending two or
three jobs for each student. The numstands for
number of recommended items. The results are
recorded in Table V and Table VI.

TABLE III. PERFORMANCE OF IMPROVED


RECOMMENDER IN SJHSYSTEM(R_NUM=3)

Figure. 3. Precision of user-based CF with different


similarity methods

© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1020
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072

take precedence (1st and 2nd recommended jobs) are better


than latter ones (3rd recommended job).

V. CONCLUSION

On the basis of this study and various techniques to research


and after implementation of algorithms the CF based
algorithm for its better performance and overall factors. Of
course a lot of improvement and hybrid algorithms need to
be implemented alongside CF algorithm. To further optimize
the recommendation system, and integrate the system for
Figure. 5. Performance of improved recommender in better performance we keep in check the sparsity of user
SJH system(r_num=3) profile and use some methods of filling user’s preference
matrix can be utilized.
TABLE VI. PERFORMANCE OF IMPROVED
RECOMMENDER IN SJH VI.REFERENCES
SYSTEM (R_NUM=2)
[1] Pazzani M J, Billsus D. Content-based recommendation
systems [M]//The adaptive web. Springer Berlin Heidelberg,
2007: 325-341.

[2] Adomavicius G, Tuzhilin A. Toward the next generation of


recommender systems: A survey of the state-of-the-art and
possible extensions [J]. Knowledge and Data Engineering,
IEEE Transactions on, 2005, 17(6): 734-749.I.S. Jacobs and
C.P. Bean, “Fine particles, thin films and exchange
anisotropy,” in Magnetism, vol. III, G.T. Rado and H. Suhl, Eds.
New York: Academic, 1963, pp. 271-350.

[3] Schafer J B, Frankowski D, Herlocker J, et al. Collaborative


filtering recommender systems[M]//The adaptive web.
Springer Berlin Heidelberg, 2007: 291-324.

[4] Sarwar B, Karypis G, Konstan J, et al. Item-based


collaborative filtering recommendation
algorithms[C]//Proceedings of the 10th international
conference on World Wide Web. ACM, 2001: 285-295.
As Fig.5 showed, when recommending three jobs for each
student, the improved recommender had a little promotion
at Precision, recall and F1 score. When the number of
recommended item came to two, as Fig.6 showed, all these
three indicator score increased significantly. And the Reach
Rate remains as before. Because the sparseness of our apply
records dataset, recommender can only offer 3 or
4recommended results for some students. To evaluate the
overall recommender’s performance, we considered the
number of recommended items as three and two. If user U
has three recommended jobs---(job 1, job 2, job 3), when
recommending jobs for U, the improved recommender just
re-ranking the three jobs recommended from the traditional
recommender, so that it has no influence on the precision,
recall and F1 score. However, when evaluating the Top 2
recommended jobs, the improved recommender changed the
order of these three jobs, so that the top 2 jobs are different
from former ones. The increased scores suggest that the
improved recommender works well for the reason that jobs

© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1021

You might also like