Professional Documents
Culture Documents
Irjet V4i5343
Irjet V4i5343
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1018
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072
Collaborative Filtering as a whole aspect. Collaborative Data Preprocessing: In this step, we clean the raw
Filtering approaches often suffer from three problems: data to filter useless data including inactive users
cold start, scalability and sparsely. These three drawbacks and expired recruiting information.
are very problematic at times and crucial opportunities
can be missed because of this. B. Item-based CF Deals with Boolean Data:
B. Methods of Similarity Calculation Rather than using similarities between users’ rating
1) Cosine Similarity behavior to predict preferences, item– based CF uses
Cosine similarity uses two N-dimensional vector’s cosine similarities between the rating patterns of items. Since
Value to indicate the degree of similarity between them. It is finding similar items is easier than finding similar users, and
widely used in information retrieval (IR). attributes of items are more stable than users’ preference,
item-based methods are suitable for off-line computing. The
2) Tanimoto Coefficient: Tanimoto coefficient, also known as preferred outcome of any of the two methods is to provide a
the Jaccard index,measures similarity between finite sample suitable job.
sets, and is defined as the size of the intersection divided by the
size of the union of the sample sets. The procedure is presented below:
XY for each jobi useri applied{
Jaccard (X, Y) =X Y(2) for each co-applied userj who applied jobi{
find out jobs that userj applied;
3) Log Likelihood add these jobs to candidate set;
Similar to Tanimoto coefficient, the Log likelihood method }
calculate similarity based on the common preference two delete jobi from candidate set;
users shared. Given the total number of items and the }
number of each user rated items, the final result is the
impossibility of that the two users have such common IV. EXPERIMENTS
preference.
In this section, user-based and item-based CF algorithms are
4) The City Block Distance tested on our data set respectively. Then item-based CF,the
The city block distance is the sum of the lengths of the better performed one, to be applied on the Student
projections of the line segment between the points onto the JobHunting recommendation system. At last, we evaluate the
coordinate axes. performance of improved recommender that using used-
n liked job and co-apply users weights based on item-based
i i i D x y x y (3) algorithm. The implementation of our experiments is based
on ApacheMahout.
2. RECOMMENDATION SYSTEM OF STUDENTS JOB
HUNTING(SJH)
C. Evaluation Of User-based CF
1) Contrast of different similarities
D. Evaluation of Item-based CF
TABLE II. RECALL OF USER-BASED CF WITH
According to the result of section IV.C, when using Log
DIFFERENT SIMILARITY
likelihood similarity method, item-based algorithm
performed well. So we decided to use item-based CF and
selected Log likelihood method to compute candidate items’
similarities in the Student Job Hunting recommendation
system.
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1020
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 05 | May -2017 www.irjet.net p-ISSN: 2395-0072
V. CONCLUSION
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1021