You are on page 1of 46

COURSE RECOMMENDER USING COLLBORATIVE

FILTERING

A SEMINAR REPORT

Submitted by

G JAYANDRAN [RA1811027020039]
M MUSTAFA [RA1811027020041]
A RUSHAB [RA1811027020046]
Under the guidance of

Mrs. R. SUJEETHA
(Assistant Professor, Department of Computer Science and Engineering)

In partial fulfillment for the award of the

degree of

BACHELOR OF TECHNOLOGY
in

COMPUTER SCIENCE AND ENGINEERING WITH


SPECIALIZATION IN BIG DATA ANALYTICS
of
FACULTY OF ENGINEERING AND TECHNOLOGY

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY


RAMAPURAM CAMPUS, CHENNAI -600089
DECEMBER, 2020
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
(Deemed to be University u/S 3 of UGC Act, 1956)

BONAFIDE CERTIFICATE

Certified that this seminar report “COURSE RECOMMENDER USING


COLLABORATIVE FILTERING” is the Bonafide work of “G.
JAYANDRAN(RA1811027020039), M. MUSTAFA(RA1811027020041),
A. RUSHAB(RA1811027020046)”,who carried out the project work under
my supervision.

SIGNATURE SIGNATURE

Ms. R.SUJEETHA Dr. K. Raja


SUPERVISOR HEAD OF THEDEPARTMENT

Assistant Professor, Computer Science and Engineering,

Computer Science and Engineering,


SRM Institute of Science and Technology, SRM Institute of Science and Technology,

Ramapuram Campus, Chennai. Ramapuram Campus, Chennai.

Submitted for the Viva Voce Examination held on 07-12-2020 at SRM Institute of Science
and Technology , Ramapuram Campus, Chennai -600089

INTERNALEXAMINERI INTERNAL EXAMINER


ABSTRACT

• Being in the era of advance technology, people are more open minded and rely
heavily on modern applications for buying accessories, watching movies, learning
resources and stuffs related to their daily needs. Due to this perpetual rise in
everything being available online and digitally , the organizations are relying on
machine learning based technologies which are helping them in seeking the actual
targeted users with less efforts as compared to earlier methods of advertisement and
also helping out the user to have a great user experience by recommending them the
related products so the users don't have to rely on their competitions to get the desired
products. We are developing this Technology which helps us to understand the
requirements and gives recommendation for the products and services searched by the
user by comparing their previous history. This model compares various machine
learning algorithms for recommendation of various product buying pattern by users
and gives more accurate result related to search. It also carries all buying and the
watched history of user, so it is easy for the model to predict the best option to buy
and recommend the product.
• Key Words: -Machine Learning, Filtering, Recommendation system,
TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE


3
ABSTRACT

. LIST OF FIGURES 5
LIST OF TABLES 6
LIST OF ABBREVATIONS 7
1. INTRODUCTION 8

1.1 Background 8
9
1.2 Problem Statement
9
1.3 Objective
9
1.4 Scope of Project
2. LITERATURE REVIEW 10

3. SYSTEM SPECIFICATION 17

3.1 System Requirements 17

3.2 System Design 17


4. IMPLEMENTATION 18

4.1 Explanation about Project 18


19
4.2 How to do it?
19
4.3 Architecture Diagram
20
4.4 Steps for Implementing the code
21
4.5 Flow chat
22
4.6 Data Set 24
4.7 Result

5. CONCLUSION&FUTURE SCOPE 27&28


6. REFERENCES 29
7 APPENDIX 30
LIST OF FIGURES
Figure 1: - Architecture Diagram

Figure 2: - Flow Chart

Figure 3: - Login Page Interface

Figure 4: - Login Credentials

Figure 5: - After Login click “Get Recommendations”

Figure 6: - Interface to Enter Subjects

Figure 7: - After Entering Gained Knowledge Course

Figure 8: - Final Output

LIST OF TABLES

TABLE 1: - COLLABORATIVE FILTERING ADVANTAGES AND DISADAVNTAGES


LIST OF ABBREVATION
NumPy – Numerical Python
Tkinter – TK Interface
GUI – Graphical user interface
SQL Data Base – data base connects
CBF-Content Based Filtering
CF- Collaborative Filtering
ML- Machine Learning
MAE- Mean Absolute Error

CHAPTER 1
INTRODUCTION

1.1Background
A recommender system, or a recommendation system, is a subclass of information filtering
system that seeks to predict the "rating" or "preference" a user would give to an item. They
are primarily used in commercial applications.Collaborative filtering refers to user-to-user
association. It follows the concept that if two or more folks have identical interests in one
area then there is a likelihood that they will get attracted towards similar products or items of
some other category as well. Implicit and explicit user ratings are considered to compute
similarity between two or more users. While implicit ratings are derived from user browsing
pattern and click- through rate, explicit ratings are delivered by the user himself. The options
of people you may know, suggested posts, similar pages you may like, suggested pokes,
displayed on Facebook, are the examples of collaborative filtering. These are nothing but
recommendations, based on features like number of mutual friends, similar pages liked or
number of mutual groups, locations a user has been to or belongs to etc. For example, if two
users have mutual friends, then the possibility is that they two may know each other as

well.MongoDB is a cross-platform document-oriented database program. Classified as a


NoSQL database program, MongoDB uses JSON-like documents with optional schemas.
MongoDB is developed by MongoDB Inc. and licensed under the Server-Side Public
License.Tkinter is a Python binding to the Tk GUI toolkit. It is the standard Python interface
to the Tk GUI toolkit, and is Python's de facto standard GUI. Tkinter is included with
standard Linux, Microsoft Windows and Mac OS X installs of Python. The name Tkinter
comes from Tk interface.Companies using recommender systems focus on increasing sales as
a result of very personalized offers and an enhanced customer experience. Recommendations
typically speed up searches and make it easier for users to access content they’re interested
in, and surprise them with offers they would have never searched for. What is more,
companies are able to gain and retain customers by sending out emails with links to new
offers that meet the recipients’ interests, or suggestions of films and TV shows that suit their
profiles. The user starts to feel known and understood and is more likely to buy additional
products or consume more content. By knowing what a user wants, the company gains
competitive advantage and the threat of losing a customer to a competitor decreases.
Providing that added value to users by including recommendations in systems and products is
appealing. Furthermore, it allows companies to position ahead of their competitors and
eventually increase their earnings. 

1.2 Problem Statement


To recommend courses for the users using collaborative filtering and evaluating the
performance of this technique. This system helps to recommend courses to the user based on
the previous knowledge gained.

1.3 Objective
 The objective of our project is to give course recommendations using collaborative
filtering.
 The objective of recommender systems is to provide recommendations based on
recorded information on the users' preferences. These systems use information
filtering techniques to process information and provide the user with potentially more
relevant items.
 Recommender system has the ability to predict whether a particular user would prefer
an item or not based on the user's profile. Recommender systems are beneficial to
both service providers and users They reduce transaction costs of finding and
selecting items in an online shopping environment

1.4 Scope of the project


It provides easiness to the user to get recommendations. It uses collaborative filtering which
makes user to get further courses. It gives the almost 100% accuracy to user and help in
taking new courses.

CHAPTER 2
LITERATURE REVIEW
2.1 LITRETURE OVERVIEW

To build a system that can automatically recommend items to users based on the preferences
of other users, the first step is to find similar users or items. The second step is to predict the
ratings of the items that are not yet rated by a user. So, you will need the answers to these
questions:

 How do you determine which users or items are similar to one another?
 Given that you know which users are similar, how do you determine the rating that a user
would give to an item based on the ratings of similar users?
 How do you measure the accuracy of the ratings you calculate?

• The first two questions don’t have single answers. Collaborative filtering is a family of
algorithms where there are multiple ways to find similar users or items and multiple ways to
calculate rating based on ratings of similar users. Depending on the choices you make, you
end up with a type of collaborative filtering approach.
• One important thing to keep in mind is that in an approach based purely on collaborative
filtering, the similarity is not calculated using factors like the age of users, genre of the
movie, or any other data about users or items. It is calculated only on the basis of the rating
(explicit or implicit) a user gives to an item.
• The third question for how to measure the accuracy of your predictions also has multiple
answers, which include error calculation techniques that can be used in many places and not
just recommenders based on collaborative filtering.
• One of the approaches to measure the accuracy of your result is the Root Mean Square
Error (RMSE), in which you predict ratings for a test dataset of user-item pairs whose rating
values are already known. The difference between the known value and the predicted value
would be the error. Square all the error values for the test set, find the average (or mean), and
then take the square root of that average to get the RMSE.
• Another metric to measure the accuracy is Mean Absolute Error (MAE), in which you find
the magnitude of error by finding its absolute value and then taking the average of all error
values <name>(2020) mainly incorporated collaborative filtering mainly following two
approaches that is Memory-Based approach and the Model-Based approach to achieve an
efficient actively filtering system for recommending movies.
Memory Based approach is based on taking a matrix of preferences for items by users
using this matrix to predict missing preferences and recommend items with high predictions.
Simply stated “Item-Item CF and User-Item CF”
Model Based approach, also known as KNN collaborative system that is proposed
using cosine similarity to calculate distance between the target movie and every other movie
on dataset and then it ranks the top ‘K’ nearest similar movie. This methodology was
implemented using two datasets.
The Item-based CF was used in obtaining better results.The system proposed by the
paper was observed to be more reliable and accurate compared to the existing system. On the
other hand, KNN approach turned out to be faster and predictive in nature and low
calculation time.
Paper explains about content-based filtering and mainly focus on collaborative filtering and
the two popular collaborative filtering approaches:
1.Memory-Based Approach: It is based on taking a matrix of preferences for items by users
using this matrix to predict missing preferences and recommend items with high predictions.
Simply stated “Item-Item CF and User-Item CF”
2.Model-Based Approach: KNN collaborative recommendation system is proposed using
cosine similarity to calculate distance between the one course and every other course on
dataset and then it ranks the top ‘K’ nearest similar course.
The paper also implements the idea of the proposed system with 2 data sets.
For years, E-learning systems are around to provide students and learners with virtual
educational environments in which they have no need for others' assistance in the process of
learning. The major goal of such systems is to improve education and learning levels of users
through leveraging the system’s facilities. The core component of a working and efficient e-
learning system is its recommender system. Due to importance of personalization of e-
learning platforms, developing recommender systems for such systems has been become one
of the most important research areas in the field. We use Hybrid Recommender
systems.Architecture of the proposed E-learning system along with Neural network based
matching algorithm powered by user-specific patterns and system capabilities. The advantage
of using hybrid approach is toPresence of a knowledge base for decision making process
plus to improve them. But Hidden patterns in system’s log and users’ behavior is not
explored in this architecture.
2.2 WHAT IS RECOMMENDATION SYSTEM & ITS TYPES
A recommendation engine is a system that suggests products, services, information to users

based on analysis of data. Notwithstanding, the recommendation can derive from a variety of

factors such as the history of the user and the behaviour of similar users.

Recommendation systems are quickly becoming the primary way for users to expose to the

whole digital world through the lens

oftheir experiences, behaviours, preferences and interests. And in a world of information

density and product overload, a recommendation engine provides an efficient way for

companies to provide consumers with personalised information and solutions.

Types:

Collaborative Recommender System:

It’s the most sought after, most widely implemented and most mature technologies that is
available in the market. Collaborative recommender systems aggregate ratings or
recommendations of objects, recognize commonalities between the users on the basis of their
ratings, and generate new recommendations based on inter-user comparisons. The greatest
strength of collaborative techniques is that they are completely independent of any machine-
readable representation of the objects being recommended and work well for complex objects
where variations in taste are responsible for much of the variation in preferences.
Collaborative filtering is based on the assumption that people who agreed in the past will
agree in the future and that they will like similar kind of objects as they liked in the past.

Content based Recommender System:

It’s mainly classified as an outgrowth and continuation of information filtering research. In


this system, the objects are mainly defined by their associated features. A content-based
recommender learns a profile of the new user’s interests based on the features present, in
objects the user has rated. It’s basically a keyword specific recommender system here
keywords are used to describe the items. Thus, in a content-based recommender system the
algorithms used are such that it recommends users similar items that the user has liked in the
past or is examining currently.
Demographic based Recommender System:

This system aims to categorize the users based on attributes and make recommendations
based on demographic classes. Many industries have taken this kind of approach as it’s not
that complex and easy to implement. In Demographic-based recommender system the
algorithms first need a proper market research in the specified region accompanied with a
short survey to gather data for categorization. Demographic techniques form “people-to-
people” correlations like collaborative ones, but use different data. The benefit of a
demographic approach is that it does not require a history of user ratings like that in
collaborative and content-based recommender systems.

2.3 CHALLENGES AND ISSUES OF RECOMMENDATION SYSTEM


Challenges and issues of recommender system Following challenges and issues are found in
the recommender system.
• Changing user preferences: Recommender system is the mainly based upon the user interest
and profile. The user’s attraction and preferences are change after some time so changing
user preferences is one of the main challenges in recommender system.
• Sparsity: There are large number of users and items but there are almost always users rated
just a few items. Techniques of recommender systems create neighborhoods of users using
their profiles which are generated from their interest. If the user has ‘generated just small
number of items then it’s hard to conclude his/her taste and he/she could be related to false
neighborhood. Sparsity is issue of the small number of information.
• Scalability: With the growth of numbers of the users and items, the system needs more
resources for processing information and giving recommendations.
• Synonymy: Synonymy is the likelihood of very nearer items to have dissimilar names or
entries. Most recommender systems find it hard to make distinction between nearer related
items such as the difference between the baby wear and baby cloth.
• Privacy: In order to receive the most accurate and correct recommendations, the system
must acquire the most amount of information possible about the user, including demographic
data and data about the location of a particular user.

2.4 WHAT IS MACHINE LEARNING AND ITS TYPES


For such widely used learning algorithms one can present some major classes of problems
with which I have to deal with in order to perform the presented task:
A. Classification
It deals with problems related to assigning classes to each of the analyzed items. An example
of such a task can be the problem of image recognition.
B. Regression
This is an estimate of the actual value of the object. A good example of such a task may be an
attempt to estimate the value of bonds on the basis of economic variables.
C. Ranking
He is responsible for sorting items according to a specific criterion. The most popular task of
is type is the search engine returning websites that satisfy the query sent by the
user.
D. Clustering
It deals with issues concerning the division of objects into certain homogeneous groups. Such
algorithms can be used in social networks to identify smaller social groups from a large
number of people.
E. Dimensionality reduction
It is the transformation of the original representation of the examined object into a
representation of smaller dimensions without losing the original data. Over time, the amount
of different data becomes too large for efficient interpretation. Therefore, by discovering the
correlation between several variables, they can be reduced to one variable.

Due to different scenarios of availability of training data, test data and evaluation of teaching
methods, the following types of machine learning algorithms can be distinguished:
a) Supervised learning
In the case of learning with the teacher, the algorithm receives training data in which the
output value known from the input data is known. This is one of the most popular learning
methods.
b) Unsupervised learning
In contrast to the teaching method with the teacher, the algorithm receives training data that
does not take into account which output value should be obtained from the input data. In this
scenario, the assessment of the extent to which the algorithm has mastered the training data
can be troublesome.
c) emi-supervised learning
Training data, while partially supervised, consist of samples having the expected initial value
as well as samples that do not have it. This method is popular when the input data is easy to
obtain, but the output data is much more expensive.
d) Reinforcement learning
The training and testing phases are combined in a reinforcement approach. The learned
algorithm, by interacting with the environment, collects data. He receives, depending on the
action taken, a reward or penalty. The purpose of this method is to maximize the reward for
the learned algorithm.

TABLE: -1

Collaborative Memory Based Model Based Hybrid Filtering


Filtering Filtering Filtering
Techniques
Advantage Easy implementation Better sparsity Improve Sparsity
and scalability,
improve
recommendation
performance.
Disadvantage Performance Loose useful Increased
decreases when data information for complexity,
are sparse, requires dimensionality expensive in
lot of memory and reduction implementation
cpu time technique,
Expensive

DESCRIPTION
Collaborative filtering technique has more advantages over other techniques.
The main advantage over content-based filtering technique is that it improves
the performance of the recommender system and gives better recommendation
because it also considers other similar users interest and past history of that
similar users for giving the recommendation/suggestions to the user.

CHAPTER 3
SYSTEM SPECIFICATION
3.1 System Requirements
In order to accomplish the project, we required,
Anaconda Navigator
Python 3.6 version
NumPy Library
PyAutoGUI Library

3.2 System Design


The project was designed so that users can easily get course recommendations based on their
interest. It uses collaborative filtering which uses the user revies and suggest courses to other
users.
Compatibility: Windows 7 and above
RAM: 4 GB
Memory: 100 MB

CHAPTER 4
IMPLEMENTATION
4.1 Explanation About Project

This project aims to develop a tool for recommending course to users based on their interest.
People like to get recommendations easily without any difficulty. For that purpose, instead of
using internet or other people reference, we built a recommendation system using
collaborative filtering which gives users a better choice for studying. By using machine
learning techniques and algorithms, the machine detects the inputs of the user and
recommends for the further courses. Firstly, the user gives the input based on his knowledge
and then the recommendations are given to user after the system makes the best
recommendation to learn. This system uses collaborative filtering.
LIBRARIES
In order to materialize an active data filtering system irrespective of its type, there are
certain support packages that are needed to be involved buildo one such system. The
methodology that we propose in this project involves Python programming language and its
libraries as the backbone. Libraries are needed to be installed using pip from the terminal and
required libraries are:

1. NumPy
NumPy is a low-level library written in C (and Fortran) for high level mathematical
functions. NumPy cleverly overcomes the problem of running slower algorithms on
Python by using multidimensional arrays and functions that operate on arrays. Any
algorithm can then be expressed as a function on arrays, allowing the algorithms to be
run quickly.

2. SciPy
It is a library that uses NumPy for more mathematical functions. SciPy uses NumPy
arrays as the basic data structure, and comes with modules for various commonly used
tasks in scientific programming, including linear algebra, integration (calculus),
ordinary differential equation solving, and signal processing.

3. Pandas
Pandas is a data manipulation library based on NumPy which provides many useful
functions for accessing, indexing, merging, and grouping data easily. The main data
structure (Data Frame) is close to what could be found in the R statistical package;
that is, heterogeneous data tables with name indexing, time series operations, and
auto-alignment of data.

4.2How to do it?
 Open up the application.

 Create a new account and login.

 Enter previously studied subjects.

 Click on get recommendation button.

 Get suggestions.

4.3 ARCHITECTURE DIAGRAM

Figure 1 :- Architecture Diagram


4.4 Steps for Implementing the Code:
Step-1: Open the tkinter application

Step-2: Create a new account and login using the credentials. The new account which is
created, their credentials gets stored in sql data base.

Step-3: Enter the previously studied subjects. This technology helps us to understand the
requirements and gives recommendation for the products and services searched by the user
by comparing their previous history. This model compares various machine learning
algorithms for recommendation of various product buying pattern by users and gives more
accurate result related to search. It also carries all buying and the watched history of user, so
it is easy for the model to predict the best option to buy and recommend the product.

Step-4: Get the recommendations. By getting the recommendations, we get the list of


subjects recommended to us. This is an Online Course Recommendation system which uses
Udacity course catalog to recommend courses to the user using collaborative filtering. The
efficiency of the recommender has also been calculated by splitting the dataset into training
and testing parts.

4.5 FLOW CHAT DIAGARAM


Figure 2 :- Flow Chart

4.6 IDEAL DATASET


The dataset serves as the fuel for the engine that the recommender system is. The
recommender model can be tuned as per the kind of data that we have at disposal. For the
model to perform at the fullest of its potential must have certain specific type of information,
that includes a varied information about the products that we are dealing in and the
information about the users for whom we are trying to streamline the process of
recommending, which basically tries to mimic the behaviour of the user or customer. The
performance of the model incredibly depends on the quality of the dataset. A dataset that
contains not only the basic information about the product and users, but also logs certain
information of the users’ activity, like past orders, products in the wishlist and cart, reviews
uploaded by the user or the star ratings, will outperform the model that is functioning on the
dataset contains purely the basic information about the user and the products.
DATA PREPROCESSING
After importing all the libraries that have been stated in the part we go for processing the
data before we can leverage it for recommending capabilities of the system that we are
aiming for. The procedure for pre-processing can be generalized as follows:
1. Loading the dataset
Using pandas library, the dataset is loaded and a dataframe is instantiated, which acts
as the instance for all the analysis and data manipulation

2. Accounting the datatypes of each attribute


Keeping a tab on various types of data that is involved in the dataframe is essential so
the process of dealing with missing values becomes error free.

3. Accounting the total number of products, ratings and reviews


Identifying the unique available values in product and user related attributes and
accordingly truncating the data frame by omitting the duplicates

4. Handling missing values


Missing values can be tactfully dealt by appropriately replacing the null values with
median or the mode depending upto the type of attribute

5. Dropping the least significant attributes


Least significant attributes like timestamps can be omitted in order to truncate the data
frame

4.7 RESULT
Login Page Interface: Figure 3

It is the interface when the user opens the application and should click login to enter the
credentials.

Login Credentials Figure 4

It is the interface where user can enter the credentials and enter to his personal dashboard.
After Login click “Get Recommendations” Figure 5

It is the interface where the user need to click on the get recommendations.

Interface to Enter Subjects Figure 6

It is the interface to enter the gained knowledge and give the knowledge rating so that the
system or the application can verify it.
After Entering Gained Knowledge Course Figure 7
After entering the gained knowledge and rating click on the get recommendation.

Final Output Figure 8

After clicking on the get recommendation, you will get the output on what courses you can
study and also it gives some recommendation of courses which doesn’t require pre-requisite
knowledge .
CHAPTER 5
CONCLUSION

By using machine learning techniques and algorithms, the machine is able to detect the
movement of palm of the player and the game has been played with hand gestures of the
player in an easy way. By using machine learning techniques and algorithms, Before
recommending course to user , all these levels of filtering is consolidated together by making
the data pass through each filtering funnels - CBF, Collab, Demographic BF and then the
products with the highest favorable score is presented to the customer which help in
achieving robustness dues to the hybrid nature of the model
FUTURE SCOPE

Since recommender system is a project based on machine learning algorithms and artificial
intelligence , it requires large sql data base meory and also it should provide almost 100%
accuracy there are many enhancements to be done, such as: -
• Data should be updated regularly.
• Since in future, data will be in large scale there will be required of content based filtering
based on the user interest and gained previous knowledge.
• The recommender system should give all the related outputs based on the user input.
• The application should be compatible with the latest technology and should be more user
friendly.
• A recommender systems are an important research field today. Rapidly, increasing in data
size like a number of items and users over sites raises the big data analysis techniques like
Spark, Map-Reduce, Apache Hadoop, etc . Recommender system used to recommend course
to the user according to their interests and previously gained knowledge.
CHAPTER 6
REFERENCES

1) Soanpet .Sree Lakshmi and Dr.T.Adi Lakshmi. "Recommendation Systems:Issues and


challenges".(IJCSIT) International Journal of Computer Science and Information
Technologies, Vol. 5(4) ,2014, pp.5771-5772.

2) Debashis Das, Laxman Sahoo,Sujoy Datta." A Survey on Recommendation System".


International Journal of Computer Applications ,vol 160(7), February 2017 , pp.6-10

3) Tanya Maan, Shikha Gupta, Dr. Atul Mishra." A Survey On Recommendation


System",international conference on recent innovations in management,engineering,science
and technology (RIMEST 2018), (2018).pp:543-549.

4) Yagnesh G. Patel, Vishal P.Patel. "A Survey on Various Techniques of Recommendation


System in Web Mining", International Journal of Engineering Development and Research,
2015 IJEDR,Vol 3(4),2015,pp.696- 700.

5) Mustansar Ali Ghazanfar and Adam Prugel-Bennett,” A Scalable, Accurate Hybrid


Recommender System”, 2010 Third International Conference on Knowledge Discovery and
Data Mining.

6) Qian Wang, Xianhu Yuan, Min Sun “Collaborative Filtering Recommendation Algorithm
based on Hybrid User Model”, FSKD, 2010.

7) Katarya, Rahul, and Om Prakash Verma. "A collaborative recommender system enhanced
with particle swarm optimization technique." Multimedia Tools and Applications 75.15
(2016): 9225-9239.

8) Xie, Li, Wenbo Zhou, and Yaosen Li. "Application of Improved Recommendation System
Based on Spark Platform in Big Data Analysis." Cybernetics and Information Technologies
16.6 (2016): 245-255

9) Ponnam, Lakshmi Tharun, et al. "Movie recommender system using item-based


collaborative filtering technique." Emerging Trends in Engineering, Technology and Science
(ICETETS), International Conference on. IEEE, 2016.

10) P.Priyanga , Dr.A.R.Nadira Banu Kamal. "Methods of Mining the Data from Big Data
and Social Networks Based on Recommender System". International Journal of Advanced
Networking & Applications (IJANA), vol 8(5) 2017, pp.55-60.
APPENDIX

PYTHON CODE
import tkinter as tk
from tkinter import *
from tkinter import messagebox as ms
from tkinter import *
from tkinter import messagebox as ms
import sqlite3
from user_specific import *
from user_data import user_record
from similarity_matrix import sim_matrix
from inverted_index import List_of_Courses

# make database and users (if not exists already) table at programme start up
with sqlite3.connect('quit.db') as db:
c = db.cursor()

c.execute('CREATE TABLE IF NOT EXISTS user (username TEXT NOT


NULL ,password TEX NOT NULL);')
db.commit()
db.close()

def cleanup(array,array2):
for items in array:
array2.append(items[:len(items)-1])

i,j=0,0

class s(tk.Tk):
def __init__(self,*args,**kwargs):
tk.Tk.__init__(self,*args,**kwargs)
container=tk.Frame(self,bg="blue")

container.pack(expand=True,fill="both")

self.frame={}

for F in (startpage,main,p1,cr,page2,logout):
frame=F(container,self)

self.frame[F]=frame

frame.grid(row=0,column=0,sticky="nsew")

self.show_frame(startpage)

def show_frame(self,container):
frame=self.frame[container]
frame.tkraise()

class startpage(tk.Frame):

def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
label=tk.Label(self,text="WELCOME!! CLICK BELOW TO GO TO
LOGIN PAGE",font=("Ariel",24),bg="white",fg="blue")
label.pack(padx=300,pady=100)

button1=Button(self,text="Login",font=("Ariel",24),bg =
"red",fg="black",command=
lambda: controller.show_frame(main))
button1.pack(ipadx=200,pady=100)

class page2(tk.Frame):
def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
button1=Button(self,text="Get
recommendations",font=("Ariel",24),bg="white",fg="blue",command=
lambda: controller.show_frame(p1))
button1.pack(padx=290,pady=20)

button1=Button(self,text="Logout",font=("Ariel",24),bg =
"red",fg="black",command=
lambda: controller.show_frame(logout))
button1.pack(ipadx=250,pady=300)

class logout(tk.Frame):
def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
label1=Label(self,text="You have been successfully logged
out",font=("Ariel",24),bg="white",fg="blue")
label1.pack(padx=290,pady=20)
button1=Button(self,text="Click here to go to login
again",font=("Ariel",24),bg = "red",fg="black",command=
lambda: controller.show_frame(main))
button1.pack(ipadx=250,pady=300)

class main(tk.Frame):
def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
self.controller = controller
self.username = StringVar()
self.password = StringVar()
self.label1 = Label(self,text ='LOGIN',font = ('',35),pady = 10)
self.label1.pack(padx=50,pady=50)
self.label1=Label(self,text = 'Username: ',font =
('',20),pady=5,padx=5).pack()
Entry(self,textvariable = self.username,bd = 5,font = ('',15)).pack()
self.label3=Label(self,text = 'Password: ',font =
('',20),pady=5,padx=5).pack()
Entry(self,textvariable = self.password,bd = 5,font = ('',15),show =
'*').pack()
self.log=Button(self,text = ' Login ',bd = 3 ,font =
('',15),padx=5,pady=5,command=self.login).pack()
self.create=Button(self,text = ' Create Account ',bd = 3 ,font =
('',15),padx=5,pady=5,command=
lambda: controller.show_frame(cr)).pack()

#Login Function
def login(self):
#Establish Connection
with sqlite3.connect('quit.db') as db:
c = db.cursor()

#Find user If there is any take proper action


find_user = ('SELECT * FROM user WHERE username = ? and password
= ?')
c.execute(find_user,[(self.username.get()),(self.password.get())])
result = c.fetchall()
if result:
self.controller.show_frame(page2)

else:
ms.showerror('Oops!','Username Not Found.')

class cr(tk.Frame):
def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
self.n_username = StringVar()
self.n_password = StringVar()
self.label1 = Label(self,text ='CREATE ACCOUNT',font = ('',35),pady =
10)
self.label1.pack(padx=50,pady=50)
Label(self,text = 'Username: ',font = ('',20),pady=5,padx=5).pack()
Entry(self,textvariable = self.n_username,bd = 5,font = ('',15)).pack()
Label(self,text = 'Password: ',font = ('',20),pady=5,padx=5).pack()
Entry(self,textvariable = self.n_password,bd = 5,font = ('',15),show =
'*').pack()
Button(self,text = 'Create Account',bd = 3 ,font =
('',15),padx=5,pady=5,command=self.new_user).pack()
Button(self,text = 'Go to Login',bd = 3 ,font =
('',15),padx=5,pady=5,command=
lambda: controller.show_frame(main)).pack()

def new_user(self):
#Establish Connection
with sqlite3.connect('quit.db') as db:
c = db.cursor()

#Find Existing username if any take proper action


find_user = ('SELECT * FROM user WHERE username = ?')
c.execute(find_user,[(self.n_username.get())])
if c.fetchall():
ms.showerror('Error!','Username Taken Try a Diffrent One.')
else:
ms.showinfo('Success!','Account Created!')
#Create New Account
insert = 'INSERT INTO user(username,password) VALUES(?,?)'
c.execute(insert,[(self.n_username.get()),(self.n_password.get())])
db.commit()

class p1(tk.Frame):

def __init__(self,parent,controller):
tk.Frame.__init__(self,parent)
self.array=[]
self.text=StringVar()
i,j=0,0
self.items=StringVar()
topics = open('Course-List.txt')
self.array.extend(topics.readlines())
self.topic_new=[]
self.l=[]
cleanup(self.array,self.topic_new)
self.var = []
self.var2 = []
self.var3=StringVar()
self.var4=StringVar()
self.var5=StringVar()
self.var8=StringVar()
self.subject_rating={}
self.sub=[]
entry = Entry(self,textvariable = self.var8,justify = CENTER)
entry.place(x=300,y=140)
self.button1 = Button(self,bg =
"blue",activebackground="pink",text="Back to Previous
page",fg="black",font=("Ariel",18),command =
lambda: controller.show_frame(page2) )
self.button1.place(x=700,y=10)
label =
Label(self,bg="white",fg="blue",font=("Ariel",30),textvariable=self.var5)
self.var5.set("Course recommendation system")
label.place(x=10,y=10)
label2 =
Label(self,bg="white",fg="black",font=("Ariel",18),textvariable=self.var4)
self.var4.set("Please Enter your subjects")
label2.place(x=10, y=100)
self.var.append(StringVar())
subject1 = OptionMenu(self,self.var[0],*list(self.topic_new))
self.var[0].set("Enter The Subject")
subject1.place(x=10,y=140)
self.var2.append(StringVar())
#Entry(root,textvariable = var2[0],justify = CENTER).place(x=300,y=140)
Spinbox(self,textvariable = self.var2[0], from_=1,
to=5).place(x=300,y=140)
self.var2[0].set("")
self.add = Button(self,bg =
"blue",activebackground="pink",text="Add",fg="black",font=("Ariel",18),com
mand =self.action)
self.add.place(x=10,y=200)
self.submit=Button(self,bg =
"red",activebackground="pink",text="Submit",fg="black",font=("Ariel",18),co
mmand =self.action2)
self.submit.place(x=100,y=200)
self.submit1=Button(self,bg =
"green",activebackground="black",text="Get
recomendations",fg="black",font=("Ariel",18),command =self.action3)
self.submit1.place(x=10,y=300)

def action2(self):
ms.showinfo('Success!','Ratings successfully submitted')

def action(self):
global i,j
self.sub.append(self.var[i].get())
self.subject_rating[self.sub[i]]=int(self.var2[i].get())
i = i+1
self.var.append(StringVar())
self.var2.append(StringVar())
OptionMenu(self,self.var[i],*list(self.topic_new)).place(x=10,y=180+j)
self.var[i].set("Enter Subject")
self.add.place(x=10,y=220+j)
self.submit.place(x=100,y=220+j)
self.submit1.place(x=10,y=320+j)
#Entry(root,textvariable = var2[i],justify =
CENTER).place(x=300,y=180+j)
Spinbox(self,textvariable = self.var2[i], from_=1,
to=5).place(x=300,y=180+j)
self.var2[i].set("")
j=j+40

def action3(self):
global i,j
self.sub.append(self.var[i].get())
self.subject_rating[self.sub[i]]=int(self.var2[i].get())
print(self.subject_rating)
l = Course_recommendation(self.subject_rating,user_record)
self.display =
Label(self,bg="blue",fg="black",font=("Ariel",24),textvariable=self.text)
self.display.place(x=500,y=300)
self.text.set("You May Also Like")
scrollbar=Scrollbar(self)
scrollbar.pack(side="right")
mylist=Listbox(self,bg="black",bd=5,fg="red",width=50,height=10,font=('
',16),yscrollcommand=scrollbar.set)
k=0
for items in l:
subject=StringVar()
mylist.insert(END,items)
mylist.place(x=500,y=350+k)
scrollbar.config(command=mylist.yview)

app=s()
app.mainloop()

Graphic User Interface


import os,tkinter,json
from tkinter import *
from tkinter import messagebox as ms
from user_specific import *
from user_data import user_record
from similarity_matrix import sim_matrix
from inverted_index import List_of_Courses

root = Tk()
root.title('Course Recommender System')
root.geometry("750x750")
array=[]
i,j=0,0
topics = open('Course-List.txt')
array.extend(topics.readlines())
def cleanup(array,array2):
for items in array:
array2.append(items[:len(items)-2])

def action():
global i,j
sub.append(var[i].get())
subject_rating[sub[i]]=int(var2[i].get())
i = i+1
var.append(StringVar())
var2.append(StringVar())
OptionMenu(root,var[i],*list(topic_new)).place(x=10,y=180+j)
var[i].set("Enter Subject")
add.place(x=10,y=220+j)
submit1.place(x=10,y=320+j)
submit.place(x=100,y=220+j)
#Entry(root,textvariable = var2[i],justify = CENTER).place(x=300,y=180+j)
Spinbox(root,textvariable = var2[i], from_=1, to=5).place(x=300,y=180+j)
var2[i].set("")
j=j+40

def action2():
ms.showinfo('Success!','Ratings successfully submitted')

def action3():
sub.append(var[i].get())
subject_rating[sub[i]]=int(var2[i].get())
print(subject_rating)
l = Course_recommendation(subject_rating,user_record)
# l contans a list of top 10 courses to be recommended
text = StringVar()
display = Label(bg="white",fg="black",font=("Ariel",24),textvariable=text)
display.place(x=10,y=360+j)
text.set("Recommended Courses For you")
k=0
for items in l:
subject = StringVar()
Label(bg="white",fg="black",font=("Ariel",18),textvariable=subject).place
(x=10,y=420+j+k)
subject.set(items)
k=k+40
topic_new=[]
cleanup(array,topic_new)
var = []
var2 = []
var3=StringVar()
var4=StringVar()
var5=StringVar()
var8=StringVar()
subject_rating={}
sub=[]

entry = Entry(root,textvariable = var8,justify = CENTER)


entry.place(x=300,y=140)
label = Label(bg="white",fg="blue",font=("Ariel",30),textvariable=var5)
var5.set("Course Recommender System")
label.place(x=10,y=10)
root.configure(background = "white")
label2 = Label(bg="white",fg="black",font=("Ariel",18),textvariable=var4)
var4.set("Please Enter your subjects")
label2.place(x=10, y=100)
var.append(StringVar())
subject1 = OptionMenu(root,var[0],*list(topic_new))
var[0].set("Enter The Subject")
subject1.place(x=10,y=140)
var2.append(StringVar())
#Entry(root,textvariable = var2[0],justify = CENTER).place(x=300,y=140)
Spinbox(root,textvariable = var2[0], from_=1, to=5).place(x=300,y=140)
var2[0].set("")
add = Button(root,bg =
"red",activebackground="black",text="Add",fg="black",font=("Ariel",18),com
mand = action)
add.place(x=10,y=200)
submit=Button(root,bg =
"green",activebackground="black",text="Submit",fg="black",font=("Ariel",18),
command = action2)
submit.place(x=100,y=200)
submit1=Button(root,bg = "blue",activebackground="black",text="Get
recomendations",fg="black",font=("Ariel",18),command = action3)
submit1.place(x=10,y=300)
root.mainloop()

DATA PREPROCESSING
import json
import os
import math
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
ps= PorterStemmer()

from nltk.tokenize import RegexpTokenizer

tokenizer = RegexpTokenizer(r'\w+')

import urllib.request
def create_data_files(project_name,base_url,filename):
queue =project_name+'/'+filename+'.txt'

if os.path.isfile(queue):
append_to_file(queue,base_url)

if not os.path.isfile(queue):
write_file(queue,base_url)

def write_file(path, data):


f=open(path,'w')
f.write(data)
f.close()

def append_to_file(path,data):
with open (path,'a') as file:
file.write(data+'\n')
i=0
final_ans_total="{"
with open('IRDATA.json') as json_data:
json_response = json.load(json_data)
for course in json_response['courses']:
#course=json_response['courses'][0]
i=i+1
course_title=course['title']
course_level=course['level']
course_summary=course['summary'].lower()

stop_words=set(stopwords.words("english"))
#words=word_tokenize(course_summary)
words=tokenizer.tokenize(course_summary)
words.sort()
filtered_sentence=[]
new_str=""
final_str="'"+course_title+"':[{'level':'"+course_level+"'},{"
for w in words:
if w not in stop_words:
filtered_sentence.append(w)
#print(filtered_sentence)
freq=1
prev=ps.stem(filtered_sentence[0])
for itr in range (1,len(filtered_sentence)):
#for w in filtered_sentence:
#print(ps.stem(w))
w=ps.stem(filtered_sentence[itr])
curr=w
if(curr==prev):
freq=freq+1
else:
final_str+="'"+prev+"': "+str(1)+","
new_str+=prev+"\n"
freq=1
prev=curr
final_str+="}]"
#print(final_str)
final_ans_total+=final_str+","
create_data_files('final_IR',new_str,'myfile_27_oct_pl')

final_ans_total+="}"
#print(final_ans_total)
create_data_files('final_IR',final_ans_total,'myfile_27_oct')

You might also like