You are on page 1of 24

Music Recommendation System

Problem Statement
To develop a music recommendation systems that will recommend songs to
users based on the audio features and the metadata of the songs that they have
recently listened
Approach
❖ Import Libraries (Spotipy, Basic Libraries)

❖ Reading the Dataset (Spotify Data, spotify dataset from kaggle)

❖ Exploratory Data Analysis

➢ Clustering Songs with K-Means

❖ Building a Content-Based Recommender System using cosine distance.

➢ Compute the average vector of the audio and metadata features for each song the user has listened
to using cosine distance.

➢ Find the n-closest data points in the dataset (excluding the points from the songs in the user’s
listening history) to this average vector.

➢ Take these n points and recommend the songs corresponding to them.

❖ Generating song recommendations


Cosine Distance
➢ The cosine distance is commonly used in recommender systems and can work well even
when the vectors being used have different magnitudes.
➢ The cosine distance is one minus the cosine similarity — the cosine of the angle between the
two vectors.

➢ If the vectors for two songs are parallel, the angle between them will be zero, meaning the
cosine distance between them will also be zero because the cosine of zero is 1.
➢ The function scipy.spatial.distance.cdist() is used to calculate cosine distance.
Clustering

❖ Clustering or cluster analysis is a machine learning technique, which groups the unlabelled dataset. It
can be defined as a way of grouping the data points into different clusters, consisting of similar data
points.
❖ It does it by finding some similar patterns in the unlabelled dataset and divides them as per the
presence and absence of those similar patterns.
Partitioning Clustering

It is a type of clustering that divides the data into non-hierarchical


groups. It is also known as the centroid-based method. The most
common example of partitioning clustering is the K-Means Clustering
algorithm.
K means Clustering Algorithm

❏ K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset
into different clusters.
❏ It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a
way that each dataset belongs only one group that has similar properties.
❏ It is a centroid-based algorithm.
❏ The main aim of this algorithm is to minimize the sum of distances between the data point and
their corresponding clusters.
The k-means clustering algorithm mainly performs two tasks:

● Determines the best value for K center points or centroids by an iterative process.
● Assigns each data point to its closest k-center. Those data points which are near to the particular k-center, create a
cluster.

● Determines the best value for K center points or centroids by an iterative process.
● Assigns each data point to its closest k-center. Those data points which are near to the particular k-center, create a
cluster.
How does the K-Means Algorithm Work?

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids.

Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: reassign each datapoint to the new closest centroid of each cluster.

Step-6: If any reassignment occurs, then proceed step 4 else go to FINISH.


Let's understand the above steps by considering the visual plots:

-->Suppose we have two variables M1 and M2. The x-y axis scatter plot of these two
variables is given below:
Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids.

-->Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into different clusters

-->We need to choose some random k points or centroid to form the cluster. These points can be either the
points from the dataset or any other point
Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters

-->Now we will assign each data point of the scatter plot to its closest K-point or centroid. We will compute it
by applying some mathematics that we have studied to calculate the distance between two points. So, we
will draw a median between both the centroids.
-->From the above image, it is clear that points left side of the line is near to the K1 or blue
centroid, and points to the right of the line are close to the yellow centroid
Step-4: Calculate the variance and place a new centroid of each cluster
Step-4: Calculate the variance and place a new centroid of each cluster

-->As reassignment has taken place, so we will again go to the step-4, which is finding new centroids or
K-points
Step-5: reassign each datapoint to the new closest centroid of each cluster

-->We can see in the above image; there are no dissimilar data points on either side of the line, which means
our model is formed
Reference

https://www.kaggle.com/vatsalmavani/music-recommendation-system-using-spotify-dataset/noteb
ook

https://towardsdatascience.com/how-to-build-an-amazing-music-recommendation-system-4cce271
9a572

https://www.javatpoint.com/k-means-clustering-algorithm-in-machine-learning
Research Papers:
https://www.ijert.org/music-recommendation-system-using-content-and-collaborative-filtering-methods

Inference:

The application allows users to select and listen to the songs available in the device. Whenever a user listens to a
particular song, a log is created. In order to suggest songs to the users, we use various strategies to implement
recommendation engine. The main motive of this Proposed System is extending the capabilities of the traditional
recommendation System. Traditional music recommendation systems depend on collaborative filtering or
content-based filtering based on the audio features and meta data of the songs to generate recommendations.
K-Nearest Neighbor model and collaborative filtering are used to recommend songs based on song metadata.

https://link.springer.com/article/10.1007/s11042-015-3202-4

Inference:

People have dissimilar needs regarding stress-relief music. In this paper, we proposed a personalized stress-relieving
music recommendation system. The system structure comprises the following features: (a) automated music
categorization, in which a new clustering algorithm, K-MeansH, is employed to precluster music and improve
processing time; (b) the access and analysis of users’ EEG data to identify perceived stress-relieving music; and (c)
personalized recommendations based on collaborative filtering and provided according to personal preferences
https://www.sciencedirect.com/science/article/pii/S18770509193106

Inference:

The role music recommender system for the music providers is essential. It can predict and then offer the appropriate
songs to their users, consequently the music providers can increase user satisfaction and sell more diverse music.
Generally, music recommender system can be divided into three main parts 9, that is: (i) users, (ii) items and (iii)
user-item matching algorithms. Finally, the matching algorithm should be able to automatically recommend
personalized music to listeners. There are two main approaches of matching algorithm, that is collaborative filtering and
content-based filtering.

https://www.ijltemas.in/DigitalLibrary/Vol.8Issue3/84-86.pdf

Inference:

These days recommendation systems are used in many fields such as item recommendation by amazon, show
recommendation by Netflix, etc. Such recommendation systems are based mainly on collaborative filtering , which
involves recognizing similar users and combining their recommendation based on each users preferences. The other
method used in combination with collaborative filtering is content based filtering.This paper presents the project to
implement a music recommendation system which recommends the next song based on preferences of the user and on
songs history the user has listened by using a machine learning model. The model is built using k means clustering
algorithm
Music Recommendation System Based on Usage History and Automatic Genre Classification

https://sci-hub.hkvisa.net/10.1109/icce.2015.7066352
Inference :
The MusicRecom system is the personalized music services based on usage history and automatic
genre classification. We used 5 and 10-dim feature vector for genre classification without performance
degradation. This system can be applied to various audio devices, apps and services.

Music Recommendation from Song Sets


http://shiftleft.com/mirrors/www.hpl.hp.com/techreports/2004/HPL-2004-148.pdf

Inference:
proposed solutions to the problem of music recommendation based solely on acoustics from sets of
related songs. We found that for a timbre-based similarity measure, the best recommendations were
obtained by ranking songs by the minimum of their distance to songs in the song set.
A Novel Hybrid Music Recommendation System using K-Means Clustering
https://sciresol.s3.us-east-2.amazonaws.com/IJST/Articles/2016/Issue-28/Article81.pdf
Inference :

Music recommender system using hybrid approach of K-Means for recommending music is expected to be
efficient for its users. Such system saves the time of users in searching and along with that fulfilling their need
in terms of searching without even making effort by the users

Music Recommendation System


https://www.ijert.org/research/music-recommendation-system-IJERTV8IS070267.pdf
Inference :
Music Recommendation System is used to recommend songs based on factors that have lyrics similarity
between songs, audio features of songs, metadata of songs using Artificial Neural Network (ANN) and KNN
Regression algorithm. The system allows users to create playlists, add songs to it and stream it whenever
they are logged in. Recommendations are also made based on the same artist.
Music Recommendation System

https://www.ijert.org/music-recommendation-system

Inference:
Music Recommendation System is used to recommend songs based on factors that have lyrics similarity between
songs, audio features of songs, metadata of songs using Arificial Neural Network (ANN) and KNN Regression
algorithm. The system allows users to create playlists, add songs to it and stream it whenever they are logged in.
Recommendations are also made based on the same artist. The system contains 2215 songs which can be played
along with instantaneous recommendations for each song which is being played.

https://www.irjet.net/archives/V6/i6/IRJET-V6I6319.pdf
Inference:
We have presented a personalized music recommendation system based on the CNN approach and
collaborative filtering algorithm. CNN approach was used to classify music based on corresponding audio signals
of the music into genre and gives genre-based recommendation. The CF algorithm uses log file to provide
recommendation to users by calculating item-based similarity. As a part of our future work, we would like to on
work on the efficiency of the genre classification. We would also like to extract the user’s information (like
geographical location, time, emotions, etc.) to provide better music recommendation that match the user’s
preference. The usage of the latest song can also be considered in the dataset to keep the system updated.
Thank You!!!

You might also like