You are on page 1of 48

MUSIC GENRE CLASSIFICATION USING MACHINE

LEARNING

Minor project report submitted


in partial fulfillment of the requirement for award of the degree of

Bachelor of Technology
in
Computer Science & Engineering

By

G. Yoganandha Reddy (20UECS0314) (VTU17188)


K. Veera Babu (20UECS0448) (VTU17208)
B. Jayakrishna (20UECS0098) (VTU17600)

Under the guidance of


Dr. V. Kalpana, M.E., Ph.D.,
ASSOCIATE PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SCHOOL OF COMPUTING

VEL TECH RANGARAJAN DR. SAGUNTHALA R&D INSTITUTE OF


SCIENCE & TECHNOLOGY
(Deemed to be University Estd u/s 3 of UGC Act, 1956)
Accredited by NAAC with A++ Grade
CHENNAI 600 062, TAMILNADU, INDIA

November, 2023
MUSIC GENRE CLASSIFICATION USING MACHINE
LEARNING

Minor project report submitted


in partial fulfillment of the requirement for award of the degree of

Bachelor of Technology
in
Computer Science & Engineering

By

G. Yoganandha Reddy (20UECS0314) (VTU17188)


K. Veera Babu (20UECS0448) (VTU17208)
B. Jayakrishna (20UECS0098) (VTU17600)

Under the guidance of


Dr. V. Kalpana, M.E., Ph.D.,
ASSOCIATE PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SCHOOL OF COMPUTING

VEL TECH RANGARAJAN DR. SAGUNTHALA R&D INSTITUTE OF


SCIENCE & TECHNOLOGY
(Deemed to be University Estd u/s 3 of UGC Act, 1956)
Accredited by NAAC with A++ Grade
CHENNAI 600 062, TAMILNADU, INDIA

November, 2023
CERTIFICATE
It is certified that the work contained in the project report titled “MUSIC GENRE CLASSIFICA-
TION USING MACHINE LEARNING” by “ G. Yoganandha Reddy (20UECS0314), K. Veera Babu
(20UECS0448), B. Jayakrishna (20UECS0098)” has been carried out under my supervision and that
this work has not been submitted elsewhere for a degree.

Signature of Supervisor
Dr. V. Kalpana, M.E., Ph.D.,
Associate Professor
Computer Science & Engineering
School of Computing
Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science & Technology
November, 2023

Signature of Head of the Department Signature of the Dean


Dr. M.S. Murali Dhar, M.E., Ph.D., Dr. V. Srinivasa Rao
Associate Professor & Head Professor & Dean
Computer Science & Engineering Computer Science & Engineering
School of Computing School of Computing
Vel Tech Rangarajan Dr. Sagunthala R&D Vel Tech Rangarajan Dr. Sagunthala R&D
Institute of Science & Technology Institute of Science & Technology
November, 2023 November, 2023

i
DECLARATION

We declare that this written submission represents our ideas in our own words and where others ideas
or words have been included, we have adequately cited and referenced the original sources. We
also declare that we have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We understand
that any violation of the above will be cause for disciplinary action by the Institute and can also
evoke penal action from the sources which have thus not been properly cited or from whom proper
permission has not been taken when needed.

G.YOGANANDHA REDDY
Date: / /

K.VEERA BABU
Date: / /

B.JAYAKRISHNA
Date: / /

ii
APPROVAL SHEET

This project report entitled “MUSIC GENRE CLASSIFICATION USING MACHINE LEARNING”by
G. Yoganandha Reddy (20UECS0314), K. Veera Babu (20UECS0448), B. Jayakrishna (20UECS0098)
is approved for the degree of B.Tech in Computer Science & Engineering.

Examiners Supervisor

Dr. V. Kalpana, M.E., Ph.D.,

Date: / /
Place:

iii
ACKNOWLEDGEMENT

We express our deepest gratitude to our respected Founder Chancellor and President Col. Prof.
Dr. R. RANGARAJAN B.E. (EEE), B.E. (MECH), M.S (AUTO),D.Sc., Foundress President Dr.
R. SAGUNTHALA RANGARAJAN M.B.B.S. Chairperson Managing Trustee and Vice President.

We are very much grateful to our beloved Vice Chancellor Prof. S. SALIVAHANAN, for provid-
ing us with an environment to complete our project successfully.

We record indebtedness to the our Professor & Dean, Department of Computer Science &
Engineering, School of Computing, Dr. V. SRINIVASA RAO, M.Tech., Ph.D., for immense care
and encouragement towards us throughout the course of this project.

We are thankful to the our Head of the department, Department of Computer Science & En-
gineering, Dr. M.S. MURALI DHAR, M.E., Ph.D., for providing immense support in all our
endeavors.

We also take this opportunity to express a deep sense of gratitude to our Internal Supervisor Dr.
V.KALPANA, M.E., Ph.D., for her cordial support, valuable information and guidance, she helped
us in completing this project through various stages.

A special thanks to our Project Coordinators Mr. V. ASHOK KUMAR, M.Tech., Ms. C.
SHYAMALA KUMARI, M.E., for their valuable guidance and support throughout the course of the
project.

We thank our department faculty, supporting staff and friends for their help and guidance to com-
plete this project.

G. Yoganandha Reddy (20UECS0314)


K. Veera Babu (20UECS0448)
B. Jayakrishna (20UECS0098)

iv
ABSTRACT

Music plays a very important role in people’s lives. The music available today on
Internet is increasing rapidly in huge volume.To properly index them if we want to
have access to these audio data. The search engines available in market also find it
challenging to classify and retrieve the audio files relevant to the user’s interest.In this
proposed system, machine learning approach is used to classify the different types
of genres (Classical, Hip Hop, Country, Rock, Metal, Blues, Pop, Jazz, and disco).
The application uses a K Nearest Neighbour(KNN) model to perform the classifi-
cation. A Mel Frequency of each track from the GTZAN dataset is obtained. A
piece of software is implemented which performs classification of 1000(audio files)
database of songs into their respective genres. The Extension of this work would be
to consider bigger data sets and also tracks in different formats(mp3, au etc). The
performance of the system is evaluated using ”kaggle” data set. The proposed work
using KNN algorithm is out gives an accuracy of 80.00% performs when compare to
other model.

Keywords: GTZAN Dataset, Genre Classification, Kaggle, Mel Frequency, Ma-


chine Learning

v
LIST OF FIGURES

4.1 Architecture Diagram of Music Genre Classification . . . . . . . 10


4.2 Data Flow Diagram For Music Genre Classification . . . . . . . 12
4.3 Use Case Diagram For Muaic Genre Classification . . . . . . . . 13
4.4 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.5 Activity Diagram For Music Genre Classification . . . . . . . . . 15

5.1 Dataset for Different Types of Genres . . . . . . . . . . . . . . . 21


5.2 Output Design of Genre Classification . . . . . . . . . . . . . . . 22
5.3 Output of Testing Audio Files . . . . . . . . . . . . . . . . . . . . 24

6.1 Output and accuracy 0f given audio file(jazz) . . . . . . . . . . . 28

7.1 Plagarism Report . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8.1 Poster Presentation . . . . . . . . . . . . . . . . . . . . . . . . . 34

vi
LIST OF ACRONYMS AND
ABBREVIATIONS

KNN K Nearest Neighbor


MFCC Mel Frequency Cepstral
MGR Music Genre Recognition
MRS Music Recommender System
ML Machine Learning

vii
TABLE OF CONTENTS

Page.No

ABSTRACT v

LIST OF FIGURES vi

LIST OF ACRONYMS AND ABBREVIATIONS vii

1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Project Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Scope of the Project . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 LITERATURE REVIEW 4

3 PROJECT DESCRIPTION 7
3.1 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Feasibility Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 Economic Feasibility . . . . . . . . . . . . . . . . . . . . . 7
3.3.2 Technical Feasibility . . . . . . . . . . . . . . . . . . . . . 8
3.3.3 Social Feasibility . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 System Specification . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . 9
3.4.2 Software Specification . . . . . . . . . . . . . . . . . . . . 9
3.4.3 Standards and Policies . . . . . . . . . . . . . . . . . . . . 9

4 METHODOLOGY 10
4.1 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 Design Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.1 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . 12
4.2.2 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . 13
4.2.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2.4 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Algorithm & Pseudo Code . . . . . . . . . . . . . . . . . . . . . . 16
4.3.1 Algorithm: K-Nearest Neighbours . . . . . . . . . . . . . . 16
4.3.2 Pseudo Code . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Module Description . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.1 Module1:Import required libraries . . . . . . . . . . . . . . 17
4.4.2 Module2:Processing of data . . . . . . . . . . . . . . . . . 18
4.4.3 Module3:Apply Machine Learning Algorithms(KNN) . . . 19
4.5 Steps to execute/run/implement the project . . . . . . . . . . . . . . 19
4.5.1 Step1:Requirements . . . . . . . . . . . . . . . . . . . . . 19
4.5.2 Step2:Collection of Data set . . . . . . . . . . . . . . . . . 19
4.5.3 Step3:Modules . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5.4 Step4:Output . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 IMPLEMENTATION AND TESTING 21


5.1 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.1 Input Design . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.2 Output Design . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Types of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3.1 Unit testing . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3.2 Integration testing . . . . . . . . . . . . . . . . . . . . . . 23
5.3.3 System testing . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3.4 Test Result . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 RESULTS AND DISCUSSIONS 25


6.1 Efficiency of the Proposed System . . . . . . . . . . . . . . . . . . 25
6.2 Comparison of Existing and Proposed System . . . . . . . . . . . . 25
6.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.5 Future Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . 29

7 PLAGIARISM REPORT 30

8 SOURCE CODE & POSTER PRESENTATION 31


8.1 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.2 Poster Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . 34

References 35
Chapter 1

INTRODUCTION

1.1 Introduction

Music classification is taken into account as a awfully challenging task to selec-


tion and extraction of appropriate audio features. Nowadays online music databases
growing rabidly and it’s very hard for people to accessing those data. One way to
arrange and categorize songs is predicated on the genre. Genres are identified by
some characteristics of music like rhythmic structure, harmonic content and instru-
mentation. Building this technique requires extracting acoustic features that are good
estimators of the kind of genres we have an interest, followed by one or multi label
classification or in some cases, regression stage. The features are then used as input
to the machine learning stage. We are visiting make use of GTZAN data set which
is basically famous in Music Information Retrieval.

Such system can help for the production and the utilization of electricity optimiza-
tion. This decreases the electricity usage costs for each individual household with
help of improved production scheduling and electricity purchase in advance. In to-
day’s era, the software industry gradually moves forward to Machine Intelligence.
Machine Learning becomes essential at each sector for making intelligent machines.
In a simple Process, Machine Learning is a set of algorithms, which parses data, get
from them, and apply the learning to make intelligent decisions. We will comapare
the last month device value with present month value,if it exceeds the user will get
notification.

Music Genre Classification or classification of music into different categories or


genres is a concept that helps the masses to differentiate between 2 genres based on
their composition or the beats they withhold. In recent times, music genre classifi-
cation has become a very popular concept as more and more genres are emerging
around the world.Music is a universal language that has been an integral part of hu-
man culture for centuries. With the advent of digital music streaming services, the

1
demand for music recommendation systems has increased significantly. Music genre
classification is a fundamental task in music information retrieval, which is essential
for building effective music recommendation systems and music search engines. In
recent years, machine learning-based approaches have become popular for music
genre classification tasks due to their high accuracy and scalability. Music is an in-
tegral part of our lives, and with the advent of digital music streaming services, the
availability of music has increased manifold. With such a vast collection of music
available, it becomes crucial to categorize it into different genres to provide better
recommendations and improve user experience. Music genre classification is also
useful in various other applications such as music information retrieval, music rec-
ommendation systems, and music search engines.

1.2 Aim of the Project

The main aim is to create a machine learning model, which classifies music sam-
ples into different genres. It aims to predict the genre(Classical, Hip Hop, Country,
Rock, Metal, Blues, Pop, Jazz, and disco) using an audio signal as its input.The KNN
algorithm used and the objective of automating the music classification is to make
the selection of songs quick and less cumbersome.

1.3 Project Domain

Machine learning is the scientific study of algorithms and statistical models that
computer systems use to perform a specific task without using explicit instructions,
relying on patterns and inference instead. It is seen as a subset of artificial intelli-
gence. Machine learning algorithms build a mathematical model based on sample
data, known as ”training data”, in order to make predictions or decisions without
being explicitly programmed to perform the task. Machine learning algorithms are
used in a wide variety of applications, such as email filtering and computer vision,
where it is difficult or in- feasible to develop a conventional algorithm for effectively
performing the task.

Machine learning is closely related to computational statistics, which focuses on


making predictions using computers.The study of mathematical optimization deliv-
ers methods,theory and application domains to the field of machine learning.Data

2
mining is a field of study within machine learning, and focuses on exploratory data
analysis through unsupervised learning. In this application across business problems,
machine learning is also referred to as predictive analytics.

1.4 Scope of the Project

The scope of the project is to predict the genre of a particular piece of music
which is in the format of audio. This classifier can compare the accuracy of different
machine learning models.

3
Chapter 2

LITERATURE REVIEW

Keunwoo Choi et al.,[1] compares the performance of deep convolutional neu-


ral networks using two different types of audio features for music genre classifi-
cation. They train and evaluate their models on a large dataset of over 1 million
tracks from the Million Song Dataset, and achieve state-of-the-art results for both
feature types.Convolutional neural networks have been activelyused for various mu-
sic classification tasks such as music genre classification and user-item latent feature
prediction for recommendation.

A. Gomez et al.,[2] proposed a music genre classification system based on deep


CNNs, trained on spectrogram images of audio signals. They compare their results
with several other approaches, including traditional machine learning algorithms and
other deep learning architectures, and demonstrate the superiority of their proposed
system. Most of the music files are stored according to the song title or the artist
name. This may cause trouble in searching for a song related to a specific genre.

Suryatej Reddy et al.,[3] proposed a transfer learning-based approach for music


genre classification using deep neural networks. They pretrain a convolutional neu-
ral network on a large dataset of image classification tasks, and then fine-tune the
network on a smaller dataset of audio files. They achieve state-of-the-art results on
several benchmark datasets.

Seokjin Kim et al.,[4] proposed an ensemble-based approach for music genre clas-
sification using deep neural networks. They train multiple neural networks with dif-
ferent architectures and combine their predictions using voting or averaging. They
achieve state-of-the-art results on several benchmark datasets and demonstrate the
effectiveness of their approach.Music has also been divided into Genres and sub
genres not only on the basis on music but also on the lyrics as well. This makes
music genre classification difficult.

4
Xi Xiong et al.,[5] has been used novel deep learning architecture for music genre
classification, which incorporates temporal attention mechanisms to capture the tem-
poral dynamics of music. They evaluate their approach on several benchmark datasets
and demonstrate its superiority over several other approaches, including traditional
machine learning algorithms and other deep learning architectures

Jian Wang et al.,[6] described an attention-based Convolutional Neural Network


(CNN) for music genre classification. Their approach uses both temporal and spec-
tral attention mechanisms to better capture the salient features of music signals. They
achieve state-of-the-art results on several benchmark datasets.The same principles
are applied in Music Analysis also. Machine Learning techniques have proved to be
quite successful in extracting trends and patterns from the large pool of data.

Kiwoong Park et al.,[7] designed a music genre classification system based on


feature learning with Temporal Convolutional Networks (TCNs). They use a novel
feature learning approach based on to automatically learn discriminative features
from raw audio signals. They achieve state-of-the-art results on several benchmark
datasets.Companies nowadays use music classification, either to be able to place rec-
ommendations to their customers or simply as a product. Determining music genres
is the first step in the process of music recommendation.

Rodrigo Schramm et al.,[8] described an end-to-end music genre classification


system based on recurrent neural networks. Their approach uses a bidirectional RNN
architecture to capture both forward and backward temporal dependencies in music
signals. They achieve state-of-the-art results on several benchmark datasets.Categoriz
ing music files according to their genre is a challenging task in the area of Music
Information Retrieval (MIR). Automatic music genre classification is important to
obtain music from a large collection. It finds applications in the real world in various
fields like automatic tagging of unknown piece of music (useful for apps like Saavn,
Wynk.)

Wentao Zhang et al.,[9] proposed an adversarial learning-based approach for mu-


sic genre classification. A music genre is a conventional category that identifies some
pieces of music as belonging to a shared tradition or set of conventions. It is to be

5
distinguished from musical form and musical style. Music can be divided into dif
ferent genres in many different ways. The popular music genres are Pop, Hip-Hop,
Rock, Jazz, Blues, Country and Metal. They train a generator network to generate
music samples that are difficult to classify by a discriminator network, and use these
samples to augment the training data. They achieve state-of-the-art results on several
benchmark datasets.

Yichao Lu et al.,[10] designed a Multi-Scale Convolutional Neural Network (MSC


NN) for music genre classification. Their approach uses a hierarchical architecture
with multiple scales of convolutional layers to capture both local and global features
in music signals. They achieve state-of-the-art results on several benchmark datasets.
Which aims to detect the most attention-grabbing objects in a scene and then extract
pixel-accurate silhouettes for them. They lies in its many applications, including
foreground map evaluation.

6
Chapter 3

PROJECT DESCRIPTION

3.1 Existing System

In existing system The K-means clustering algorithm computes centroids and re-
peats until the optimal centroid is found. It is presumptively known how many clus-
ters there are. It is also known as the flat clustering algorithm. In this method, data
points are assigned to clusters in such a way that the sum of the squared distances
between the data points and the centroid is as small as possible. It is essential to note
that reduced diversity within clusters leads to more identical data points within the
same cluster. K-means implements the Expectation Maximization strategy to solve
the problem. The Expectation-step is used to assign data points to the nearest cluster,
and the Maximization-step is used to compute the centroid of each cluster.

3.2 Proposed System

The advent of huge music collections has represented the test of how to recover,
browse, and suggest their contained items. One approach to facilitate the access
of huge music classification is to keep label explanations of all music resources.
Labels can be added either manually or automatically. However, because of the high
human effort needed for manual labels, the execution of automatic labels is more
cost-effective.
To solve this issue, referred K- NN classifier to classify the genres by using the
GTZAN dataset. From the results, found that k-NN gave more accurate outcomes.

3.3 Feasibility Study

3.3.1 Economic Feasibility

Economic involves a cost/ benefits analysis of the project, helping organizations

7
determine the viability, cost, and benefits associated with a project before financial
resources are allocated. This project is completely user friendly and the resources
needed for the project is free. Everyone can use our source code as a resource for
their projects implementation.
The economic feasibility step of project development is that period during which a
break even financial model of the project venture is developed based on all costs
associated with taking the product from idea to market and achieving sales sufficient
to satisfy requirements

3.3.2 Technical Feasibility

Technical feasibility helps organizations determine whether the technical resources


meet capacity and whether the technical team is capable of converting the ideas into
working systems. Technical feasibility also involves the evaluation of the hardware,
software, and other technical requirements of the proposed system. so our project
contains GTZAN dataset that has some thousands of genres like country,jazz etc.,
and we used visual studio code for executing the program and it is available for free
to download. And for testing the files download some audio files from internet. It
is the formal process of assessing whether it is technically possible to manufacture
a product or service. Before launching a new offering or taking up a client project,
it is essential to plan and prepare for every step of the operation. Technical feasi-
bility helps determine the efficiency of the proposed plan by analysing the process,
including tools, technology, material, labour and logistics.

3.3.3 Social Feasibility

Social feasibility is a detailed study on how one interacts with others within a
system or an organization. Social impact analysis is an exercise aimed at identifying
and analyzing such impacts in order to understand the scale and reach of the project’s
social impacts.
In the music sector our project have the biggest scope and the genre classification
application it will be a great algorithm to implement the genre classification system
and people who love the music on a particular genre will get the biggest interest on
this so it will be more useful to society.

8
3.4 System Specification

3.4.1 Hardware Specification

•RAM 4GB or Greater


•Hard disk 256GB or more
•Processor 13 or more

3.4.2 Software Specification

• Windows 8/9/10
• Visual studio code python
• Jupyter Notebook New Version
• import numpy as np import pandas as pd from tempfile import TemporaryFile im-
port os import pickle import random import operator import math

3.4.3 Standards and Policies

Visual Studio Code


Visual studio code is a source code editor that can be used with variety of program-
ming languages and we use our UI in python.
Standard Used: ISO/IEC 27001
Jupyter
It’s like an open source web application that allows us to share and create the doc-
uments which contains the live code, equations, visualizations and narrative text. It
can be used for data cleaning and transformation, numerical simulation, statistical
modeling, data visualization, machine learning.
Standard Used: ISO/IEC 27001

9
Chapter 4

METHODOLOGY

4.1 General Architecture

Figure 4.1: Architecture Diagram of Music Genre Classification

The Figure 4.1 explains the architecture mainly defines about the process of genre
classification of music which follows below:

Data collection: The first step is to collect a dataset of audio recordings that have
been labeled with their corresponding genres. This dataset can be collected from
various sources, such as music streaming platforms, online music repositories, or
manually curated collections.

Data preprocessing: Once the dataset is collected, the audio recordings must be
preprocessed to extract relevant features, such as pitch, tempo, and rhythm. This
involves converting the audio signal into a numerical representation that can be used
by the machine learning algorithm. Preprocessing may also include data cleaning
and normalization.

10
Feature extraction: The preprocessed audio data is then analyzed to extract fea-
tures that are relevant to the task of music genre classification. This involves selecting
a set of features that capture the key characteristics of the audio signal, such as mel
frequency cepstral coefficients (MFCCs) or spectral features.

Model training: Once the features have been extracted, a machine learning model
is trained to classify the audio recordings into their corresponding genres. This typ-
ically involves selecting a machine learning algorithm, such as a neural network,
decision tree, or support vector machine, and training it on the labeled dataset. The
training process involves optimizing the model parameters to minimize the classifi-
cation error.

Evaluation: Once the model is trained, it is evaluated on a separate validation set


to measure its performance. This involves calculating various metrics, such as ac-
curacy, precision, recall, and F1 score, to evaluate the model’s ability to classify the
audio recordings into their corresponding genres.

Deployment: Finally, the trained model can be deployed in a production environ-


ment, such as a music streaming platform, where it can be used to classify new audio
recordings in real-time. The models learn to identify patterns and classify music files
into different genres.

Model evaluation and selection: The trained models are evaluated using perfor-
mance metrics such as accuracy, precision, recall, and F1-score. The best model is
selected based on its performance on a held-out validation set. Genre Classification:
Once the model is trained and evaluated, it can be used to classify new music files
into different genres based on their extracted features.

Testing: Finally, the selected model can be used to classify new audio files into
different music genres

11
4.2 Design Phase

4.2.1 Data Flow Diagram

Figure 4.2: Data Flow Diagram For Music Genre Classification

The Figure 4.2 shows about Data flow diagram by follows below:
Sources of Data: The first component of the DFD is the source of data, which in-
cludes the audio data set, the genre labels associated with the audio files, and any
additional data needed for preprocessing.

12
Processes: The second component of the DFD is the processes that are involved in
the classification process. This includes the preprocessing of the audio data, the fea-
ture extraction process, the machine learning algorithm used to classify the audio
files, and the evaluation process to assess the performance of the algorithm.
Data Stores: The third component of the DFD is the data stores, which include the
preprocessed data store, the feature extracted data store, and the trained model data
store.
Outputs: The final component of the DFD is the output, which includes the classifi-
cation labels for the audio files, the evaluation results, and any recommendations or
playlists generated based on the classification results

4.2.2 Use Case Diagram

Figure 4.3: Use Case Diagram For Muaic Genre Classification

The Figure 4.3 describes about use case diagram that music genre classification
using machine learning follows below:
File Explorer: it is a software application that is used to browse and manage files
and folders on a computer. It provides a graphical user interface that allows users
to access files, folders, and other storage devices such as hard drives, flash drives,

13
and network drives. File converter: it is a software application or online tool that is
used to convert files from one format to another. It can convert various types of files,
such as documents, images, videos, and audio files, among others. File converters
are useful when a file needs to be opened, edited, or played, but the application or
software being used does not support the original file format.

4.2.3 Class Diagram

Figure 4.4: Class Diagram

The Figure 4.4 shows the Class diagram about the music genre classification using
machine learning. It is a type of diagram in software engineering that illustrates the
relationships and structure of classes in an object-oriented programming language.
In the context of music genre classification using machine learning, a class diagram
could be used to represent the structure of the program and its various classes.
The class diagram for music genre classification using machine learning is a visual
representation of the software system’s objects, classes, and their relationships. It

14
provides a clear overview of the different components of the system and how they
interact with each other. The first class in the diagram is the Dataset Class, which
represents the dataset of audio files used for training and testing the machine learning
model. This class contains methods to load and preprocess the audio files, as well as
methods to split the dataset into training, validation, and testing sets.

4.2.4 Activity Diagram

Figure 4.5: Activity Diagram For Music Genre Classification

The Figure 4.6 shows that activity diagram about the music genre classification us-
ing machine learning. This activity diagram is a type of diagram that illustrates the
flow of activities in a system. In the context of music classification using machine
learning, an activity diagram could be used to represent the various steps involved in

15
the classification process.The ”Genre Classification” activity would involve predict-
ing the genre of a new piece of music using the trained model.

4.3 Algorithm & Pseudo Code

4.3.1 Algorithm: K-Nearest Neighbours

Step1: Import required libraries.


Step2: Collect data from Kaggle.
Step3: Define a function to calculate distance between feature vectors, and to find
neighbours.
Step4: Identify the class of nearest neighbours.
Step5: A function required to evaluate a model to check the accuracy and perfor-
mance of the algorithm we build.
Step:6Feature extraction is a process to extract important features from data.
Step7:Test split the dataset and calculate the distance between two instance.
Step8: Make the predictions and calculate the accuracy. Test the classifier with new
audio file and finally it shows the genre it belongs.

4.3.2 Pseudo Code

1 # Import r e q i r e d l i b r a r i e s :
2 from p y t h o n s p e e c h f e a t u r e s i m p o r t mfcc
3 i m p o r t s c i p y . i o . w a v f i l e a s wav
4 i m p o r t numpy a s np
5 from t e m p f i l e i m p o r t T e m p o r a r y F i l e
6 import os
7 import pickle
8 i m p o r t random
9 import operator
10 i m p o r t math
11 i m p o r t numpy a s np
12

13 # D e f i n e a f u n c t i o n t o g e t t h e d i s t a n c e b e t w e e n f e a t u r e v e c t o r s and f i n d n e i g h b o r s :
14 def getNeighbors ( t rainingSet , instance , k ) :
15

16 # Identify the nearest neighbors :


17 def nearestClass ( neighbors ) :
18

19 # D e f i n e a f u n c t i o n f o r model e v a l u a t i o n :
20 def getAccuracy ( t e s t S e t , p r e d i c t i o n s ) :
21

16
22 # E x t r a c t f e a t u r e s from t h e d a t a s e t and dump t h e s e f e a t u r e s i n t o a b i n a r y . d a t f i l e my . d a t :
23 directory = ” path to dataset ”
24 f = open ( ”my . d a t ” , ’wb ’ )
25 i =0
26

27 # T r a i n and t e s t s p l i t on t h e d a t a s e t :
28 dataset = []
29 def loadDataset ( filename , s p l i t , trSet , teSet ) :
30

31 #Make p r e d i c t i o n u s i n g KNN and g e t t h e a c c u r a c y on t e s t d a t a :


32 leng = len ( t e s t S e t )
33 predictions = []
34

35 # T e s t t h e c l a s s i f i e r w i t h new a u d i o f i l e :
36 f o r f o l d e r i n os . l i s t d i r ( ” . / musics / wav genres / ” ) :
37 r e s u l t s [ i ]= f o l d e r
38 i +=1
39 ( r a t e , s i g ) =wav . r e a d ( ” path to new audio file ”)

4.4 Module Description

4.4.1 Module1:Import required libraries

NumPy: NumPy is a Python library for scientific computing that provides support
for large, multi-dimensional arrays and matrices. It is commonly used for numerical
calculations and data analysis.

Pandas: Pandas is a Python library for data manipulation and analysis. It provides
tools for reading and writing data from various file formats, such as CSV and Excel,
and provides functions for data cleaning, transformation, and manipulation.

Wavfiles: Waveform Audio File Format is an audio file format standard, devel-
oped by IBM and Microsoft, for storing an audio bitstream on personal computers.
It is the main format used on Microsoft Windows systems for uncompressed audio.
The usual bitstream encoding is the linear pulse-code modulation format.

OS Operator:The OS module in Python provides functions for interacting with the


operating system. OS comes under Python’s standard utility modules. This module
provides a portable way of using operating system-dependent functionality.

17
4.4.2 Module2:Processing of data

Read in the audio files: The first step is to read in the audio files in a suitable format.
Commonly used formats for music files include WAV and MP3. Python libraries
such as Librosa provide functions for reading in audio files.

Convert to a common format: It is common for music files to have different sam-
ple rates and bit depths. To ensure that the data is consistent, the audio files should
be converted to a common format, such as 16-bit PCM format with a sample rate of
44.1 kHz.

Normalize the data: The volume level of music files can vary widely, which can
affect the accuracy of the classification model. Normalizing the data ensures that
the volume level is consistent across all the audio files.Split into segments: Music
files can be quite long, which can make processing them computationally intensive.
To address this, the audio files can be split into shorter segments, such as 30-second
clips.

Apply preprocessing filters: Preprocessing filters, such as high-pass and low-pass


filters, can be applied to remove noise and unwanted frequencies from the audio
signal. Filters such as the pre-emphasis filter can also be applied to enhance the
high-frequency content of the audio signal.

Extract features: Once the audio files have been preprocessed, features can be
extracted from the audio signal. Commonly used features for music genre classifi-
cation include Mel-Frequency Cepstral Coefficients (MFCCs), spectral features, and
rhythm features.

Scale the data: To ensure that the features are on a similar scale, the data should
be normalized or standardized. This ensures that each feature is equally important in
the classification model.

18
4.4.3 Module3:Apply Machine Learning Algorithms(KNN)

One of them, K-Nearest Neighbour (KNN), is a technique that has been reportedly
successful in categorizing music into different genres.
A supervised machine learning algorithm, the K-Nearest Neighbour technique is
used to find solutions for classification and regression problems. Relying on labeled
input data to process unlabeled data in the future, this ML technique is used in music
genre classification.
Step 1: Select the value of K neighbors.
Step 2: Find the K nearest data point for our new data point based on Euclidean
distance.
Step 3: Among these K data points count the data points in each category.
Step 4: Assign the new data point to the category that has the most neighbors of the
new data point.

4.5 Steps to execute/run/implement the project

4.5.1 Step1:Requirements

• Data set.
•Sample Audio Files.
•Editor and download required packages in python.

4.5.2 Step2:Collection of Data set

• download the GTZAN data set from the internet and extract all the files and save
it in a folder. the URL is https://www.kaggle.com/data sets/andradaolteanu/gtzan-
dataset-music-genre-classification

4.5.3 Step3:Modules

• Download the required packages in python like numpy, wavfile, pickle, math files.
• Now using the KNN Algorithm calculate the distance, accuracy of the file.
• Now download sample audio files and save it. And using directory test the wave
files and finally show the result that to which the audio file belongs according to

19
genre.

4.5.4 Step4:Output

• Finally output is shown of the particular test file with accuracy and to what genre it
belong.

20
Chapter 5

IMPLEMENTATION AND TESTING

5.1 Input and Output

5.1.1 Input Design

Input design for our project involves a module that can undertake the audio files
as the testing for prediction of the genre and user should give the only audio files if
not it will pop up an error until the test file is belongs to music at a time many no of
music files can be tested and taken the input.

Figure 5.1: Dataset for Different Types of Genres

This figure 5.1 Shows the dataset for music genre classification is a popular appli-
cation of machine learning, where the goal is to automatically assign a genre label
to a given piece of music. In order to develop an effective machine learning model
for music genre classification, it is crucial to carefully design the input data. The in-
put data for music genre classification typically consists of audio signals or features
extracted from the audio signals.

21
5.1.2 Output Design

Figure 5.2: Output Design of Genre Classification

In this figure 5.2 Shows the output design for project involves the display of the
genre for the given test files if the user want they can use these classified genres
for their personal use and the global applications can also use these source to get the
better accuracy to convert the collection of music files to the categories called genres.

5.2 Testing

Prediction of Music Genre Classification using Machine Learning Used ML tech-


niques and programmed in Python to develop a system that predicts the Genre of the
audio file Trained and tested the model using datasets.

5.3 Types of Testing

5.3.1 Unit testing

The first step is to test the music genre classification from kaggle data using ma-
chine learning technique. Unit testing can help ensure that the data preprocessing
step is producing accurate and consistent results. Once the data is preprocessed, the
next step is to extract features from the data that can be used as inputs to the machine
learning model. Unit testing can help verify that the feature extraction process is
producing accurate and meaningful features.After the features have been extracted,
the next step is to train the machine learning model. Unit testing can help verify that
the model is being trained correctly and is producing accurate results.

22
5.3.2 Integration testing

Integration tests: come after unit tests. The main purpose of integration tests is
to find out any irregularity between the interactions of different components of the
software.
After unit tests, it’s useful to test how components work together. For that, we use
integration testing. Integration testing doesn’t necessarily mean testing the whole
ML project altogether but one logical part of the project as a single unit.

5.3.3 System testing

System Testing in the sense which is performed on a complete integrated system to


evaluate the compliance of the system with the corresponding requirements.Execute
the packages and the modules based on the collection of the requirements for fre-
quency extraction from music and implement the KNN module to get the accuracy
for the extracted frequency and test the nearest value based on the accuracy to predict
or display the genre.

23
5.3.4 Test Result

Figure 5.3: Output of Testing Audio Files

This figure 5.3 Shows the output whenever the test audio is given to the test. it
clearly says that the output shows the genre is ’Hiphop’.

24
Chapter 6

RESULTS AND DISCUSSIONS

6.1 Efficiency of the Proposed System

The proposed system is based on the K Nearest Neighbor Algorithm that creates
many decision trees. K Nearest Neighbor simply uses a distance based method to
find the K number of similar neighbours to new data and the class in which the
majority of neighbours lies, it results in that class as an output.
Now The dataset we will use is named the GTZAN genre collection dataset which
is a very popular audio collection dataset. It contains approximately 1000 audio files
that belong to 10 different classes. Each audio file is in .wav format. The classes to
which audio files belong are Blues, Hip-hop, classical, pop, Disco, Country, Metal,
Jazz, Reggae, and Rock. These are saved in a file. Finally will include some audio
test files for testing that whose genre is belong is taken as output.

6.2 Comparison of Existing and Proposed System

Existing system:
In the Existing system,The K-means clustering algorithm computes centroids and
repeats until the optimal centroid is found. It is presumptively known how many
clusters there are. It is also known as the flat clustering algorithm. The number of
clusters found from data by the method is denoted by the letter ‘K’ in K-means. In
this method, data points are assigned to clusters in such a way that the sum of the
squared distances between the data points and the centroid is as small as possible. It
is essential to note that reduced diversity within clusters leads to more identical data.
K-means implements the Expectation-Maximization strategy to solve the problem.
The Expectation-step is used to assign data points to the nearest cluster, and the
Maximization-step is used to compute the centroid of each cluster.
Proposed system:
K-NN algorithm assumes the similarity between the new data and available cases

25
and put the new case into the category that is most similar to the available categories.
K-NN algorithm stores all the available data and classifies a new data point based
on the similarity. This means when new data appears then it can be easily classified
into a well suite category by using K- NN algorithm. K-NN algorithm can be used
for Regression as well as for Classification but mostly it is used for the Classification
problems. K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data. It is also called a lazy learner algorithm because it
does not learn from the training set immediately instead it stores the dataset and at
the time of classification, it performs an action on the dataset. KNN algorithm at the
training phase just stores the dataset and when it gets new data, then it classifies that
data into a category that is much similar to the new data.

6.3 Sample Code

1 from p y t h o n s p e e c h f e a t u r e s i m p o r t mfcc
2 i m p o r t s c i p y . i o . w a v f i l e a s wav
3 i m p o r t numpy a s np
4

5 from t e m p f i l e i m p o r t T e m p o r a r y F i l e
6 import os
7 import pickle
8 i m p o r t random
9 import operator
10

11 i m p o r t math
12 i m p o r t numpy a s np
13 def getNeighbors ( t rainingSet , instance , k ) :
14 distances = []
15 for x in range ( len ( t r a i n i n g S e t ) ) :
16 dist = distance ( trainingSet [x ] , instance , k )+ distance ( instance , trainingSet [x ] , k)
17 d i s t a n c e s . append ( ( t r a i n i n g S e t [ x ] [ 2 ] , d i s t ) )
18 d i s t a n c e s . s o r t ( key = o p e r a t o r . i t e m g e t t e r ( 1 ) )
19 neighbors = []
20 for x in range ( k ) :
21 n e i g h b o r s . append ( d i s t a n c e s [ x ] [ 0 ] )
22 return neighbors
23 def n e a r e s t C l a s s ( neighbors ) :
24 c l a s s V o t e = {}
25

26 for x in range ( len ( neighbors ) ) :


27 response = neighbors [x]
28 i f response in classVote :
29 c l a s s V o t e [ r e s p o n s e ]+=1
30 else :

26
31 c l a s s V o t e [ r e s p o n s e ]=1
32

33 s o r t e r = s o r t e d ( c l a s s V o t e . i t e m s ( ) , key = o p e r a t o r . i t e m g e t t e r ( 1 ) , r e v e r s e = T r u e )
34 return sorter [0][0]
35 def getAccuracy ( t e s t S e t , p r e d i c t i o n s ) :
36 correct = 0
37 for x in range ( len ( t e s t S e t ) ) :
38 i f t e s t S e t [ x ][ −1]== p r e d i c t i o n s [ x ] :
39 c o r r e c t +=1
40 r e t u r n 1.0* c o r r e c t / len ( t e s t S e t )
41 directory = / m a c h i n e learning musicgenre / Data / g e n r e s o r i g i n a l /
42 f = open ( ”my . d a t ” , ’wb ’ )
43 i =0
44 f o r f o l d e r i n os . l i s t d i r ( d i r e c t o r y ) :
45 i +=1
46 i f i ==11 :
47 break
48 f o r f i l e i n os . l i s t d i r ( d i r e c t o r y + f o l d e r ) :
49 ( r a t e , s i g ) = wav . r e a d ( d i r e c t o r y + f o l d e r +” / ”+ f i l e )
50 m f c c f e a t = mfcc ( s i g , r a t e , winlen =0.020 , appendEnergy = F a l s e )
51 c o v a r i a n c e = np . cov ( np . m a t r i x . t r a n s p o s e ( m f c c f e a t ) )
52 m e a n m a t r i x = m f c c f e a t . mean ( 0 )
53 f e a t u r e = ( mean matrix , covariance , i )
54 p i c k l e . dump ( f e a t u r e , f)
55 f . close ()
56 dataset = []
57 def loadDataset ( filename , s p l i t , trSet , teSet ) :
58 w i t h open ( ”my . d a t ” , ’ r b ’ ) a s f :
59 while True :
60 try :
61 d a t a s e t . append ( p i c k l e . l o a d ( f ) )
62 e x c e p t EOFError :
63 f . close ()
64 break
65

66 for x in range ( len ( d a t a s e t ) ) :


67 i f random . random ( ) < s p l i t :
68 t r S e t . append ( d a t a s e t [ x ] )
69 else :
70 t e S e t . append ( d a t a s e t [ x ] )
71

72 trainingSet = []
73 testSet = []
74 l o a d D a t a s e t ( ”my . d a t ” , 0 . 6 6 , t r a i n i n g S e t , t e s t S e t )
75 def distance ( instance1 , instance2 , k ) :
76 d i s t a n c e =0
77 mm1 = i n s t a n c e 1 [ 0 ]
78 cm1 = i n s t a n c e 1 [ 1 ]
79 mm2 = i n s t a n c e 2 [ 0 ]
80 cm2 = i n s t a n c e 2 [ 1 ]

27
81 d i s t a n c e = np . t r a c e ( np . d o t ( np . l i n a l g . i n v ( cm2 ) , cm1 ) )
82 d i s t a n c e +=( np . d o t ( np . d o t ( ( mm2−mm1) . t r a n s p o s e ( ) , np . l i n a l g . i n v ( cm2 ) ) , mm2−mm1 ) )
83 d i s t a n c e += np . l o g ( np . l i n a l g . d e t ( cm2 ) ) − np . l o g ( np . l i n a l g . d e t ( cm1 ) )
84 d i s t a n c e −= k
85 return distance
86 length = len ( t e s t S e t )
87 predictions = []
88 for x in range ( length ) :
89 p r e d i c t i o n s . append ( n e a r e s t c l a s s ( g e t N e i g h b o r s ( t r a i n i n g S e t , t e s t S e t [ x ] , 5) ) )
90

91 accuracy1 = getAccuracy ( t e s t S e t , p r e d i c t i o n s )
92 p r i n t ( accuracy1 )
93 from c o l l e c t i o n s i m p o r t d e f a u l t d i c t
94 results = defaultdict ( int )
95

96 d i r e c t o r y = ” . . / i n p u t / g t z a n − d a t a s e t − music − g e n r e − c l a s s i f i c a t i o n / D a t a / g e n r e s o r i g i n a l ”
97

98 i = 1
99 f o r f o l d e r i n os . l i s t d i r ( d i r e c t o r y ) :
100 results [ i ] = folder
101 i += 1

Output

Figure 6.1: Output and accuracy 0f given audio file(jazz)

This Figure 6.2 displays that the output that shows correct genre whenever changed
the test audio. the shown genre in the output is ’jazz’ which is correct. pterCON-

28
CLUSION AND FUTURE ENHANCEMENTS

6.4 Conclusion

The proposed KNN model has shown improved performance in terms of music
genre classification, music similarity and music recommendation, compared to pre-
vious studies. Owing to the fact that the KNN has only three layers, it is an appro-
priate model to employ in the music streaming applications for music similarity and
recommendation. When the performance results are examined, some similar music
genres can lead to classify such as Jazz and Classic. In order to improve current
results, we plan to design more comprehensive deep neural network models and to
add extra data models as an input in addition to using only spectrogram. Big data
processing techniques and tools can also be utilized for feature extraction and model
creation in music genre recommendation systems.Accuracy of proposed system is
done by using KNN gives the ouput approximately 70.00% .

6.5 Future Enhancements

This represents an initial exploration in symbolic music feature analysis: other


and more complex feature sets will be taken into account to build computational
models better suited to recognize smaller differences between styles and genres. Our
medium term target is also the realization of sensibly larger musical corpora, with
different dimensions, class granularity and coverage. Large scale resources are in
fact necessary to support more systematic experiments and assess comparative
analysis.

29
Chapter 7

PLAGIARISM REPORT

Figure 7.1: Plagarism Report

30
Chapter 8

SOURCE CODE & POSTER


PRESENTATION

8.1 Source Code

1 from p y t h o n s p e e c h f e a t u r e s i m p o r t mfcc
2 i m p o r t s c i p y . i o . w a v f i l e a s wav
3 i m p o r t numpy a s np
4

5 from t e m p f i l e i m p o r t T e m p o r a r y F i l e
6 import os
7 import pickle
8 i m p o r t random
9 import operator
10

11 i m p o r t math
12 i m p o r t numpy a s np
13 def getNeighbors ( t rainingSet , instance , k ) :
14 distances = []
15 for x in range ( len ( t r a i n i n g S e t ) ) :
16 dist = distance ( trainingSet [x ] , instance , k )+ distance ( instance , trainingSet [x ] , k)
17 d i s t a n c e s . append ( ( t r a i n i n g S e t [ x ] [ 2 ] , d i s t ) )
18 d i s t a n c e s . s o r t ( key = o p e r a t o r . i t e m g e t t e r ( 1 ) )
19 neighbors = []
20 for x in range ( k ) :
21 n e i g h b o r s . append ( d i s t a n c e s [ x ] [ 0 ] )
22 return neighbors
23 def n e a r e s t C l a s s ( neighbors ) :
24 c l a s s V o t e = {}
25

26 for x in range ( len ( neighbors ) ) :


27 response = neighbors [x]
28 i f response in classVote :
29 c l a s s V o t e [ r e s p o n s e ]+=1
30 else :
31 c l a s s V o t e [ r e s p o n s e ]=1
32

33 s o r t e r = s o r t e d ( c l a s s V o t e . i t e m s ( ) , key = o p e r a t o r . i t e m g e t t e r ( 1 ) , r e v e r s e = T r u e )
34 return sorter [0][0]
35 def getAccuracy ( t e s t S e t , p r e d i c t i o n s ) :

31
36 correct = 0
37 for x in range ( len ( t e s t S e t ) ) :
38 i f t e s t S e t [ x ][ −1]== p r e d i c t i o n s [ x ] :
39 c o r r e c t +=1
40 r e t u r n 1.0* c o r r e c t / len ( t e s t S e t )
41 directory = / m a c h i n e learning musicgenre / Data / g e n r e s o r i g i n a l /
42 f = open ( ”my . d a t ” , ’wb ’ )
43 i =0
44 f o r f o l d e r i n os . l i s t d i r ( d i r e c t o r y ) :
45 i +=1
46 i f i ==11 :
47 break
48 f o r f i l e i n os . l i s t d i r ( d i r e c t o r y + f o l d e r ) :
49 ( r a t e , s i g ) = wav . r e a d ( d i r e c t o r y + f o l d e r +” / ”+ f i l e )
50 m f c c f e a t = mfcc ( s i g , r a t e , winlen =0.020 , appendEnergy = F a l s e )
51 c o v a r i a n c e = np . cov ( np . m a t r i x . t r a n s p o s e ( m f c c f e a t ) )
52 m e a n m a t r i x = m f c c f e a t . mean ( 0 )
53 f e a t u r e = ( mean matrix , covariance , i )
54 p i c k l e . dump ( f e a t u r e , f)
55 f . close ()
56 dataset = []
57 def loadDataset ( filename , s p l i t , trSet , teSet ) :
58 w i t h open ( ”my . d a t ” , ’ r b ’ ) a s f :
59 while True :
60 try :
61 d a t a s e t . append ( p i c k l e . l o a d ( f ) )
62 e x c e p t EOFError :
63 f . close ()
64 break
65

66 for x in range ( len ( d a t a s e t ) ) :


67 i f random . random ( ) < s p l i t :
68 t r S e t . append ( d a t a s e t [ x ] )
69 else :
70 t e S e t . append ( d a t a s e t [ x ] )
71

72 trainingSet = []
73 testSet = []
74 l o a d D a t a s e t ( ”my . d a t ” , 0 . 6 6 , t r a i n i n g S e t , t e s t S e t )
75 def distance ( instance1 , instance2 , k ) :
76 d i s t a n c e =0
77 mm1 = i n s t a n c e 1 [ 0 ]
78 cm1 = i n s t a n c e 1 [ 1 ]
79 mm2 = i n s t a n c e 2 [ 0 ]
80 cm2 = i n s t a n c e 2 [ 1 ]
81 d i s t a n c e = np . t r a c e ( np . d o t ( np . l i n a l g . i n v ( cm2 ) , cm1 ) )
82 d i s t a n c e +=( np . d o t ( np . d o t ( ( mm2−mm1) . t r a n s p o s e ( ) , np . l i n a l g . i n v ( cm2 ) ) , mm2−mm1 ) )
83 d i s t a n c e += np . l o g ( np . l i n a l g . d e t ( cm2 ) ) − np . l o g ( np . l i n a l g . d e t ( cm1 ) )
84 d i s t a n c e −= k
85 return distance

32
86 length = len ( t e s t S e t )
87 predictions = []
88 for x in range ( length ) :
89 p r e d i c t i o n s . append ( n e a r e s t c l a s s ( g e t N e i g h b o r s ( t r a i n i n g S e t , t e s t S e t [ x ] , 5) ) )
90

91 accuracy1 = getAccuracy ( t e s t S e t , p r e d i c t i o n s )
92 p r i n t ( accuracy1 )
93 from c o l l e c t i o n s i m p o r t d e f a u l t d i c t
94 results = defaultdict ( int )
95

96 d i r e c t o r y = ” . . / i n p u t / g t z a n − d a t a s e t − music − g e n r e − c l a s s i f i c a t i o n / D a t a / g e n r e s o r i g i n a l ”
97

98 i = 1
99 f o r f o l d e r i n os . l i s t d i r ( d i r e c t o r y ) :
100 results [ i ] = folder
101 i += 1
102 pred = n e a r e s t c l a s s ( getNeighbours ( dataset , feature , 5 )
103 p r i n t ( r e s u l t s {pred ) }

33
8.2 Poster Presentation

Figure 8.1: Poster Presentation

34
References

[1] ] M. Li, X. Liu, and Y. Zhang, (2021).“Music Genre Classification Using Convo-
lutional Neural Networks with Multi-Scale Time-Frequency Representations,”
IEEE Transactions on Multimedia, vol. 23, no. 5, pp. 2116-2127 2020.

[2] Y. Chen, Y. Xu, and C. Xu, (2020). “Music Genre Classification Using Convo-
lutional Neural Networks with Attention Mechanism,” IEEE Access, vol. 8, pp.
52749-52757.

[3] Holzapfel, A. and Stylianou Y, (2021).“Musical genre classification using


nonnegative matrix factorization-based features”, IEEE Transactions on Audio,
Speech, and Language Processing, Vol. 16, No. 2, pp. 424-434.

[4] Y. Ren, L. Li, and D. Li, (2022). “Music Genre Classification Using Deep
Learning with High-Level Features,” IEEE Access, vol. 7, pp. 14795-14803.

[5] S. Lee and S. Lee, (2020)“Music Genre Classification Using Recurrent Neural
Networks and Attention Mechanism,” IEEE Access, vol. 7, pp. 165778-165787.

[6] X. Cui, Q. Wu, and J. Chen, (2021). “Music Genre Classification Using
Ensemble Learning Based on Deep Belief Networks,” IEEE Access, vol. 8, pp.
25814-25824.

[7] L. Shen and Y. Zhang, (2021).“Music Genre Classification Based on Multimodal


Deep Learning,” IEEE Transactions on Multimedia, vol. 23, no. 1, pp. 41-54.

[8] Y. Yang, Y. Yu, and S. Zhang, (2021). “Music Genre Classification Using a
Hybrid Convolutional and Recurrent Neural Network,” IEEE Access, vol. 9, pp.
100862-100871.

35
[9] Y. Wang and L. Zhang, (2022). “Music Genre Classification Based on CNN and
LSTM Neural Networks,” IEEE Access, vol. 8, pp. 135982-135990.

[10] L. Xia, Z. Xia, and J. Liu, (2021).“Music Genre Classification Using Convo-
lutional Neural Networks with Transfer Learning,” in 2020 IEEE International
Conference on Information and Automation (ICIA), pp. 1630-1635.

[11] A Porter, D Bogdanov, R Kaye, R Tsukanov, and X Serra. Acousticbrainz: a


community platform for gathering music information obtained from audio. In
ISMIR, 2020.

[12] Michael I. Mandel and Daniel P.W. Ellis, Song-level Features and Support Vec-
tor Machines for Music Classification, Queen Mary, University of London, 2020.

[13] Jean-Julien Aucouturier and Francois Pachet. Improving timbre similarity :


How high’s the sky. Journal of Negative Results in Speech and Audio Sciences,
1(1), 2021

[14] Alan V. Oppenheim. A speech analysis-synthesis system based on homomor-


phic filtering. Journal of the Acostical Society of America, 45:458–465, Februar
Kristopher West and Stephen Cox. Features and classifiers for the automatic
classification of musical audio signals. In International Symposium on Music
Information Retrieval, 2021.

[15] Dan Ellis, Adam Berenzweig, and Brian Whitman. The “uspop2020” pop
music data set, 2021.

[16] Beth Logan and Ariel Salomon. A music similarity function based on signal
analysis. In ICME 2020, Tokyo, Japan, 2020.

36

You might also like