Professional Documents
Culture Documents
1
professor & HOD, 2 Assistant professor
Department of electronics and communication Engineering
CMR College Of Engineering & Technology, Kandlakoya, Medchal Road, Hyderabad-501401
Abstract- Signals and sound waves are part of our everyday life. However, music is a
distinct type of signal. Musical sounds each have a certain pitch that we can differentiate
as notes. A song contains basically two things, vocal and background music. Reading a
song on sheet music and then playing it on an instrument is an easy task for any musician.
In this century, computer software has also been designed to do just this. Programs can
create audio files (music we can hear) from sheet music very effectively for a whole range
of instruments. A major problem is that the reverse task, listening to or recording audible
music and then generating the sheet music for that piece, is much more difficult to
complete for both computers and talented musicians alike. Extracting the characteristics of
a song becomes more important for various objectives like learning, teaching, and
composing. The idea of this project is to develop a program that would take an audio input
(a song) and process it, in order to give musical notes as an output.
Keywords: Time-Frequency Analysis; Musical Note; Sampling Frequency; Recording;
Extraction.
I. Introduction various objectives like learning,
Songs play a vital role in our day-to- teaching, and composing. This project
day life. A song contains two things, takes the song as an input, extracts the
vocal and background music. Where features, and detects and identifies the
the characteristics of the voice depend notes, each with a duration. First, the
on the singer and in the case of song is recorded and digital signal
background music, it involves a processing algorithms are used to
mixture of different musical identify the characteristics. The
instruments like piano, guitar, drum, experiment is done with several piano
etc. Extracting the characteristics of a songs where the notes are already
song becomes more important for known, and identified notes are
compared with original notes until the because of a situation we call
detection rate goes higher. And then harmonic ambiguity; this occurs when
the experiment is done with piano one pitch whose fundamental
songs with unknown notes with the frequency is an integer multiple of
proposed algorithm. The ability to another pitch. The problem is solved
derive the relevant musical by careful signal processing in both
information from a live or recorded the time domain signals and frequency
performance is relatively easy for a domain signals. The main objective of
trained listener but highly non-trivial this project is to create an aid tool for
for a learner and computer. For learning for Musicians, Producers,
several practical applications, it would Composers, DJs, remixers, Teachers,
be desirable to obtain this information and Music Students. This project can
in a quick, error-free, automated be treated as a box, where you give
fashion. This thesis discusses the any song as input and get the features
design of a software system that of the song out. This project aims to
accepts as input a digitized waveform propose methods to analyze and
representing an acoustical music describe a signal, from which the
signal, and that attempts to derive the musical parameters can be easily and
notes from the signal so that a musical objectively obtained, in a sensible
score could be produced. This signal manner. A common limitation found
processing algorithm involves event in the musical literature is that how
detection, or precisely where within such parameters are obtained is
the signal the various notes begin and intuitively satisfactory but, to our
end, and pitch extraction, or the view, not very sound from a signal
identification of the pitches being processing perspective.
played in each interval. The event
detection is carried out using the time
domain analysis of the signal, where
the problem arises at a different speed.
Pitch detection (nothing but frequency
identification) is more complicated
Fig 1: Note durations called Arohana and Avarohana. The home
Every song has a tempo or a speed at which notes are named Griha Swara, the dominant
the music is to be played. Tempo is defined is called Vaadi, and Subdominant is called
as beats per minute, where a beat is usually Samvadi. The Dissonant is called Vivaldi.
defined to be a particular length of the note. The landing notes or resting notes are called
All notes’ lengths are then given a value, Nyasa Swara. Minimum an octave must
such as a quarter or a half. This value consist of 5 notes at least in a Raga. Notes
determines how many beats that note should sung with 3 or 4 notes in an octave are very
last. Interestingly enough, a beat is usually rarely performed. The root note is “Sa” and
defined to be a one-quarter note, and thus a a Raga must use “Ma” or “Pa” by default.
quarter note is 1 beat, a half note is 2 beats, Either one of them or both can be used in the
and an eighth note is half a beat. same raga. Other notes are exceptional.
Various combinations can be performed
using the octaves. To expertise in the
concepts and to develop an algorithm, some
more references were viewed and have been
discussed below. In Musical Notes
Identification using Digital Signal
Processing [1] paper, the input is taken as an
audio file and it is processed to extract the
Fig 2: Frequency table music
features to identify the note of the song. To
identify the characteristic of the song, digital
II. LITERATURE SURVEY
signal processing techniques are used and
The Scales are fundamental to all music. In
has been explained. The piano songs are
Ancient Greece, the scales are referred to as
only allowed as input. The piano songs are
modes. Some commonly known scales are
used because the notes are known by us
Major scale, Minor scale, jazz sale, blue
already, the identified notes and the original
scale, etc. A Raga can be derived from a
notes are compared until a higher rate is
scale. It is a unique personality or distinct
detected. This method used here for the
flavor which has no fixed rules precisely
identification of notes is more optimized
which combination. The ascendants and
than the previous methods. We can get the
descendants in Indian classical music are
results by varying the parameters like 2.1 System Model
threshold values and width, with the time A sound can be characterized by the
duration of each note. Thus, it can be used following three quantities:
as a tool for learning the notes of a song. (i) Pitch.
In pyAudioAnalysis: An Open- (ii) Quality.
Source Python Library for Audio Signal (iii) Loudness.
Analysis [8], is all about an open-source Pitch is the frequency of a sound as
Python library that provides a wide range of perceived by the human ear. A high
audio analysis procedures including feature frequency gives rise to a high pitch note and
extraction, classification of audio signals, a low frequency produces a low pitch note.
and supervised and unsupervised A pure tone is the sound of only one
segmentation. PyAudioAnalysis is licensed frequency, such as that given by a tuning
under the Apache License. This paper fork or electronic signal generator. The
provided the implemented methodologies by fundamental note has the greatest amplitude
the theoretical background, along with an and is heard predominantly because it has a
evaluation of some metrics of the methods. larger intensity. The other frequencies such
Several audio analysis research applications as 2fo, 3fo, and 4fo, are called overtones or
use the pyAudioAnalysis speech emotion harmonics and they determine the quality of
recognition, depression classification based the sound. Loudness is a physiological
on audiovisual features, smart-home sensation. It depends mainly on sound
functionalities through audio event pressure but also on the spectrum of the
detection, music segmentation, multimodal harmonics and the physical duration.
content-based movie recommendation, and 2.2 Musical Notes
health applications such as monitoring Humans can hear signal frequencies ranging
eating habits. SVM regression map the from 20-20 kHz. From this wide range,
audio features extracted from the previous some parts are associated with the piano.
steps to one or more supervised variables. Different pianos have different ranges. Each
The library also provides a semi-supervised tone of the piano has one particular
silence removal functionality. Based on the fundamental frequency and is represented by
literature survey, future enhancements have a note like C, D, ...etc. as shown in fig 1.
been drafted. The later C is 12 half steps away from the
previous one and has doubles the 2.4 Frequency & Fourier Transforms
fundamental frequency. Hence this portion (FFT)
(from one C immediate next C) is called one A Fourier transform provides the means to
octave. Different octaves are differentiated break up a complicated signal, like musical
by C1, C2, etc. tone, into its constituent sinusoids. This
method involves many integrals and a
continuous signal. We want to perform a
Fourier transform on a sampled (rather than
continuous) signal, so we have to use the
Discrete Fourier Transform instead.
recording discrete time values is called The audio signals stored in the computer are
sampling, and the process of recording different in format, sampling rate, the
discrete pressures is called quantizing. number of bits, or the original audio signal
Recording studios use a standard sampling containing sharp noise, which can affect the
frequency of 48 kHz, while CDs use the rate processing effect. At the same time, the unit
twice the highest frequency present in the the feature extraction, the original audio data
signal. Humans can hear frequencies from needs to be pre-processed: into a unified