You are on page 1of 25

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/311848940

A study of Shazam’s Audio Recognition

Presentation · December 2016


DOI: 10.13140/RG.2.2.21768.21766

CITATIONS READS

0 4,004

1 author:

Guendalina Palmirotta
University of Luxembourg
5 PUBLICATIONS   0 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Lower bounds for kissing numbers View project

Construction of flexible polyhedra View project

All content following this page was uploaded by Guendalina Palmirotta on 23 December 2016.

The user has requested enhancement of the downloaded file.


Get in touch with Shazam
The magic behind the Shazam algorithm
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition

PALMIROTTA
Guendalina

Get in touch with


A study of Shazam’s Audio Recognition
Shazam
Motivation
Seminar in Data Science
The magic behind
the Shazam
algorithm
From digital sound PALMIROTTA Guendalina
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and University of Luxembourg,
Hash functions
Anchor point and
Faculty of Science, Technology and Communication,
target zone Master in Mathematics
Matching of a song
Scatterplot and Winter semester 2016-2017
Histogram

Conclusion
Roboustness, December 21, 2016
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition
1 Get in touch with Shazam
PALMIROTTA
Guendalina Motivation
Get in touch with
Shazam 2 The magic behind the Shazam algorithm
Motivation
From digital sound to spectogram
The magic behind
the Shazam
Spectogram (Sonic visualization)
algorithm
From digital sound
Fingerprinting and Hash functions
to spectogram Anchor point and target zone
Spectogram (Sonic
visualization)
Fingerprinting and
Matching of a song
Hash functions
Anchor point and
Scatterplot and Histogram
target zone
Matching of a song
Scatterplot and
Histogram
3 Conclusion
Conclusion Roboustness, Speedness and Noise resistanceness
Roboustness,
Speedness and
Noise
resistanceness 4 A mini Shazam on R
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Motivation
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition
1 Get in touch with Shazam
PALMIROTTA
Guendalina Motivation
Get in touch with
Shazam 2 The magic behind the Shazam algorithm
Motivation
From digital sound to spectogram
The magic behind
the Shazam
Spectogram (Sonic visualization)
algorithm
From digital sound
Fingerprinting and Hash functions
to spectogram Anchor point and target zone
Spectogram (Sonic
visualization)
Fingerprinting and
Matching of a song
Hash functions
Anchor point and
Scatterplot and Histogram
target zone
Matching of a song
Scatterplot and
Histogram
3 Conclusion
Conclusion Roboustness, Speedness and Noise resistanceness
Roboustness,
Speedness and
Noise
resistanceness 4 A mini Shazam on R
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Motivation
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition Motivation – Introduction
PALMIROTTA
Guendalina

Get in touch with


Shazam
1 Capture the music for a
Motivation few seconds (5-15s)
The magic behind
the Shazam
algorithm 2 Identification of the song
From digital sound
to spectogram
Spectogram (Sonic
visualization) 3 Display the information
Fingerprinting and
Hash functions
Anchor point and
(name, artist, album)
target zone
Matching of a song
Scatterplot and
Histogram Goal
Conclusion
Roboustness,
Recognize our unknown song in a short time using Shazam
Speedness and
Noise
resistanceness
music application
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Motivation
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition Difficulty and constraints
PALMIROTTA
Guendalina Develop an algorithm that is able to:
Get in touch with
Shazam Capture by a little
Motivation
microphone a short
The magic behind
the Shazam sample of music
algorithm
From digital sound Often with mixed heavy
to spectogram
Spectogram (Sonic
visualization)
ambient noise
Fingerprinting and
Hash functions Quick identificaton over
Anchor point and
target zone a large database of
Matching of a song
Scatterplot and
Histogram
music ⇒ 2M tracks
Conclusion
Roboustness,
Speedness and
Noise Keywords:
resistanceness

A mini Shazam Roboustness, Noise resistanceness and Speedness


on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Motivation
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition Seems to be magic, but...
PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition
1 Get in touch with Shazam
PALMIROTTA
Guendalina Motivation
Get in touch with
Shazam 2 The magic behind the Shazam algorithm
Motivation
From digital sound to spectogram
The magic behind
the Shazam
Spectogram (Sonic visualization)
algorithm
From digital sound
Fingerprinting and Hash functions
to spectogram Anchor point and target zone
Spectogram (Sonic
visualization)
Fingerprinting and
Matching of a song
Hash functions
Anchor point and
Scatterplot and Histogram
target zone
Matching of a song
Scatterplot and
Histogram
3 Conclusion
Conclusion Roboustness, Speedness and Noise resistanceness
Roboustness,
Speedness and
Noise
resistanceness 4 A mini Shazam on R
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Overview
PALMIROTTA
Guendalina

Get in touch with Shazam two sides : ’Client’ side and the ’Server’ side
Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Step by step – Continous signal to discrete signal
PALMIROTTA
Guendalina
Step 1
Get in touch with
Shazam
The song must be transformed in a time-frequency graph,
Motivation that we call spectrogram. Then we do a kind of filtration, we
The magic behind
the Shazam get a constellation map and keep only the ’important points’.
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition How to get a spectogram?
PALMIROTTA
Guendalina
Goal: From digital sound to frequency
Get in touch with
Shazam
Motivation Discrete Fourier Transform (DFT)
The magic behind
the Shazam N−1
algorithm
X −2πikn
From digital sound
X (n) = x[k]e N ,
to spectogram
Spectogram (Sonic k=0
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
N size of (Hamming) window
Matching of a song
Scatterplot and
Histogram
X (n) the n-th bin of frequencies
Conclusion x[k] the k-th sample of the audio signal
Roboustness,
Speedness and
Noise
resistanceness ⇒ Use Fast Fourier Tranform (FFT) instead of the DFT
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Spectogram filtering
PALMIROTTA Combination of sinewaves at multiple frequencies
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Filtration : Consellation map
PALMIROTTA What are the important nodes to consider?
Guendalina
Depends on the coefficients of the bins
Get in touch with
Shazam
Depends on the number of bands of the strongest
Motivation time-frequency point
The magic behind
the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Step by step – Fingerprinting and Hash functions
PALMIROTTA
Guendalina
Step 2
Get in touch with
Shazam
We code the song in a unique acoustic fingerprints and store
Motivation it in a hash tag table.
The magic behind
the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition How to store?
PALMIROTTA Use anchor point with their corresponding zone, called target
Guendalina
zone
Get in touch with
Shazam Idea
Motivation

The magic behind


To look for multiple points at the same time instead of
the Shazam
algorithm
comparing each point one by one.
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Hash details
PALMIROTTA Each time an anchor point lies inside the target zone → a
Guendalina
hash is created.
Get in touch with
Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Searching and Scoring the fingerprints
PALMIROTTA
Guendalina

Get in touch with


Shazam
Fingerprint
Motivation
Unknown song: [(f1 , f2 , ∆t), t1 ] → [t1 ]
The magic behind
the Shazam Update a new song: [(f1 , f2 , ∆t), t1 , ID] → [t1 , ID]
algorithm
From digital sound
to spectogram
(t1 , f1 ) time-frequency at which the anchor point is
Spectogram (Sonic
visualization) located,
Fingerprinting and
Hash functions
Anchor point and
(t2 , f2 ) time-frequency at which the point in the target
target zone
Matching of a song zone is positioned,
Scatterplot and
Histogram
∆t = t2 − t1 the time difference between t1 and t2 ,
Conclusion
Roboustness,
Speedness and
ID of the song (name, artist, album, of the song).
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Hash function and table
PALMIROTTA
Guendalina ⇒ Store in a hash tag table in the database
Get in touch with
Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness
Hash tag table: Hashtable = [Hash(1), . . . , Hash(n)]
A mini Shazam
on R
Bucket is a specific location in database

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Step by step – Matching of a song
PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
Step 3
From digital sound
to spectogram For the matching factor we use a scatter graph and the
Spectogram (Sonic
visualization)
Fingerprinting and
corresponding histogram graph.
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Scatterplot and Histogram of no matching
PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
From digital sound to spectogram
The magic behind the Shazam algorithm
Fingerprinting and Hash functions
Conclusion
Matching of a song
A mini Shazam on R

Shazam’s Audio
Recognition Scatterplot and Histogram of matching
PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Roboustness, Speedness and Noise resistanceness
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition
1 Get in touch with Shazam
PALMIROTTA
Guendalina Motivation
Get in touch with
Shazam 2 The magic behind the Shazam algorithm
Motivation
From digital sound to spectogram
The magic behind
the Shazam
Spectogram (Sonic visualization)
algorithm
From digital sound
Fingerprinting and Hash functions
to spectogram Anchor point and target zone
Spectogram (Sonic
visualization)
Fingerprinting and
Matching of a song
Hash functions
Anchor point and
Scatterplot and Histogram
target zone
Matching of a song
Scatterplot and
Histogram
3 Conclusion
Conclusion Roboustness, Speedness and Noise resistanceness
Roboustness,
Speedness and
Noise
resistanceness 4 A mini Shazam on R
A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Roboustness, Speedness and Noise resistanceness
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition Roboustness, Speedness and Noise resistanceness
PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram
Identify the music with a rate of 90% of correctness in a
Conclusion
Roboustness, short time with noise!
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Roboustness, Speedness and Noise resistanceness
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition

PALMIROTTA
Guendalina

Get in touch with


Shazam
Motivation

The magic behind


the Shazam
algorithm
From digital sound
to spectogram
Spectogram (Sonic
visualization)
Fingerprinting and
Hash functions
Anchor point and
target zone
Matching of a song
Scatterplot and
Histogram

Conclusion
Roboustness,
Speedness and
Noise
resistanceness

A mini Shazam
on R

PALMIROTTA Guendalina Shazam’s Audio Recognition


Get in touch with Shazam
The magic behind the Shazam algorithm
Conclusion
A mini Shazam on R

Shazam’s Audio
Recognition
1 Get in touch with Shazam
PALMIROTTA
Guendalina Motivation
Get in touch with
Shazam 2 The magic behind the Shazam algorithm
Motivation
From digital sound to spectogram
The magic behind
the Shazam
Spectogram (Sonic visualization)
algorithm
From digital sound
Fingerprinting and Hash functions
to spectogram Anchor point and target zone
Spectogram (Sonic
visualization)
Fingerprinting and
Matching of a song
Hash functions
Anchor point and
Scatterplot and Histogram
target zone
Matching of a song
Scatterplot and
Histogram
3 Conclusion
Conclusion Roboustness, Speedness and Noise resistanceness
Roboustness,
Speedness and
Noise
resistanceness 4 A mini Shazam on R
A mini Shazam
on R

View publication stats PALMIROTTA Guendalina Shazam’s Audio Recognition

You might also like