P. 1
A Perceptually Grounded Approach to Sound Analysis

A Perceptually Grounded Approach to Sound Analysis

5.0

|Views: 1,187|Likes:
Studying of an algorithm for real-time audio onset detection based on a constant-Q transform. Project developed inside the project Orchestra Meccanica Marinetti, which consists of two robots playing drums, controlled by human gestures via MIDI. The developed algorithm detects the perceived attack of the sound, so that the delay between MIDI note’s generation and the sound produced by the hit on the drum can be calculated and compensated, during a live performance.
Studying of an algorithm for real-time audio onset detection based on a constant-Q transform. Project developed inside the project Orchestra Meccanica Marinetti, which consists of two robots playing drums, controlled by human gestures via MIDI. The developed algorithm detects the perceived attack of the sound, so that the delay between MIDI note’s generation and the sound produced by the hit on the drum can be calculated and compensated, during a live performance.

More info:

Published by: Corrado Zenji Scanavino on Oct 07, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less

11/14/2012

pdf

The bonk∼ method works essentially on a specialization of the constant-Q filter bank
analysis, called emphbounded-Q analysis. This method has the advantage to drastically
reduce the complexity of the constant-Q transform, such as well described in [17] after
[29] and [11]. In this kind of analysis the value of Q is limited (bounded) to approximately
5 and a few number of filters could be used to obtain a filterbank which give us at least
the same results of a constant-Q analysis. In addition, the bounded-Q analysis, takes the
advantages of a FFT-like algorithm, applied in between each frequency channel. This
is possible because the octaves are geometrically separated, but within each octave,
the frequency bins are equally spaced, as shown in figure 6.5. This channel distribution
becomes a good approximation for the geometric scale with a proper number of channels
per octave.

Puckette in [38] says: the bonk∼object was written for dealing with sound sources
for which sinusoidal decomposition breaks down; the first application has been to drums
and percussion.
That is, our case. The bandwidths of the filters subdivide the sound
spectrum into regions which are approximately tuned around the critical bands, in a
similar manner to the above Klapuri approach. This should well mimics the auditory
system behavior.

82

6.5 – Onset Detection in ·O M M·

Figure 6.5: Graphical representation of the bounded-Q filterbank. Only the octave are
geometrically spaced, in between the octave the spacing between analysis bins is linear.
This allows the application of FFT-like algorithm to calculate the spectrum of each
component.

We found that the implementation of 15 (non overlapping) filters was successful for
our case. See table 6.1 for detail on the filters used for the band-wise analysis. In this
table can be easily recognized the filter spacing with two filters per octave, except where
prohibited (the first two filters do not respect this spacing20

). The details of filterbank’s

implementation can be founded in appendix of this thesis.
The final stage, what we have called before the pick-picking stage, in bonk∼works
essentially with the definition of a growth function.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->