You are on page 1of 47

Modern

Methods of Extracting Vocal A Cappellas From


Stereo Recordings


Lachlan Nixon


























TABLE OF CONTENTS

INTRODUCTION 1
MOTIVATIONS BEHIND EXTRACTING VOCALS FROM STEREO RECORDINGS 1
LIMITATIONS OF THE METHODS 3
ACQUIRING A CAPPELLAS INSTEAD OF CREATING THEM 3
DAWS AND VOCAL EXTRACTION 5

METHOD 1: PHASE INVERSION 6


INTRODUCTION 6
FINDING INSTRUMENTAL VERSIONS OF SONGS 6
DESCRIPTION OF METHOD 6
AUDIO EXAMPLE 11
EVALUATION OF THE PHASE INVERSION METHOD 12

METHOD 2: SUBTRACTIVE EQUALISATION 13


INTRODUCTION 13
DESCRIPTION OF METHOD 13
AUDIO EXAMPLE 20
EVALUATION OF THE SUBTRACTIVE EQUALISATION METHOD 20

METHOD 3: CENTRE CHANNEL ISOLATION 22


INTRODUCTION 22
DESCRIPTION OF METHOD 22
AUDIO EXAMPLE 24
CENTRE CHANNEL ISOLATION USING VST PLUGINS 24
EVALUATION OF THE CENTRE CHANNEL ISOLATION METHOD 24

METHOD 4: STEREO FIELD EXTRACTION 26


INTRODUCTION 26
DESCRIPTION OF METHOD 26
SOFTWARE CAPABLE OF STEREO FIELD EXTRACTION 29
AUDIO EXAMPLE 29
EVALUATION OF THE STEREO FIELD EXTRACTION METHOD 29

METHOD 5: NOISE REMOVAL 30


INTRODUCTION 30
DESCRIPTION OF METHOD 30
DAWS AND VSTS CAPABLE OF NOISE PROFILING 31
AUDIO EXAMPLE 31
EVALUATION OF THE NOISE REMOVAL METHOD 31

REMOVING THE ARTEFACTS 32


NOISE GATES 32
DOWNWARDS EXPANSION 32
PHASE INVERTED SAMPLES OR LOOPS 33
SPECTRAL EDITING 33

USING MULTIPLE METHODS 36

i
CONCLUSION 37

GLOSSARY 38






























ii
Disclosure

The information in this document is intended for academic and educational
purposes only. Nothing in this document should be taken to encourage the
unauthorised use of intellectual property owned by another person or entity.
Before applying the information contained in this document to the intellectual
property of another person or entity, their express permission must first be
obtained. The author of this document takes no responsibility for any breach of
intellectual property rights that may result from the provision of the
information. Users of this document are wholly responsible for any
consequences resulting from their use of the information provided.
























iii
Figures

Figure 1: The instrumental and original in two separate audio tracks in Ableton
Live 9 Suite.
Figure 2: Abletons Clip View for an audio sample, showing the warp button and
the vertical gain slider.
Figure 3: Turning the grid off in Ableton Live 9 Suite.
Figure 4: Aligned instrumental and original audio tracks at full zoom in
Abletons arrangement view.
Figure 5: Abletons Utility audio effect.
Figure 6: The show/hide track delay button.
Figure 7: Setting the track delay by samples.
Figure 8: The show/hide in/out preferences button.
Figure 9: Setting an audio track to resampling mode.
Figure 10: Audio in the process of being resampled into a new audio track.
Figure 11: Abletons EQ8 with one active filter.
Figure 12: Selecting a steep High Pass Filter in EQ8.
Figure 13: Setting the High Pass Filter in EQ8.
Figure 14: Adding a notch filter to EQ8.
Figure 15: A notch filter in EQ8 with maximum gain.
Figure 16: A notch filter in EQ8 with a narrowed bandwidth.
Figure 17: Identifying unwanted frequencies with the notch filter in audition
mode.
Figure 18: The frequency range of common instruments.
Figure 19: Reducing the gain of unwanted frequencies in EQ8.
Figure 20: Using subtractive EQ to remove several instruments in a section of a
song.
Figure 21: Selecting the EQ8 in the Fades/Device chooser.
Figure 22: Selecting to automate the on/off function of filter 2 of the EQ8.
Figure 23: Automation of the on/off function of the EQ8.
Figure 24: Switching the EQ8 to mid/side mode.
Figure 25: Switching the EQ8 to side mode.
Figure 26: Adding a high pass filter to EQ8 in Side Mode.
Figure 27: Extreme high-passing the side panned frequencies using EQ8.
Figure 28: The play button within R-Mix.
Figure 29: R-Mixs harmonic placement window during playback.
Figure 30: Lowering the outside level fader in R-Mix to the lowest possible
value.
Figure 31: Finding the vocals in the Stereo Field in R-Mix.
Figure 32: R-Mixs Shape Switch.
Figure 33: R-Mixs Export Button.
Figure 34: The noise reduction window in Audacity.
Figure 35: A noise gate in Ableton removing all the audio set below the
threshold.
Figure 36: Abletons Multiband Dynamics plugin.
Figure 37: Spectral Editing in Audacity.

iv

Tables

Table 1: A list of the names and URLs of current a cappella hosting websites
Table 2: A list of the names and URLs of current online remix competitions
Table 3: A list of the names and URLs of Instrumental Hosting Websites
Table 4: A list of VST plugins capable of Centre Channel Isolation
Table 5: A List and of VSTs and Standalone Software capable of Stereo Field
Extraction
Table 6: A list of DAWs and VSTs capable of setting the Noise Profile
Table 7: A list of VST plugins, DAWs and standalone software capable of Spectral
Editing



































v

Audio Examples

These tracks are provided on a USB drive along with this essay.

Track 1: Original version of Cant Feel My Face by the Weeknd
Track 2: Instrumental version of Cant Feel My Face by the Weeknd
Track 3: A Cappella version of Cant Feel My Face by the Weeknd created using
Phase Inversion
Track 4: A Cappella version of Cant Feel My Face by the Weeknd created using
Subtractive Equalisation
Track 5: A Cappella version of Cant Feel My Face by the Weeknd created using
Centre Channel Isolation
Track 6: A Cappella version of Cant Feel My Face by the Weeknd created using
Stereo Field Extraction
Track 7: A Cappella version of Cant Feel My Face by the Weeknd created using
Noise Removal
Track 8: A Cappella version of Cant Feel My Face by the Weeknd created using
multiple methods






vi
Introduction

An a cappella, Italian for "in the manner of the chapel"1, is defined as a vocal
performance without any instrumental accompaniment.2 In the digital age, an a
cappella version of a recording refers to a digital audio file which contains only
the vocals and no other accompanying instruments. Digital a cappellas can be
either obtained from various sources or created from the original recording.
Once an a cappella has been acquired, it can be used by music producers for their
own productions.

Currently, no comprehensive analysis exists that describes, compares and
contrasts each method of vocal extraction. An analysis and comparison of all of
the methods together will provide a helpful resource, as a combination of
methods is often required to produce satisfactory results. To this end, this essay
aims to provide an overview of all the processes that are currently available to
extract vocals from a stereo recording and how to get the best results from them.
It will also provide audio examples of a cappellas created using these methods,
as well as offering recommendations on the context in which they should be
used.

This essay utilises advanced audio engineering concepts and techniques. A
strong understanding of Ableton Live 9 (or other similar DAW), audio
recordings, VST plugins and mixing and production techniques will be required
to make sense of the processes described. The glossary provides definitions of
key terms.

Motivations behind extracting vocals from stereo
recordings

It is important to understand why a cappellas are created and what they are used
for, as this influences the choice of technique used. Due to the nature of the
motivations described below, it is not necessary to create a perfect a cappella in
the majority of situations. The following is a non-exhaustive list of motivations
behind the desire to extract the vocals from a recording.

(1) To Create a Bootleg Remix

A primary use of a cappellas is the creation of bootleg remixes. Bootleg
remixing is a trend in electronic music production that involves creating an
adaptation of a song using the original recording as source material.3 The
resulting music is referred to as a bootleg as it is not created with the express

1 "A cappella" Oxford Music Online. Oxford University Press, accessed October 23, 2015,

http://www.oxfordmusiconline.com/subscriber/article/grove/music/00091.
http://www.oxfordmusiconline.com/subscriber/article/grove/music/00091.
2 I. R. Titze, The Human Instrument, Scientific American 298 (2008): 94-101.
3 Neil Strauss, Spreading By The Web, Pops Bootleg Remix, The New York Times, May 9

2002, 18.

1
permission of the original artist.4 Many electronic music producers are
constantly searching for and creating a cappellas. Once the isolated vocals have
been acquired, either by obtaining them from another source or by creating their
own, a producer can add their own musical elements to produce a version of
another artists track.5

(2) To Create a Mash-up

A Mash-up is a track that typically merges the instrumentation of one song
with the vocals of another.6 Once an a cappella of one song has been acquired
and the instrumental of another obtained or created, they are typically combined
together. Mash-ups have become incredibly popular in the 21st century.7 Many
artists exist who have created careers solely from the art of creating and
performing Mash-ups of other peoples songs.8

(3) For Stem Mixing

When DJs have access to a cappella versions of songs, they gain the ability to mix
the vocals of one song over the instruments from another song while performing
live. Isolated a cappellas can be used as loops, one shot samples or entire melody
tracks.9 When a cappellas are used in this way while a DJ performs it is called
stem mixing10. Each stem is played on a separate deck (some DJs perform with
up to 8 decks) or through the use of a DAW such as Ableton Live 9 Suite. This is a
highly creative process for DJs who wish to take their performances to the next
level.11

(4) For Vocal Analysis

Extracting the vocals from a recording can be incredibly useful for vocalists who
wish to learn from other performers. Once a vocal has been isolated, it is easier
to hear the nuances and subtleties in a vocalists delivery. The a cappella version


4 Neil Strauss, Spreading By The Web, Pops Bootleg Remix, The New York Times, May 9

2002, 18.
5 A. Renzo, '"Sounds Like an Official Mix": The Mainstream Aesthetics of Amateur Remix

Production', in Redefining Mainstream Popular Music, ed. Andy Bennet et al. (Ashgate:
Aldershot, 2013), 139.
6 Neil Strauss, Spreading By The Web, Pops Bootleg Remix, The New York Times, May 9

2002, 18.
7 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.
8 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.
9 David Schulman, Finding and Making A Cappella Tracks For DJs, DJ Tech Tools, July 28

2013, accessed 4 October 2015 http://djtechtools.com/2013/07/28/getting-vocals-for-


track-acappellas-for-djs/.
10 Stem Mixing: The Future of DJing?, accessed 29 October 2015,

http://www.64studio.com/blog_stem_mixing
11 Ed Montano, How do you know he's not playing Pac-Man while he's supposed to be

DJing? technology, formats and the digital future of DJ culture. Popular Music 29 (2010),
397-416.

2
of the recording allows vocal analysis free from any distraction from the
frequencies produced by the other instruments.12

(5) For Sale

Companies such as X Tracks13 have managed to create online businesses from
the provision of vocal extraction services. The customer provides a recording
from which they wish to isolate the vocal. The company will then perform vocal
extraction and send the isolated vocal file back via email to the customer in
exchange for a fee.14

Limitations of the Methods

It is difficult to completely remove a recording of all the elements of the other
instruments. In many ways, extracting the vocals from a stereo recording can be
compared to taking the egg out of a cake. The effectiveness of each method varies
but generally some artefacts, left over from other instruments, will not be
removed. Instruments that occupy a similar frequency range to the vocals and
are panned in a similar fashion can be incredibly difficult to remove without also
detracting from the vocals. In addition, the quality of the vocals that are obtained
will often be degraded due to the process used. This essay will focus on the
methods used for creating a cappellas for the purposes described above, which
may still contain some artefacts and somewhat affected vocals.

Acquiring A Cappellas Instead of Creating Them

The process of creating a cappellas can be tedious and time consuming. For this
reason many producers choose to obtain a cappellas from other sources before
attempting to make their own. The following is a non-exhaustive list of sources
where digital a cappellas may be found.

(1) A Cappella Hosting Websites

The internet is an essential tool which can be used to find a cappellas.15 Many
websites host vocal tracks which have been obtained from the artist or have
been created. These a cappellas are available for download for free or for a small
fee. A few of these website include:

A Cappella Hosting Websites
Acappellas4u www.acappellas4u.co.uk
Voclr www.voclr.it/acappellas/
Reddit www.reddit.com/r/isolatedvocals
www.reddit.com/r/SongStems

12 For examples of vocal analysis see

https://kpopvocalanalysis.wordpress.com/categoryvocal-analyses/.
13 see http://www.xtracks.tv/
14 Derek Luff, Co-President of X-Tracks, email message to author, September 22 2015.
15 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.

3
Discogs www.discogs.com
Stems Music www.stems-music.com/
Breakbeat Paradise www.breakbeat-paradise.com/
Looperman http://www.looperman.com/acappellas
Youtube www.youtube.com
A Cappella Town http://acappellatown.net/
Table 1: A list of the names and URLs of current A Cappella Hosting Websites

(2) Remix Competition Websites



Artists or record companies will often release the stems of a track (including the
isolated vocal a cappella) as part of remix competitions.16 Isolated vocal stems
are provided so that producers can create remixes. There is a limited range of a
cappellas available via these websites, but their quality is high as they are
obtained directly from the artist. The websites listed in Table 2 contain many
vocal a cappellas for download as they have featured in past remix competitions:

Remix Competition Websites
Remix Comps http://www.remixcomps.com/
Beatport Play http://play.beatport.com/
Find Remix http://findremix.com/
Indaba Music http://www.indabamusic.com/
Laptop Rockers http://www.laptoprockers.eu/
Remix Competitions http://www.remixcompetition.com/
Table 2: A list of the names and URLs of current online Remix Competitions


(3) Directly From the artist/record company

An a cappella may sometimes be obtained by contacting the relevant artist
directly. This will always result in a high-quality a cappella. Artists who are not
signed to a record label will often provide a cappellas upon request as long as
credit is given in their use17. Unfortunately, record companies will not provide
the isolated vocals upon receiving a request; they generally only allow the
dissemination of the isolated vocals to a collaborator creating an approved
official remix.18

(4) EP Singles, Promotional CDs and B-sides

Some record companies will provide a cappellas as a track on an EP single,
Promotional CD or on a B-side.19


16 Remix Competitions, accessed 29 October 2015, http://www.remixcomps.com/.
17 The Remix Business: Part One, accessed 29 October 2015,

https://www.soundonsound.com/sos/jun09/articles/worldoftheremixerpt1.htm.
18 A. Renzo, '"Sounds Like an Official Mix": The Mainstream Aesthetics of Amateur Remix

Production', in Redefining Mainstream Popular Music, ed. Andy Bennet et al. (Ashgate:
Aldershot, 2013), 139.
19 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.

4
DAWs and Vocal Extraction

In order to illustrate the methods described below I will be using Ableton Live 9
Suite. However, many of the processes can be adapted for use in any DAW. The
methods that require the use of VST plugins can be used in another DAW if the
VST plugin and DAW are compatible. Where necessary, I will describe other
DAWs, standalone software and VST plugins that may be used to the same effect.







































5
Method 1: Phase Inversion

Introduction

Phase Inversion (sometimes referred to as Phase Cancellation) requires both the
original recording and the instrumental version of the track. The method
involves inverting the instrumental version of a track and then applying it to the
original recording. This removes all of the instruments in the original, leaving
only the vocal.

Phase Inversion requires a high-quality studio version of the instrumental
track.20 Instrumentals that have been reproduced by third parties will not yield
the best results. File types with low quality encoding (such as .mp3) should not
be used.21 Ideally both the instrumental and the original are in an uncompressed
format (such as .wav) with the same bit rate of 320kbps or higher.

Finding Instrumental Versions of Songs

The websites in Table 3 host instrumental tracks for download either for free or
for a small fee.

Instrumental Hosting Websites
Beatport http://www.beatport.com/
Youtube http://www.youtube.com/
iTunes Store http://www.apple.com/itunes/
Reddit http://www.reddit.com/r/Instrumentals/
Stems Music http://www.stems-music.com/
Karaoke Version http://www.Karaoke-Version.com/
Table 3: A list of the names and URLs of Instrumental Hosting Websites
In addition, promotional CDs, EP singles and B-sides may also sometimes contain
instrumentals.22

Description of Method

In order to demonstrate Phase Inversion, I will be using the stereo recording of
Cant Feel My Face by the Weeknd. Please listen to the original here (or Track 1
on the USB) as well as the official instrumental here (or Track 2 on the USB).

Step (1)

20Audacity Manual Tutorial on Vocal Removal and Isolation, last modified 11 July

2015,
http://manual.audacityteam.org/o/man/tutorial_vocal_removal_and_isolation.html.
21 E.P. Ruzanski "Effects of MP3 encoding on the sounds of music," in Potentials 25

(2006),43-45.
22 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.

6

The .wav instrumental and .wav original are imported into two separate audio
tracks in the arrangement view in Ableton Live 9 Suite (see Figure 1).

Figure 1: The instrumental and original in two separate audio tracks in Ableton Live 9 Suite

Both tracks must be at the same volume. This is achieved by clicking on each
audio clip so that the Sample menu appears at the bottom of the screen. Next,
the clip gain for both tracks is set to 0.00db using the vertical slider (see Figure
2).

Step (2)

Warping must be turned off for each track. If warping remains on, Ableton will
match the tempo of the track to the global tempo of the session. To turn warping
off, the yellow Warp icon is unchecked in the Sample Menu (see Figure 2).















Figure 2: Abletons Clip View for an audio sample, showing
the warp button and the vertical gain slider


Step (3)

7
In order for Phase Inversion to work the arrangement view grid needs to be
deactivated. This allows the tracks to be freely moved without automatically
snapping to the timing of the grid. Turning off the grid can be done by right
clicking the grid and selecting Off (see Figure 3).


























Figure 3: Turning the grid off in Ableton Live 9 Suite

Step (4)

As both tracks need to be aligned perfectly in order for Phase Inversion to work,
the peaks and troughs of each track need to be aligned. In order to do this,
zooming in is required. To zoom, the dark grey scrub space above the
arrangement view is selected and dragged downwards. A kick drum hit (or
another obvious peak) is used as a reference point as it is easy to find in both the
instrumental and the original. While zooming in, it is necessary to continually
stop in order to align the peak or trough at that zoom level before zooming
further. Once the tracks are perfectly aligned at the highest zoom level, the
session will look like Figure 4.

Figure 4: Aligned instrumental and original audio tracks at full zoom in Abletons arrangement view

Step (5)

The next step is adding Abletons Utility audio effect to the instrumental track by
selecting utility from the audio effects browser and dragging it onto the
instrumental track. Both the phase left (Phz-L) and phase right (Phz-R) buttons
are checked. This will invert the phase of the instrumental track. The width
should remain at the default of 100% and the Gain at 0 (see Figure 5).
















Figure 5: Abletons
Utility audio effect

Step (6)

Now when both the instrumental and the original are played together, only the
vocals will be heard. However, if the tracks have not been aligned properly,
remnants of other instruments may still be present.

9

To further refine the alignment of the tracks, it possible to use Abletons track
delay function. This function allows delaying or pre-delaying of a track by the
millisecond or by the sample. To show the track delay feature, the Show/Hide
Track Delay button on the bottom right corner of the Mixer/Drop Area needs to
be selected (see Figure 6).







Figure 6: The
show/hide track
delay button


The track delay control for the instrumental track is then changed to samples
(Smp) as it is set to milliseconds by default (see Figure 7). Setting the track
delay to samples allows a finer resolution and therefore more accuracy in
alignment can be achieved.







Figure 7: Setting the track delay by
samples

The number value is dragged up or down in order to delay or pre-delay the
instrumental track by the specified number of samples. In order to line the
instrumental up with the original perfectly, the two tracks are played together
and the samples that are delayed or pre-delayed are dragged to the point where
the least amount of other instruments can be heard. There will be a point where
the samples the instrumental is pre-delayed or delayed by will bring it perfectly
in line with the original. The result of this is that the vocals will be heard clearly
with little remnants from other instruments.

Step (7)

The audio is then resampled to a new audio track. The in/out preferences need
to be accessed by selecting the i/o button on the bottom right of the mixer area
(see Figure 8).


10







Figure 8: The show/hide
in/out preferences button


Then, Resampling is selected from the drop down menu in the in/out
preferences of the new audio track (see Figure 9).










Figure 9: Setting an audio track to resampling mode


Finally, the resampling destination track is armed and the global arrangement
record button at the top of the screen is engaged. The song is played from start to
finish and the combination of audio tracks 1 and 2 will be resampled into track 3
(see Figure 10).

Figure 10: Audio in the process of being resampled into a new audio track

The resampled audio in track 3 should contain an a cappella version of the


recording.

Audio Example

11
Play Track 3 on the USB to hear the audio of the completed a cappella of Cant
Feel My Face by The Weeknd created using Phase Inversion.

Evaluation of the Phase Inversion Method

The Phase Inversion method will almost always leave some artefacts or
remnants of other instruments.23 In the above audio example, the drums, the
bass and some of the synthesizer can still be heard in the background. These
artefacts are primarily caused by mixing, mastering and quality differences
between the instrumental and the full version.24 Any slight differences in
dynamics and pitch will also cause a greater number of artefacts. I will discuss
strategies employed to remove these artefacts in the Removing the Artefacts
section of this essay.

Phase Inversion relies on the identical nature of digital files. The best results are
received using high-quality instrumentals where the vocals are muted out at the
mix down. Instrumentals that have an analog derivative (for example files
recorded to digital form from vinyl) will not produce good results. Consequently,
Phase Inversion is dependent on the acquisition of digital source material.

The primary advantage of Phase Inversion is that, if done correctly, the vocals
will not be affected much by the process. This is due to the fact that the process
involved in Phase Inversion does not require the removal any of the vocal
frequencies. If a good quality instrumental version of the track can be found,
using Phase Inversion should be able to produce a high-quality a cappella.25




















23 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.
24 Phase Cancellation, accessed 29 October 2015,

http://forum.dancehallreggae.com/archive/index.php/t-95246.html
25 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.

12
Method 2: Subtractive Equalisation

Introduction

Depending on what instruments are used in the track and what frequencies they
occupy, it may be possible to isolate the vocals by Subtractive Equalisation (or
Subtractive EQ).26 A parametric EQ processor, such as Abletons EQ8, can be used
to remove the frequencies of the other instruments, leaving only the vocals. Any
other VST EQ effect with several parametric filters may also be used to similar
effect.27

Description of Method

Step (1)

The first step is to import the recording into an audio track in Abletons
arrangement view. Then an EQ8 audio effect is inserted into the track by
selecting it from the audio effects browser and dragging it onto the track. All of
the filters except for one are then turned off by unchecking their orange filter
activator buttons (see Figure 11).

Figure 11: Abletons EQ8 with one active filter

Step (2)

A steep High Pass filter is engaged by selecting the icon from the filter mode
dropdown selector (see Figure 12).


26 Francis Preve, Mash It Up, Keyboard, January 6 2006, 38.
27 For a list of free parametric EQ VST plugins, see

http://bedroomproducersblog.com/2011/03/03/bpb-freeware-studio-best-free-
parametric-equalizer-vst-plugins/.

13
Figure 12: Selecting a steep High Pass Filter in EQ8


Step (3)

While the track is playing, the high pass filter is dragged to the point where the
most bass frequencies can be removed without taking the bottom end off the
vocals (see Figure 13).

Figure 13: Setting the High Pass Filter in EQ8

Step (4)

A second Notch filter is added by selecting the icon from the dropdown
menu as before. The filter is then is activated by pressing the grey square filter
activator button, turning it orange (See Figure 14).







14









Figure 14: Adding a notch filter to EQ8

Step (5)

The gain of the notch filter is increased to the maximum amount by selecting, in
this case, the number 3 filter dot and dragging upwards, or alternatively using
the gain control on the left hand side of the EQ8 (See Figure 15).

Figure 15: A notch filter in EQ8 with maximum gain

Step (6)

The bandwidth of the notch filter is narrowed by increasing the Q value on
bottom left of EQ8 to around 3.00 (see Figure 16). This reduces the frequency
range that the filter applies to.

15

Figure 16: A notch filter in EQ8 with a narrowed bandwidth


Step (7)

Audition mode is enabled by clicking the headphone icon in the top right corner
of the EQ8. In Audition mode, clicking and holding on a filter dot allows you to
hear only that filters effect on the output (see Figure 17).

Step (8)

While the track is playing, the frequency of unwanted instruments can be
identified by dragging the filter dot of the notch filter across the frequency
spectrum (see Figure 17).

Figure 17: Identifying unwanted frequencies with the notch filter in audition mode


For this step, it is helpful to know what frequency range each instrument
generally operates in. Figure 18 shows the frequency range in which most
common instruments can be found.


16

Figure 18: The frequency range of common instruments28

Step (9)

Once the frequency of an unwanted instrument has been identified, audition
mode can be turned off by unchecking the headphone icon. The unwanted range
of frequencies can then be removed by reducing the gain of the notch filter to the
minimum value of -15.00dB (see Figure 19).


28 Frequency Chart, Independent Recording, accessed 2 October 2015,

http://www.independentrecording.net/irn/resources/freqchart/main_display.htm.

17
Figure 19: Reducing the gain of unwanted frequencies in EQ8


Step (10)

To reduce the loss of frequency, the bandwidth of the notch filter should be
narrowed by increasing the Q value of the filter. It is best to keep the bandwidth
of the notch filter as small as possible while still taking out the unwanted
frequencies.

Step (11)

Steps 4-10 are repeated for each instrument that needs to be taken out in a
certain section of the track (e.g. the chorus). Each instrument will have to be
taken out individually. It is unwise to take out instruments that take up a large
frequency range (e.g. synthesizers) or instruments that have a similar frequency
range to the vocals (e.g. guitars), as doing so will affect the vocals too heavily.
Once this is completed, the EQ8 should have a subtracted notch for each
instrument that has been removed (see Figure 20).


Figure 20: using subtractive EQ to remove several instruments in a section of a song

18
Figure 20 shows Subtractive EQ being used in the second verse of Cant Feel My
Face by the Weeknd. Filter 6 is being used to cut out the guitar, filter 5 is being
used to cut out the snare and filter 2 is cutting out the bass guitar and kick.

Step (12)

The next step is to automate the on/off function of each notch filter of the EQ8
throughout the song. This means that each notch filter is only in operation when
the instrument it is trying to minimise is present in the mix, reducing the loss of
vocal frequencies throughout the track. The automation used will depend on the
structure of the track and which instruments are playing when.

In order to automate the on/off function of a notch filter, select the EQ8 from the
Fades/Device chooser drop-down menu (see Figure 21)







Figure 21: Selecting the EQ8 in the Fades/Device chooser

The on/off function of each filter can then be selected for automation using the
automaton control chooser dropdown list (see Figure 22).




















Figure 22: Selecting to automate the on/off
function of filter 2 of the EQ8

19
Automating this filter can then be carried out clicking and dragging points on the
red breakpoint envelope located on top of the track display. Creating a
breakpoint and dragging upwards will automate the filter to turn on while
dragging down will automate it to turn off. A sequence of breakpoints is created
throughout the song turning the filter on when to instrument to be removed
comes in and turning it off when the instrument ceases. An example of
automation of the on/off function of a notch filter can be seen in Figure 23.

Figure 23: Automation of the on/off function of a notch filter


Figure 23 shows automation of the on/off function of filter 6 of an EQ8
throughout the song Cant Feel My Face by the Weeknd. Filter 6 aims to take out
the guitar and so is only activated when the guitar is playing. At all other points
in the song the filter is off.

Audio Example

To hear an a cappella of Cant Feel My Face by The Weeknd created using
subtractive equalisation, listen to Track 4 on the USB.

Evaluation of the Subtractive Equalisation Method

As you can hear from the audio example above, the results obtained from
subtractive equalisation in this case are less than impressive. The effectiveness
of this method depends heavily on the frequencies of the instruments that are
used in the recording. If you have instruments that occupy the same frequencies
as the vocals, they will be difficult to remove using subtractive EQ without
removing part of the vocal as well. If too much subtractive EQ is used in the
frequency range of the vocals, the vocals will be too heavily affected to be used
for other purposes.

Another difficulty associated with subtractive equalisation is removing all the
harmonics of an instrument. If instruments only occupied single frequencies it
would be possible to easily isolate them in the mix, however a note played by an
instrument contains harmonics that give character to the sound throughout the
entire frequency spectrum.29 The frequencies of harmonics can be difficult to
identify. As a result, it is often easy to remove the fundamental note of
instrument but much harder to remove all of the harmonics.

29 John Taylor, Classical Mechanics (University Science Books, 2006), 87.

20
Despite this, Subtractive Equalisation can be useful in some situations. The
method does not require anything other than the original recording and Ableton
Live 9 Suite. The sourcing and purchase of VST plugins or instrumental versions
of a recording is not required. Subtractive Equalisation may be carried out when
resources required for other methods are unavailable. It is also often used in
conjunction with other methods.









































21
Method 3: Centre Channel Isolation

Introduction

This method of isolating vocals is only effective if the vocals operate in the centre
of the mix. It involves removing all of the frequencies that are panned to either
side of the recording, leaving just what is common between the left and right
channels. This can be done using Abletons EQ8, as its mid/side function allows
the EQ of instruments which operate solely in the left and/or right channels to be
tweaked. Other EQs with a mid/side function can also be used for this method.30

Description of Method

Step (1)

The recording is imported onto an audio track in arrangement view of Ableton
and an EQ8 is added to the track.

Step (2)

The EQ8 is put into M/S (or Mid/Side) mode using the Mode drop-down list
(see Figure 24).

Figure 24: switching the EQ8 to mid/side mode

Step (3)

The Edit Toggle Switch on the right-hand side of the EQ is used to select S (see
Figure 25). This will put the EQ8 into Side Mode.


30 For example the VSTs Voxengo CurveEQ and FabFilter Pro Q2 also contain a

mid/side function.

22
Figure 25: switching the EQ8 to side mode


Step (4)

All of the filters need to be unchecked except for one steep high pass filter, which
is shown by the symbol (see Figure 26).

Figure 26: Adding a high pass filter in Side Mode


Step (5)

The filter dot (in this case 3) should then be dragged to the bottom right corner
of the frequency display (See Figure 27). This will remove everything panned to
either side of the mix.

23
Figure 27: Extreme High-Passing the side panned frequencies using EQ8


Audio Example

To hear an a cappella created using the Centre Channel Isolation method, listen
to Track 5 on the USB.

Centre Channel Isolation using VST plugins

In addition to EQ VST plugins containing a mid/side function that allow Centre
Channel Isolation through a similar process to that described above, there are
also other VST plugins specifically developed to carry out Centre Channel
Isolation. These are listed in Table 4.

VST plugins capable of Centre
Channel Isolation
Kn0ck0ut
Brainworx bx_solo
Voicetrap
Table 4: A list of VST plugins capable of Centre Channel Isolation

Evaluation of the Centre Channel Isolation Method



Centre Channel Isolation will fail to remove other instruments that are also
panned to the centre of the mix. This is typically the drums and the bass guitar as
bass frequencies are usually kept mono and in the centre. In the above audio
example, Centre Channel Isolation has also failed to take out the majority of the
synthesizer. For this reason, the Centre Channel Isolation method works well
with the Subtractive EQ method, as some of the remaining centre panned
instruments can be removed by subtracting their frequencies.

In addition, as all the side-panned content is removed by Centre Channel
Isolation, the resulting a cappella will lack stereo width. This is often undesirable
as the vocal performance will sound as though it lacks depth. It is often necessary

24
to add reverb in order to regain some of this lost stereo width, although the
resulting audio effect can result in a sound that is not desired by the producer.














































25
Method 4: Stereo Field Extraction

Introduction

The Stereo Field Extraction method allows the isolation and removal of regions
of audio based on their stereo field location and frequency.31 A Stereo Field
Extractor displays audio in what is called a harmonic placement window. This
window displays the audio as coloured clouds showing the frequency,
amplitude and positioning of each element of audio in the stereo field.32

A specific frequency range and stereo position can then be selected and isolated.
A consistently loud, centred and mid-range collection of elements in the
harmonic placement window is almost always the vocal.33 Most Stereo Field
Extractors allow the frame of the audio selected to be adjusted to be
rectangular, square, circular or oval in shape. The frame size is also adjusted to
further refine the selection of audio.

Description of Method

To demonstrate Stereo Field Extraction this essay will use the standalone
software Roland R-Mix.

Step (1)

R-Mix is launched and then "New Project" is clicked. The import window will
then appear. The audio file from which the a cappella is to be extracted from is
selected.

Step (2)

The play button below the harmonic placement window is engaged to commence
playback of the audio (see Figure 28).

Figure 28: The play button within R-Mix

A visual representation of the recording will now be displayed in R-Mixs


harmonic placement window. Each elements colour indicates its amplitude
(black for quieter and white for louder; other colours indicate in-between

31 Fit It In The R-Mix, last modified September 13, 2013,

http://www.soundonsound.com/sos/sep13/articles/sonar-notes-0913.htm.
32 R-Mix: Audio Processing Software, last modified February 12 2015,

http://www.rolandus.com/products/r-mix/.
33 Fit It In The R-Mix, last modified September 13, 2013,

http://www.soundonsound.com/sos/sep13/articles/sonar-notes-0913.htm.

26
gradations). Frequency is mapped vertically with high frequencies at the top and
low frequencies at the bottom. Stereo placement is mapped from left to right (see
Figure 29).34

Figure 29: R-Mixs harmonic placement window during playback


Step (3)

The Outside Level fader located to the right of the harmonic placement window
is lowered to the minimum (-INF) value (see Figure 30). This will remove all
audio in the harmonic placement window that is outside of the red frame.















Figure 30: Lowering the
outside level fader to the
lowest possible value

34 R-Mix: Audio Processing Software, last modified February 12 2015,

http://www.rolandus.com/products/r-mix/.

27
Step (3)

The red frame is moved around the harmonic placement window to find the
location where the vocals are most prevalent (See Figure 31). This will generally
be in the centre of the stereo field and about halfway up the vertical frequency
scale.

Figure 31: Finding the vocals in the Stereo Field in R Mix

Step (4)

The shape and size of the red frame can then be refined in order to encapsulate
as much of the vocal as possible without letting any other instruments into the
output. The shape of the red frame can be adjusted by using the Shape Switch at
the top left of the harmonic placement window (See Figure 32).





Figure 32: R-Mixs Shape Switch


The size of the red frame can be adjusted by grabbing any of its vertical or
horizontal lines (or a corner) and dragging.

Step (5)

28

The export button above the harmonic placement window is used to export the
audio (See Figure 33).






Figure 33; R-Mixs
export button

Software Capable of Stereo Field Extraction

In addition to Roland R-Mix, Table 5 lists some VST plugins and Standalone
Software that are capable of Stereo Field Extraction.

Software Capable of Stereo Field Extraction
VST Plugins Standalone Software

Quik Quak Mashtashtic Roland R-Mix
Extraboy Pro
Table 5: A List and of VSTs and Standalone Software capable of Stereo Field Extraction


Audio Example

To hear an a cappella of Cant Feel My Face by the Weeknd created using Stereo
Field Extraction, listen to Track 6 on the USB.

Evaluation of the Stereo Field Extraction Method

It is often difficult to encapsulate the full breadth of the vocals within the frame
selector when using Stereo Field Extraction. As a result the vocals are most likely
to lose harmonic content during the process. In addition, frequencies that occupy
the same frequency range and stereo position as the vocals are not removed.
Stereo Field Extraction will thus not completely remove all of the other
components of a recording in most cases.

Despite this, Stereo Field Extraction is capable of producing high-quality a
cappellas in some situatuation. The more isolated the vocal part is with regard to
frequency range and panning, the more effective the Stereo Field Extraction
process. The ability to adjust the size and shape of the frame in the harmonic
placement window is an incredibly powerful tool. It allows Stereo Field
Extraction to be more accurate than using a combination of the Centre Channel
Isolaton and Subtractive Equalisation techniques.

29
Method 5: Noise Removal

Introduction

Several programs and VSTs have been developed to remove noise and hiss from
audio. The noise removal capabilities of these programs can be utilised in a
similar way to Abletons utility audio effect during Phase Inversion in order to
create an a cappella. These programs allow you to define what you want to
remove from the audio (called noise profiling). If an instrumental version of
the track is set as the noise profile, the program will subtract the instrumental
from the original track when the noise removal effect is applied, leaving just the
vocals.

Description of Method

The DAW Audacity will be used to demonstrate this method.

Step (1)

First, the instrumental version of the recording must be acquired (see the
Finding Instrumental Versions of Songs section above).

Step (2)

Next, the instrumental is imported into Audacity using File > Import.

Step (3)

The whole of the instrumental is selected by dragging the mouse over it. The
instrumental is then set as a noise profile by clicking Effect > Noise Reduction
and then clicking on the Get Noise Profile button (see Figure 34).

Figure 34: The noise reduction window in audacity


Step (4) The original recording is imported and noise reduction applied to it by
selecting Effect > Noise Reduction and then clicking on the OK button.

30

DAWs and VSTs capable of Noise Profiling

Although many DAWs and VSTs are capable of noise reduction, extracting an a
cappella using noise reduction requires the DAW or VST to be capable of setting
the instrumental as a noise profile. Table 6 lists the DAWs and VSTs that are
capable of this:

Software capable of Noise Profiling
DAWs VSTs
Audacity X Noise Stereo
Adobe Audition Izotope RX5
FL Studio 12 Reaper ReaFIR
Table 6: A list of DAWs and VSTs capable of setting the Noise Profile


Audio Example

To hear an a cappella of Cant Feel My Face by The Weeknd created using Noise
removal, listen to Track 7 on the USB.

Evaluation of the Noise Removal Method

This method is similar to Phase Inversion in its use of the instrumental version of
the track. However, it is rare that using this method will result in a better quality
a cappella than using Phase Inversion. Setting the noise profile in noise reduction
software does not allow you to line up the tracks perfectly. If the noise profile is
applied and the tracks do not align, the a cappella created will contain many
artefacts. The results are therefore inferior to those achieved using the
instrumental version of the recording in the Phase Inversion method.












31
Removing the Artefacts

In the methods described above, artefacts or remnants of other instruments
will often remain in the a cappella. The extent and type of these artefacts
depends on the method used and the individual track. Depending on what
artefacts are left, the below tools can be used to reduce or remove them.

Noise Gates

Noise Gates can be used to remove unwanted audio that is below a certain
volume threshold. The gate opens as the signal rises above a threshold, and
closes when it falls below it. If the threshold is set just above the volume of the
unwanted audio but below the volume of the vocal, the gate will close during
unwanted audio and open as soon as the vocal is detected.35 Figure 35 shows a
noise gate in operation in Ableton Live.


Figure 35: A noise gate removing all the audio set below the
threshold

Downwards Expansion

Downwards expansion can also be used to reduce the volume of audio that falls
below a certain threshold. Abletons Multiband Dynamics plugin can act as a
Downwards Expander and many other compressors, expanders and multiband
dynamics plugins also have this capability.36 The advantage of using a multiband
effect is that downwards expansion can be applied to certain frequency bands of
audio and not others (see Figure 36). If the threshold is set at the point where
the vocals go above it and all other frequency content goes below it, then the
downwards expander will be able to remove the unwanted frequency content in
that frequency range.

35 http://www.soundonsound.com/sos/jan12/articles/noise-reduction.htm
36 For example the Fab Filter MB

32


Figure 36: Ableton Multiband Dynamics plugin

Figure 36 shows Abletons Multiband dynamics plugin in below mode, which


turns it into a Downwards Expander. Downwards expansion is being applied to
the mid and low bands of audio as the amplitude of frequencies in that band (the
orange bars) have fallen below the set threshold.

Downward expansion is often beneficial over using noise gates as it is frequency
specific; it allows a threshold to be set for multiple frequency bands of audio. It
also avoids the chatter effect that can be caused by noise gates as downwards
expansion does not completely remove unwanted frequencies, it only reduces
them. By merely reducing their amplitude and not cutting the signal completely,
downwards expansion is a more subtle effect.

Phase Inverted Samples or Loops

A sample from another part of the song can be phase inverted and used to
remove instruments in other parts of the recording. This is only effective if a
song is repetitive or uses loops. If there is a section of a recording that contains
drums, this section can be sampled and inverted and applied to other party of
the recording in order to remove drums.

A similar technique also works for removing drums in electronic music if the
drums can be identified as being from a classic drum machine such as the 707,
808 or 909. Samples from these drum machines are readily available online.37 If
they are phase inverted and placed on recording that contains use of that drum
machine, the drums can be removed.

Spectral Editing

Spectral editors allow the visualisation and removal of specific frequencies of
audio. They typically display audio on a Spectrogram, which depicts frequency

37 Tons of Electronic Drum Kit Samples for Free!, accessed 29 October 2015,

http://howtomakeelectronicmusic.com/tons-of-classic-drum-machine-samples-for-free

33
content in the audio over time38. Grabbing and editing tools within the editor can
then be used to identify unwanted sounds and remove them. Although the
method will vary for each Spectral Editor, the basic process of selecting and
deleting frequency content is the same. Figure 37 shows the Spetrogram view in
Audacity

Figure 37: The Spectrogram view in Audacity


Spectral Editing can be done in some DAWs (e.g. Audacity and Adobe Audition)
as well as some VST plugins and standalone software.39 Table 7 contains a list of
current VSTs, DAWs and Standalone Software that is capable of Spectral Editing.

Spectral Editors
VSTs DAWs Standalone
Software
Stillwell Spectro Adobe Audition Interactive Source
Separation Editor
(ISSE)
Melodyne Audacity TAPESTREA
Imageline Edison CEDAR Studio 7 Sony SpectraLayers
Pro 3
Ircam Anasynth Magix Samplitude Pro X Magix Audio
Cleaning Lab 2016

38 Kelly Fitz; Sean Fulop, Digital algorithms for computing the time-corrected

instantaneous frequency (reassigned) spectrogram, with applications,


The Journal of the Acoustical Society of America 116 (2004), 2582.
39 Spectral Selection, accessed 28 October, 2015,

http://manual.audacityteam.org/o/man/spectral_selection.html.

34
Algorithmix reNOVAtor Klingbeil SPEAR
Audio Master Suite
2
Izotope RX5
Table 7: A list of VST plugins, DAWs and standalone software capable of Spectral Editing


Using Spectral Editing it can difficult to identify the elements that you are trying
to remove, as there is no way of soloing the frequency region to hear what
resides within the selection. As a result, the process often involves large amounts
of trial and error and can be very time consuming.

Some more recent Spectral Editors such as Melodyne and Sony Spectra Layers
Pro 3 allow for greater precision and soloing of a region in their spectrogram
displays. The intuitive software is able to further separate the frequency content
of the audio into the separate instruments that the software detects in the mix.
This software is currently incredibly expensive and the algorithms used to group
the frequency content into different instruments are mediocre.






























35

Using Multiple Methods

In order to create the best quality a cappella, it is advisable to combine several of
the aforementioned methods together. An a cappella of Cant Feel My Face
created using a combination of the above methods listen to Track 8 on the USB.

The following processes were applied to create this a cappella:
A phase inverted instrumental was first applied to the original stereo
recording;
The centre channel was then isolated;
Subtractive EQ was used to remove the frequencies above and below the
vocals;
Further Subtractive EQ was applied using automated notch filters to
remove as much as possible of the drums;
Downwards expansion was applied to remove audio below the set
threshold in the mid frequencies;
Spectral Editing was used to further remove artefacts;
Reverb was added to regain the stereo width lost from Centre Channel
Isolation.

As can be heard from the above audio example, a combination of methods can
produce a far superior result than any one method used on its own.






















36
Conclusion

Before the trend in creating bootleg remixes and Mash-ups existed, there was
little motivation in exploring the methods behind extracting vocals from a stereo
recording. Now, there is a strong demand for a cappella versions of stereo
recordings and interest in the techniques used to isolate the vocals from a
recording has dramatically increased.

Currently, the creation of a perfect a cappella is almost impossible without
obtaining it directly from the artist. There are, however, several methods that
can be used to largely eliminate the other instruments from the mix. Many of
these techniques will not produce stellar results when used on their own, but
when they are applied together will result in an a cappella that is usable for
purposes such as creating a Bootleg Remix or Mash-up.

The technology surrounding the extraction of a cappellas is constantly updating
and improving. From a development perspective, Spectral Editing software is in
its infancy and will, for the moment, will struggle to produce a high quality a
cappella on its own. However, the future is likely to see the creation of an
advanced Spectral Editing program for audio that is capable of easily identifying
and removing instruments from a track.


























37
Glossary


A Cappella: a vocal performance without any instrumental accompaniment

Amplitude: the maximum extent of a vibration or oscillation, measured from the
position of equilibrium.

Analog: a mechanism that represents data by measurement of a continuous
physical variable

Arrangement View: A view in Ableton Live that contains music laid out along a
horizontal timeline

Bandwidth: a range of frequencies within a given band

Bit Rate: the number of bits per second that can be transmitted along a digital
network, measured in kilobytes per second (kbps)

Compression: reducing the volume of loud sounds and amplifying quiet sounds
by narrowing an audio signal's dynamic range

DAW: Digital Audio Workstation; computer software for recording, editing and
producing audio files

dB: Decibels; a unit used to measure the intensity of a sound or the power level
of an electrical signal

Deck: A digital turntable used by disc jockeys

Digital: data expressed as series of the digits 0 and 1, typically represented by
values of a physical quantity

DJ: Disc Jockey; a person who introduces and plays recorded music

Drum Machine: a programmable electronic device able to imitate the sounds of
a drum kit

EQ: Equalisation; the process of adjusting the balance between frequency
components within an electronic signal

Fader: a knob or button or slider that is used to increase/decrease a parameter

Filter: a frequency dependent amplifier

Fundamental Note: the lowest harmonic of a note played by an instrument

38
Frequency: oscillation capable of being perceived by the human ear, generally
between 20 and 20,000 Hz

Gain: a unit of measurement used to describe the ratio of a signal output of a
system to the signal input of the system

Global Tempo: The overall tempo of a session in Ableton Live

Grid: Fixed vertical lines that denote a time value in Abletons arrangement view

Harmonics: a wave is a component frequency of the signal that is an integer
multiple of the fundamental frequency

Hertz (Hz): a unit used to measure frequency defined as one cycle per second

High Pass Filter: a filter that allows signals with a frequency higher than a
certain cutoff to pass and attenuates signals with frequencies lower than the
cutoff frequency

Invert: flipping the audio upside-down, reversing its polarity

Loop: short sections of tracks which are repeated

Mash-up: a musical track comprising the vocals of one recording placed over the
instrumental backing of another

Mastering: preparing and transferring recorded audio from a source containing
the final mix to a data storage device

Mixing: the process by which multiple sounds are combined into one or more
channels

.mp3: an audio coding format for digital audio which uses a form of lossy data
compression

Music Producer: someone who oversees and manages creation of music

Notch Filter: a filter that attenuates signals within a very narrow band of
frequencies

One-shot sample: a sample that is played only once (i.e. it does not loop)

Panning: the distribution of a sound signal into a stereo or multi-channel field

Parametric: multi-band variable equalizers which allow users to control the
three primary parameters: amplitude, centre frequency and bandwidth

Phase: how far along its cycle an audio waveform is

39

Remix: music that has been altered from its original state by adding, removing,
and/or changing elements of it

Sample: a digital representation of an analog signal

Session: a working file in Ableton Live

Standalone Software: a program that does not require a host program to run

Stem: audio recordings of each instrument which are mixed together when
mixing

Stereo: sound that is directed through two or more speakers so that it seems to
surround the listener and to come from more than one source

Tempo: the speed at which a passage of music is or should be played

Track: an audio recording

VST Plugin: Virtual Studio Technology Plugin; software that integrates into a
host DAW

Warping: Stretching or condensing an audio file in order to match it with a set
tempo

.wav: Waveform Audio File Format; a Microsoft and IBM audio file format
standard for storing audio

40

You might also like