AI in Sound Engineering Resolution V19.6 Winter 2020

Technology
to slowly sneak its way into our lives, from

chatbots, to virtual assistants all the way to
autonomous driving. And there is no doubt it
will continue to evolve.
Music is extremely complex, unique and
activates all parts of the human brain. Let’s take
Artificial intelligence
a closer look at how machine learning has
helped our sound engineering field, what are the
obstacles and how could the future look like?
in sound engineering
AI Mastering
There is a fair chance you have come across
platforms such as LANDR and AI Mastering.
Users can upload their music and the platform
analyses it and masters the song for you. Even
though their technology isn’t public information,
ROBIN REUMERS has the greatest enthusiasm and it’s fair to assume they fed their AI with mixes
confidence in our mission. And wants to help you* and masters and let it analyse information such
as frequency response, dynamics and
T
harmonics. Future development in this area will
ech companies like IBM have for Machine learning, a subset of Artificial undoubtedly see AI masters with presets for
decades been trying to understand and Intelligence (AI), can automatically learn and specific engineers, each with a different taste.
replicate the neural networks of a improve from experience without being Personal Take: It feels and sounds like a
human brain. You might have heard of explicitly programmed. fast-food chain. Sure, lots of people will eat
‘Watson’ — that’s where Artificial Learning (AI) Coinciding with the vast amount of there. But still there are plenty who will prefer a
really gained its first mainstream attention. computing power available to us, AI has started fresh home cooked meal or to go to a Michelin
star restaurant every now and then.
De-noising
The second example I have is de-noising.
CEDAR, and now iZotope, have been
tremendous innovators in this field, and the
latest versions of iZotope RX has started to
incorporate AI engines. Considering it’s an
offline process and therefore has access to a full
audio track, it’s the perfect candidate to benefit
from AI (more on that in the next section).
48 / Winter 2020
Personal Take: treat carefully. I often hear
engineers reaching for that de-noiser but in my
personal opinion, you have to be very careful
that the cure is not worse than the disease. It
can cause serious artefacts, and you have to be
cautious.
De-mixing
The third one is a more recent example, and
that is ‘Sound Source Separation’. iZotope and
Audionamix (review, Resolution V17.4) as well as
many open source projects (ISSE among
others) in recent years have introduced their
own version of tools that allow you to take a
stereo mix and separate specific instruments
from the mix, for example Drums/Percussion,
Bass, Vocals and all the rest. This technology is
also based on Deep Learning.
Personal Take: Even though the premise
sounds exciting, care should be used. I find it
extremely powerful for a creative effect but
have had no luck using it to ‘remix’ a crappy
mix. In most cases, there are still plenty of Research & Development
artefacts — we’ll call them ‘space monkeys’. A last example where AI is useful is for new
plug-in development or R&D. Imagine trying to
Job crafting solve a serious audio problem, where on one
Then comes something that was recently hand you have ‘audio you like’ and on the other
introduced by our dear friends at Acustica hand you ‘audio you don’t like’. But you can’t
Audio, where machine learning is used to figure out what the problem is. Using AI can
analyse engineers’ moves on a particular piece help you narrow down and see patterns. It’s a
of gear. While using that piece of gear, you really powerful tool, undoubtedly currently in
could ask, what would Engineer X do if he/she use by audio plug-in developers to solve serious
were me. problems and make the next killer plug-ins.
Personal Take: I definitely find it an
interesting application. For me personally, I’m How Does It Work?
more about wanting to achieve a certain result I Let’s focus on Deep Learning, which in turn is a
have in my head, so it won’t do much for me. subset of machine learning. This is the subnet
But no doubt for many, including musicians, this most applicable to audio processing. Artificial
can be helpful. neural networks are created which adapt and
Winter 2020 / 49
learn from vast amounts of data. With deep neural networks, algorithms learn would be able to identify them perfectly. Now
To understand neural networks, let’s have a from their own experience, forming in the imagine what else it can be trained to do?
closer look at how the brain works. It all starts learning process multi-level, hierarchical ideas
with a neuron which is a node with many inputs about the information they’re fed. As an Challenges
and one output. A neural network in turn example, let’s say I would feed a neural network It all sounds like fun so far, but let’s talk about
consists of many of these interconnected with 100 audio examples, ten of which are bass some of the challenges in building or using AI in
neurons. Simply put, it receives data at the guitars and the rest being other instruments. sound processing.
input and provides a response. What makes it Then I will tell the network to output only those First of all, in order for a neural network to
unique is that the network learns to correlate ten tracks (they are marked as ‘bass’). So in go work well, it needs large amounts of data.
incoming and outgoing signals with each other. 100 tracks, out only the ten bass tracks. The When you’re Facebook and you’re at this
Evolutionary designed to separate a signal from neural network can then analyse everything moment generating about four petabytes of
‘random noise’, it can also create new from frequency response, dynamic envelope to data per day, using AI to optimize certain tasks
hierarchical views by learning from its own harmonic structure to ‘learn’ what a bass guitar makes perfect sense. It’s hard to put a number
experience. sounds like. Ultimately with enough training, it on it for audio, since it depends on the
application, but it’s safe to say that you need
many thousands of audio examples to
efficiently train a neural network. For certain
nonlinear algorithms, you can even quadruple
this amount. Sound engineering is an extremely
nice market and we’re often limited to the
amount of input or data we have available,
often also limited by budgets.
The next challenge has even more impact on
our industry, and that’s the need for offline
processing. As I mentioned before, AIs do well
when you feed it a lot data and ask it to
generate an output. Most of us who are
recording or mixing, expect sound processing
to happen in real time. We’re being creative,
hear something and want to act on it. If every
time we wanted to process something, we’d
have to ‘capture’ the whole track and then
process it, it would take us out of the creative
process. We want to throw on a plug-in and
hear its effect immediately. Speed is important.
That means these plug-ins only receive very
small buffers, say 32 samples of audio at a time.
By the time the plug-in receives the next buffer,
it needs to have processed the first one already.
With only 32 samples, there is almost nothing
that an AI can do with this. For it to be effective,
it needs to have access to much more audio,
something currently not present in our typical
workflows (except for offline processing of
course).
The future of AI in audio processing

I obviously have no crystal ball, but I’ll explain
what I hope to see in the coming years. The
future will tell if it actually happens. I honestly
believe we need to see a serious paradigm shift
in the way DAWs are designed. Most of them
still follow the linear real-time mode, an
evolutionary step from using tape. However, for
things such as editing, mixing and mastering (or
even post recording), a DAW has access to all
the audio in the session.
If DAWs and plug-ins could communicate
more with each other, instead of just
exchanging buffers, imagine the possibilities
this would bring.
Let’s say you have a track with a bass guitar
on it, and you want to use a particular plug-in
that adds harmonics and compression. If that
plug-in could access all the data on the track
while it gets instantiated, it could give you
feedback about the existing harmonic structure
and dynamic content. That’s where I think AI
50 / Winter 2020
/ Technology
can be extremely useful, giving suggestions to producers and engineers

that will speed up their workflows.
>> insert fig_4 [crop black]
For example, how many times have you heard a resonance in a piano
recording and had to notch it out. What if the DAW would understand it’s
a piano recording and finds its resonance automatically and give you a
suggestion “I believe there is a resonance at 2050Hz, do you want me to
remove it?” Then as a creative, you would still have the choice to act on it
or leave it alone. Or how about “I notice this track is funky but your bass
guitar has a lot of 3rd order harmonics where most bass guitars in this
genre possibly have more 2nd order harmonics. Would you like to change
its harmonic structure?”
Also, plug-ins such as Neutron could become a lot more powerful if it
had access to the data in advance. It would make the workflow to detect
masking between tracks a whole lot easier. If there would be such a
protocol in the coming years to have better communication between
DAWs and plugins, it would really inspire a serious revolution in the design
of tools for audio processing.
What about the creative process?

I’ve already explained why in my vision AI is there to assist us more, and
not take over the creative control. I personally feel that the creative
process of crafting a song is sacred. In most cases, you’re telling a story
— your story. Music is a reflection of society and of the person behind the
music. In the end, it would be possible to use AI to learn more about the
personality and workflow of an artist and create a song in his or her style.
But that’s not what music is all about, it’s an art form. It’s also not
about perfection, so we shouldn’t try to use AI to make something
perfect, let’s use it to fix things that bother us or shape things that fit our
taste. If we were to go that route where we let AI take over our creative
process, then who owns the copyright? Is it the developer who created
the AI? It opens up a whole can of worms. Let’s hope AI will just help us to
be more creative and allow us to create sounds that knock the socks off
of your listeners.
*2001: A Space Odyssey (Stanley Kubrick 1968) — As Dave moves to shut down
HAL, the AI computer pleads with him and promises to change: “I know I’ve made
some very poor decisions recently, but I can give you my complete assurance that
my work will be back to normal. I’ve still got the greatest enthusiasm and
confidence in the mission. And I want to help you…”
Winter 2020 / 51

AI in Sound Engineering Resolution V19.6 Winter 2020

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI in Sound Engineering Resolution V19.6 Winter 2020

Uploaded by

Copyright:

Available Formats

Technology

to slowly sneak its way into our lives, from

The future of AI in audio processing

can be extremely useful, giving suggestions to producers and engineers

What about the creative process?

You might also like