Professional Documents
Culture Documents
Simple Audio Recognition On A Raspberry Pi Using Machine Learning (I2S, TensorFlow Lite) Electronut
Simple Audio Recognition On A Raspberry Pi Using Machine Learning (I2S, TensorFlow Lite) Electronut
electronut
Introduction
You know a technology is maturing when there is an
abundance of jokes about it on the internet.
Source: https://xkcd.com/1838/
https://electronut.in/audio-recongnition-ml/ 1/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
Objective
Adapt the official TensorFlow simple audio recognition
example to use live audio data from an I2S microphone on
a Raspberry Pi.
The Method
Here’s our plan:
https://electronut.in/audio-recongnition-ml/ 2/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
Project Architecture
https://electronut.in/audio-recongnition-ml/ 3/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
https://electronut.in/audio-recongnition-ml/ 4/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
def stft(x):
return tf.convert_to_tensor(np.abs(spec))
def get_spectrogram(waveform):
# same length
spectrogram.set_shape((129, 124))
return spectrogram
https://electronut.in/audio-recongnition-ml/ 6/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.SparseCategoricalCrossen
tropy(from_logits=True),
metrics=['accuracy'],
https://electronut.in/audio-recongnition-ml/ 7/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
EPOCHS = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=EPOCHS,
callbacks=tf.keras.callbacks.EarlyStopping(ve
rbose=1, patience=2),
model.save('simple_audio_model_numpy.sav')
converter = tf.lite.TFLiteConverter.from_saved_mo
del('simple_audio_model_numpy.sav') # path to the
SavedModel directory
tflite_model = converter.convert()
f.write(tflite_model)
$ python3
>>> interpreter.allocate_tensors()
>>> interpreter.get_input_details()
>>> interpreter.get_output_details()
>>>
Now let’s look at how to get the audio input for inference.
After you set up the I2S mic and Pi audio, it’s a good idea
to collect some sound samples before you try any
inference. Take a look at audio_test.py utility program in
https://electronut.in/audio-recongnition-ml/ 9/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
the repository. First run it with the –list option so you get
a list of the input devices on the Pi.
Found 4 devices:
dmic_hw
dmic_sv
dmix
done.
CHUNK = 4096
FORMAT = pyaudio.paInt32
CHANNELS = 2
RATE = 16000
RECORD_SECONDS = nsec
WAVE_OUTPUT_FILENAME = wavfile_name
https://electronut.in/audio-recongnition-ml/ 10/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
# initialize pyaudio
p = pyaudio.PyAudio()
getInputDevice(p)
print('opening stream...')
channels = CHANNELS,
rate = RATE,
input = True,
frames_per_buffer = CHUNK,
input_device_index = 1)
frames = []
data = stream.read(CHUNK)
data = stream.read(CHUNK)
#print(data)
frames.append(data)
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
This is what the WAV file looks like it when I transfer the
file over to my computer and open it in Audacity.
https://electronut.in/audio-recongnition-ml/ 11/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
As you can see, there are 2 channels, but data only on one
of them (as expected), and the volume level is very low.
Here’s what it looks like after normalising it in Audacity.
def show_audio(wavfile_name):
print_info(waveform0)
https://electronut.in/audio-recongnition-ml/ 12/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
waveform = None
if len(waveform0.shape) == 2:
waveform = waveform0.T[1]
else:
waveform = waveform0
# normalise audio
wabs = np.abs(waveform)
wmax = np.max(wabs)
display.display(display.Audio(waveform, rate
= 16000))
np
.sqrt(np.mean(waveform**2)),
np
.mean(np.abs(waveform))))
max_index = np.argmax(waveform)
timescale = np.arange(waveform0.shape[0])
axes[0].plot(timescale, waveform0)
timescale = np.arange(waveform.shape[0])
axes[1].plot(timescale, waveform)
timescale = np.arange(waveform.shape[0])
axes[2].plot(timescale, waveform)
timescale = np.arange(16000)
axes[3].plot(timescale, waveform[start_index:
end_index])
plt.show()
https://electronut.in/audio-recongnition-ml/ 13/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
[[ 0 -122814464]
[ 0 -122912768]...]
While working with data, it’s a good idea to print out the
shape of your numpy arrays at various stages to verify that
your assumptions are correct.
waveform_padded = np.zeros((16000,))
waveform_padded[:waveform.shape[0]] = waveform
https://electronut.in/audio-recongnition-ml/ 14/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
PTP = np.ptp(waveform)
return []
def get_spectrogram(waveform):
waveform_padded = process_audio_data(waveform
)
if not len(waveform_padded):
return []
# compute spectrogram
spectrogram = np.abs(Zxx)
if VERBOSE_DEBUG:
print("spectrogram:", spectrogram.shape,
type(spectrogram))
print(spectrogram[0, 0])
return spectrogram
OLED Display
Since want to be a bit fancy on the output, we’re going to
hook up an 128 x 64 pixel I2C OLED display to the
Raspberry Pi. It’s connected to the Pi as follows:
VCC 3.3 V
GND GND
https://electronut.in/audio-recongnition-ml/ 15/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
spectrogram = get_spectrogram(waveform)
if not len(spectrogram):
#time.sleep(1)
return
spectrogram1= np.reshape(spectrogram,
if VERBOSE_DEBUG:
spectrogram1.dtype, spectrogram1.s
hape))
interpreter = Interpreter('simple_audio_model
_numpy.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details
()
output_details = interpreter.get_output_detai
ls()
https://electronut.in/audio-recongnition-ml/ 16/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
#print(input_details)
#print(output_details)
input_shape = input_details[0]['shape']
input_data = spectrogram1.astype(np.float32)
interpreter.set_tensor(input_details[0]['inde
x'], input_data)
print("running inference...")
interpreter.invoke()
output_data = interpreter.get_tensor(output_d
etails[0]['index'])
yvals = output_data[0]
if VERBOSE_DEBUG:
print(output_data[0])
opening stream…
Listening…
Listening…
running inference…
>>> STOP
Listening…
https://electronut.in/audio-recongnition-ml/ 17/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
Conclusion
In this project, we trained an audio recognition model on
our computer using TensorFlow, converted it to a
TensorFlow Lite model, and used that to infer commands
from a live audio stream on a Raspberry Pi. The inference
works quite well in practice, even though we used only a
subset of the MNIST data for training, and the training
was done only for 10 epochs.
Downloads
All code for this project can be downloaded from the
github link below:
https://github.com/mkvenkit/simple_audio_pi
https://electronut.in/audio-recongnition-ml/ 18/19
1/29/22, 5:43 PM Simple Audio Recognition on a Raspberry Pi using Machine Learning (I2S, TensorFlow Lite) · electronut
Related posts
Mico: A PDM to USB microphone based on the
Raspberry Pi RP2040
20 Dec 2021
Rendering a Torus: Geometry, Lighting, and
Textures
18 Jun 2021
iCE Bling FPGA – Beautiful LED Earrings with
Lattice iCE40
22 May 2019
https://electronut.in/audio-recongnition-ml/ 19/19