You are on page 1of 8

Page 1 of 8

SHREE L.R. TIWARI COLLEGE OF ENGINEERING


DEPARTMENT OF ELECTRONICS AND TELECOMMUNICATION
ENGINEERING
MIRA ROAD (E) – 401107
UNIVERSITY OF MUMBAI
Academic Year 2019–20

Project Report on

“TO READ A SPEECH SIGNAL AND PLOT IT”

Submitted by
TANMAY HARESH PATIL

Under Guidance of
PROF. ADITYA DESAI
Page 2 of 8

INDEX:
SR NO. TOPIC PAGE NO.

1. ACKNOWLEDGEMENT I

2. ABSTRACT II

3. INTRODUCTION 1

4. DESCRIPTION 1

5. SOFTWARE 2

6. PROGRAM 2

7. OUTPUT 3

8. CONCLUSION 4

9. BIBILOGRAPHY 4
Page 3 of 8

ACKNOWLEDGEMENT

Special thanks to our Guide PROF. ADITYA DESAI for assisting us to complete our
project report onTO READ A SPEECH SIGNAL AND PLOT IT. She is our faculty for
Mini Project whose expertise and talent in OPEN SOURCE TECHNOLOGY and
troubleshooting and logical regression helped us effectively to complete this project.
We would also like to thank our HOD PROF. ZAINAB MIZWAN for providing us facility
and labs, which helped us constantly in increasing our technical knowledge, and to write this
project report.
Page 4 of 8

ABSTRACT

This project focuses especially on the python coding and program to read a speech signal
and to plot it graph. The program output to read a speech signal are shown in frequency
domain.
Page 5 of 8

INTRODUCTION
Speech is a complex phenomenon. People rarely understand how is it produced and
perceived. The naive perception is often that speech is built with words and each word
consists of phones. The reality is unfortunately very different. Speech is a dynamic process
without clearly distinguished parts. It’s always useful to get a sound editor and look into the
recording of the speech and listen to it. Here is for example the speech recording in an audio
editor. [3]

DESCRIPTION
Speech processing is the study of speech signals and processing methods. The signals are
usually processed in a digital representation, so speech processing can be regarded as a
special case of digital signal processing, applied to speech signals. Aspects of speech
processing include the acquisition, manipulation, storage, transfer, and output of speech
signals.
Speech processing has been defined as the study of speech signals and their processing
methods, and also as the intersection of digital signal processing and natural language
processing.
Speech processing technologies are used for digital speech coding, spoken language dialog
systems, text-to-speech synthesis, and automatic speech recognition. Information (such as
speaker, gender, or language identification, or speech recognition) can also be extracted from
speech. [1]

SOFTWARE
Python is an interpreted, high-level, general-purpose programming language. Created by
Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code
readability through use of significant whitespace. Its language constructs and object-oriented
approach aim to help programmers write clear, logical code for small and large-scale
projects.
Page 6 of 8

PROGRAM
import matplotlib.pyplot as plt
import sounddevice as sd
from scipy.fftpack import fft, fft2
import numpy as np
plt.close('all')

# details for sound recording


Fs = 16000
d=3

# record sound
print('Start Speaking')

a = sd.rec(int(d*Fs), Fs, 1, blocking = 'True')

a = a.flatten(); # to convert matrix into array

#t = np.arange(0,d,1/Fs)
#a = np.sin(2*3.14*2000*t)
print('End Recording')

# Play
sd.play(a,Fs)
# plot the recorded wave
plt.plot(a); plt.title('Recorded Sound')

# spectrum
X_f = fft(a)
#X_f = fft2(a)

# create frequency axis


n = np.size(a)
fr = (Fs/2)*np.linspace(0,1,round(n/2))
X_m = (2/n)*abs(X_f[0:np.size(fr)])

# plot spectrum
plt.figure()
plt.plot(fr, X_m); plt.xlabel('Frequency(Hz)')
plt.ylabel('Magnitude'); plt.title('Sound Spectrum')
Page 7 of 8

OUTPUT
Page 8 of 8

CONCLUSION

As a conclusion, we can say that this software offers strong possibilities. Thus when a speech
signal was given as an input, the desired time and frequency domain graphs are obtained by
executing the above code. Fast Fourier Transform (FFT) is used for plotting of frequency
domain where it is so efficient that power of two transform lengths are frequently used
regardless of what the actual length of a given speech signal..

REFERENCES
[1] https://www.sciencedirect.com/topics/neuroscience/speech-processing
[2] https:jcbrolabs.org
[3] https://cmusphinx.github.io/wiki/tutorialconcepts/

You might also like