1 views

save

- EE506+Lecture+2
- Fundamentals of Digital Signal Processing (DSP)
- 3
- ECG BASED HEART RATE MONITORING SYSTEM IMPLEMENTATION USING FPGA FOR LOW POWER DEVICES AND APPLICATIONS.pdf
- 9283_DOC063.52.03813.Mar09.web
- Fundamentals of Digital Signal Processing (DSP)
- 1 2 6 ak understandinganalogdesign rng
- Basic Signal Processing
- 5 Kundur Test1Summary Handouts
- Ddc Theory
- PL Ex Terap NME1-3
- slomo-nsdi13
- DSP-Chapter1 Student 28062015
- 1 2 6 p understandinganalogdesign rng(finished)
- Fundamentals of Digital Signal Processing (DSP)
- Tarea Electronica
- Chapter 5
- E Datenerfassung PC (1)
- Signal Processing Using MATLAB
- RTWP Analysis for Dual Carier Radios
- Disco Help
- windaq.pdf
- Shri Rabindra Kumar Bhalotia list_new2_Pdf order1.pdf
- Audio Basics
- Service Training Electric BF800
- Adc
- the digital-decimation filter.pdf
- intsrument physics doc - google docs
- Analog s7200
- 82531
- The Unwinding: An Inner History of the New America
- Yes Please
- Sapiens: A Brief History of Humankind
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- John Adams
- The Emperor of All Maladies: A Biography of Cancer
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- Grand Pursuit: The Story of Economic Genius
- This Changes Everything: Capitalism vs. The Climate
- The Prize: The Epic Quest for Oil, Money & Power
- The New Confessions of an Economic Hit Man
- Team of Rivals: The Political Genius of Abraham Lincoln
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Rise of ISIS: A Threat We Can't Ignore
- Angela's Ashes: A Memoir
- Steve Jobs
- How To Win Friends and Influence People
- Bad Feminist: Essays
- The Light Between Oceans: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- Extremely Loud and Incredibly Close: A Novel
- Leaving Berlin: A Novel
- The Silver Linings Playbook: A Novel
- You Too Can Have a Body Like Mine: A Novel
- The Incarnations: A Novel
- The Love Affairs of Nathaniel P.: A Novel
- Life of Pi
- Bel Canto
- The Master
- A Man Called Ove: A Novel
- Brooklyn: A Novel
- The Flamethrowers: A Novel
- The First Bad Man: A Novel
- We Are Not Ourselves: A Novel
- The Blazing World: A Novel
- The Rosie Project: A Novel
- The Wallcreeper
- Wolf Hall: A Novel
- The Art of Racing in the Rain: A Novel
- Beautiful Ruins: A Novel
- The Kitchen House: A Novel
- Interpreter of Maladies
- The Perks of Being a Wallflower
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Bonfire of the Vanities: A Novel
- The Cider House Rules
- A Prayer for Owen Meany: A Novel
- Mislaid: A Novel

Processing

Introduction

Figure 1 illustrates the speech production model universally used in speech

signal processing.

pitch

impulse

train

generator

glottal

pulse

model

random

noise

generator

vocal

tract

model

radiation

model

voiced/unvoiced

switch

vocal tract

parameters

U(z), u[n]

G(z), g[n]

P(z), p[n]

e[n]

V (z), v[n]

R(z), r[n]

s[n]

A

V

A

N

period N

p

Figure: Speech Production Model

From this model, we can see that the important parameters in the model

include

◮ pitch period N

p

◮ voice/unvoiced detection

◮ glottal pulse g[n]

◮ Gain A

V

and A

N

◮ vocal tract model v[n]

◮ radiation model r [n]

Short-Time Speech Analysis

Since speech is time-varying, the model parameters are also time-varying

and we need a short-time analysis to estimate them. Furthermore, from

speech samples to model parameters, alternative short-time representations

are often required.

short−time

analysis

parameter

estimation

speech

alternate

representation

waveform

model

parameters

s[n]

Q

ˆ n

Figure: Short-time Speech Analysis for Parameter Estimation

Here we introduce the representations of

◮ the short-time energy

◮ the short-time zero-crossing rate

◮ the short-time autocorrelation function

◮ the short-time Fourier transform

Short-Time Speech Analysis

Assumption

The properties of speech signal change relatively slowly (compared to the

sampling rate of the speech waveform), with rates of change on the order of

10 −30 times/sec, corresponding to the rate of speech 5 −15 phones (or

subphones) per second.

Basic Framework

A speech signal is partitioned into short segments, each of which is assumed

to be similar to a frame from a sustained sound. Such a segment is called an

(analysis) frame. The frames are used to detect the sounds, which are

integrated to be the speech.

Frame

◮ window function w[n]: the (ﬁnite-duration) function used to extract a frame

from the speech waveform

◮ rectangular window vs. Hamming window

w

R

[n] =

1, 0 ≤ n ≤ L −1

0, otherwise

(1)

w

H

[n] =

0.54 −0.46 cos

2πn

L−1

0 ≤ n ≤ L −1

0, otherwise

(2)

◮ frame length (size) L: the length (seconds or samples) of a frame

◮ short frames (5 −20 ms), medium frames (20 −100 ms), long frames

(100 −500 ms)

◮ frame shift R: the time between frames

General Representation of Short-time Processing

linear

filter

lowpass

filter

T(x[n]) Q

ˆ n

T( )

˜ w[n]

s[n] x[n]

Figure: General Representation of Short-time Analysis

All the short-time processing we will discuss can be represented

mathematically as

Q

ˆ

n

=

m

T(x[m])

˜

w[

ˆ

n −m] (3)

In (3), T(·) is meant to extract certain feature(s) of the speech signal. The

feature(s) is then summed over a window

˜

w[

ˆ

n −m] anchored at

ˆ

n. The result

is a short-time feature near

ˆ

n. For example, for a rectangular window with

length L,

Q

ˆ

n

=

ˆ

n

m=

ˆ

n−L+1

T(x[m]) (4)

Frame Shift

In (3), let

ˆ

n be shifted R samples a time, i.e.,

ˆ

n = kR, k = 0, 1, . . . (5)

The choice of R is dependent on the frame length L. First, R < L, since we

do not want to exclude samples from analysis. Secondly, R cannot be too

large lest the short-time analysis cannot catch up with the rate of change of

speech sounds.

Energy

E

ˆ

n

=

m

(x[m]w[

ˆ

n −m])

2

=

m

(x[m])

2

˜

w[

ˆ

n −m] (6)

For the case of rectangular window of length L

E

ˆ

n

=

ˆ

n

m=

ˆ

n−L+1

(x[m])

2

(7)

Without additive noises, E

ˆ

n

is useful in detecting voiced segments of speech.

It is also useful to detect silence segments.

Magnitude

The short-time magnitude is similar to the short-time energy. Only (·)

2

is

replaced by | · |.

Zero-Crossing Rate

Zero-Crossing

For discrete-time signals, a zero-crossing occurs if two adjacent samples are

of different signs.

Consider a sinusoidal wave x

c

(t ) of frequency F

0

. If the sampling period T is

not longer than half the period 1/2F

0

, then the number of zero-crossing per

second of x[n] = x

c

(nT) is 2F

0

, since there are two zero-crossing in x[n] per

cycle of x

c

(t ) and there are F

0

cycles per second. The number of

zero-crossings per sample is

Z

(1)

=

2F

0

F

S

, (8)

where F

S

1

T

is the sampling frequency. The number of zero-crossings per

M samples is

Z

(M)

= M

2F

0

F

S

(9)

Reversing (8), the equivalent sinusoidal frequency can be deﬁned as

F

e

=

1

2

F

S

Z

(1)

(10)

Examples

Suppose F

S

= 10000 samples/sec. For

F

0

= 100, 1000, 5000 Hz

of sinusoidal signals, the zero-crossing rates are

0.02, 0.2, 1

crossings/sample respectively. (Note the typo in text)

Practical Considerations

It is vulnerable to noise and DC offset.

General Representation

In terms of (3), we can write

Z

ˆ

n

=

1

2L

ˆ

n

m=

ˆ

n−L+1

|sgn x[m] −sgn x[m]| (11)

Usage

Together with energy, Z or Z

(M)

can be used in the classiﬁcation of

voiced/unvoiced segments of speech signal. Speciﬁcally, a voiced segment

is high in E and low in Z.

Autocorrelation Function

The deterministic autocorrelation function of a discrete-time signal x[n] is

deﬁned as

φ[k] =

∞

m=−∞

x[m]x[m+ k] (12)

If x[n] is random or periodic, the useful deﬁnition is

φ[k] = lim

N→∞

1

2N + 1

N

m=−N

x[m]x[m+ k] (13)

Properties

φ[k] = φ[−k]; φ[k] ≤ φ[0]; φ[0] is the signal energy or power.

Short-time Autocorrelation

At analysis time

ˆ

n, the short-time autocorrelation is deﬁned as the

autocorrelation function of the windowed segment, i.e.,

R

ˆ

n

[k] =

∞

m=−∞

(x[m]w[

ˆ

n −m])(x[m+ k]w[

ˆ

n −k −m]) (14)

The properties holding for φ[k] also hold for R

ˆ

n

[k].

General Representation

In terms of (3), we can write

R

ˆ

n

[k] = R

ˆ

n

[−k]

=

∞

m=−∞

(x[m]x[m−k])(w[

ˆ

n −m]w[

ˆ

n + k −m])

=

∞

m=−∞

(x[m]x[m−k])

˜

w

k

[

ˆ

n −m],

(15)

where

˜

w

k

[

ˆ

n −m] is the effective window for lag k deﬁned as

˜

w

k

[

ˆ

n] = w[

ˆ

n]w[

ˆ

n + k].

Computation

(14) can be re-written as

R

ˆ

n

[k] =

∞

m=−∞

(x[

ˆ

n + m]w

′

[m])(x[

ˆ

n + m+ k]w

′

[k + m]), (16)

where w

′

[m] = w[−m], and the change of variable m →

ˆ

n + m

′

has been

made. Considering w

′

[m], we can see there are only certain non-zero terms

in (17), which becomes

R

ˆ

n

[k] =

L−1−k

m=0

(x[

ˆ

n + m]w

′

[m])(x[

ˆ

n + m+ k]w

′

[k + m]) (17)

Practical Considerations

R

ˆ

n

[k] decreases with k as the number of terms in the summation is L −k.

Some modiﬁcation can be made such that there are L terms for each k.

Usage

For voiced segments, the autocorrelation function shows peridicity.

Furthermore, the rectangular window works better than the Hamming

window.

National Sun Yat-Sen University - Kaohsiung, Taiwan

- EE506+Lecture+2Uploaded byZaf Feer
- Fundamentals of Digital Signal Processing (DSP)Uploaded byGordon Ariho
- 3Uploaded byJanardhan Ch
- ECG BASED HEART RATE MONITORING SYSTEM IMPLEMENTATION USING FPGA FOR LOW POWER DEVICES AND APPLICATIONS.pdfUploaded byesatjournals
- 9283_DOC063.52.03813.Mar09.webUploaded bySoeAung
- Fundamentals of Digital Signal Processing (DSP)Uploaded byZesi Villamor Delos Santos
- 1 2 6 ak understandinganalogdesign rngUploaded byapi-290804719
- Basic Signal ProcessingUploaded byM S Prasad
- 5 Kundur Test1Summary HandoutsUploaded byvimalaspl7831
- Ddc TheoryUploaded byHassan Abuzid
- PL Ex Terap NME1-3Uploaded byCarvalho Sonia
- slomo-nsdi13Uploaded bybulli babu
- DSP-Chapter1 Student 28062015Uploaded byNguyenThao
- 1 2 6 p understandinganalogdesign rng(finished)Uploaded byapi-287657167
- Fundamentals of Digital Signal Processing (DSP)Uploaded byvijay
- Tarea ElectronicaUploaded bySebastian Ordoñez
- Chapter 5Uploaded bySamah Abu Saleem
- E Datenerfassung PC (1)Uploaded byNew User
- Signal Processing Using MATLABUploaded byayslaks
- RTWP Analysis for Dual Carier RadiosUploaded byMahmoud Atwi
- Disco HelpUploaded byJohn Fredric Kallie
- windaq.pdfUploaded bynikoleta_tmm
- Shri Rabindra Kumar Bhalotia list_new2_Pdf order1.pdfUploaded byDan Ciranny
- Audio BasicsUploaded byAndres Regalado Bucheli
- Service Training Electric BF800Uploaded byjose oyola coronado
- AdcUploaded bydominhchi
- the digital-decimation filter.pdfUploaded byMoh Zainal Abidin
- intsrument physics doc - google docsUploaded byapi-385958959
- Analog s7200Uploaded byalimoya13
- 82531Uploaded byseloca