You are on page 1of 30

Precedence Effect

Beamforming

Demo of the Franssen effect


Demonstrates precedence

Introduction to 3D Audio (capture)


Directivity of microphone.
Omni-directional Advantages are that microphones capture all sound including that of interest Directional Capture sound from a preferred direction

Beamforming
Given N microphones combine their signals in a way that some desired result occurs Word arises from the use of parabolic reflectors to form pencil beams for broadcast and reception Alternate word: spatial filtering Towed array and fixed array sonars

Delay and Sum Beamforming


If the source location is known, delays relative to the microphone can be obtained Signal x at location s arrives at microphone mi as

x t

|s mi | c

Signals at microphones can be appropriately delayed and weighted. Output signal is


l =1

|s mi|

l = |s xl |/c wl = 1/|s xl |

N 1 X y (k ) = wlxl (k l ) N

Behavior of simple beamformer


Usually source is assumed to be far away.
Weights are approximately the same in this case

Signal from source direction adds in phase


So the signal is amplified N times

Signals from other directions will add up with random phase and the power will decrease by a factor of 1/N Directivity index is a measure of the gain of the array in the look direction (location of the delays) in decibels
For N microphones 10 log10 (N)

Requires an ability to store the signal (at least for max {l} Jargon: taps number of samples in time that are stored

Data independent beamforming:


Weights are fixed

Data dependent (adaptive)


Weights change according to the data

Simple example:
Fixed: Delay and sum looking at a particular point (direction) Adaptive: Delay and sum looking at a particular moving source

More general beamforming


Suppose we want to take advantage of the stored data Write the beamformer output as
y(k) =
N X k X

l =1 m=kM

w lm xl (k m)

Can be written as y=wH x Take Fourier transform of the weights and the signal

Speech and Audio Processing

Microphone Array Processing


Slides adapted from those of Marc Moonen/Simon Doclo Dept. E.E./ESAT, K.U.Leuven www.esat.kuleuven.ac.be/~moonen/

Introduction
Each microphone is characterized by a `directivity pattern which specifies the gain (& phase shift) that the microphone gives to a signal coming from a certain direction (`angle-of-arrival). for 1 frequency: Directivity pattern is a function of angle-of-arrival and frequency Directivity pattern is a (physical) microphone design issue.

0.5

0 0
F r e 1000 qu e n c 2000 3000 y( Hz )

45

90

135

180

A ngle (deg)

Introduction
By weighting/filtering and summing signals from different microphones, a `virtual directivity pattern may be produced

f1[k]
z[k]

y1[k] y2[k]

f2[k] f M [k]

yM [k]

This is `spatial filtering and `spatial filter design, based on given microphone characteristics (with correspondences to traditional (spectral) filter design) Applications: teleconferencing, hands-free telephony, hearing aids, voicecontrolled systems,

Introduction
An important aspect is that different microphones in a microphone array are in different positions/locations, hence receive different signals Example : linear array, with uniform inter-microphone distances, under far-field (plane waveforms) conditions. Each microphone S() receives the same signal, but Y (, ) F () with different delays. Y (, )
1 1

F2 ()

Z(,)

dm

Fm ()

Ym (, )

dm cos

FM ()

YM (, )

Hence `spatial filter design based on microphone characteristics + microphone array configuration. Often simple assumptions are made, e.g. microphone gain = 1 for all frequencies and all angles.

Introduction
Background/history: ideas borrowed from antenna array design/processing for RADAR & (later) wireless comms. Microphone array processing considerably more difficult than antenna array processing:
narrowband radio signals versus broadband audio signals far-field (plane wavefronts) versus near-field (spherical wavefronts) pure-delay environment versus multi-path reverberant environment

Classification: fixed beamforming: data-independent, fixed filters fm[k]


e.g. delay-and-sum, weighted-sum, filter-and-sum e.g. LCMV-beamformer,

adaptive beamforming: data-dependent, adaptive filters fm[k]

Beamforming basics
General form: filter-and-sum beamformer
linear microphone array with M microphones and inter-micr. distance dm Microphone gains are assumed to be equal to 1 for all freqs./angles
(otherwise, this characteristic is to be included in the steering vector, see next page)

source S() at angle (far-field, no multipath) filters fm[k] with filter length L

F1 ( ) F 2 ( ) Y1 ( , ) Y 2 ( , )
dm S ( )

Z ( , )

F m ( )

Y m ( , )

d m cos

F M ( )

Y M ( , )

Fm ( ) = f m [ k ]e jk
k =0

L 1

Terminology: `Broadside direction: = 90o

`End-fire

direction: = 0o

Near-field beamforming
Far-field assumptions not valid for sources close to microphone array
spherical wavefronts instead of planar waveforms include attenuation of signals 3 spherical coordinates ,,r (=position q) instead of 1 coordinate

Different steering vector:

d( , )

d( , q) = a1e
am = q p ref q pm

j 1 ( q )

a2 e

j 2 ( q )

K aM e
fs

j M ( q ) T

m (q) =

q p ref q p m c

with q position of source pref position of reference microphone pm position of mth microphone

Beamforming basics
Data model:
Microphone signals are delayed versions of S()

Ym ( , ) = e j m ( ) .S ( )

y m [ k ] = s[ k m ( )]

m ( ) =

d m cos fs c

Stack all microphone signals in a vector

Y( , ) = d( , ).S ( ) d( , ) = [1 e j
d is `steering vector Output signal Z(,) is
* Z ( , ) = Fm ( )Ym ( , ) = F H ( ) Y( , ) m =1 M

2 ( )

K e

j M ( ) T

Beamforming basics
Data model:
Microphone signals are corrupted by additive noise

ym [k ] = s[k m ( )] + nm [k ]
Stack all noise signals in a vector

N( ) = [N1 ( ) N 2 ( ) ... N M ( )]
Define noise correlation matrix as

NN ( ) = E{N( ).N( ) H }
We assume noise field is homogeneous, i.e. diagonal elements of are NN ( )

ii ( ) = noise ( ) , i
Then noise coherence matrix is

1 NN ( ) = . NN ( ) noise ( )

Beamforming basics
Definitions:
Spatial directivity pattern: `transfer function for source at angle

Z ( , ) M * H ( , ) = = Fm ( )e j m ( ) = F H ( ) d( , ) S ( ) m =1
Steering direction max = angle with maximum amplification (for 1 freq.) Beamwidth = region around max with amplification > -3dB Array Gain = improvement in SNR
SNR Output SNR Input F H ( ) d ( , )
2

(for 1 freq.)

G ( , ) =

F H ( ) NN ( ) F ( )

Beamforming basics
Definitions:
Array Gain = improvement in SNR

G ( , ) =

SNR Output SNR Input

F ( ) d ( , )
H

F H ( ) NN ( ) F ( )

Directivity = array gain for max and diffuse noise (=coming from all directions)

DI ( ) =

F ( ) d( , max )
H

F H ( ) diffuse NN ( ) F ( )

White Noise Gain = array gain for max and spatially uncorrelated noise (NN = ) (e.g. sensor noise)
WNG ( ) = F ( ) d ( , max )
H 2

F H ( ) F ( )

ps: often used as a measure for robustness

Delay-and-sum beamforming
Microphone signals are delayed and summed together Array can be virtually steered to angle 1 M z[ k ] = . ym [k + m ] M m =1

1
1 M

d d

2
m

e j m Fm ( ) = M
( m 1) d cos

m =

d m cos fs c

Angular selectivity is obtained, based on constructive (for =) and destructive (for ) interference d( , ) For =, this is referred to as a `matched filter : F( ) = M For uniform linear array : H ( , = ) = 1
d m = (m 1)d m = (m 1)

PS:

H ( , ) = H ( , ) (explain!) (if microphone characteristics are ignored)

Delay-and-sum beamforming
Spatial directivity pattern H(,) for uniform DS-beamformer
H ( , ) = =

e
e

j ( m 1)

d (cos cos ) fs c

M=5 microphones d=3 cm inter-microphone distance

m =1 jM / 2

H(,) has sinc-like shape and is frequency-dependent


H ( , = ) = 1
1 0.8 0.6 0.4 0.2 0 2000 4000 Frequency (Hz) 6000 8000 0 45 90 Angle (deg) 135 180
270 180 -10

sin( M / 2) e j / 2 sin( / 2)

=60 steering angle


fs=16 kHz sampling frequency

Spatial directivity pattern for f=5000 Hz


90 0

-20 0

=endfire

=60
wavelength=4cm

Delay-and-sum beamforming

c f For d (1 + cos )an ambiguity, called spatial aliasing, occurs.

This is analogous to time-domain aliasing where now the spatial c c sampling (=d) is too large. = = min d 2 f s 2. f max Aliasing does not occur (for any ) if M=5, =60, fs=16 kHz, d=8 cm
1 0.8 0.6 0.4 0.2 0 2000
y enc qu F re
f = c d .(1 + cos )

Details... H ( , ) = 1 iff = 2.p for integer p 1) = 0 for = (for all ) c 2 d .(1 + cos ) c 3) if then = 2 occurs for = 0 and f = .. = 2 d .(1 cos ) 2) if then = 2 occurs for = and f = .. =

4000 6000 8000 0 50 150 100 Angle (deg)

z) (H

Delay-and-sum beamforming
Beamwidth: for a uniform delay-and-sum beamformer
96(1 ) BW c sec dM
with e.g. = 1
2

(-3 dB)

hence large dependence on # microphones, distance (compare p14 & 15) and frequency (e.g. BW infinitely large at DC)

Array topologies:
Uniformly spaced arrays Nested (logarithmic) arrays (small d for high , large d for small ) Planar / 3D-arrays
4d d

2d

Weighted-sum beamforming
and-weight/sum Sensor-dependent complex weight + delay (compare to p. 13)

`delay-

w1 w2
wm

d d

z[k ] = wm . ym [k + m ]
m =1

2
m

( m 1) d cos

H ( , ) = wm e
m =1

j ( m 1)

d (cos cos ) fs c

Weights added to allow for better beam shaping Design similar to traditional (spectral) filter design
Ex: Dolph-Chebyshev design: beampattern with uniform sidelobe level (`equiripple)

Filter-and-sum beamforming
Sensor-dependent filters implement frequency-dependent complex weights to obtain a desired response over the whole frequency/angle range of interest

f1[k ]

d d

z[k ] = f m [k ] ym [k ]
m =1
( m 1 ) d cos

f 2 [k ]
f m [k ]

H ( , ) = F ( ) e
m =1 * m

j ( m 1)

d cos fs c

Design strategies : desired beampattern is P(,)


Non-linear: Quadratic:
min
f m [ k ],m =1KM

min

( H ( , ) P( , ) ) d d
2
1

2
1

f m [ k ], m =1K M 1

2
1

H ( , ) P ( , ) d d
2

Frequency sampling, i.e. design weights for sampling frequencies I and 2 then interpolate : min H ( , ) P ( , ) d
Fm ( i ), m =1K M 1

Filter-and-sum beamforming
Example-1: frequency-independent beamforming (continued)
M=8 Logarithmic array L=50 =90 fs=8 kHz

0.5

0 0
Fre 1000 que 2000 nc y 3000 (H z)

45

90

135

180

Angle (deg)

Filter-and-sum beamforming
Example-2: `superdirective beamforming
Maximize directivity for known (diffuse) noise fields Maximum directivity =M 2 obtained for diffuse noise & endfire steering ( =0o)

Design: find F() that maximizes


for given steering angle theta_max Optimal solution is

DI ( ) =

F H ( ) d( , max )

F H ( ) diffuse NN ( ) F ( )

1 F( ) = NN ( ) d ( , max )

This is equivalent to minimization of noise output power, subject to unit response for steering angle (**) min F H ( ) NN ( ) F ( ), s.t. F H ( ) d ( , max ) = 1
F ( )

(NN = ) PS: Delay-and-sum beamformer similarly maximizes WNG F( ) = d( , max )

Filter-and-sum beamforming
Example-2: `superdirective beamforming (continued)
Directivity patterns for endfire steering:
D elay -and-su m be amfo rmer (f=3 00 0 Hz)
90 0

S u p e rd ire c tiv e b e a m fo rm e r (f= 3 0 0 0 H z)


90 0

-10

-1 0

-20 180 0

-2 0 180 0

270

270

M=5 d=3 cm theta_max=0 fs=16 kHz

Superdirective beamformer has highest DI, but very poor WNG hence problems with robustness (e.g. sensor noise) ! M2
25 20 Directivity (linear) 15 10 5 0 10 S u p e rd ire c tiv e D e la y -a n d -s u m 0 White noise gain (dB) -1 0 -2 0 -3 0 -4 0 -5 0 0 2000 4000 F re q u e nc y (H z) 6000 8000 -6 0 0 2000 S u p e rd ire c tiv e D e la y -a nd -s u m 4000 F re q u e nc y (H z) 6000 8000

6.99=10.Log(5)

PS: diffuse noise = white noise for high frequencies

LCMV-beamforming
Adaptive filter-and-sum structure:
Aim is to minimize noise output power, while maintaining a chosen frequency response in a given look direction (and/or other linear constraints, see below) This corresponds to operation of a superdirective array (see (**) p25), but now noise field is unknown Implemented as adaptive filter (e.g. constrained LMS algorithm) Notation:

f1[k ]
z[k ]

y1 [k ] y 2 [k ]
Speaker

T z[k ] = f y[k ] = f m y m [k ] T m=1 M

f 2 [k ] f M [k ]

y M [k ]
Noise

T T y[k ] = y1 [k ] y T 2 [k ] K y M [k ]

T T

f = f1T

T T f2 K fM

T T

y m [k ] = [ ym [k ] ym [k 1] K ym [k L + 1]]

f m = [ f m [0]

f m [1] K f m [ L 1]]

LCMV-beamforming
LCMV = Linearly Constrained Minimum Variance
f designed to minimize variance of output z[k] :
min E z 2 [ k ] = min f T R yy [ k ] f
f f

to avoid desired signal distortion/cancellation, add linear constraints:


CT f = b, with C ML J , b J if noise and speech are uncorrelated, constrained output power minimization corresponds to constrained noise power minimization Type of constraints: M
Frequency response in look-direction. Point, line and derivative constraints Ex:
m=1 m

F ( z) = 1 (for broadside)
(=L constraints)

Solution is (obtained using Lagrange-multipliers, etc..):

fopt = R [k ] C C R [k ] C b
T

1 yy

1 yy

You might also like