You are on page 1of 48

Slide 1

A COMPARISON OF COMMERCIAL
SPEECH RECOGNITION
COMPONENTS FOR USE IN POLICE
CRUISERS

3rd Annual Intelligent Vehicle Systems


Symposium
Andrew L. Kun
Brett Vinciguerra
June 11, 2003

A Free sample background from www.powerpointbackgrounds.com


Slide 2

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results and Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 3

Project54 Overview

 UNH / NHSP / DOJ


 Integrates
 Controls
 Standard Interface

A Free sample background from www.powerpointbackgrounds.com


Slide 4

C o m p u te r
G PS F in g e r p r in t a id e d
v e h ic le checks d is p a tc h
tr a c k in g
C e n tra l d a ta
re s o u rc e s :
m o to r v e h ic le ,
V o ic e D ig ita l c r im in a l,
com m and r a d io fin g e r p r in ts

V o ic e
re s p o n s e
R e m o te a c c e s s to
v e h ic le r e s o u r c e s
V id e o
C e n tra l d a ta b a s e a c c e s s
a n d fo rm s e n try
A Free sample background from www.powerpointbackgrounds.com
Slide 5

A Free sample background from www.powerpointbackgrounds.com


Slide 6

Introduction

 What was the goal of this research?

– Compare SR engine and microphone


combinations
– Accuracy and efficiency
– Quantitatively

A Free sample background from www.powerpointbackgrounds.com


Slide 7

Introduction

 Why was this research important?

– Limit distraction
– Limit frustration
– Standard Process

A Free sample background from www.powerpointbackgrounds.com


Slide 8

Introduction

 How was this goal accomplished?

– 16 combinations (4 engines x 4 mics) evaluated

– Speech Recognition Evaluation Program (SREP)


• Simulates
• Classifies
• Calculates

A Free sample background from www.powerpointbackgrounds.com


Slide 9

Introduction

 Accuracy

– # of correct commands verses total commands

 Efficiency

– false recognitions
– weighted

A Free sample background from www.powerpointbackgrounds.com


Slide 10

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results
 Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 11

SR ENGINE OPTIONS
 Speed of Speech
– Discrete
– Continuous

 Type of Application
– Command-and-control
– Dictation

 User-Dependency
– Speaker dependent
– Speaker independent

 Field of Application
– PC
– Telephone
– Noise robust

 Grammar File
A Free sample background from www.powerpointbackgrounds.com
Slide 12

Comparing SR Engines

 Field test

 Simulated tests
– Speaker source
– Background noise
– Number of speakers

A Free sample background from www.powerpointbackgrounds.com


Slide 13

Accuracy Ratings

 Not consistent

– Different conditions

 Hyde’s Law

– ‘Because speech recognisers have an accuracy


of 98%, tests must be arranged to prove it’

A Free sample background from www.powerpointbackgrounds.com


Slide 14

Component Requirements

 Speech Recognition Engine


– Must be SAPI 4.0

 Microphone
– Must be far-field
– Mountable on dashboard
– Cancel noise
• Array
• Directional

A Free sample background from www.powerpointbackgrounds.com


Slide 15

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results and Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 16

Application
A
Application Application
H B

Application Application Application


G Manager C

Application Application
F D
Application
E

A Free sample background from www.powerpointbackgrounds.com


Slide 17

A Free sample background from www.powerpointbackgrounds.com


Slide 18

LOOP ENGINES

LOOP BACKGROUND

LOOP
COMMANDS

A Free sample background from www.powerpointbackgrounds.com


Slide 19

Obtaining Sound Files

 Laptop w/ SoundBlaster
 Earthworks M30BX
 Background recorded on patrol
 Speech commands in lab
– Microsoft Audio Collection Tool
– 5 Speakers (4 male, 1 female)
– 40 phrases

A Free sample background from www.powerpointbackgrounds.com


Slide 20

Processing Sound Files

 Matlab script

Signal strength = variance(signal) + mean(signal)2

 Set volume and signal-to-noise ratio

A Free sample background from www.powerpointbackgrounds.com


Slide 21

A Free sample background from www.powerpointbackgrounds.com


Slide 22

Control File Structure

 Background Noises
– WAV filename
– Desired SNR
– Signal strength
– Description of file
 Voice Commands
– WAV filename
– Number of loops
– Signal strength
– Phrase

A Free sample background from www.powerpointbackgrounds.com


Slide 23

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results and Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 24

PRODUCTS TESTED

 Four microphones
– A, B, C and D.
 Four SR engines
– 1, 2, 3, and 4.
 16 unique combinations
– A1 through D4

A Free sample background from www.powerpointbackgrounds.com


Slide 25

A Free sample background from www.powerpointbackgrounds.com


Slide 26

SR ENGINES

 SR Engine 1
– Microsoft SR Engine 4.0

 SR Engine 2
– Microsoft SR Engine 4.0

 SR Engine 3
– Dragon NaturallySpeaking 4.0

 SR Engine 4
– IBM ViaVoice 8.01

A Free sample background from www.powerpointbackgrounds.com


Slide 27

PREPERATION

 Freshly installed engines


 Minimum training
 Default settings
 Microphone Set-up Wizard

A Free sample background from www.powerpointbackgrounds.com


Slide 28

TEST SCENERIO

 Identical conditions
 42 phrase grammar
 10 speech commands
 5 speakers
 6 background noises
 3 SNR levels

A Free sample background from www.powerpointbackgrounds.com


Slide 29

A Free sample background from www.powerpointbackgrounds.com


Slide 30

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results and Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 31

ACCURACY BY ENGINE
80
70
60
Accuracy (%)

50 MIC A
40 MIC B
MIC C
30
MIC D
20
10
0
ENG 1 ENG 2 ENG 3 ENG 4
A Free sample background from www.powerpointbackgrounds.com
Slide 32

ACCURACY BY MIC
80
70
60
Accuracy (%)

50
ENG 1
40 ENG 2
ENG 3
30
ENG 4
20
10
0
MIC A MIC B
A Free sample background from www.powerpointbackgrounds.com
MIC C MIC D
Slide 33

RANKED ACCURACY
80 C2
A2
70 D2
A1
60 C1
Accuracy (%)

50 B2
D1
40 B1
30 D4
C4
20 B4
B3
10 A3
0 C3
D3
Configuration
A4
A Free sample background from www.powerpointbackgrounds.com
Slide 34

Efficiency Score

 Specific to Project54
 False recognitions

A Free sample background from www.powerpointbackgrounds.com


Slide 35

Efficiency Score
SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS = 0
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

A Free sample background from www.powerpointbackgrounds.com


Slide 36

Efficiency Score
SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS = 1
LIGHTS UNRECOGNIZED
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

A Free sample background from www.powerpointbackgrounds.com


Slide 37

Efficiency Score
SAID HEARD
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS LOSS = 1.5
LIGHTS SIREN ON
SIREN OFF SIREN OFF
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS
LIGHTS LIGHTS

A Free sample background from www.powerpointbackgrounds.com


Slide 38

Efficiency Score
 Scoring system
– Correctly recognized = 1.5
– Unrecognised = 0.5
– Falsely recognized = 0

Eff. = ((#correct * 1.5) + (#unrec. * 0.5)) / 13.5

 Extreme scores
– All correct => Eff. = 100
– All unrecognised => Eff. = 33
– All falsely recognised => Eff. = 0

A Free sample background from www.powerpointbackgrounds.com


Slide 39

RANKED EFFICIENCY
80 C2
A2
70 A1
Efficiency (max 100)

60 C1
D2
50 D1
D4
40
B2
30 C4
B4
20 B1
10 B3
A3
0 C3
Configuration D3
A4
A Free sample background from www.powerpointbackgrounds.com
Slide 40

WINNER

 Accuracy
– Configuration C2 accuracy = 70.3 %

 Efficiency
– Configuration C2 efficiency = 72.4

 Logical choices
– Microphone C
– SR Engine 2

A Free sample background from www.powerpointbackgrounds.com


Slide 41

WHY LOW ACCURACIES?

 Speakers SR experience
 Limited training
 Training Environment
 Default settings
 Microphone and speaker placement
 SNR

 Absolute scores not important

A Free sample background from www.powerpointbackgrounds.com


Slide 42

Outline of Presentation

 Introduction - What, Why and How?


 Background
 Speech Recognition Evaluation Program
Software
 Testing
 Results and Discussion
 Conclusion

A Free sample background from www.powerpointbackgrounds.com


Slide 43

CONCLUSION

 The main goal of this research was

– SR engine and microphone combinations


– Accuracy and efficiency
– Quantitatively

A Free sample background from www.powerpointbackgrounds.com


Slide 44

CONCLUSION

 This research was important in order to

– Limit distraction
– Limit frustration

A Free sample background from www.powerpointbackgrounds.com


Slide 45

CONCLUSION

 The goal was reached by

– Evaluating 16 combinations (4 engines x 4 mics)

– Speech Recognition Evaluation Program (SREP)


• Simulated
• Classified
• Calculated

A Free sample background from www.powerpointbackgrounds.com


Slide 46

CONCLUSION

 Configuration C2
– Most accurate
– Most efficient

SR ENGINE 2
Microsoft SR Engine 4.0
Telephone mode

A Free sample background from www.powerpointbackgrounds.com


Slide 47

CURRENT STATUS

 9 vehicles on road
 300 in production

 Now support non SAPI 4.0


 Evaluating new engines

A Free sample background from www.powerpointbackgrounds.com


Slide 48

MORE INFORMATION

 www.project54.unh.edu

 andrew.kun@unh.edu
 brettv@unh.edu

A Free sample background from www.powerpointbackgrounds.com

You might also like