You are on page 1of 2

06/04/2021 Certification - Speech Testing | Assistant Partners

Ce i cation - Speech Testing 

Overview

Speech validation tests comprise of the following

Automatic Speech Recognition (ASR) and Hotword Acceptance - WER tests

Acoustic Echo Cancellation (AEC) - Eraser tests

Hotword False Accept - False Accept tests

Assistant Speech Validation Tests utilize a series of queries to validate hotword acceptance
and Automatic Speech Recognition in quiet conditions, noisy conditions, and with media
playback from the device. Recordings of speech and media which do not contain hotwords are
used to validate hotword false accept performance.

ASR and Hotword Acceptance

To assess the Word Error Rate and False Reject Rate, queries will be played to the device. After
playback is complete, the hotword count and machine transcript will be retrieved from the
device. A total of 2,000 queries will be played divided between two speaker positions. For half
of the queries played from each position, wide-band diffused noise will be added.

Each test scenario (combination of speaker position and noise setting) will be scored
independently. After each query is played, the device's state will be retrieved. This state
includes the current count of hotwords detected and the most recent speech transcript.
Between each query, the device's volume level and media playback state will be reset so that
each query is played with the device idle, not answering a previous query, and not playing an
alarm or media. After all queries are played the scores will be calculated as follows:

False Reject Rate is calculated as the percentage of hotwords played but not detected.

Word Error Rate is calculated from the subset of queries for which the device detected
the hotword. Word errors are counted as the total number of words inserted, substituted,

https://developers.google.com/assistant-partners/resources/art/speech 1/2
06/04/2021 Certification - Speech Testing | Assistant Partners

or or removed. The Word Error Rate is the total word errors divided by the total number of
words played excluding hotwords.

Crash Count is the number of times the device restarted or reset its state during the test.

Query Error Rate, Second Stage False Reject Rate, and other metrics are produced for
informational purposes and not used to certify the device.

Acoustic Echo Cancellation

The AEC tests are executed in the same fashion as the ASR and Hotword Acceptance tests.
Only one speaker position is used and noise is not added, instead music is played from the
device during the test. The device's volume is set such that in the voice range of 100Hz - 8kHz
the level received on the lab's reference microphone is ~70dB(A).

A total of 100 queries will be played and scored for False Reject Rate and Word Error Rate.
Marginal failures, where the device fails one or more metrics by a narrow margin, may be
retested with 500 queries to con rm the result.

Hotword False Acceptance

To ensure the device does not wake up and listen when no hotword has been spoken, this test
will play approximately one hour and fteen minutes of background noise.

The background noise is composed of news broadcasts, media, and other common household
noise. These recordings have been screened to ensure no hotwords are present. After
playback, the device's hotword count is retrieved to determine if any hotwords were detected. A
maximum of one false accept event is permitted during the test.

All rights reserved. Java is a registered trademark of Oracle and/or its a liates.

Last updated 2020-06-27 UTC.

https://developers.google.com/assistant-partners/resources/art/speech 2/2

You might also like