/  7
Use this report to...

Understand significant trends in the PC-
based speech recognition market in order to
plan your go-to-market strategy

w w w .d a ta m o n ito r.c o m /te c h n o lo g y
A Datamonitor report

Automating and Enhancing Processes
through Voice in Desktop and Back
Office Environments (Strategic Focus)

Digitizing document production using speech recognition
Published: Mar-08
Product Code: DMTC2178
Providing you with:

Analysis of the key trends, drivers and
inhibitors in the PC and server-based
speech recognition market

Discussion on the different applications forspeech recognition and assesses the leading vendors in this market

Assessment of the future trends and
potential growth for speech recognition
Automating and Enhancing Processes through Voice in Desktop and Back Office Environments (Strategic Focus)
DMTC2178
C o n ta c t u s ...
F ro m E u ro p e :
te l: + 4 4 2 0 7 5 5 1 9 3 3 8
fa x : + 4 4 2 0 7 5 5 1 9 0 8 9
e m a il: tc m a rk e tin g @ d a ta m o n ito r.c o m
F ro m th e U S :
te l: + 1 2 1 2 6 8 6 7 4 0 0
fa x : + 1 6 4 6 3 6 5 3 3 6 2
e m a il: u s tc m a rk e tin g @ d a ta m o n ito r.c o m
F r o m A s ia P a c ific :
te l: + 6 1 2 8 7 0 5 6 9 0 0
fa x : + 6 1 2 8 7 0 5 6 9 0 1
e m a il: a p in fo @ d a ta m o n ito r.c o m
Introduction

Speech-based transcription, dictation and search capabilities have improved
vastly over the past decade due to refined algorithms and improved computer
processors. As a result, speech recognition in PC-based environments is quickly
emerging as a compelling solution that enhances workforce productivity across
both niche and broader markets.

Key findings and highlights

Speech recognition initially gained traction amongst people with a disability who were
unable to use a computer in the traditional way. The number of cases of RSI and upper limb
musculoskeletal diseases is growing due to increased computer usage in the work
environment and the lack of education around ways to prevent these problems.

There are key job roles where the use of speech recognition is suitable, such as radiologists
that have repetitive procedures and reports to create. Although there has been substantial
uptake in the healthcare industry, and there are becoming more users in professional
services industry, the technology is still only utilized by specific positions.

There are two leading vendors in the PC- and server-based speech recognition market:
Nuance and Philips. These two vendors directly compete in some markets, most noticeably
in healthcare and legal transcription areas and both vendors approaching the market
differently and the integration of these products with other systems.

Reasons to buy
Understand significant trends in the PC-based speech recognition market in order to plan
your go-to-market strategy
Learn about the key industries where the technology is deployed and discover their
motivations for investment
Gain insight into the global market size and discover Datamonitor's predictions for revenue
growth
w w w .d a ta m o n ito r.c o m /te c h n o lo g y
Sample pages from the report
Market Opportunity
Automating and enhancing processes through voice in desktop environments
DMTC2178/ Published 03/2008
© Datamonitor. This report is a licensed product and is not to be photocopied
Page 9
Figure 1:
An example of the use of front-end speech recognition for a physician to dictate an EHR
Speech
Recognition
Dictating
physician
View, check
and edit EHR
Complete and file to EHR
system for later retrieval
EHR management system
Speech
Source: Datamonitor
D AT AM O N I T O R

Back-end or server-based speech recognition takes place after the dictation has been converted to an audio file (usually a .wav file) on a computer. The conversion from audio to text typically occurs before the file is been sent to a team of transcriptionists. In this case, the physician or dictator is not aware if speech recognition is used or not and the editing is done by a transcriptionist who will receive both the audio file and the converted text file. This method relies heavily on transcriptionists to edit and review the work accurately: rather than transcribing the audio file, their role is changed to that of editing and formatting. Although there will not be significant overhead cost savings for transcriptionists when using back-end speech recognition, the pressure on staff is reduced as typing skills are not needed and the speed of document production is faster. Figure 2 shows the process of back-end speech recognition whereby a lawyer dictating case notes would record these to digital format before sending the file on to be transcribed. Speech recognition would be carried out on the network side before being received by the transcriptionist who obtains an audio and text file together to be edited.

In some cases, both front-end and back-end speech recognition are needed to gain a more flexible solution and adapt to the availability of staff at either end of the process. For example, if a short report can be processed quickly and edited by a physician, or it is needed immediately, front-end speech recognition can be used. But if the work requires significant time commitment and the physician’s time is limited, back-end speech recognition might be a more viable option.

Market Opportunity
Automating and enhancing processes through voice in desktop environments
DMTC2178/ Published 03/2008
© Datamonitor. This report is a licensed product and is not to be photocopied
Page 11
content. Cutting out the need for tapes makes the process much more efficient, as audio can easily be replayed and the
length of files is not important.

Speech recognition, in many cases, is a natural extension of digital dictation. The increased use of digital dictation has given speech recognition vendors a new route to market. Currently around 10% of digital dictation users have implemented speech recognition and this represents a significant growth opportunity. There is a large base of existing digital dictation users that can potentially invest in speech, in addition to the growing adoption of digital dictation systems. Speech recognition can be used in tandem with a digital dictation system to simplify the document creation process. In particular, there is a demand for digital dictation systems in the medical field in healthcare and the legal field in professional services markets. In these areas the value proposition for speech recognition is simple: the technology automates highly repetitive tasks such as report creation and transcription which drives cost savings and efficiencies.

Automation of processes and cost pressure in the healthcare industry

Cost pressures, concerns over the speed of delivery of patient information and the push towards making health records electronic are key drivers for speech recognition adoption in the healthcare industry. It is believed that the automation of processes can help eradicate errors in diagnosis due to reduction of illegible handwriting and process simplification. Speech recognition is therefore being used by radiologists to create reports. In addition the movement towards electronic health records (EHRs) is laying the groundwork for greater adoption of speech in the healthcare industry. Benefits for using speech recognition with EHRs are improved efficiency in completing forms and cost savings where transcriptionists are no longer needed. Automatic speech-based transcription and dictation systems, such as Nuance’s Dragon NaturallySpeaking and Philips’ SpeechMagic, are evolving to become better integrated with EHR and radiology solutions.

Market challenges
Poor accuracy and disappointing deployments in the past make users hesitant to adopt

The main challenge for vendors is changing market perception. Vendors must persuade users that speech recognition accuracy has improved. One of the key issues for vendors is that past implementations of speech recognition were not successful. When the technologies were first released as commercial products, there were a handful of poor deployments, mostly in the enterprise, which were highly publicized. In the PC space, many people that tried to use speech recognition found it difficult to use as the time required to train the system was long and accuracy was low. For this reason solutions were abandoned. This presents a significant challenge for vendors as they need to prove that speech recognition has matured and that deployments will provide value to the user, rather than being a hindrance to work patterns.

Moving technology out of niche environments will be difficult for vendors

There are key job roles where the use of speech recognition is suitable, such as radiologists that have repetitive procedures and reports to create. Although there has been substantial uptake of speech recognition in the healthcare industry and growing uptake in the professional services industry, the technology is still only utilized by specific job roles. Certain factors will prevent speech recognition from widespread use in these industries, and therefore it will remain a niche technology in the short-term. For example, in healthcare, ambient noise may prevent speech recognition being used in a ward or likewise in open plan offices. In addition, there is a lot of confidential information that goes through businesses which needs to be typed rather than spoken aloud to reduce the risk of confidentiality being breached.

Competitive Landscape
Automating and enhancing processes through voice in desktop environments
DMTC2178/ Published 03/2008
© Datamonitor. This report is a licensed product and is not to be photocopied
Page 23
Figure 8:
Speech recognition applications and their uses
Language

Carnegie
• Speech Assessment
• NativeAccent

RosettaStone
Soliloquy Learning
• Reading Assistant
Lingvosoft
• Windows Lingvobit

IBM
• MASTOR
• TALES

Command & Control
Transcription / Dictation
Philips
• SpeechMagic

Nuance
• Dragon
NaturallySpeaking

IBM
• ViaVoice
RedStart Systems
• UtterCommand
Microsoft
• Vista
Spheris
• ClarityConvert

Nuance
• PowerScribe
• RadWhere

eScription
• AutoScript
Translation
Pronunciation and fluency
Language

Carnegie
• Speech Assessment
• NativeAccent

RosettaStone
Soliloquy Learning
• Reading Assistant
Lingvosoft
• Windows Lingvobit

IBM
• MASTOR
• TALES

Command & Control
Transcription / Dictation
Philips
• SpeechMagic

Nuance
• Dragon
NaturallySpeaking

IBM
• ViaVoice
RedStart Systems
• UtterCommand
Microsoft
• Vista
Spheris
• ClarityConvert

Nuance
• PowerScribe
• RadWhere

eScription
• AutoScript
Language

Carnegie
• Speech Assessment
• NativeAccent

RosettaStone
Soliloquy Learning
• Reading Assistant
Lingvosoft
• Windows Lingvobit

IBM
• MASTOR
• TALES

Carnegie
• Speech Assessment
• NativeAccent

RosettaStone
Soliloquy Learning
• Reading Assistant
Lingvosoft
• Windows Lingvobit

IBM
• MASTOR
• TALES

Command & Control
Transcription / Dictation
Philips
• SpeechMagic

Nuance
• Dragon
NaturallySpeaking

IBM
• ViaVoice
RedStart Systems
• UtterCommand
Microsoft
• Vista
Spheris
• ClarityConvert

Nuance
• PowerScribe
• RadWhere

eScription
• AutoScript
Translation
Pronunciation and fluency
Source: Datamonitor
D AT AM O N I T O R
Nuance
Strengths and opportunities

Nuance’s strengths lie in the range and quality of its products. It is the only vendor in this space that offers specialist medical and legal versions of its transcription / dictation products, as well as, cheaper consumer alternatives which are marketed towards those with disabilities or home offices. It currently offers five versions of Dragon NaturallySpeaking: Standard, Preferred, Professional, Legal and Medical. The vendor also acts as a publisher for IBM’s ViaVoice speech recognition software, which is aimed at those customers that rarely use their PC for document creation but do require some form of desktop speech recognition. The Standard Dragon NaturallySpeaking product is aimed for users that may wish to dictate rather than type emails and documents. The Preferred edition offers support for macros in Excel. And the medical and legal versions are highly specialized with specific vocabularies and intelligent learning capabilities. The latest version, 9, of the software was released at the end of 2006 and provides the ability to centrally manage a word list and create user profiles.

Nuance has invested heavily in both speech recognition and medical transcription technology. It acquired Dictaphone Healthcare Solutions in 2006, a leading provider of medical dictation services in the US. This acquisition has allowed Nuance to gain significant traction in the healthcare market as Dictaphone has a large installed base of customers in the healthcare industry and partnerships with electronic health record solutions providers. Through this acquisition and that of Commissure towards the end of 2007, Nuance has gained two more speech recognition solutions: Dictaphone’s PowerScribe and Commissure’s RadWhere both aimed at radiologists. More recently Nuance has acquired Vocada to provide a text-to-speech based Critical Test Result Management system, Veriphy, expanding addressing healthcare issues

R e q u e s t m o re s a m p le p a g e s...fo r F R E E !
F ro m E u ro p e :
te l: + 4 4 2 0 7 5 5 1 9 3 3 8
fa x : + 4 4 2 0 7 5 5 1 9 0 8 9
e m a il: tc m a rk e tin g @ d a ta m o n ito r.c o m
F ro m th e U S :
te l: + 1 2 1 2 6 8 6 7 4 0 0
fa x : + 1 6 4 6 3 6 5 3 3 6 2
e m a il: u s tc m a rk e tin g @ d a ta m o n ito r.c o m
F r o m A s ia P a c ific :
te l: + 6 1 2 8 7 0 5 6 9 0 0
fa x : + 6 1 2 8 7 0 5 6 9 0 1
e m a il: a p in fo @ d a ta m o n ito r.c o m

Share & Embed

More from this user

Carissa Tcs readcast this 07 / 07 / 2010Learn more about Readcast.