Professional Documents
Culture Documents
Voice Technology Seminar
Voice Technology Seminar
Internal Guide:E.Nithya
Co-ordinators:S.Revathi
HOD: C.Murugamani
BY
N.Aishwarya
17321A1202
Contents
• Introduction
• Evolution
• Voice recognition through AI
• Voice recognition and algorithm
• Applications
• Pros & cons
• Conclusion
What makes voice technology so popular?
• Isolated word
• Connected Word
• Continuous speech
• Spontaneous speech
• Voice verification/identification
Text-Dependent
Text-Independent
speech recognition process
• It involves following five major steps:
1) Signal processing
2) Speech recognition
3) Semantic interpretation
4) Dialog management
5) Response Generation
Signal processing:
Dialog Management:
• The errors encountered are tried to be corrected.
Response Generation:
Signal Analysis:
Raw speech should be initially transformed and
compressed, in order to simplify subsequent processing.
Signal analysis converts raw speech to speech frames
Speech Frames:
The result of signal Analysis is a sequence of speech
frames, typically at 10 millisecond intervals,with about
16 coefficients per frame.
The speech frames are used for acoustic analysis.
Acoustic Models:
In order to analyze the speech frames for their
acoustic content, we need a set of acoustic models.
Acoustic model:template and state representation for the word “cat”
Acoustic analysis and frame score:
Acoustic analysis is performed by applying each
acoustic model over each frame of speech , yielding a
matrix of frame score . Scores are computed
according to the type of acoustic model that is being
used.
The alignment path with the best total
score identifies the word sequence and
segmentation
Time Alignment:
The process of searching for best
alignment path is called time alignment.
Word Sequence:
The end result of time alignment is a
word sequence-The sentence
hypothesis for the utterances.
Voice Recognition softwares
• Julius
• Google now
• SIRI
• S Voice
• Iris(Intelligent Rival Imitator of SIRI)
• Dragon Naturally speaking
• Windows speech recognition
Advantages:
• Talking is faster than typing.
• This can specially assist the people who
have little keyboard skills or experience.
• Assist who are slow typist or do not have
the time or resources to develop keyboard
skills.
• People with physical disabilities that affect
either data entry or ability to read what
they have entered.
Disadvantages:
• Privacy of voice recorded data.
• Error and misinterpretation of words.
Conclusion: