You are on page 1of 18

Topic - Text to

30+ Ready
Speech Made
Software
PowerPoint Template
with Google Slides for
Major Project Instructor : Dr. Philemon Daniel
Free
Presented by:
 Tanuj Kumar 194115
 Ritesh Pushkar 194114
 Virendra Saini 194113
 Animesh Barnwal 194105
Text to speech software
September 26, 2022
Overview
To reiterate, text-to-speech (TTS) is the capacity of your computer or device to read text aloud
through software such as Speechify. The benefits of this technology can go very far. Research has
shown that text-to-speech technology improves accessibility, facilitates comprehension, and creates
a more efficient learning environment. Whether it’s used to assist students, a business needs,
children or just for your own pleasure. Individuals with disabilities, such as dyslexia, or blindness
have a particularly valuable advantage when utilizing this speech software.
Agenda Style
01 Add Contents Title
You can simply impress your audience and add a unique zing.

02 Add Contents Title


You can simply impress your audience and add a unique zing.

03 Add Contents Title


You can simply impress your audience and add a unique zing.

04 Add Contents Title


You can simply impress your audience and add a unique zing.
“ Understanding the problem

Single Box Text

The
02 text in any single box is spoken continuously
from top to bottom.

Individual boxes are spoken in the order they are


03
added to the slide. There appears to be no way to
alter this and the way to force a different order is
to create a new slide and copy the text boxes in
04
the required order
Animated Text

Spoken
02 in the order that the animations play,
waiting for each to complete before speaking the
next. However if a number of animations are used
within
03 a single text box, PowerTalk waits for the
first animation only and then speaks all the
remaining text. You may want to break the text
into separate boxes and animate each.
04
Text for Images

It02is possible to specify the text spoken for images


(and indeed any object including text) by entering
Alternative Web Text for the image. This is done
as
03 follows: right click on the image, select 'format
object' ('size and shape'), select the Web tab and
enter the alternative text to be spoken. If you
enter a space then nothing is spoken at all.
04
“ Natural Language Processing

(NLP) Module

 It produces a phonetic transcription of the text read, together with


prosody.
 Major operations of the
NLP module
First the text is segmented into tokens. The token-to-
Text Analysis word conversion creates the orthographic form of the
token. For the token “Mr” the orthographic form
“Mister”.

After the text analysis has been completed,


Application of pronunciation rules can be applied. Letters cannot be
Pronunciation Rules transformed 1:1 into phonemes because
correspondence is not always parallel.

Pronunciation → the prosody is generated. The degree


of naturalness of TTS system is dependent on prosodic
Prosody Generation factors like intonation modelling, amplitude modelling
and duration modelling
The output of the NLP module is passed to
the DSP module. This is where the actual
synthesis of the speech signal happens.

In concatenative synthesis the selection and


linking of speech segments take place. For
individual sounds the best option (where
several appropriate options are available)
are selected from a database and
concatenated.
 The DSP component of a general concatenation
based synthesizer
 Operations of the natural Language
processing module of a TTS synthesizer.
 TTSR Interface when a text Document is loaded into it.
CONCLUSION
Text to speech synthesis is a rapidly growing aspect of
computer technology and is increasingly playing a more
important role in the way we interact with the system and
interfaces across a variety of platforms. We have
identified the various operations and processes involved
in text to speech synthesis. We have also developed a
very simple and attractive graphical user interface which
allows the user to type in his/her text provided in the text
field in the application. Our system interfaces with a text
to speech engine developed for American English. In
future, we plan to make efforts to create engines for
localized language so as to make text to speech
technology more accessible to a wider range of People.
Timeline
You can simply You can simply
impress your audience impress your audience
and add a unique zing and add a unique zing
and appeal to your and appeal to your
Presentations. Presentations.

2015 2016 2017 2018 2019


You can simply You can simply You can simply
impress your audience impress your audience impress your audience
and add a unique zing and add a unique zing and add a unique zing
and appeal to your and appeal to your and appeal to your
Presentations. Presentations. Presentations.
References
• Lemmetty, S., 1999. Review of Speech Syn1thesis Technology. Masters Dissertation, Helsinki University Of
Technology.
• Dutoit, T., 1993. High quality text-to-speech synthesis of the French language. Doctoral dissertation, Faculte
Polytechnique de Mons.
• Suendermann, D., Höge, H., and Black, A., 2010. Challenges in Speech Synthesis. Chen, F., Jokinen, K., (eds.),
Speech Technology, Springer Science + Business Media LLC.
• Allen, J., Hunnicutt, M. S., Klatt D., 1987. From Text to Speech: The MITalk system. Cambridge University Press.
• Rubin, P., Baer, T., and Mermelstein, P., 1981. An articulatory synthesizer for perceptual research. Journal of the
Acoustical Society of America 70: 321–328.
• van Santen, J.P.H., Sproat, R. W., Olive, J.P., and Hirschberg, J., 1997. Progress in Speech Synthesis. Springer.
• van Santen, J.P.H., 1994. Assignment of segmental duration in text-to-speech synthesis. Computer Speech &
Language, Volume 8, Issue 2, Pages 95–128
• Wasala, A., Weerasinghe R. , and Gamage, K., 2006, Sinhala Grapheme-to-Phoneme Conversion and Rules for
Schwaepenthesis. Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, pp.
890-897.
• Lamel, L.F., Gauvain, J.L., Prouts, B., Bouhier, C., and Boesch, R., 1993. Generation and Synthesis of Broadcast
Messages, Proceedings ESCA-NATO Workshop and Applications of Speech Technology.
• van Truc, T., Le Quang, P., van Thuyen, V., Hieu, L.T., Tuan, N.M., and Hung P.D., 2013. Vietnamese Synthesis
System, Capstone Project Document, FPT UNIVERSITY.

You might also like