You are on page 1of 3

1.

Business Case

1.1 Introduction

Tamil, along with Latin, Greek and Sanskrit is one of the classical languages of
the world that has a history of over 2,500 years. Tamil is perceived to have its
origins independent of Sanskrit and other Sanskrit based Indian languages. This
is apparent by the absence of a large set of syllables that can be found in
Sanskrit and other Indian languages. The language is official in the Indian state
of Tamil Nadu, Sri Lanka and Singapore and an estimated 70 million people
speak it as the first language. The motivation for a speech to text application in
Tamil is from the perspective of preserving the linguistic heritage and is
envisaged to be used in the realm of education particularly in pronunciation
training.

1.2 Problem

Speech recognition and speech to text is an incredibly exciting field. However,


there are practical challenges that designers would need to take into account
when designing and developing a speech to text application in Tamil. Some of
the issues identified in studies such as Fuller & Narasimhan (2014),
Ramachandran (2018), Nirukshi (2015), Shanmugam (2015),) not only point to
indigenous method to develop a software application but also, the issue of
language maintenance, code switching and code mixing (Shulman 2016,
Schiffman 2002) by the native Tamil speakers both in the native region and in
the diaspora. Ramachandran (2018) recommends an indigenous approach to
design, develop and evaluate the user acceptance of speech to text in language
based technology such as speech to text. Ogunshile, E., & Ramachandran, R.
(2019) discusses various approaches to designing and building the application.

2. Solution

2.1 Aim

To design and develop a speech to text application in Tamil using the


conceptual framework of what you speak is what you get!

1
Draft
2.2 Objectives

 To understand the structure of the Tamil language, background of native


Tamil speakers.
 To evaluate the feasibility and applicability of various speech recognition
and speech to text models and techniques that have been used in other
languages in this context.
 Create a very small speech corpus and use it in the development of the
prototype and application. (See appendix)
 Create a prototype with limited number of syllables, words. (See
appendix)
 Create an application that recognizes a Tamil word, converts it into Tamil
orthography as spoken by the user.

2.3 Stakeholder

The project sponsor would be your primary stakeholder.

References

Kadakara, S. 2015, "Status of Tamil language in Singapore : An analysis of


family domain", Education Research and Perspectives, vol. 42, no. 2015, pp.
25-64.

Perera, N. 2015, "The maintenance of Sri Lankan languages in Australia -


comparing the experience of the Sinhalese and Tamils in the homeland",
Journal of Multilingual and Multicultural Development, vol. 36, no. 3, pp. 297-
312.

Schiffman, H.F. 2002, "Malaysian Tamils and Tamil linguistic culture",


Language and Communication, vol. 22, no. 2, pp. 159-169.

Shulman, D., 2016. Tamil. Harvard University Press.

Thangarajan, R., Natarajan, A.M. & Selvam, M. 2009, "Syllable modeling in


continuous speech recognition for Tamil language", International Journal of
Speech Technology, vol. 12, no. 1, pp. 47-57.

Appendix
2
Draft
Tamil orthography Roman orthography
அலை  Alai
முகநூல் Muganool
அலை  (அழை) Alai (Azhai)

அளைப்பிதல் (அழைப்பிதழ்) ALaippithal (azhaippithazh)

விளுப்புரம் (விழுப்புரம்) ViLuppuram (Vizhuppuram)

கல்லு Kallu
கள்ளு KaLLu
பிளை (பிழை) PiLai (pizhai)

வாலு Valu
வால்க்கை (வாழ்க்கை) vaalkkai (vaazhkkai)

வன்னம் (வண்ணம்) vannam (vaNNam)

விலக்கு (விளக்கு) vilakku (viLakku)

பலனி (பழனி) Palani (Pazhani)

மாட்டு Maattu
மீட்டு Meettu
காலை Kaalai
தொழிலாளி ThozhilaaLi
கலை (களை) Kalai (kaLai)

முட்டு Muttu
கிளை KiLai
குட்டு Kuttu
கிளி (கிழி) KiLi (Kizhi)

மேட்டு Maettu
தொகுப்பாலர் (தொகுப்பாளர்) Thoguppaalar (ThoguppaaLar)

எண்ணை ENNai
ஏற்றுக்கொள் EttrukkoL
எளுதுகோள்  (எழுதுகோல்  ) ELuthugoL (Ezhuthugol)

மனப்பான்மை  Manappanmai

3
Draft

You might also like