Professional Documents
Culture Documents
Abstract
This projects focuses on developing state of the art end to end Amharic speech recognition
system
Semawit Araya
Semawitaraya@gmail.com
Aug 30 2021
Contents
Introduction.................................................................................................................................................1
Problem statement......................................................................................................................................1
Objectives....................................................................................................................................................1
General objective........................................................................................................................................1
Specific objective.........................................................................................................................................1
Material and Methods/Methodology..........................................................................................................2
Dataset....................................................................................................................................................2
Hardware and software requirement......................................................................................................2
Methodology...........................................................................................................................................2
Expected Benefits........................................................................................................................................2
Project Team Members’ Biography.............................................................................................................3
Key Stakeholders.........................................................................................................................................3
Timeline.......................................................................................................................................................4
Project Risks................................................................................................................................................4
Estimated Budget....................................................................................................................................4
References...................................................................................................................................................4
1|Page
Introduction
Afro-Asiatic is one of the major language families widely spoken in north and west Africa.
Semitic languages be- long to Afro-Asiatic. Next to Arabic, Amharic is the second most spoken
Semitic language. Moreover, Amharic is an official language of Ethiopia, spoken by over 22
million people, according to Central Statistical Agency of Ethiopia1. Amharic has its own unique
orthographic representation containing 32 consonants and 7 vowels called Amharic-Fidel. The
orthographic representation is also shared with Tigrinya, the other Semitic language of Ethiopia
(also the main language of Eritrea). Amharic also shares several linguistic features (including
morphological structure and vocabulary) with Arabic.
Although there is a large volume of Amharic content avail- able on the web, searching and
retrieving them is hard as they only exist in their raw form (not analyzed and in
dexed well). Therefore, building language specific tools that analyze and index, could potentially
enhance the accessibility of Amharic web content. Particularly, automatic speech recognition
highly improves the searchability of audio and video content due to its speech transcription
support (Mezaris et al., 2010).
Existing Amharic ASR prototypes never seem to be used to perform even other common
speech-oriented tasks such as language learning (Farzad and Eva, 1998) or solve practical
problems by integrating them in other large natural language processing systems such as
machine-translation. This is mainly due to the requirement of a fairly large amount of
annotated data (e.g., speech transcriptions, language models, lexicons) along with a reasonable
degree of quality sufficient to train ASR models.
Problem statement
The main goal of an automatic speech recognition system (ASR) is to build a system that can
simulate the human listener, i.e. it can “understand” our spoken language and respond — this
means the system can react appropriately to the spoken words and convert the speech into
another medium such as text.
Objectives
General objective
The objective of the project is to develop an end to end Amharic speech recognition system
that performs very well.
2|Page
Specific objective
developing good dataset for speech recognition
Methodology
3|Page
Expected Benefits
increase productivity by enabling a person to use his/her hands and mouth for different
tasks and making hands-free work possible,
rapid return on investment that applies ASRS to speed up tasks
access to new markets (24-hour services)
environment control (for disabled peoples)
enable us to orally dictate our computers
automatically translate spoken languages
enable us to communicate to remote computers (tele banking, expert system, database-
query, information retrievals, etc)
have voice controlled equipment’s
automatically receive services requests in different organizations, primarily
telecommunication,
assist/aid person with disabilities;
support teaching oral skills;
develop a multimedia computer-assisted instruction system
analyze speech evidence for the police and the court.
N NAME PROFILE
O
Key Stakeholders
4|Page
Timeline
Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug
Project Risks
5|Page
Estimated Budget
References
Hui Bu and Jiayu Du and Xingyu Na and Bengu Wu and Hao Zheng. (2017). Aishell ASR corpus.
provided byBeijing Shell Technology and distributed via OpenSLR, ISLRN SLR33. ELRA-W0074.
(2014).
Amharic-English bilingual corpus, distributed via ELRA,1.0. distributed via ELRA,1.0, 1.0, ISLRN
590 255-335-719-0. Elodie Gauthier and Laurent Besacier and Sylvie Voisin and Michael Melese
and Uriel Pascal Elingui. (2016).
Harvesting big text data for under- resourced languages. distributed via Natural Language
Processing Centre, Faculty of Informatics, Masaryk Uni- versity.
Vassil Panayotov and Guoguo Chen and Daniel Povey and Sanjeev Khudanpur. (2015).
LibriSpeech ASR corpus. distributed via OpenSLR, ISLRN SLR1
6|Page