Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
1Activity
0 of .
Results for:
No results containing your search query
P. 1
Acr39DF.tmp

Acr39DF.tmp

Ratings:

4.5

(2)
|Views: 25|Likes:
Published by api-3761762

More info:

Published by: api-3761762 on Oct 16, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/09/2014

pdf

text

original

PAPER ON ENGLISH TO HINDI - TRANSLATION SYSTEM
By:
Prof. Lata Gore.
Prof. Nishigandha Patil
M.E.(Comp. Science & Engg)
B.E. (Comp. Science & Engg)

Dr.Babasaheb Ambedkar Marathwada University
Jawaharlal Nehru Engineering College
N-6, CIDCO, Aurangabad
431003-M.S, India

ABSTRACT

The goal of Natural Language Processing (NLP) is to design and build a computer system that will analyze, understand and generate languages that human uses naturally, so that eventually you can address your computer as though you were addressing another person.

Machine Translation (MT) is one of the major areas of NLP. The objective of MT is to recognize the content of a document in order to render it in another language. We have designed and developed an English \u2013 Hindi translation system with special reference to weather narration domain. Two objectives of this MT system are:

(a)
To recognize the content in its source language.
(b)
To generate a command to give same content in target language.

For recognition of the content of the source document, our MT system takes into account the structural properties of the source language. In the generation process of MT, our system is able to solve some of the ambiguities in source text and produces the same content in target language.

1. INTRODUCTION.

Due to inherent ambiguities in a natural language, making a computer to translate natural language is a complex task. So we have to restrict it to some specific domain. We have selected Weather narration domain. We have designed and developed an English \u2013 Hindi translation system with special reference to weather narration domain.

The translation system from English to Hindi i.e. from a foreign to a regional language consists of many problems. Any natural language is a free language, i.e. its structure is not fixed. The structure can keep changing as the user wishes. Hence a good translation system will have to handle as many grammar constructs as possible.

Thus our purpose is to develop a Translation System that can translate English text into Hindi, with a special reference
to \u201cWeather Narration\u201d.
2. BODY OF PAPER
Evaluation and synthesis

The problems faced in translation from English to Hindi are immense, and a lot of efforts are being put into it to try and solve them. On our part, we have tried to handle the different ambiguities and other problems as far as possible. Considering the problems that may arise we decided to store the data along with the word\u2019s attributes or those it governs like gender, number, person etc.

The translation is text based i.e. input is a text file given by the user while the output is a text file generated by the program which can be saved as per the user requirements. The input is sentence wise, which is converted into a linked list of words. A linked list turns out to be economical for shifting of words required in the translation process. Disambiguation is carried out step by step to get the final correct output.

To translate any text it is initially important to determine the noun, verbs, and phrases in the sentence. This information is obtained from the database. Further, it is important to determine the subject and object in the sentence, as further problems need this information. This information can be sought by checking the positions of verbs, etc. in the sentence.

Grammar

In order to understand the syntactic structure of a sentence one must know two things: the grammar which is a formal specification of the structures allowable in the language and the parsing techniques which is the method of analyzing a sentence to determine its structure according to the grammar.

In our Translation system Context free grammar is used.
Example of CFG,
S
NP.VP.
VP
V.NP.
NP
N.
NP
D.NP

Based on such rules we have formalized the grammar rules for English as well as Hindi are which are stored in the database. If developer wants to make changes in the grammar rules, if required, he can directly update the database without modifying the source code.

Parser uses English grammar rules to determine whether the syntax of the input sentence is correct or not. If it is correct English tree is generated. The Parsing technique used here is bottom-up parsing with Top-down filtering. For generating the tree for the corresponding translated Hindi sentence mapping of English - Hindi grammar rules is done.

Analysis and Modeling
Level 0:
English Text

Hindi Text
Updation
Module

Fetching words
From Database
Database
Level 1:
English text
Hindi Text
Updation
Fetching words
Module
from Database
Database
Text Editor
1
Translation
S
y
stem
Text
Editor
1.1
Text
Processin
g
1.2
Translation according
to Hindi rules
Text
Editor
Level 2:
English
Text
Updation Module
Hindi Text
Database
File and Database Structure
The database for the project is designed in Microsoft Access 97. As the translated output is Devnagari Script,
use of MS Access was felt necessary. The database is divided into three parts:
1)MAP-TABLE: This table stores all the English word, their corresponding Hindi translation, along with the
features. All the words are stored here. This is the main table of our database.
2)VERB-TABLE: This table stores all the verbs, with the root Hindi word, and the type corresponding to
each of the verb.

3)AUXILLARY-TABLE: This table stores all the possible combinations of the auxiliaries occurring in English, along with their corresponding Hindi Translation, and their type. There are in total 8 types of verbs. The auxiliary combination of all these types is stored here.

4)RULES \u2013TABLE: This table stores all formalized English grammar rules and corresponding Hindi
grammar rules.
Software Modules
The translation system consists of two main modules:
1. Translation
2. Updation
Text
Editor
1.1.1
Text
acce
p
tance
1.1.2
Tokanization
Text
Editor
1.1.3
Information
Retrieval
1.2.1
English Tree
generator
1.2.2
Hindi Tree
g
enerator

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->