Professional Documents
Culture Documents
MACHINETRANSLATIONUSINGDEEPLEARNINGpaper ID15703437211
MACHINETRANSLATIONUSINGDEEPLEARNINGpaper ID15703437211
net/publication/319367587
CITATIONS READS
149 8,662
6 authors, including:
All content following this page was uploaded by Shikha Jain on 13 April 2019.
C. Language Modelling
FNN can be used to learn this model in continuous
space. In this model, concatenation of word vectors is
fed to input and hidden layer to find the probability
of Tn based on T1n-1 [5].
Recurrent neural network can be designed for
Fig. 3. Word Representation language modelling because it performs very good in
sequence to sequence learning task. Here we give the
Similarity is very useful concept in case of rare sequence of inputs (s1, …, sn) and on the basis of the
words. Suppose we have an alignment A for pair (q, sequence, it will predict sequence of output (t1, ….,
p), q belongs to S and p belongs to r, we want to find tn’). Input vectors entered to the network one by one,
the correspond word of r in target language then find concatenate with previous history at hidden layers
the nearest/similar word of r in S and find out the and then output is calculated at each step [9].
most suitable word s in target T such that alignment RNN computation can be explain by the following
A’ is generated as (r, s), here, s is the required word equations
in target language [22]. h = sigm (Whss + Whhh )
n n n−1
(3)
RNN implementations for word alignment task not
only learns the bilingual word embedding but also Tn = Wthhn (4)
acquire the similarity between words and use the Two RNNs are required; one for encoding and
wide contextual information very effectively. another for decoding process. If (S, T) be the source
and target sentence pair then s1, s2, … sn =Encoder
B. Rule Selection and Reordering (s1, s2, …., sn) by using chain rule, condition
probability can be calculated as
Once alignment process is done, translation process (5)
leads to rule selection/extraction phase. Here, rules P(S|T) =P (T|s1, s2, …., sn)
are selected/extracted on the basis of word alignment
Decoder is the combination of recurrent neural
and then reordering model is trained by word aligned
network and softmax layer [17].
bilingual text. There is a problem in choosing right
It is difficult to train RNN due to long term
target phrase/word due to language sparseness.
dependences. LSTM networks avoid the problems
Source sentence may have different meanings. If we
occurred with RNN. It uses back propagation through
have a rule R-> (S1, …., a, T1, …., b) then it first
time algorithm to learn the model parameters. [9] [4]
employed to vector representation and then similarity
score is calculated to select the most suitable rule.
D. Joint Translation
FNN can be used to optimize the score which leads Joint language and translation model is used to
towards better translation but bilingual constrained predict the target word with the help of unbounded
recursive auto-encoder outperform in this task history of source and target words. RNN is the best
because it tries to minimize the reconstruction error network for this. FNN and CNN only concern with
and minimize the semantic distance. The recursive the learning using networks but RNN maintains the
auto-encoder is trained with reordering examples that sequence whether translation is generated left to right
are already generated from word-aligned bilingual or right to left [14].
sentences. RAE is capable enough to capture
knowledge of phrase’s word order information [5]. IV. METHODOLOGY
Next step is reordering and predict the structure of It is difficult to train RNN for Word Alignment so an
sentence. Combination of recursive neural network alternative can be used in the form of bilingual
and recurrent neural network (R2NN) is a good idea corpus. We have created English- Hindi bilingual
to execute this. Two main concern here are 1) which corpus that contain 1, 20,000 words with their feature
two candidates composed first, 2) in which order they values. Hence, we can fetch Hindi meaning of given
would be composed. To work with tree structure, word and can assign vector values to it, based on its
recursive neural network is the best choice but if we feature [as in Figure 6]. That is, vector of English
use RNN with it then they integrate their capabilities word and its corresponding Hindi word will be same
as RNN will maintain the history that will be useful and after word alignment we can proceed for further
for language modelling and recursive neural network processing [as in Table 3].
will be useful to generate tree structure in bottom-up
Binary tree structure for source sentence is shown in
Fig. 4. TABLE. 2 DATABASE TABLE
English Hindi Vectors
C-DAC सी-डेक [0.123, 0.107…]
Was था [-0.043, 0.0105…]
स्थापित ककया
Established [-0.0123, 0.143…]
गया था
By द्वारा [-0.172, -0.231…]
Of के [-0.442, -0.342…]
Electronics इलेक्ट्रॉनिक्ट्स [-0.334, -0.344…]
(1)
P= (w (1) [k1; k2] + b (1)) (6)
E rec, s (s; )
(9)
= ∑∑ [ ]