Professional Documents
Culture Documents
Training data:
I am Sam
Sam I am
Sam I like
Sam I do like
do I like Sam
Assume that we use a bi-gram language model based on the above training data. What is the
most probable next word predicted by the model for the following word sequences? Show.
(1) Sam . . .
(2) Sam I do . . .
(4) do I like . . .
Solution :-
Estimating bigram probabilies by checking the probability of next word prediction for word(W)
can be calculated as [2]
P(W|Wi-1) = count(Wi-1,W)/count(Wi-1)
Assume that we use a bi-gram language model based on the above training data. What is
the most probable next word predicted by the model for the following word sequences?
Show.
(1) Sam . . .
(2) Sam I do . . .
therefore word next to do are “I” and “Like” are equally probable
(4) do I like . . .
[1] R. Nagata, H. Takamura, and G. Neubig, “Adaptive Spelling Error Correction Models for Learner
English,” Procedia Comput. Sci., vol. 112, pp. 474–483, 2017, doi: 10.1016/j.procs.2017.08.065.