Professional Documents
Culture Documents
Gathering Data
• Gathering correct and relevant data is the toughest job in this task, we can
use Reddit (free conversations between people), Whatsapp conversations
or Movie conversations to create a lot of conversational data.
• We can then add rule based content like the name of chatbot, age,
different features of the chatbot to increase the realism of the chatbot and
so that the chatbot can provide factual information when asked about the
ques like “What is your name ?”, “What do you do for a living ?” etc.
• We can create a database and upload our whatsapp conversations on
them and then connect it to our chatbot using sqlite or any other server.
Preprocessing the data
• First, we need to preprocess our conversations so that we can use it as the
training data.
• So initially we just associate and link the conversation between two
people and associate it with the timeframe so that the conversation
remains sequential.
• After we have linked and prepared our data we need to break it into time
steps so that we can input our data into a Recurrent Neural Network unit.
Model
Encoder- Decoder (seq2seq)
The Encoder-Decoder LSTM is a recurrent neural network designed to address
sequence-to-sequence problems, sometimes called seq2seq.
Sequence-to-sequence prediction problems are challenging because the number
of items in the input and output sequences can vary. For example, text
translation and learning to execute programs are examples of seq2seq problems.
• Our conversational data will be inputted into an encoder, the
encoder will encode the data into a feature vector, the feature
vector will consist of our data encoded in a mathematical format so
that we can find the similarity between two sentences.
• This feature vector will be given as input to the decoder so that it
can understand the meaning of the sentence and then give a
response to the data.
• The output of the decoder and the whole encoder output across a
lot of time frames to an ATTENTION mechanism.
ATTENTION Mechanism
Based upon the different importance of different words in a sentence
the attention mechanism will create a context vector as to decide what
is more important in a particular sentence. Therefore the more
important part will be used to find the output of our model.