You are on page 1of 4

Name

- ISHAN CHAWLA Name – LAKSHAY MALHOTRA


College – Delhi Technological University College –PGDAV College, Delhi University
Phone- 9958 143475 Phone- 8586908921
Mail-ishchawla24b@gmail.com Mail-lakshaymalhotra3@gmail.com



Dynamic Chatbot

Deep Learning Aspect and Creation of Chatbot



Approach

The approach we are going to follow is based on neural machine
translation using Encoder-Decoder using LSTM (seq2seq) model. For
this model a large corpus of conversational data is needed.



Gathering Data

• Gathering correct and relevant data is the toughest job in this task, we can
use Reddit (free conversations between people), Whatsapp conversations
or Movie conversations to create a lot of conversational data.
• We can then add rule based content like the name of chatbot, age,
different features of the chatbot to increase the realism of the chatbot and
so that the chatbot can provide factual information when asked about the
ques like “What is your name ?”, “What do you do for a living ?” etc.
• We can create a database and upload our whatsapp conversations on
them and then connect it to our chatbot using sqlite or any other server.

Preprocessing the data

• First, we need to preprocess our conversations so that we can use it as the
training data.

• So initially we just associate and link the conversation between two
people and associate it with the timeframe so that the conversation
remains sequential.

• After we have linked and prepared our data we need to break it into time
steps so that we can input our data into a Recurrent Neural Network unit.




Model

Encoder- Decoder (seq2seq)



The Encoder-Decoder LSTM is a recurrent neural network designed to address
sequence-to-sequence problems, sometimes called seq2seq.
Sequence-to-sequence prediction problems are challenging because the number
of items in the input and output sequences can vary. For example, text
translation and learning to execute programs are examples of seq2seq problems.


• Our conversational data will be inputted into an encoder, the
encoder will encode the data into a feature vector, the feature
vector will consist of our data encoded in a mathematical format so
that we can find the similarity between two sentences.
• This feature vector will be given as input to the decoder so that it
can understand the meaning of the sentence and then give a
response to the data.
• The output of the decoder and the whole encoder output across a
lot of time frames to an ATTENTION mechanism.

ATTENTION Mechanism

Based upon the different importance of different words in a sentence
the attention mechanism will create a context vector as to decide what
is more important in a particular sentence. Therefore the more
important part will be used to find the output of our model.

Before Attention mechanism, translation relies on reading a complete sentence


and compress all information into a fixed-length vector, as you can image, a
sentence with hundreds of words represented by several words will surely lead
to information loss, inadequate translation, etc.

However, attention partially fixes this problem. It allows machine translator to


look over all the information the original sentence holds, then generate the
proper word according to current word it works on and the context. It can even
allow translator to zoom in or out (focus on local or global features).


Implementation of Model

• So we can build a RNN by stacking different LSTM layers and simple
dense layers and use it in our encoder and decoder.

• We can use a loss func like cross entropy with gradient descent as the
optimizer to train our model.

• Attention mechanism takes into account the input from several time
steps of our decoder to make a single prediction.

• The encoder will take conversational data and our factual data to train
the word embedding and thus create an Encoded Matrix given as output.

• The decoder will take this Encoded matrix as an input, the output it
itself is generating as the input and the attention vector from
ATTENTION mech which helps us differentiate between the importance
of different words in a sentence.

• After the algorithm converges we can get the response from the chatbot
as a word sequence from our model

Creation of API

After we have developed and trained our chatbot we will structure our
program and form an API which can be used as an input library to the
Python App development part so that we don’t have to again and again
change the chatbot code.

Server

• We can create a backend server on sqlite or postgres or mysql and
connect it to our python software the server will provide us with the
information regarding the job positions and different aspects related to
our chatbot from the internet-> like day and date, temperature, whether
and many more.
• We can create a database of the people communicating with the chatbot
so that we can generate and keep track of the info about what people
have shared with the chatbot, so that the chatbot is able to communicate
better with the people.



DEMO CHATBOT
We have developed and implemented a small chatbot using the above approach,
we need to improvise a lot and add a database to our approach. A sample chat
with our chatbot is as follows :

You might also like