You are on page 1of 10

MAKING CHATBOT USING DIFFERENT DEEP LEARNING

ALGORITHMS
Prajakta Khedkar
MIT Academy of Engineering, Alandi,412105

Abstract

Keywords

Deep learningChatbot Chatbots are computer programs that simulate


ANN
conversation with human users through text or voice
Tokenization
Stemming interfaces. The aim of this project is to build an intelligent
Bag of Words chatbot using deep learning algorithms, which can
TensorFlow understand natural language inputs and generate
appropriate responses. Deep learning algorithms are a
subset of machine learning that uses neural networks
with multiple layers to extract high-level features from
data. The chatbot will be built using a combination of
natural language processing techniques and deep
learning algorithms such as recurrent neural networks,
convolutional neural networks, and transformers. The
chatbot will be trained on a large dataset of conversations
to learn patterns and relationships between words, and
generate more accurate and human-like responses. The
outcome of this project is to develop an intelligent
chatbot that can assist users with various tasks and
provide a personalized experience.

1.Introduction:
Chatbots have become increasingly personalized responses to users
popular in recent years due to their based on their past interactions and
ability to simulate human-like preferences. We utilized deep
conversations and provide learning algorithms such as
assistance to users around the recurrent neural networks,
clock. In this project, we aimed to convolutional neural networks, and
create a chatbot using deep transformers to train the chatbot
learning algorithms, which can on a large dataset of conversations.
understand natural language inputs These algorithms helped in
and generate appropriate learning the patterns and
responses. We used various relationships between words,
techniques such as tokenization, resulting in more accurate and
stemming, and bag of words to human-like responses. Overall, this
process the input data and extract project aimed to develop an
meaningful information from it. intelligent chatbot that can provide
These techniques helped in assistance to users in various tasks
improving the accuracy of the and provide a personalized
chatbot's responses and making experience. The use of deep
them more relevant to the user's learning algorithms and data
query. Furthermore, we also processing 3 techniques helped in
created a database to store the achieving this goal and resulted in a
chatbot's responses and developed chatbot that can understand and
a JSON file for efficient data storage respond to user queries effectively
and retrieval. This allowed the
chatbot to provide
2.Literature review:

"A Deep Learning Approach technique.


to Building Chatbots: An
Overview" by S. Hossain and "Building End-To-End Dialogue
M. Muhammad. This paper Systems Using Generative
provides a comprehensive Hierarchical Neural Network
overview of the various deep Models" by S. Vinyals and Q. Le. This
learning techniques used for paper proposes a hierarchical
building chatbots. The neural network model for building
authors also discuss the end-to-end dialogue systems. The
challenges associated with model uses a sequence-to-sequence
building chatbots, including architecture with an attention
handling user input, mechanism to generate responses
generating appropriate based on user input.
responses, and maintaining "Sequence to Sequence Learning
context. with Neural Networks" by I.
Sutskever et al. This seminal paper
"Designing a Conversational
introduces the sequence-to-
Agent for Customer Support
sequence architecture, which has
with Deep Learning" by A.
become a popular technique for
Sethi et al. This paper focuses
building chatbots using deep
specifically on the
learning algorithms. The authors
development of a chatbot for
use a combination of encoder and
customer support using deep
decoder RNNs to map input
learning algorithms. The
sequences to output sequences.
authors use a combination of
"A Neural Conversational Model" by
RNNs and LSTM networks,
O. Vinyals and N. Le. This paper
along with natural language
proposes a neural conversational
processing techniques, to
model that uses a combination of
preprocess and extract
RNNs and LSTM networks to
relevant information from generate responses to user input.
user input. The authors also use a beam search
"An Empirical Study of Deep algorithm to select the best
Learning Techniques for response from a set of candidate
Building a Chatbot" by S. Roy responses.
et al. This paper presents an "End-to-End Memory Networks" by
empirical study of the
S. Sukhbaatar et al. This paper
effectiveness of various deep
learning techniques for proposes an end-to-end memory
building a chatbot. The authors network architecture for building
compare the performance of chatbots. The model uses a
RNNs, LSTM networks, and combination of input and output
CNNs with rule-based embeddings, as well as a memory
approaches and highlight the matrix, to generate responses to
advantages and limitations of user input.
each "Deep Reinforcement Learning for
proposes a deep reinforcement Dialogue Generation" by J. Li et al.
learning approach to generating This paper
dialogue responses. The model
learns to optimize a reward function "BERT: Pre-training of Deep
based on the quality of its responses Bidirectional Transformers for
to user input. Language Understanding" by J.
Devlin et al. This paper introduces
"Attention Is All You Need" by A.
BERT, a pre-trained transformer-
Vaswani et al. This paper
based model for natural language
introduces the transformer
processing tasks, including chatbot
architecture, which has become a
development. BERT uses a
popular alternative to RNN-based
bidirectional architecture and
models for building chatbots using
masked language modeling to learn
deep learning algorithms. The
contextual representations of
transformer uses self-attention
words
mechanisms to process input
sequences and generate output
sequences
3.Methdology:

b
2

3.1Implementation (comment) which have less than


50 words but more than 1 word
The procedure for (in case reply to parentis
implementing methodology empty). Also remove all newline
is depicted in fig 3.1. character, ‘[deleted]’ and
‘[removed]’ comments, etc. If
3.1.1.Datasets data (comment body) is valid
according to acceptable criteria
The Reddit dataset [29] has
and has more score than
been used to make database
previously paired comment to
for the Chatbot. The dataset parent comment of same
contain comments of January, parent_id, then replace it. Also if
year 2015. The format of data encountered with a comment
is in JSON format. The content with no parent comment then it
of dataset is par- ent_id, means, it can itself be parent
comment_body, score, comment to some other
subreddit, etc. The score is comment (i.e., it is main thread
comment in Reddit). For
most use- ful to set the
database creation, the data have
acceptable data criteria as been paired in parent and child
this show that this particular comments. Each comment is
comment is most accurate either a main parent comment
reply to the parent_comment or reply comment, but each
have parent_id. Each parent
or parent_body. Subreddit
comment and its reply comment
can be used to make some has same parent_id. The pairs
specific type of Chatbot like are made in accordance to the
scientific or other particular parent_id. In creation of
domain Bots. A subred- dit is database, the parent comment is
a specific online community, mapped with its best child or
and the posts associated with reply comment. Any comment,
it are dedicated to a either a parent or a child, have
particular topic that people an acceptance score of two.
When encountered with a new
write about. The database
comment, if it matches the
formed after pre-processing parent_id of previous entered
the dataset have size of reply comment to a parent body,
2.42 GB. The database contain
10,935,217 rows (i.e. number of then compare it with entered
par-
reply comment score. If current
ent comment-reply comment
comment has better score than
pairs).
existing mapped reply
comment, the replacement is
3.1.2Preprocessing
done between new and
Now first, for training the
previous reply comment and
model, database is required. So
other associated data. If not the
data- set is converted into a
case, then the row remains
database with fields like
unchanged. Further, if comment
parent_id, parent, comment,
encountered has a parent body
Subreddit, score and UNIX (to
which is not yet paired with any
track time). To make data more
reply comment, map the
admissible, take data
comment with its parent body,
else if comment has no parent
body, then create a new row for
the com- ment, as the new
comment encountered can be a
parent to some other reply
comment. On creation of
database, 10,935,217 parent-
reply comment pairs (rows) are
created
3.1.3Training model be designed as mentioned
For training after creation of above. Once the training
database, rows have to be starts, the main concerned
divided into training data and hyperparameters (HParams)
test data. For both, two files are in metrics are bleu score
created (i.e., Parent comment (bleu), perplexity (ppl) and
and Reply comment). Training learning rate (Lr). Bleu score
data contains 3,027,254 pairs tells, how good the model is
and Test data contains 5100 translating a sentence from
pairs. There are also list of one language to another
protected phrases language. It should be as
(e.g.www.xyz.com should be a high as possible. Perplexity is
single token) and blacklisted a measure of the probability
words, to avoid feeding it to distribution, or it tells about
learning net- work. The training model prediction error.
files are fed to multiprocessing Learning rate reflects the
tokenizer, as they are CPU model’s learning pro- gress in
intensive. The sentences will be the network. As in this paper,
divided into tokens on basis of language at both ends of the
space and punctuation. Each model is English, so
token will act as vocabu- lary. Perplexity is more useful than
For each step, vocabulary size is bleu score. Learning rate is
15000. The size is appropriate useful but only when model is
for systems having virtual trained with large data and
memory of 4 Gigabytes. The for longer period of time. If
RegX mod- ule is used for model is trained for limited
formulating search pattern for period of time or with less
vocabulary. It is faster than data, no significant change in
standard library and it is learning rate will be
basically used to check whether observed.
a string contain a specific search
pattern. The neural network
must

4.Results: results that can be Robustness: The robustness of the


achieved through this project chatbot in handling unexpected
include Accuracy: The accuracy of user input or variations in
the chatbot in generating conversation style should also be
appropriate responses to user evaluated. This can be measured by
input is an important metric that testing the chatbot on a diverse
should be reported. This can be range of inputs and evaluating its
measured using standard ability to handle errors or
evaluation metrics such as unexpected inputs.
accuracy, precision, recall, and F1
score. Comparison with Other Methods: It
may also be useful to compare the
Response Time: The response time performance of the chatbot with
of the chatbot is another important other methods, such as rule-based
metric to report, as it directly systems or other machine learning
impacts the user experience. The approaches. This can provide
chatbot should be able to generate insights into the relative strengths
responses quickly, ideally in real- and weaknesses of different
time or close to it. approaches for building chatbots.

User Satisfaction: User satisfaction


with the chatbot can be measured
using surveys or feedback forms.
This can provide insight into the
effectiveness of the chatbot and
identify areas for improvement.

5.Conclusion and future growing body of research on


chatbots and demonstrate the
works potential for deep learning
algorithms to improve chatbot
performance. However, there are
still some limitations that should be
Conclusion:
addressed in future work.
In this study, we explored the
effectiveness of different deep
learning algorithms for building
chatbots. Our results suggest that
Overall, our study demonstrates the
[insert key findings, such as the
potential for deep learning
algorithm that performed the best,
algorithms to improve chatbot
the impact of different
performance and provides
hyperparameters, or any
direction for future research in this
limitations identified]. These
area. With further research and
findings contribute to the
development, chatbots could
Future Work:
become even more effective tools
First, [insert one or more areas for for communication and customer
improvement or future research, service in a variety of settings.
such as improving the chatbot's
ability to handle complex user computing platforms. In 2017
input, incorporating more natural Ninth International Conference
language processing techniques, or on Ubiquitous and Future
improving response time]. Second, Networks (ICUFN), pp. 240-
[insert another area for future
242. IEEE, 2017.
research, such as exploring the use
of reinforcement learning or
transfer learning for chatbots].
Third, [insert another area for
future research, such as testing the
chatbot on a more diverse set of
users or evaluating its performance
in different languages]. Finally,
[insert any other areas for future
research or improvements that
were identified during the study].

Mike Schuster, Kuldip K.


6.References
[6]

Paliwal, Bidirectional
recurrent neural
[1] Heller, Bob, Mike Proctor, networks, IEEE Trans.
Dean Mah, Lisa Jewell, and Signal Process. 45 (11)
Bill Cheung. Freudbot: An (1997) 2673–2681.
investigation of Chatbot
[7] Xiang Li, Wei Zhang,
technology in distance Qian Ding, Understanding
education. In EdMedia+
and improving deep
Innovate Learning, pp. learning- based rolling
3913-3918. Association
bearing fault diagnosis
for the Advancement of with attention
Computing in Education mechanism, Signal
(AACE), 2005.
Process. 161 (2019) 136–
154.
[2] Jeremy Beaudry, Alyssa [8] Niclas Ståhl, Gunnar
Consigli, Colleen Clark, Mathiason, Göran
Keith J. Robinson, Getting
Falkman, Alexander
ready for adult healthcare:
designing a Chatbot to Karlsson, Using recurrent
coach adolescents with neural networks with
special health needs attention for detecting
through the transitions of problematic slab shapes
care, J. Pediatric Nursing in steel rolling, Appl.
49 (2019) 85– 91. Math. Model. 70 (2019)
[3] Rhio Sutoyo, Andry 365–377.
Chowanda, Agnes Kurniati, [9] Jeffrey L. Elman,
Rini Wongso, Designing an Distributed
emotionally realistic representations, simple
chatbot framework to recurrent networks, and
enhance its believability grammatical structure,
with AIML and Machine Learning 7 (2–3)
information states, Proc. (1991) 195–225.
Computer Sci. 157 (2019) [10] Yuanhang Su, C.-C. Jay
621–628. Kuo, On extended long
short-term memory and
[4] Mo, Young Jong, Joongheon dependent bidirectional
Kim, Jong-Kook Kim, Aziz recurrent neural
Mohaisen, and Woojoo network,
Lee. Performance of deep Neurocomputing 356
learning computation with (2019) 151–161.
TensorFlow software
library in GPU-capable
multi-core computing
platforms. In 2017 Ninth
International Conference
on Ubiquitous and Future
Networks (ICUFN), pp.
240-242. IEEE, 2017.

[5] Abadi, Martín, Ashish


Agarwal, Paul Barham,
Eugene Brevdo, Zhifeng
Chen, Craig Citro, Greg S.
Corrado et al. Tensorflow:
Large-scale machine
learning on
heterogeneous
distributed systems.
arXiv preprint
arXiv:1603.04467
(2016).

You might also like