Professional Documents
Culture Documents
Abstract— A chatbot is a software that simulates a As chatbots have gained popularity over the last few
conversation with a human in a generalized form through years, most of them are rule-based in which output of the
various applications like messaging, websites, mobile linear conversations is predetermined. For most cases, this
applications, etc. Traditionally, a search engine was used to works fine. However, chatbots can do much more than this.
answer any type of questions, but with the advancement in
With the introduction of NLP, developers are now able to put
technology the use of chatbot has increased and it allows the
user to simply ask any question in a same manner they would the chat in chatbots .[1]
communicate with humans. The interesting features of this is that NLP is a technical process that allows machines to extract
they are self-learning and become intelligent over time using meaning from user text inputs. It aims to understand the intent
technologies like Machine Learning and Natural Language of the input instead of just obtaining information about the
Processing(NLP). This paper provides a brief overview of the use intent itself. With NLP, it is possible to train the chatbot on
of NLP in chatbot as well as the working of different algorithms. various conversations it will go through and help to give the
It studies various machine learning algorithms used for responses. The training part consists of providing the content
developing chatbots. examples it might come across. Providing more training set to
the chatbot gives it a wider knowledge base with which it can
Keywords— Chatbot, NLP (Natural Language Processing),
Supervised Machine Learning, Entity, Classification. interpret and answer the questions.[3]
In this paper, working of chatbot is described thoroughly.
The section II gives the use of NLP in chatbot and section III
I. INTRODUCTION
explains the steps in NLP. The section IV explains various
supervised algorithms. Lastly, the conclusion is given in
With advancement in World Wide Web, the human-
section V.
machine interaction plays vital role in the digital world. In
order to expand and elevate the simplicity of user
II. NLP IN CHATBOT
communication with any system, interaction between human
and machine is essential.
A chatbot is a service or a tool that the user can communicate
“Our Differentiation starts with the third dimension of with via text messages. The chatbot understands what the user
Chatbots. We call it the Human Dialogue Theory.” is trying to say and replies with a coherent, relevant message
– Riza Berkan. or directly completes the desired task by using NLP.
A chatbot is a software which has been designed to There are two components of NLP:
imitate an intelligent conversation with end user. In the past, 1. Natural Language Understanding
there has been a significant hike in interest in chatbots. Few of 2. Natural Language Generation
the current scenarios are as follows:
• DOM, The Pizza Bot, Dominos: It helps user to
1. Natural Language Understanding:
order dominos pizza of their choice via facebook
messenger account, and DOM allows the user to
customize their pizza hassle free. In NLU, the input gets transformed into the useful
• Allo, Google: It helps in various tasks like finding representations in order to analyze various aspects of the
scores, dinner reservations etc. language. NLU has two specific concepts:
• Tay, Microsoft: It is developed by Microsoft on Entities: Entities are basically nouns to represent an idea
Twitter. It was capable of interacting with twitter to chatbot.
users. Context: When a NLU algorithm considers a sentence, it
• KLM Royal Dutch Airlines: This bot helps the users doesn’t have the historical data of the user’s conversation.
to get the current updates of their flights, check-in This implies that, if a same question is asked again and
notifications of Dutch Airlines. chatbot has got the response, it won’t recall the previous
inquiry. So, the phases during the conversation of chat are
separately stored. It can either be banners like "Asking about
admission process". i. POS Tagging: The part of speech for each word
A chatbot must fulfill customer's needs which can be the is determined for each sentence.
same for different queries. For example, "I want to know
about the college reading hall" and "Do you have a reading ii. Text Lemmatization: It is a linguistic process
hall? I want to know about it", it is the same: details of reading that attempts to determine the root of each word.
hall. Hence, all user typing text show a single command which The lemma of word is its basic form along with
is identifying the tag: reading hall. its inflected forms.
Word Tokenization: Education, is, the, process, of, facilitating, Co-reference Resolution: In this example, “this” refers to
learning. “Computer Department”.
Step 2: Syntactic Analysis (Parsing)- It involves analysis of Step 5: Pragmatic Analysis- During this, what was said is re-
words in the sentence to check grammar and arrangement of interpreted on what it actually meant. It involves deriving
words in a manner that shows the relationship among the those aspects of language which require real world
words. The sentence such as “The food like the girl" is knowledge.
rejected by English syntactic analyzer.
i. Named Entity Recognition: It looks for Recursive Division stops only when the following conditions
different categories of words, similar to the name are true:
of the specific entity, the user’s address, 1. All sample data of a given node belongs to the same
whatever information is required. class, that is, value of all flag attribute are same.
2. No remaining backup attribute can be used to divide
Example: "Apple CEO Tim Cook introduced two new larger the sample further.
Iphones at Cupertino, Flint Center Event." 3. After split, certain branch does not have any sample
record.
NER: Apple, Flint Center: Name of Organization
Tim Cook: Name of Person In the decision tree, the first question is the root of the tree.
Cupertino: Name of Location From there, the chatbot might need additional information in
order to solve the problem, moving the conversation from up
to down. The more specific a query is, the more branches will
IV. ALGORITHMS AND TECHNIQUES be needed to solve it, and the more branches you create, the
more intelligent and helpful the chatbot will be.
A. Decision Tree:
Decision trees are how chatbots help customers find B. Naïve Bayes Algorithm:
exactly what they are looking for: they follow a step-by-step
process to discover the accurate answer to the customer’s It is a probabilistic classification technique which is based
question in a conversational way. The decision tree algorithm: on Bayes' theorem with an assumption of independence
ID3 (Iterative Dichotomiser 3) → uses Entropy function and assumptions among the features. Basically, a Naive Bayes
Information Gain as metrics.[11]
classifier assumes that the presence of a particular feature in a
class is unrelated to the presence of any other feature. For
The Information Gain can be calculated as:
example, if a human wants to know which engineering fields
are there in that particular college, the bot will first read the
data and further partition it into two parts like engineering and
non-engineering and accordingly split the data and answer the
...(1) human [14].
Naive Bayes model is easy to build. It is particularly
Entropy can be calculated as:
useful for very large data sets. This algorithm is capable of
self-categorizing the databases in an efficient way.
...(2)
Step 4: Create a branch for certain test attribute’s every Bayes theorem provides a way of calculating the
known value, and divided the sample data into every probability P(c|x) from P(c), P(x) and P(x|c). Refer the
branch according this. equation (3) below:[15]
4. Retrieval from Database: Based on the query , the [8] A. Singh, N. Thakur and A. Sharma, "A review of
response is retrieved from the database and output is given to supervised machine learning algorithms," 2016 3rd
user. International Conference on Computing for Sustainable
Global Development (INDIACom), New Delhi, pp.1310-1315.
5. NLG: If the relevant answer is not present in the database
then the answer is generated. [9] S. Angra and S. Ahuja, "Machine learning and its
applications: A review," 2017 International Conference on
6.Output: Response is given to the user. Big Data Analytics and Computational Intelligence
(ICBDAC), Chirala, 2017, pp.57-60.
[1] A. M. Rahman, A. A. Mamun and A. Islam, [13] J.Song. et al. , TOLA : Topic-oriented learning
"Programming challenges of chatbot: Current and future assistance based on cyber-physical system and big data.
prospective," 2017 IEEE Region 10 Humanitarian Technology Future generation computer systems(2016), http
Conference (R10-HTC), Dhaka, 2017,pp.75-78. doi: ://dx,doi.org.10.1016/j.future.2016.05.040
10.1109/R10-HTC.2017.8288910
[14] F. Qin, X. Tang and Z. Cheng, "Application and research
of multi-label Naïve Bayes Classifier," Proceedings of the
10th World Congress on Intelligent Control and Automation,
Beijing, 2012, pp. 764-768.