Professional Documents
Culture Documents
Introduction
Procedure
1. Collecting data
• Data Cleaning
• Building Chatbot GUI
2. CHAPTER-2
3. CHAPTER-3
Problem Identification-.
Requirements and Specification
4. CHAPTER-4
Objective
Scope of System
5. CHAPTER-5
6. CHAPTER-6
Conclusion
Chapter:-1 INTRODUCTION
• Chatbot :- A chatbot is an intelligent piece of software that is capable of communicating and performing
actions similar to a human. Chatbots are used a lot in customer interaction, marketing on social network
sites and instantly messaging the client. There are two basic types of chatbot models based on how they are
built; Retrieval based and Generative based models
• About the Python Project – Chatbot :- In this Python project with source code, we are going to build a
chatbot using deep learning techniques. The chatbot will be trained on the dataset which contains
categories (intents), pattern and responses. We use a special recurrent neural network (LSTM) to classify
which category the user’s message belongs to and then we will give a random response from the list of
responses.
• Let’s create a retrieval based chatbot using NLTK, Keras, Python, etc.
Procedure
We’ll take a step by step approach and break down the process of building a Python chatbot.
• Collecting Data & Libraries
• Data Cleaning
• Initializing Chatbot Training
• Training and Testing of Dataset
• Making Final Prediction
• Calculating Accuracy Score
• Building the machine Learning Model
• Building Chatbot GUI
• Running Chatbot
• Areas of Improvement
Collecting data
• The dataset we will be using is ‘intents.json’. This is a JSON file that contains the patterns we need
to find and the responses we want to return to the user.
• The data for this exercise is taken from the Kaggle link below. The name of the dataset is
“drug_metadata.txt
• kaggle.com/fda/adverse-pharmaceuticals-events?select=aeolus_v1
Data Cleaning
• Tokenization
• Normalise Case
• Remove Punctuations
• Stem Each Token (maybe)
• Convert Non-alphabetic Tokens
• Fix Common Nuanced Errors
• Filter Out Stop Words
• And Finally, You Can Begin
Initializing Chatbot Training
• Initialize all of the lists where we’ll store our natural language data.
• We have our json file I mentioned earlier which contains the “intents”. Here’s a snippet of what the
json file actually looks like.
• By using sci-kit learn module of python split your data into training and testing data.
• Apply the suitable algorithm to make prediction on your dataset.
• We use the json module to load in the file and save it as the variable intents.
• we will use a nested for loop to extract all of the words within “patterns” and add them to
our words list. We then add to our documents list each pair of patterns within their corresponding
tag.
Training and Testing of Dataset
Using scikit-learn , Train your dataset so that machine can recognize on which data it has to make
prediction
• Test your data , on which you want to make prediction .
• Initialize our training data with a variable training.
• We’re creating a giant nested list which contains bags of words for each of our documents.
• We have a feature called output_row which simply acts as a key for the list. We then shuffle our
training set and do a train-test-split, with the patterns being the X variable and the intents being the
Y variable
• we have our training and test data ready, we will now use a deep learning model from keras called
Sequential.
Making final prediction
• Now, the data is tested successfully , and machine is ready to show the final result (Prediction).
• Result will be displayed on your window.
• Then, the images will be classified with respect to their class or cluster.
• And then, the images are moved to separate folder with respect to their class.
Calculating accuracy score
• Now, the final prediction is made by machine , and we don’t know that the output is accurate or not.
• we calculate the accuracy score of output, with the help of Sci-kit learn classifier’s accuracy()
method.
• Note: The accuracy cant be 99% or 100% as we cant say a machine can make accurate predication ,
so if your accuracy score found 99 or 100% then , it will be considered as Overfit.
Chapter:-2 Python
Python is a popular programming language. It was created by Guido van Rossum, and released in 1991.
Python is meant to be an easily readable language. Its formatting is visually uncluttered, and it often uses
English keywords where other languages use punctuation. Unlike many other languages, it does not use curly
brackets to delimit blocks, and semicolons after statements are optional. It has fewer syntactic exceptions and
special cases than C or Pascal.
It is interpreted.
It is a high level programming language.
• It is used for:
• web development (server-side),
• software development
• mathematics
• system scripting.
.
Natural Language Processing
Pre-requisites
• Hands-On knowledge of scikit library and NLTK is assumed. However, if you are new to NLP, you can still read the
article and then refer back to resources.
• The field of study that focuses on the interactions between human language and computers is called Natural
Language Processing, or NLP for short. It sits at the intersection of computer science, artificial intelligence, and
computational linguistics[Wikipedia].
• NLP is a way for computers to analyze, understand, and derive meaning from human language in a smart and
useful way. By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic
summarization, translation, named entity recognition, relationship extraction, sentiment analysis, speech
recognition, and topic segmentation
• Natural language Processing with Python provides a practical introduction to programming for language processing.
I highly recommend this book to people beginning in NLP with Python.
Chapter-2 Machine Learning
• Machine learning (ML) is the study of computer algorithms that improve automatically
through experience.[1] It is seen as a subset of Artificial Intelligence.
• Machine learning algorithms build a mathematical model based on sample data, known as
"training data", in order to make predictions or decisions without being explicitly
programmed to do so.
• Building a simple chatbot exposes you to a variety of useful skills for data science and general
programming. I feel that the best way (for me, at least) to learn anything is to just build and tinker
around. If you want to become good at something, you need to get in lots of practice, and the best
way to practice is to just get your hands dirty and build
• In this Python data science project, we understood about chatbots and implemented a deep
learning version of a chatbot in Python which is accurate. You can customize the data according to
business requirements and train the chatbot with great accuracy. Chatbots are used everywhere and
all businesses is looking forward to implementing bot in their workflow.