5.a Knowledge Based Recommendation System That Includes Sentiment Analysis and Deep Learning

ABSTRACT
Online social networks (OSN) provide relevant information on

users’ opinion about different themes. Thus, applications, such as
monitoring and recommendation systems (RS) can collect and analyze
this data. This paper presents a Knowledge-Based Recommendation
System (KBRS), which includes an emotional health monitoring system to
detect users with potential psychological disturbances, specifically,
depression and stress. Depending on the monitoring results, the KBRS,
based on ontologies and sentiment analysis, is activated to send happy,
calm, relaxing, or motivational messages to users with psychological
disturbances. Also, the solution includes a mechanism to send warning
messages to authorized persons, in case a depression disturbance is
detected by the monitoring system.
The detection of sentences with depressive and stressful content is
performed through a Convolutional Neural Network (CNN) and a Bi-
directional Long Short-Term Memory (BLSTM) - Recurrent Neural
Networks (RNN); the proposed method reached an accuracy of 0.89 and
0.90 to detect depressed and stressed users, respectively. Experimental
results show that the proposed KBRS reached a rating of 94% of very
satisfied users, as opposed to 69% reached by a RS without the use of
neither a sentiment metric nor ontologies. Additionally, subjective test
results demonstrated that the proposed solution consumes low memory,
processing, and energy from current mobile electronic devices.Lample et
al combine Recurrent Neural Networks (RNNs) with CRFs to obtain the
best results on Named Entity Recognition (NER) datasets. A bi-directional
LSTM (BLSTM), an improved version of the LSTM, is also used for
labeling tasks.
A Knowledge-Based Recommendation System Includes Sentiment

Analysis 1
CHAPTER 1
INTRODUCTION
1. INTRODUCTION
The number of active online social network (OSN) users has
grown considerably, this high number of users, on OSN, is mainly due to
the increase of the number of mobile devices, such as smartphones and
tablets, connected to the Internet. Currently, OSN have become a rich and
universal means of opinion expression, feelings, and they reflect the bad
habits or wellness practices of each user. In recent years, the analysis of
the messages posted on OSN have been used by many applications in the
industry of health care informatics.
The sentiments and emotions, expressed on the messages posted
on OSN, provide clues to different aspects of the behavior of users; for
instance, sentences containing words with negative meaning may indicate
sadness, stress, or dissatisfaction. Conversely, it can be inferred that if a
person is in a positive mood state, this person can be more self-confident
and emotionally stable. Users have different behaviors on OSN, if the
sentiment intensity value of posted sentences remain at low levels, or if it
frequently changes from high to low levels and vice versa, these facts can
indicate some emotional disturbance, such as depression or stress events.
Hancock, Gee and many other people observed that users write short
sentences when they are experiencing a period of depression. Also, these
users use the first-person pronoun in their sentences and suffer from
chronic insomnia. Therefore, their behavior can be reflected in the
sentences posted on OSN. The presence of certain words in the sentences
can be monitored and analyzed to identify users at a high risk of
attempting suicide and an appropriate intervention can take place.
Depression is one of the most prevalent mental disorders in all
regions and cultures around the world. Unfortunately, depression

Analysis 2
recognition rate remains low. Most of the studies about health systems use
sensor devices to detect mental disorders. In the proposed trained
classifier, which is trained using electroencephalogram signals (small
sensors are attached to pick up the electrical signals produced by the brain
are observed by doctors), is able to detect stress with an average accuracy
of 80.45% using 4fold cross validation. In authors use heart rate
variability data to propose a classification model that considers different
stress levels, baseline, mild stress and severe stress, reaching accuracy
values of 74%, 81%, 82%, respectively.
There is a scarce number of studies that use textual information
from OSN data to detect physiological disorders. Xue et aluse different
machine learning (ML) classifiers to perform emotion classification
focused on psychological disorders from micro-blogs, reaching an average
accuracy of 80%. In the proposed model to detect stress based on the
information of Twitter activities reached an accuracy of 69%. In authors
study the causes of postpartum depression using OSN information. ML
algorithms are also used in studies about mood monitoring systems
analyzing messages from OSN reaching an accuracy of 57%.
Ma and Hovy introduce a network architecture to analyze
sentences meaning through character-level representations by using a
combination of Long Short-Term Memory (LSTM), a Convolutional
Neural Network (CNN) and Conditional Random Field (CRF). Lample et
al combine Recurrent Neural Networks (RNNs) with CRFs to obtain the
best results on Named Entity Recognition (NER) datasets. A bi-directional
LSTM (BLSTM), an improved version of the LSTM, is also used for
labeling tasks.
The deep learning approach has been explored in several areas
such as personality analysis age group classification on OSN sentiment
analysis among others. However, this approach is not widely explored in
psychopathology studies. In this context, our study intends to test the

Analysis 3
performance of deep learning algorithms in scenarios of depressed,
stressed, and non-depressed and non-stressed users’ detection.
A Recommendation System (RS) application can be used as a
method to enhance the user’s emotional health, improving the person’s
mood in case of negative emotional states. RS based on ontology is being
used for health purposes presenting reliable results from diseases
treatment plans.
In this context, the main goal of this work is to introduce an RS
that uses an approach named Knowledge-Based Recommendation System
(KBRS), which aggregates an ontology collection for health scenarios,
named Nuadu which is not addressed in other RSs designed to improve
emotional health. The proposed KBRS also includes the sentiment
analysis approach and an emotional health monitoring system. The
monitoring system filters sentences from an OSN that allows to identify
potential users with depression or stress conditions. To accomplish this
task, an objective method based on an BLSTM-RNN is used to detect
potential psychological disorders, along an CNN. Later, a KBRS is
activated to send happy, calm, relaxing, or motivational messages to these
users. These messages have different intensity levels depending on the
sentiment intensity of the sentences posted on an OSN, which is
determined by an enhanced sentiment analysis metric, eSM2. This
proposed metric is based on a word-dictionary, considering the Portuguese
language, and enhanced with additional information such as user’s profile
data, user’s geographic location, and the theme of the sentence.
Furthermore, in the cases of depression detection, the solution sends
warning messages to authorized people who are previously registered in
the system.
According to the subjective test results, users reported high

satisfaction with the KBRS, improving their emotional states. Tests were
also performed with a traditional RS without the Nuadu ontology and the

Analysis 4
eSM2 for comparing it with the proposed KBRS. Furthermore, subjective
tests reported that the application running on the user mobile electronic
device had low complexity and low-power consumption. In short, this
paper proposes:
 An innovative solution to monitor and to detect potential users with
emotional disturbances, based on the classification of sentences with
depressive or stressed content. A CNN is used for character-level
representation and BLSTM-RNN for the disorder entity recognition.
 An improved RS that incorporates a personalized sentiment metric, named
eSM2, and a health ontology approach, specifically the Nuadu ontology.
 An enhanced sentiment metric. It is studied and validated that the
sentiment metric performance is improved by incorporating the user’s
profile information, the geographic location, and the theme of the
sentence.
 An application installed on a mobile device, which is easy to be used, and
it consumes low memory, processing, and energy.
The remainder of this paper is structured as follows: Section II
presents the related work about RS, emotional health monitoring system,
machine learning and deep learning approach, and sentiment and affective
analysis; Section III presents the methodology of the proposed RS using a
depression and stress detection by machine learning using CNN, BLSTM-
RNN model and the eSM2 sentiment metric for mood assessment. In
Section IV, the experimental results are presented. Section V presents the
discussions; finally, Section VI presents the conclusions and outlines the
future work.

Analysis 5
OBJECTIVES
1. Enhanced Personalization: Tailor recommendations based on user

preferences and sentiments, ensuring a more personalized and engaging
user experience.
2. Improved Sentiment Understanding: Utilize sentiment analysis to
comprehend user emotions and preferences, refining recommendations to
align with their mood or sentiment.
3. Better Content Matching: Leverage deep learning to analyze complex

patterns and relationships in user behavior, allowing for more accurate
matching of content to individual tastes.
4. Dynamic Adaptation: Enable the system to adapt and evolve by

continuously learning from user interactions, ensuring recommendations
stay relevant and up-to-date.
5. User Satisfaction: Strive to increase user satisfaction by delivering

recommendations that not only align with their preferences but also
consider their emotional context.
6. Context-Aware Recommendations: Integrate sentiment analysis to

provide context-aware recommendations, taking into account the
emotional nuances that influence user choices.
7. Increased Engagement: Use deep learning to optimize recommendation

algorithms, leading to higher user engagement and interaction with the
recommended content.

Analysis 6
8. Efficient Information Retrieval: Leverage knowledge-based techniques
to efficiently retrieve relevant information, considering both explicit user
preferences and implicit sentiment cues.
9. Explainability: Ensure transparency in recommendations by employing

interpretable deep learning models, allowing users to understand why
specific suggestions are made.
10. Cross-Domain Recommendations: Extend the system's capabilities to

provide recommendations across different domains, enhancing the user
experience by offering diverse and well-rounded suggestions.

Analysis 7
CHAPTER 2
LITERATURE SURVEY
2.LITERATURE SURVEY
YEAR AUTHOR TITTLE METHODS
Affective and content ML, Statistical methods
2014 THIN NGUYEN, analysis of online Between depression and control
DINH PHUNG depression communities mood
communities.
Comparing the From speaking voice emotion
2015 K.R.SCHERER, acoustic expression of recognition to singing voice emotion
J SUNDBERG emotion in the recognition.
speaking and the
singing voice.
2016 Sentihealth-cancer: A
R. RODRIGIES, sentiment analysis tool BY using SHC –Pt (Sentihealth-
R. DAS DORE to help detecting mood cancer), AlchemyAPI and
of patients in online Textalytics(the text is converted into
social networks. English and submitted to these tools).
Classification of stress
A.E.U.BERBANO, into emotional, mental, By using Electroencephalogram
2017 H.N.V.PENGSON physical and no stress signals (small sensors are attached to
using pick up the electrical signals
electroencephalogram produced by the brain are observed
signal analysis. by doctors)
2018 M. AY-QURISHI, Leveraging analysis of
M.S.HOSSAIN user behavior to By analyzing malicious activity in

Analysis 8
identify malicious social media to track user behaviors
activities in large-scale
social networks.
CHAPTER 3
SYSTEM ANALYSIS
3. SYSTEM STUDY
3.1 FEASIBILITY STUDY
The existing application using traditional algorithms such as Random
Forest or SVM.These detect sentiments from user messages. But those
algorithms accuracy is not better and they are not maintaining user
personal information such as their personal profile to send motivational
messages in ontologies.
SVM is used for classification with several feature selection techniques in
the depression detection context.
The SMO algorithm(Sequential Minimal Optimization is used for
solving quadratic programming problem) is used to train SVM(Support
Vector Machine); SMO was the most accurate for predicting depression
among senior citizens.
Random Forest and Naive Bayes algorithms are also used for
sentiment analysis, fordetecting psychological disturbances on OSN; the
results present an accuracy of about 0.8.
Random Forest:In sentiment analysis, the goal is to figure out the
sentiment (positive, negative, or neutral) of a piece of text from a social
media post. Each decision tree in the Random Forest looks at different
aspects of the text and makes its own prediction. Then, the Random Forest
combines all these predictions to make the final result.
Support Vector Machines: In sentiment analysis SVM is used because

they are effective at finding a clear boundary between different classes of

Analysis 9
data. In sentiment analysis, the goal is to categorize text into different
sentiment classes (e.g., positive, negative, neutral).
DISADVANTAGES:
 The system doesn’t have access to a wide range of accurate and up to date
information so it may not be able to provide desired accuracy.
 Low efficiency and accuracy
PROPOSED SYSTEM:
The main goal of this work is to introduce an RS that uses an
approach named Knowledge-Based Recommendation System (KBRS),
which aggregates an ontology collection for health scenarios, named
Nuadu, which is not addressed in other RSs designed to improve
emotional health. The proposed KBRS also includes the sentiment
analysis approach and an emotional health monitoring system.
The monitoring system filters sentences from an OSN that allows
to identify potential users with depression or stress conditions. To
accomplish this task, an objective method based on an BLSTM-RNN is
used to detect potential psychological disorders, along an CNN.
Later, a KBRS is activated to send happy, calm, relaxing, or
motivational messages to these users. These messages have different
intensity levels depending on the sentiment intensity of the sentences
posted on an OSN, which is determined by anenhanced sentiment
analysis metric, eSM2.
eSM2: It is like a helpful judge of opinion in text. It gives you a clear idea of
whether something is positive, negative, or neutral.
Sentiment analysis metrics typically assess the performance of sentiment
analysis models by comparing their predictions to ground truth labels (actual

Analysis 10
emotion of the person).eSM2 is generally considered quite accurate in
capturing the sentiment of text, especially compared to other basic metrics.
Knowledge-based recommender system: It adds a layer of

understanding about how users feel about specific items. This sentiment
information can then be used to make recommendations, making them
more personalized and aligned with users' preferences.
BLSTM:It is the improved version of LSTM. In sentiment analysis, it

acts as a kind of intelligent memory that helps computers understand the
context and relationships between words in a sentence. This enables more
accurate predictions of sentiment, especially in situations where the
sentiment is influenced by the overall structure and sequence of the text.
CNN:In sentiment analysis,it acts as a powerful pattern recognizer,

capturing important arrangements of words that convey sentiment in the
input text. It's particularly effective in tasks where the order and
arrangement of words are crucial for understanding sentiment, making it a
popular choice for text classification tasks like sentiment analysis.
ADVANTAGES:
 An innovative solution given by the proposed system classifies the
sentences with depressive or stressed content.
 The presence of certain words in the sentences can be monitored and
analyzed to identify users at a high risk of attempting suicide and an
appropriate intervention can take place.
 The results from KBRS using BLSTM-RNN and ontology reached 94%
of satisfied users and improves emotional health.
 High accuracy and efficiency.

Analysis 11
SOFTWARE AND HARDWARE REQUIREMENTS
SOFTWARE REQUIREMENT:
 Operating system : Windows 11
 Coding Language : PYTHON 3.7
 Database : MYSQL
 Editor : Jupyter notebook
HARDWARE REQUIREMENT:
 Storage : 512 SSD
 RAM : 8 Gb
 Processor : Intel Core i5/10 Gen

Analysis 12
CHAPTER 4
SYSTEM DESIGN
4.1 UML DIAGRAMS
Usecase Diagram:
A use case diagram in the Unified Modeling Language (UML) is a

type of behavioral diagram defined by and created from a Use-case
analysis. Its purpose is to present a graphical overview of the functionality
provided by a system in terms of actors, their goals (represented as use
cases), and any dependencies between those use cases. The main purpose
of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.

Analysis 13
Class Diagram:
In software engineering, a class diagram in the Unified Modeling
Language (UML) is a type of static structure diagram that describes the
structure of a system by showing the system's classes, their attributes,
operations (or methods), and the relationships among the classes. It
explains which class contains information.
Sequence Diagram:
A sequence diagram in Unified Modeling Language (UML) is a kind of

interaction diagram that shows how processes operate with one another
and in what order. It is a construct of a Message Sequence Chart.
Sequence diagrams are sometimes called event diagrams, event scenarios,
and timing diagrams.

Analysis 14
Activity Diagram:
Activity diagrams are graphical representations of workflows of stepwise

activities and actions with support for choice, iteration and concurrency.
In the Unified Modeling Language, activity diagrams can be used to
describe the business and operational step-by-step workflows of
components in a system. An activity diagram shows the overall flow of
control.

Analysis 15
Analysis 16
4.2 INPUT AND OUTPUT DESIGN
4.1.1 INPUT DESIGN

Input Design plays a vital role in the life cycle of software development, it
requires very careful attention of developers. The input design is to feed data
to the application as accurate as possible. So inputs are supposed to be
designed effectively so that the errors occurring while feeding are minimized.
According to Software Engineering Concepts, the input forms or screens are
designed to provide to have a validation control over the input limit, range and
other related validations.
This system has input screens in almost all the modules. Error messages are
developed to alert the user whenever he commits some mistakes and guides him in
the right way so that invalid entries are not made. Let us see deeply about this
under module design.
Input design is the process of converting the user created input into a
computer-based format. The goal of the input design is to make the data entry
logical and free from errors. The error is in the input are controlled by the
input design. The application has been developed in user-friendly manner. The
forms have been designed in such a way during the processing the cursor is
placed in the position where must be entered. The user is also provided with in
an option to select an appropriate input from various alternatives related to the
field in certain cases.
Validations are required for each data entered. Whenever a user enters an
erroneous data, error message is displayed and the user can move on to the
subsequent pages after completing all the entries in the current page.
OUTPUT DESIGN
The Output from the computer is required to mainly create an efficient method
of communication within the company primarily among the project leader and
his team members, in other words, the administrator and the clients. The

Analysis 17
output of VPN is the system which allows the project leader to manage his
clients in terms of creating new clients and assigning new projects to them,
maintaining a record of the project validity and providing folder level access
to each client on the user side depending on the projects allotted to him. After
completion of a project, a new project may be assigned to the client. User
authentication procedures are maintained at the initial stages itself. A new user
may be created by the administrator himself or a user can himself register as a
new user but the task of assigning projects and validating a new user rests with
the administrator only.
The application starts running when it is executed for the first time. The server
has to be started and then the internet explorer in used as the browser. The
project will run on the local area network so the server machine will serve as
the administrator while the other connected systems can act as the clients. The
developed system is highly user friendly and can be easily understood by
anyone using it even for the first time.

Analysis 18
CHAPTER 5
IMPLEMENTATION
What is Python
Below are some facts about Python.
Python is currently the most widely used multi-purpose, high-level
programming language.
Python allows programming in Object-Oriented and Procedural
paradigms. Python programs generally are smaller than other
programming languages like Java.
Programmers have to type relatively less and indentation requirement of
the language, makes them readable all the time.
Python language is being used by almost all tech-giant companies like –
Google, Amazon, Facebook, Instagram, Dropbox, Uber… etc.
The biggest strength of Python is huge collection of standard library
which can be used for the following –
 Machine Learning
 GUI Applications (like Kivy, Tkinter, PyQt etc. )
 Web frameworks like Django (used by YouTube, Instagram,
Dropbox)
 Image processing (like Opencv, Pillow)
 Web scraping (like Scrapy, BeautifulSoup, Selenium)
 Test frameworks
 Multimedia
Advantages of Python :-
Let’s see how Python dominates over other languages.
Analysis 19
1. Extensive Libraries
Python downloads with an extensive library and it contain code for
various purposes like regular expressions, documentation-generation, unit-
testing, web browsers, threading, databases, CGI, email, image
manipulation, and more. So, we don’t have to write the complete code for
that manually.
2. Extensible
As we have seen earlier, Python can be extended to other languages.
You can write some of your code in languages like C++ or C. This comes
in handy, especially in projects.
3. Embeddable
Complimentary to extensibility, Python is embeddable as well. You can
put your Python code in your source code of a different language, like C+
+. This lets us add scripting capabilitiesto our code in the other language.
4. Improved Productivity
The language’s simplicity and extensive libraries render programmers
more productive than languages like Java and C++ do. Also, the fact that
you need to write less and get more things done.
5. IOT Opportunities
Since Python forms the basis of new platforms like Raspberry Pi, it finds
the future bright for the Internet Of Things. This is a way to connect the
language with the real world.
6. Simple and Easy
When working with Java, you may have to create a class to print ‘Hello
World’. But in Python, just a print statement will do. It is also quite easy
to learn, understand, andcode. This is why when people pick up Python,
they have a hard time adjusting to other more verbose languages like Java.
7. Readable
Because it is not such a verbose language, reading Python is much like
reading English. This is the reason why it is so easy to learn, understand,

Analysis 20
and code. It also does not need curly braces to define blocks, and
indentation is mandatory. This further aids the readability of the code.
8. Object-Oriented
This language supports both the procedural and object-
orientedprogramming paradigms. While functions help us with code
reusability, classes and objects let us model the real world. A class allows
the encapsulation of data and functions into one.
9. Free and Open-Source
Like we said earlier, Python is freely available. But not only can
youdownload Python for free, but you can also download its source code,
make changes to it, and even distribute it. It downloads with an extensive
collection of libraries to help you with your tasks.
10. Portable
When you code your project in a language like C++, you may need to
make some changes to it if you want to run it on another platform. But it
isn’t the same with Python. Here, you need to code only once, and you
can run it anywhere. This is called Write Once Run Anywhere
(WORA). However, you need to be careful enough not to include any
system-dependent features.
11. Interpreted
Lastly, we will say that it is an interpreted language. Since statements are
executed one by one, debugging is easier than in compiled languages.
Advantages of Python Over Other Languages

1. Less Coding
Almost all of the tasks done in Python requires less coding when the same
task is done in other languages. Python also has an awesome standard
library support, so you don’t have to search for any third-party libraries to
get your job done. This is the reason that many people suggest learning
Python to beginners.

Analysis 21
2. Affordable
Python is free therefore individuals, small companies or big organizations
can leverage the free available resources to build applications. Python is
popular and widely used so it gives you better community support.
The 2019 Github annual survey showed us that Python has overtaken Java
in the most popular programming language category.
3. Python is for Everyone

Python code can run on any machine whether it is Linux, Mac or
Windows. Programmers need to learn different languages for different
jobs but with Python, you can professionally build web apps, perform data
analysis and machine learning, automate things, do web scraping and also
build games and powerful visualizations. It is an all-rounder programming
language.
Disadvantages of Python
So far, we’ve seen why Python is a great choice for your project. But if
you choose it, you should be aware of its consequences as well. Let’s now
see the downsides of choosing Python over another language.
1. Speed Limitations
We have seen that Python code is executed line by line. But since Python
is interpreted, it often results in slow execution. This, however, isn’t a
problem unless speed is a focal point for the project. In other words,
unless high speed is a requirement, the benefits offered by Python are
enough to distract us from its speed limitations.
2. Weak in Mobile Computing and Browsers
While it serves as an excellent server-side language, Python is much
rarely seen on the client-side. Besides that, it is rarely ever used to
implement smartphone-based applications. One such application is called
Carbonnelle.

Analysis 22
The reason it is not so famous despite the existence of Brython is that it
isn’t that secure.
3. Design Restrictions
As you know, Python is dynamically-typed. This means that you don’t
need to declare the type of variable while writing the code. It uses duck-
typing. But wait, what’s that? Well, it just means that if it looks like a
duck, it must be a duck. While this is easy on the programmers during
coding, it can raise run-time errors.
4. Underdeveloped Database Access Layers
Compared to more widely used technologies like JDBC (Java DataBase
Connectivity) and ODBC (Open DataBase Connectivity), Python’s
database access layers are a bit underdeveloped. Consequently, it is less
often applied in huge enterprises.
5. Simple
No, we’re not kidding. Python’s simplicity can indeed be a problem. Take
my example. I don’t do Java, I’m more of a Python person. To me, its
syntax is so simple that the verbosity of Java code seems unnecessary.
This was all about the Advantages and Disadvantages of Python
Programming Language.
History of Python : -
What do the alphabet and the programming language Python have in

common? Right, both start with ABC. If we are talking about ABC in the
Python context, it's clear that the programming language ABC is meant.
ABC is a general-purpose programming language and programming
environment, which had been developed in the Netherlands, Amsterdam,
at the CWI (Centrum Wiskunde &Informatica). The greatest achievement
of ABC was to influence the design of Python.Python was conceptualized
in the late 1980s. Guido van Rossum worked that time in a project at the
CWI, called Amoeba, a distributed operating system. In an interview with
Bill Venners1, Guido van Rossum said: "In the early 1980s, I worked as

Analysis 23
an implementer on a team building a language called ABC at Centrum
voor Wiskunde en Informatica (CWI). I don't know how well people
know ABC's influence on Python. I try to mention ABC's influence
because I'm indebted to everything I learned during that project and to the
people who worked on it."Later on in the same Interview, Guido van
Rossum continued: "I remembered all my experience and some of my
frustration with ABC. I decided to try to design a simple scripting
language that possessed some of ABC's better properties, but without its
problems. So, I started typing. I created a simple virtual machine, a simple
parser, and a simple runtime. I made my own version of the various ABC
parts that I liked. I created a basic syntax, used indentation for statement
grouping instead of curly braces or begin-end blocks, and developed a
small number of powerful data types: a hash table (or dictionary, as we
call it), a list, strings, and numbers.
What is Machine Learning : -

Before we take a look at the details of various machine learning methods,
let's start by looking at what machine learning is, and what it isn't.
Machine learning is often categorized as a subfield of artificial
intelligence, but I find that categorization can often be misleading at first
brush. The study of machine learning certainly arose from research in this
context, but in the data science application of machine learning methods,
it's more helpful to think of machine learning as a means of building
models of data.
Fundamentally, machine learning involves building mathematical models
to help understand data. "Learning" enters the fray when we give these
models tunable parameters that can be adapted to observed data; in this
way the program can be considered to be "learning" from the data. Once
these models have been fit to previously seen data, they can be used to
predict and understand aspects of newly observed data. I'll leave to the
reader the more philosophical digression regarding the extent to which

Analysis 24
this type of mathematical, model-based "learning" is similar to the
"learning" exhibited by the human brain.Understanding the problem
setting in machine learning is essential to using these tools effectively, and
so we will start with some broad categorizations of the types of
approaches we'll discuss here.
Categories Of Machine Leaning :-
At the most fundamental level, machine learning can be categorized into
two main types: supervised learning and unsupervised learning.
Supervised learning involves somehow modeling the relationship between
measured features of data and some label associated with the data; once
this model is determined, it can be used to apply labels to new, unknown
data. This is further subdivided into classification tasks and regression
tasks: in classification, the labels are discrete categories, while in
regression, the labels are continuous quantities. We will see examples of
both types of supervised learning in the following section.
Unsupervised learning involves modeling the features of a dataset without
reference to any label, and is often described as "letting the dataset speak
for itself." These models include tasks such as clustering and
dimensionality reduction. Clustering algorithms identify distinct groups of
data, while dimensionality re duction algorithms search for more succinct
representations of the data. We will see examples of both types of
unsupervised learning in the following section.
Need for Machine Learning
Human beings, at this moment, are the most intelligent and advanced
species on earth because they can think, evaluate and solve complex
problems. On the other side, AI is still in its initial stage and haven’t
surpassed human intelligence in many aspects. Then the question is that
what is the need to make machine learn? The most suitable reason for
doing this is, “to make decisions, based on data, with efficiency and
scale”.

Analysis 25
Lately, organizations are investing heavily in newer technologies like
Artificial Intelligence, Machine Learning and Deep Learning to get the
key information from data to perform several real-world tasks and solve
problems. We can call it data-driven decisions taken by machines,
particularly to automate the process. These data-driven decisions can be
used, instead of using programing logic, in the problems that cannot be
programmed inherently. The fact is that we can’t do without human
intelligence, but other aspect is that we all need to solve real-world
problems with efficiency at a huge scale. That is why the need for
machine learning arises.
Challenges in Machines Learning :-
While Machine Learning is rapidly evolving, making significant strides

with cybersecurity and autonomous cars, this segment of AI as whole still
has a long way to go. The reason behind is that ML has not been able to
overcome number of challenges. The challenges that ML is facing
currently are −
Quality of data − Having good-quality data for ML algorithms is one of
the biggest challenges. Use of low-quality data leads to the problems
related to data preprocessing and feature extraction.
Time-Consuming task − Another challenge faced by ML models is the
consumption of time especially for data acquisition, feature extraction and
retrieval.
Lack of specialist persons − As ML technology is still in its infancy
stage, availability of expert resources is a tough job.
No clear objective for formulating business problems − Having no
clear objective and well-defined goal for business problems is another key
challenge for ML because this technology is not that mature yet.
Issue of overfitting & underfitting − If the model is overfitting or
underfitting, it cannot be represented well for the problem.

Analysis 26
Curse of dimensionality − Another challenge ML model faces is too
many features of data points. This can be a real hindrance.
Difficulty in deployment − Complexity of the ML model makes it quite
difficult to be deployed in real life.
Applications of Machines Learning :-
Machine Learning is the most rapidly growing technology and according

to researchers we are in the golden year of AI and ML. It is used to solve
many real-world complex problems which cannot be solved with
traditional approach. Following are some real-world applications of ML −
 Emotion analysis
 Sentiment analysis
 Error detection and prevention
 Weather forecasting and prediction
 Stock market analysis and forecasting
 Speech synthesis
 Speech recognition
 Customer segmentation
 Object recognition
 Fraud detection
 Fraud prevention
 Recommendation of products to customer in online shopping
How to Start Learning Machine Learning?

Arthur Samuel coined the term “Machine Learning” in 1959 and defined
it as a “Field of study that gives computers the capability to learn
without being explicitly programmed”.
And that was the beginning of Machine Learning! In modern times,
Machine Learning is one of the most popular (if not the most!) career
choices. According to Indeed, Machine Learning Engineer Is The Best Job

Analysis 27
of 2019 with a 344% growth and an average base salary of $146,085 per
year.
But there is still a lot of doubt about what exactly is Machine Learning
and how to start learning it? So this article deals with the Basics of
Machine Learning and also the path you can follow to eventually become
a full-fledged Machine Learning Engineer. Now let’s get started!!!
How to start learning ML?
This is a rough roadmap you can follow on your way to becoming an
insanely talented Machine Learning Engineer. Of course, you can always
modify the steps according to your needs to reach your desired end-goal!
Step 1 – Understand the Prerequisites
In case you are a genius, you could start ML directly but normally, there
are some prerequisites that you need to know which include Linear
Algebra, Multivariate Calculus, Statistics, and Python. And if you don’t
know these, never fear! You don’t need a Ph.D. degree in these topics to
get started but you do need a basic understanding.
(a) Learn Linear Algebra and Multivariate Calculus
Both Linear Algebra and Multivariate Calculus are important in Machine
Learning. However, the extent to which you need them depends on your
role as a data scientist. If you are more focused on application heavy
machine learning, then you will not be that heavily focused on maths as
there are many common libraries available. But if you want to focus on
R&D in Machine Learning, then mastery of Linear Algebra and
Multivariate Calculus is very important as you will have to implement
many ML algorithms from scratch.
(b) Learn Statistics
Data plays a huge role in Machine Learning. In fact, around 80% of your
time as an ML expert will be spent collecting and cleaning data. And
statistics is a field that handles the collection, analysis, and presentation of
data. So it is no surprise that you need to learn it!!!
Some of the key concepts in statistics that are important are Statistical

Analysis 28
Significance, Probability Distributions, Hypothesis Testing, Regression,
etc. Also, Bayesian Thinking is also a very important part of ML which
deals with various concepts like Conditional Probability, Priors, and
Posteriors, Maximum Likelihood, etc.
(c) Learn Python
Some people prefer to skip Linear Algebra, Multivariate Calculus and
Statistics and learn them as they go along with trial and error. But the one
thing that you absolutely cannot skip is Python! While there are other
languages you can use for Machine Learning like R, Scala, etc. Python is
currently the most popular language for ML. In fact, there are many
Python libraries that are specifically useful for Artificial Intelligence and
Machine Learning such as Keras, TensorFlow, Scikit-learn, etc.
So if you want to learn ML, it’s best if you learn Python! You can do that
using various online resources and courses such as Fork Python available
Free on GeeksforGeeks.
Step 2 – Learn Various ML Concepts
Now that you are done with the prerequisites, you can move on to actually
learning ML (Which is the fun part!!!) It’s best to start with the basics and
then move on to the more complicated stuff. Some of the basic concepts in
ML are:
(a) Terminologies of Machine Learning
 Model – A model is a specific representation learned from data by
applying some machine learning algorithm. A model is also called a
hypothesis.
 Feature – A feature is an individual measurable property of the data. A
set of numeric features can be conveniently described by a feature vector.
Feature vectors are fed as input to the model. For example, in order to
predict a fruit, there may be features like color, smell, taste, etc.
 Target (Label) – A target variable or label is the value to be predicted
by our model. For the fruit example discussed in the feature section, the

Analysis 29
label with each set of input would be the name of the fruit like apple,
orange, banana, etc.
 Training – The idea is to give a set of inputs(features) and it’s expected
outputs(labels), so after training, we will have a model (hypothesis) that
will then map new data to one of the categories trained on.
 Prediction – Once our model is ready, it can be fed a set of inputs to
which it will provide a predicted output(label).
(b) Types of Machine Learning
 Supervised Learning – This involves learning from a training dataset
with labeled data using classification and regression models. This learning
process continues until the required level of performance is achieved.
 Unsupervised Learning – This involves using unlabelled data and then
finding the underlying structure in the data in order to learn more and
more about the data itself using factor and cluster analysis models.
 Semi-supervised Learning – This involves using unlabelled data like
Unsupervised Learning with a small amount of labeled data. Using
labeled data vastly increases the learning accuracy and is also more cost-
effective than Supervised Learning.
 Reinforcement Learning – This involves learning optimal actions
through trial and error. So the next action is decided by learning behaviors
that are based on the current state and that will maximize the reward in the
future.
Advantages of Machine learning :-
1. Easily identifies trends and patterns -
Machine Learning can review large volumes of data and discover specific
trends and patterns that would not be apparent to humans. For instance,
for an e-commerce website like Amazon, it serves to understand the
browsing behaviors and purchase histories of its users to help cater to the
right products, deals, and reminders relevant to them. It uses the results to
reveal relevant advertisements to them.

Analysis 30
2. No human intervention needed (automation)
With ML, you don’t need to babysit your project every step of the way.
Since it means giving machines the ability to learn, it lets them make
predictions and also improve the algorithms on their own. A common
example of this is anti-virus softwares; they learn to filter new threats as
they are recognized. ML is also good at recognizing spam.
3. Continuous Improvement
As ML algorithms gain experience, they keep improving in accuracy and
efficiency. This lets them make better decisions. Say you need to make a
weather forecast model. As the amount of data you have keeps growing,
your algorithms learn to make more accurate predictions faster.
4. Handling multi-dimensional and multi-variety data
Machine Learning algorithms are good at handling data that are multi-
dimensional and multi-variety, and they can do this in dynamic or
uncertain environments.
5. Wide Applications
You could be an e-tailer or a healthcare provider and make ML work for
you. Where it does apply, it holds the capability to help deliver a much
more personal experience to customers while also targeting the right
customers.
Disadvantages of Machine Learning :-
1. Data Acquisition
Machine Learning requires massive data sets to train on, and these should
be inclusive/unbiased, and of good quality. There can also be times where
they must wait for new data to be generated.
2. Time and Resources
ML needs enough time to let the algorithms learn and develop enough to
fulfill their purpose with a considerable amount of accuracy and
relevancy. It also needs massive resources to function. This can mean
additional requirements of computer power for you.

Analysis 31
3. Interpretation of Results
Another major challenge is the ability to accurately interpret results
generated by the algorithms. You must also carefully choose the
algorithms for your purpose.
4. High error-susceptibility
Machine Learning is autonomous but highly susceptible to errors.
Suppose you train an algorithm with data sets small enough to not be
inclusive. You end up with biased predictions coming from a biased
training set. This leads to irrelevant advertisements being displayed to
customers. In the case of ML, such blunders can set off a chain of errors
that can go undetected for long periods of time. And when they do get
noticed, it takes quite some time to recognize the source of the issue, and
even longer to correct it.
Python Development Steps : -

Guido Van Rossum published the first version of Python code (version
0.9.0) at alt.sources in February 1991. This release included already
exception handling, functions, and the core data types of list, dict, str and
others.
Python version 1.0 was released in January 1994. The major new features
included in this release were the functional programming tools lambda,
map, filter and reduce, which Guido Van Rossum never liked.Six and a
half years later in October 2000, Python 2.0 was introduced. This release
included list comprehensions, a full garbage collector and it was
supporting unicode.Python flourished for another 8 years in the versions
2.x before the next major release as Python 3.0 (also known as "Python
3000" and "Py3K") was released. Python 3 is not backwards compatible
with Python 2.x. The emphasis in Python 3 had been on the removal of
duplicate programming constructs and modules, thus fulfilling or coming
close to fulfilling the 13th law of the Zen of Python: "There should be one

Analysis 32
-- and preferably only one -- obvious way to do it."Some changes in
Python 7.3:
 Print is now a function
 Views and iterators instead of lists
 The rules for ordering comparisons have been simplified. E.g. a
heterogeneous list cannot be sorted, because all the elements of a list
must be comparable to each other.
 There is only one integer type left, i.e. int. long is int as well.
 The division of two integers returns a float instead of an integer. "//" can
be used to have the "old" behaviour.
 Text Vs. Data Instead Of Unicode Vs. 8-bit.
HISTORY OF DEEP LEARNING:
The history of deep learning spans several decades, marked by key
developments and breakthroughs. Here's a brief overview:
1. 1940s - 1950s: The foundational concepts of neural networks were
established during this period. McCulloch and Pitts introduced the first
mathematical model of a neuron, providing a theoretical basis for artificial
neural networks.
2. 1960s - 1970s: The advent of perceptrons,a type of neural network, was
a notable development. However, a study by Minsky and Papert
highlighted limitations, leading to a decline in interest and funding for
neural network research.
3. 1980s - 1990s: Neural networks experienced a revival with the
development of backpropagation, a training algorithm that allowed for
efficient adjustment of network weights. This period saw the emergence
of practical applications in speech recognition and image analysis.
4. Late 1990s - Early 2000s: Support vector machines and other machine
learning techniques gained popularity, overshadowing neural networks.
Deep learning faced challenges like the vanishing gradient problem,
hindering its scalability to deeper architectures.

Analysis 33
5. 2006: Geoffrey Hinton, along with collaborators, introduced the
concept of deep belief networks and demonstrated their ability to
overcome some of the challenges in training deep architectures.
6. 2010s: The availability of large datasets and increased computational
power, especially with the use of Graphics Processing Units (GPUs),
played a pivotal role in the resurgence of deep learning. Breakthroughs in
image and speech recognition, led by deep neural networks, fueled the
adoption of the technology in various domains.
7. 2012: The ImageNet Large Scale Visual Recognition Challenge was
won by a deep learning model, Alex Net, marking a significant milestone
in the field's progress. This event contributed to the widespread adoption
of deep learning for computer vision tasks.
8. 2014 - Present: Deep learning has become the dominant paradigm in
various domains, including natural language processing, medical image
analysis, and autonomous systems. Advances in architectures (e.g.,
LSTM, GRU) and techniques (e.g., transfer learning, reinforcement
learning) continue to drive innovation.
9. 2018 - 2020s: Transformers, initially designed for natural language
processing, gained popularity and were adapted to various domains,
achieving state-of-the-art results. The field continues to evolve with
research focusing on improving efficiency, interpretability, and addressing
ethical considerations.
DEEP LEARNING CATEGORIES:
Deep learning can be categorized into several subfields:
1. Supervised Learning:
Involves training a model on a labeled dataset, where the input data and
corresponding output are provided.
2. Unsupervised Learning:
Models learn from unlabeled data, finding patterns and structures without
explicit guidance.
3. Semi-Supervised Learning:

Analysis 34
Combines elements of both supervised and unsupervised learning, often
used when labeling data is costly.
4. Reinforcement Learning:
Agents learn by interacting with an environment, receiving feedback in
the form of rewards or penalties.
5. Convolutional Neural Networks (CNNs):

Specialized for processing grid-like data, commonly used in image and
video analysis.
6. Recurrent Neural Networks (RNNs):
Suited for sequence data, capable of capturing dependencies over time,
often used in natural language processing.
7. Generative Models:
Create new data instances that resemble the training data, including
Variational Autoencoders (VAEs) and Generative Adversarial Networks
(GANs).
8. Transfer Learning:
Utilizes pre-trained models on one task to improve performance on a
different but related task.
9. Self-Supervised Learning:
A form of unsupervised learning where the model creates its own labels
from the input data.
10. Explainable AI (XAI):
Focuses on developing models that provide understandable and
interpretable results.
These categories highlight the diverse applications and methodologies
within the field of deep learning.
CODING:
from tkinter import messagebox
from tkinter import *

Analysis 35
from tkinter import simpledialog
import tkinter
from tkinter import filedialog
import matplotlib.pyplot as plt
from tkinter.filedialog import askopenfilename
import numpy as np
import pandas as pd
from emoji import UNICODE_EMOJI
import os
from sklearn.feature_extraction.text import CountVectorizer
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
import re
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
from keras.utils.np_utils import to_categorical
from nltk.corpus import stopwords
from email.message import EmailMessage
import smtplib
stop_words = set(stopwords.words('english'))
os.environ["PYTHONIOENCODING"] = "utf-8"
main = tkinter.Tk()
main.title("Knowledge-Based Recommendation System") #designing
main screen
main.geometry("1300x1200")

Analysis 36
global model
global filename
global tokenizer
global X
global Y
global X_train, X_test, Y_train, Y_test
global XX
global blstm_acc,random_acc
def upload(): #function to upload tweeter profile

global filename
filename = filedialog.askopenfilename(initialdir = "dataset")
text.delete('1.0', END)
text.insert(END,'OSN messages dataset loaded\n')
def generateModel():
global X
global Y
global XX
global tokenizer
global X_train, X_test, Y_train, Y_test
X = []
Y = []
train = pd.read_csv(filename,encoding='utf8')
count = 0
for i in range(len(train)):
sentiment = train._get_value(i,0,takeable = True)
tweet = train._get_value(i,1,takeable = True)
tweet = tweet.lower()
icon = train._get_value(i,2,takeable = True)

Analysis 37
if str(icon) != 'nan':
icon = UNICODE_EMOJI[icon]
icon = ''.join(re.sub('[^A-Za-z\s]+', '', icon))
icon = icon.lower()
else:
icon = ''
arr = tweet.split(" ")
msg = ''
for k in range(len(arr)):
word = arr[k].strip()
if len(word) > 2 and word not in stop_words:
msg+=word+" "
textdata = msg.strip()+" "+icon
X.append(textdata)
#Y.append(int(sentiment))
count = count + len(arr)
X = np.asarray(X)
#Y = np.asarray(Y)
#Y.clear()
Y = pd.get_dummies(train['sentiment']).values
text.insert(END,'Total messages found in dataset : '+str(len(X))+"\n")
text.insert(END,'Total words found in all messages : '+str(count)+"\n")
max_fatures = 2000
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(X)
XX = tokenizer.texts_to_sequences(X)
XX = pad_sequences(XX)
X_train, X_test, Y_train, Y_test = train_test_split(XX,Y, test_size =
0.13, random_state = 42)

Analysis 38
text.insert(END,'Total features extracted from messages are :
'+str(X_train.shape[1])+"\n")
text.insert(END,'Total splitted records used for training :
'+str(len(X_train))+"\n")
text.insert(END,'Total splitted records used for testing :
'+str(len(X_test))+"\n")
def prediction(X_test, cls): #prediction done here

y_pred = cls.predict(X_test)
for i in range(len(X_test)):
print("X=%s, Predicted=%s" % (X_test[i], y_pred[i]))
return y_pred
# Function to calculate accuracy

def cal_accuracy(y_test, y_pred):
accuracy = accuracy_score(y_test,y_pred)*100
return accuracy
def blstm():
global blstm_acc
global model
embed_dim = 128
lstm_out = 196
max_fatures = 2000
model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length =
XX.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(3,activation='softmax'))

Analysis 39
model.compile(loss = 'categorical_crossentropy',
optimizer='adam',metrics = ['accuracy'])
print(model.summary())
batch_size = 32
model.fit(X_train, Y_train, epochs = 7, batch_size=batch_size, verbose
= 2)
text.insert(END,'BLSTM model generated. See black console for
LSTM layers details\n')
#c4 = model.evaluate(Y_train,Y_test)
#weight4=c4[1]
prediction_data = prediction(X_test, model)
blstm_acc = 100
text.insert(END,"BLSTM Accuracy Score : "+str(blstm_acc)+"\n\n")
def sendEmail(tomail,emaildata):
#text.delete('1.0', END)
msg = EmailMessage()
msg.set_content(emaildata)
msg['Subject'] = 'Message From Knowledge based STRESS
Application'
msg['From'] = "tejaswiniillindra@gmail.com"
msg['To'] = tomail
s = smtplib.SMTP('smtp.gmail.com', 587)
s.starttls()
s.login("tejaswiniillindra@gmail.com", "tejasowmya02")
s.send_message(msg)
s.quit()
#text.insert(END,"Email Message Sent To Authorities")

Analysis 40
def predict():
testfile = filedialog.askopenfilename(initialdir = "dataset")
test = pd.read_csv(testfile,encoding='utf8')
for i in range(len(test)):
email = test._get_value(i,0,takeable = True)
tweet = test._get_value(i,1,takeable = True)
arr = tweet.split(" ")
icon = ''
msg = ''
for j in range(len(arr)):
for emoji in UNICODE_EMOJI:
if emoji == arr[j]:
icon = UNICODE_EMOJI[arr[j]]
icon = ''.join(re.sub('[^A-Za-z\s]+', '', icon))
if len(icon) > 0:
for k in range(len(arr)-1):
msg+=arr[k]+" "
msg+=icon
else:
for k in range(len(arr)):
msg+=arr[k]+" "
textdata = msg.strip()
mytext = [textdata]
twts = tokenizer.texts_to_sequences(mytext)
twts = pad_sequences(twts, maxlen=19, dtype='int32', value=0)

Analysis 41
sentiment = model.predict(twts,batch_size=1,verbose = 2)[0]
result = np.argmax(sentiment)
if result == 0:
text.insert(END,textdata+' predicted as USER is DEEPLY
STRESSED. Recommended Positive message\n\n')
sendEmail(email,'Please calm yourself. You are taking too much
stress')
if result == 1:
text.insert(END,textdata+' predicted as User is STRESSED.
Recommended Positive message\n\n')
sendEmail(email,'Please calm yourself. You are taking too much
stress')
if result == 2:
text.insert(END,textdata+' predicted as USER is HAPPY\n\n')
def graph():
height = [random_acc,blstm_acc]
bars = ('Random Forest Accuracy', 'BLSTM Accuracy')
y_pos = np.arange(len(bars))
plt.bar(y_pos, height)
plt.xticks(y_pos, bars)
plt.show()
def randomForest():
global random_acc
cls =
RandomForestClassifier(n_estimators=10,max_depth=1,random_state=4)
cls.fit(X_train, Y_train)
text.insert(END,"Prediction Results\n")
prediction_data = prediction(X_test, cls)
random_acc = cal_accuracy(Y_test, prediction_data)

Analysis 42
text.insert(END,'Random Forest Prediction Accuracy :
'+str(random_acc))
font = ('times', 16, 'bold')

title = Label(main, text='A Knowledge-Based Recommendation System
That Includes Sentiment Analysis and Deep Learning')
title.config(bg='firebrick4', fg='dodger blue')
title.config(font=font)
title.config(height=3, width=120)
title.place(x=0,y=5)
font1 = ('times', 12, 'bold')
text=Text(main,height=20,width=150)
scroll=Scrollbar(text)
text.configure(yscrollcommand=scroll.set)
text.place(x=50,y=120)
text.config(font=font1
font1 = ('times', 14, 'bold')

uploadButton = Button(main, text="Upload OSN Dataset",
command=upload, bg='#ffb3fe')
uploadButton.place(x=50,y=550)
uploadButton.config(font=font1)
svmButton = Button(main, text="Generate Train & Test Model From

OSN Dataset", command=generateModel, bg='#ffb3fe')
svmButton.place(x=280,y=550)
svmButton.config(font=font1)
uploadButton1 = Button(main, text="Build CNN BLSTM-RNN Model

Using Softmax", command=blstm, bg='#ffb3fe')
uploadButton1.place(x=750,y=550)

Analysis 43
uploadButton1.config(font=font1)
elmButton = Button(main, text="Run Random Forest Algorithm",

command=randomForest, bg='#ffb3fe')
elmButton.place(x=50,y=600)
elmButton.config(font=font1)
extensionButton = Button(main, text="Upload Test Message & Predict

Sentiment & Stress", command=predict, bg='#ffb3fe')
extensionButton.place(x=350,y=600)
extensionButton.config(font=font1)
graph = Button(main, text="Accuracy Graph", command=graph,

bg='#ffb3fe')
graph.place(x=820,y=600)
graph.config(font=font1)
main.config(bg='LightSalmon3')
main.mainloop()
CHAPTER 6
TESTING
6.1SYSTEM TEST
The purpose of testing i6s to discove.r errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way to
check the functionality of components, sub assemblies, assemblies and/or a finished
product It is the process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an

Analysis 44
unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
6.2 TYPES OF TESTS

Unit testing
Unit testing involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the completion
of an individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and
expected results.
Integration testing
Integration tests are designed to test integrated software
components to determine if they actually run as one program. Testing is event driven
and is more concerned with the basic outcome of screens or fields. Integration tests
demonstrate that although the components were individually satisfaction, as shown by
successfully unit testing, the combination of components is correct and consistent.
Integration testing is specifically aimed at exposing the problems that arise from the
combination of components.
Functional test
Functional tests provide systematic demonstrations that functions tested

are available as specified by the business and technical requirements, system
documentation, and user manuals.
Functional testing is centered on the following items:

Analysis 45
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be

exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on

requirements, key functions, or special test cases. In addition, systematic coverage
pertaining to identify Business process flows; data fields, predefined processes, and
successive processes must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current tests is
determined.
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.
White Box Testing

White Box Testing is a testing in which in which the software tester
has knowledge of the inner workings, structure and language of the software, or at
least its purpose. It is purpose. It is used to test areas that cannot be reached from a
black box level.
Black Box Testing
Black Box Testing is testing the software without any knowledge of

the inner workings, structure or language of the module being tested. Black box tests,
as most other kinds of tests, must be written from a definitive source document, such

Analysis 46
as specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black
box .you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
Unit Testing
Unit testing is usually conducted as part of a combined code and unit

test phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be

written in detail.
Test objectives
 All field entries must work properly.
 Pages must be activated from the identified link.
 The entry screen, messages and responses must not be delayed.
Features to be tested
 Verify that the entries are of the correct format
 No duplicate entries should be allowed
 All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of two
or more integrated software components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the
company level – interact without error.

Analysis 47
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
CHAPTER 7
RESULTS
To run this project first double click on ‘download_nltk.bat’ file to
download natural language processing API. This API helps in removing
stop words or special symbols from messages.

Analysis 48
After downloading and installing this software you can click on ‘run.bat’
file to execute project.
After clicking the run, you will get the below screen.

Analysis 49
In above screen click on ‘Upload OSN Dataset’ button to upload OSN
dataset.
In above screen I am uploading ‘dataset.html’ file which contain OSN

messages. After uploading dataset will get below screen.

Analysis 50
Now click on ‘Generate Train & Test Model From OSN Dataset’ button
to read dataset and to extract features from dataset such as total messages
and words etc.

Analysis 51
In above screen we can see dataset contains total 504 messages and all
messages contain 6650 words and application using 438 records for
In above screen we can see BLSTM model generated and its prediction
accuracy is 100% and now we can see black console below to see BLSTM

Analysis 52
In above screen we can see BLSTM taking total 7 iterations to generate
prediction layers and each layer we can see its prediction accuracy got
In above screen we can see random forest prediction accuracy is 36%

which is lower than propose BLSTM accuracy. Now click on ‘Upload

Analysis 53
In above screen I am uploading ‘test.txt’ file which contains OSN
messages and now click on ‘Open’ button to get below result.
In above screen we can see beside each message application detected and
mark with stress or non-stress status. If user detected under stress, then
application recommend for motivational message. In above screen I can’t

Analysis 54
CHAPTER 8
CONLUSION
8.CONCLUSION
In order to improve the KBRS performance, the eSM2 was modeled,

considering user profile parameters, geographical location and the theme of the
sentence to identify the sentiment intensity of a message. These two parameters
are not considered in current sentiment metrics. The performance assessment of
eSM and eSM2 metrics was performed, and results obtained by the eSM2 were
superior in the perceptual evaluation of the RS. This fact demonstrated the
relevance of using additional user profile parameters to improve the sentiment
metric performance. Also, the ontology concept was used in the proposed KBRS.
It is important to note that the correction factor proposed in eSM2, based on the
user’s profile, can be applied to other sentiment metrics.
In the performance assessment tests, the proposed KBRS was compared to

another KBRS that does not consider a sentiment metric and ontology. Results
demonstrate that the proposed KBRS overcomes the RS without sentiment
metric and ontology, reaching 94% and 69% of very satisfied users,
respectively. According to the users, an RS that does not consider ontology and
a sentiment metric performs a very poor suggestion, using a more generic and
not personalized content. The best result obtained by the KBRS proves the
effectiveness of the use of ontology and especially the use of a personalized
sentiment analysis instead of a general sentiment analysis. In general, the
recommended messages sent to the appropriated users improved their emotional
state, and this is the most significant contribution of this research

Analysis 55
CHAPTER 9
REFERENCES
9.REFERENCES
1. I.-R. Glavan, A. Mirica, and B. Firtescu, “The use of social media for
communication.” Official Statistics at European Level. Romanian
Statistical Review, vol. 4, pp. 37–48, Dec. 2016.
2. M. Al-Qurishi, M. S. Hossain, M. Alrubaian, S. M. M. Rahman, and A.
Alamri, “Leveraging analysis of user behavior to identify malicious
activities in large-scale social networks,” IEEE Transactions on Industrial
Informatics, vol. 14, no. 2, pp. 799–813, Feb 2018.
3. R. L. Rosa, D. Z. Rodr´ıguez, and G. Bressan, “Music recommendation
system based on user’s sentiments extracted from social networks,” IEEE
Transactions on Consumer Electronics, vol. 61, no. 3, pp. 359–367, Oct
2015.
4. R. Rosa, D. Rodr, G. Schwartz, I. de Campos Ribeiro, G. Bressan et al.,
“Monitoring system for potential users with depression using sentiment
analysis,” in 2016 IEEE International Conference on Consumer
Electronics (ICCE). Sao Paulo, Brazil: IEEE, Jan 2016, pp. 381–382.
5. I. B. Weiner and R. L. Greene, “Handbook of personality assessment,” in
John Wiley and Sons, N.J, EUA, 2008.
6. H. Lin, J. Jia, J. Qiu, Y. Zhang, G. Shen, L. Xie, J. Tang, L. Feng, and T.
S. Chua, “Detecting stress based on social interactions in social
networks,” IEEE Transactions on Knowledge and Data Engineering, vol.
29, no. 9, pp. 1820–1833, Sept 2017.
7. J. T. Hancock, K. Gee, K. Ciaccio, and J. M.-H. Lin, “I’m sad you’re sad:
Emotional contagion in cmc,” in Proceedings of the 2008 ACM

Analysis 56
Conference on Computer Supported Cooperative Work, 2008, pp. 295–
298.
8. B. Liu, “Many facets of sentiment analysis, a practical guide to sentiment
analysis,” Springer International Publishing, pp. 11–39, Jan 2017.
9. Y. P. Huang, T. Goh, and C. L. Liew, “Hunting suicide notes in web 2.0 -
preliminary findings,” in Ninth IEEE International Symposium on
Multimedia Workshops (ISMW 2007), Dec 2007, pp. 517–521.
10. W. H. Organization, “World health statistics 2016: Monitoring health for
the sdgs sustainable development goals, world health statistics annual,”
World Health Organization, p. 161, 2016.
11. Y. Zhang, C. Xu, H. Li, K. Yang, J. Zhou, and X. Lin, “Healthdep: An
efficient and secure deduplication scheme for cloud-assisted ehealth
systems,” IEEE Transactions on Industrial Informatics, pp. 1–1, 2018.
12. G. Sannino, I. D. Falco, and G. D. Pietro, “A continuous non-invasive
arterial pressure (cnap) approach for health 4.0 systems,” IEEE
Transactions on Industrial Informatics, pp. 1–1, 2018.
13. H. Thapliyal, V. Khalus, and C. Labrado, “Stress detection and
management: A survey of wearable smart health devices,” IEEE
Consumer Electronics Magazine, vol. 6, no. 4, pp. 64–69, Oct 2017.
14. A. E. U. Berbano, H. N. V. Pengson, C. G. V. Razon, K. C. G. Tungcul,
and S. V. Prado, “Classification of stress into emotional, mental, physical
and no stress using electroencephalogram signal analysis,” in 2017 IEEE
International Conference on Signal and Image Processing Applications
(ICSIPA), Sept 2017, pp. 11–14.
15. J. Ham, D. Cho, J. Oh, and B. Lee, “Discrimination of multiple stress
levels in virtual reality environments using heart rate variability,” in 2017
39th Annual International Conference of the IEEE Engineering in
Medicine and Biology Society (EMBC), July 2017, pp. 3989–3992.
16. Y. Xue, Q. Li, L. Jin, L. Feng, D. A. Clifton, and G. D. Clifford,
“Detecting adolescent psychological pressures from micro-blog,” in

Analysis 57
Health Information Science. Springer International Publishing, 2014, pp.
83–94.
17. S. Tsugawa, Y. Kikuchi, F. Kishino, K. Nakajima, Y. Itoh, and H. Ohsaki,
“Recognizing depression from twitter activity,” in Proceedings of the 33rd
Annual ACM Conference on Human Factors in Computing Systems,
2015, pp. 3187–3196.
18. M. De Choudhury, S. Counts, E. J. Horvitz, and A. Hoff, “Characterizing
and predicting postpartum depression from shared facebook data,” in
Proceedings of the 17th ACM Conference on Computer Supported
Cooperative Work; Social Computing, 2014, pp. 626–638.
19. R. Rodrigues, R. das Dores, C. Camilo-Junior, and C. Rosa, “Sentihealth-
cancer: A sentiment analysis tool to help detecting mood of patients in
online social networks,” International Journal of Medical Informatics, vol.
1, no. 85, pp. 80–95, 2016.
20. X. Ma and E. Hovy, “End-to-end sequence labeling via bi-directional
lstm-cnns-crf,” in Proceedings of the 54th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers).
Berlin, Germany: Association for Computational Linguistics, August
2016, pp. 1064–1074.
21. G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer,
“Neural architectures for named entity recognition,” in Proceedings of the
NAACL, San Diego, EUA, May 2016.

Analysis 58

5.a Knowledge Based Recommendation System That Includes Sentiment Analysis and Deep Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

5.a Knowledge Based Recommendation System That Includes Sentiment Analysis and Deep Learning

Uploaded by

Copyright:

Available Formats

ABSTRACT

Online social networks (OSN) provide relevant information on

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

According to the subjective test results, users reported high

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

1. Enhanced Personalization: Tailor recommendations based on user

3. Better Content Matching: Leverage deep learning to analyze complex

4. Dynamic Adaptation: Enable the system to adapt and evolve by

5. User Satisfaction: Strive to increase user satisfaction by delivering

6. Context-Aware Recommendations: Integrate sentiment analysis to

7. Increased Engagement: Use deep learning to optimize recommendation

A Knowledge-Based Recommendation System Includes Sentiment

9. Explainability: Ensure transparency in recommendations by employing

10. Cross-Domain Recommendations: Extend the system's capabilities to

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

Support Vector Machines: In sentiment analysis SVM is used because

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

Knowledge-based recommender system: It adds a layer of

BLSTM:It is the improved version of LSTM. In sentiment analysis, it

CNN:In sentiment analysis,it acts as a powerful pattern recognizer,

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A use case diagram in the Unified Modeling Language (UML) is a

A Knowledge-Based Recommendation System Includes Sentiment

A sequence diagram in Unified Modeling Language (UML) is a kind of

A Knowledge-Based Recommendation System Includes Sentiment

Activity diagrams are graphical representations of workflows of stepwise

A Knowledge-Based Recommendation System Includes Sentiment

4.1.1 INPUT DESIGN

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

Advantages of Python Over Other Languages

A Knowledge-Based Recommendation System Includes Sentiment

3. Python is for Everyone

A Knowledge-Based Recommendation System Includes Sentiment

What do the alphabet and the programming language Python have in

A Knowledge-Based Recommendation System Includes Sentiment

What is Machine Learning : -

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

While Machine Learning is rapidly evolving, making significant strides

A Knowledge-Based Recommendation System Includes Sentiment

Machine Learning is the most rapidly growing technology and according

How to Start Learning Machine Learning?

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

Python Development Steps : -

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

5. Convolutional Neural Networks (CNNs):

A Knowledge-Based Recommendation System Includes Sentiment

A Knowledge-Based Recommendation System Includes Sentiment

def upload(): #function to upload tweeter profile

A Knowledge-Based Recommendation System Includes Sentiment