Sentiment Analysis: Understanding the Challenges of NLP

i
PROJECT-I REPORT
On
SENTIMENT ANALYSIS
Sentiment Analysis
Submitted to MAHARAJA RANJIT SINGH PUNJAB TECHNICAL
UNIVERSITY in partial fulfillment of the requirement for the award of the
degree of
B. TECH
In
COMPUTER SCIENCE & ENGINEERING

Submitted By
SATISH KUMAR JAISWAL,Roll. No. 170280571
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

GIANI ZAIL SINGH CAMPUS COLLEGE OF ENGINEERING&
TECHNOLOGY, MRSPTU, BATHINDA-151001
JULY - DECEMBER 2020

.
i
2
ACKNOWLEDGEMENT
We would like to express our special thanks of gratitude to our Project guide Dr. Swati Jindal
Who gave us the golden opportunity to this wonderful project on Sentiment Analysis, which also
helped us in doing the lot of research and we can to know to know about so many new things
about analysis . We are really thankful to him.
I express my sincere gratitude to Dr. Dinesh kumar worthy HOD and Er. Naresh Garg and Er.
Manpreet Kaur, Training & Placement In-charge for providing me an opportunity to undergo
Project-I.
SATISH KUMAR JAISWAL
2
3
CANDIDATE’S DECLARATION
I,am Satish Kumar Jaiswal, Roll No.170280571, B.Tech (Semester-

VII) Of the Gaini Zail Singh Campus College of Engineering & Technology, Bathinda
hereby declare that the Training Report entitled “SENTIMENT ANALYSIS” is an original
work and data provided in the study is authentic to the best of my knowledge. This report
has not been submitted to any other Institute for the award of any other degree.
Satish KUMAR JAISWAL

(170280571)
Place: BATHINDA
Date:
3
4
CONTENTS
1. ABSTRACT ........................................................... 5
2. INTRODUCTION… ................................................ 6
3. LITERATURE REVIEW .......................................... 8
4. METHODOLOGY OF THE PROJECT ................. 11
5. INCLUDING TECHNOLOGY………………………17
6. SCREENSHOTS OF PROJECT ............................ 22
7. LIMITATION OF PROJECT……………………….29
8. CONCLUSION. ..................................................... 30
4
4
ABSTRACT
Sentiment analysis and opinion mining is the field of study that analyzes people's opinions,
sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active
research areas in natural language processing and is also widely studied in data mining, Web
mining, and text mining. In fact, this research has spread outside of computer science to the
management sciences and social sciences due to its importance to business and society as a
whole. The growing importance of sentiment analysis coincides with the growth of social media
such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. For the first
time in human history, we now have a huge volume of opinionated data recorded in digital form
for analysis. Sentiment analysis systems are being applied in almost every business and social
domain because opinions are central to almost all human activities and are key influencers of our
behaviors. Our beliefs and perceptions of reality, and the choices we make, are largely
conditioned on how others see and evaluate the world. For this reason, when we need to make a
decision ,we often seek out the opinions of others.
5
1.1 Introduction
Natural language processing (NLP) is an area of computer science and artificial
intelligence concerned with the interaction between computers and humans in natural
language. The ultimate goal of NLP is to help computers understand language as well as we
do. It is the driving force behind things like virtual assistants, speech recognition, sentiment
analysis, automatic text summarization, machine translation and much more. In this post,
we'll cover the basics of natural language processing, dive into some of its techniques and
also learn how NLP has benefited recent advances in deep learning.
Natural language processing (NLP) is the intersection of computer science, linguistics
and machine learning. The field focuses on communication between computers and humans
in natural language and NLP is all about making computers understand and generate human
language. Applications of NLP techniques include voice assistants like Amazon's Alexa and
Apple's Siri, but also things like machine translation and text-filtering.
NLP has heavily benefited from recent advances in machine learning, especially from deep
learning techniques. The field is divided into the three parts:
● Speech Recognition — The translation of spoken language into text.

● Natural Language Understanding — The computer's ability to understand what we
say.
● Natural Language Generation — The generation of natural language by a computer.
Problem of Project
WHY NLP IS DIFFICULT
NLP is a subset of computer science and machine learning that attempts to derive meaning
from textual data and can help in the problems related to sentiment Sanalysis and chatbot
creation etc.
Human language is special for several reasons. It is specifically constructed to convey the
speaker/writer's meaning. It is a complex system, although little children can learn it pretty
quickly.
Another remarkable thing about human language is that it is all about symbols. According to
Chris Manning, a machine learning professor at Stanford, it is a discrete, symbolic, categorical
signaling system. This means we can convey the same meaning in different ways (i.e., speech,
gesture, signs, etc.) The encoding by the human brain is a continuous pattern of activation by
which the symbols are transmitted via continuous signals of sound and vision.
6
Understanding human language is considered a difficult task due to its complexity. For
example, there is an infinite number of different ways to arrange words in a sentence. Also,
words can have several meanings and contextual information is necessary to correctly
interpret sentences. Every language is more or less unique and ambiguous. Just take a look at
the following newspaper headline "The Pope’s baby steps on gays." This sentence clearly has
two very different interpretations, which is a pretty good example of the challenges in NLP.
Note that a perfect understanding of language by a computer would result in an AI that can
process the whole information that is available on the internet, which in turn would probably
result in artificial general intelligence.
Sentiment analysis is one of the hardest tasks in natural language processing because
even humans struggle to analyze sentiments accurately.
Data scientists are getting better at creating more accurate sentiment classifiers, but
there’s still a long way to go. Let’s take a closer look at some of the main challenges
of machine-based sentiment analysis:
Subjectivity and Tone
There are two types of text: subjective and objective. Objective texts do not contain
explicit sentiments, whereas subjective texts do. Say, for example, you intend to
analyze the sentiment of the following two texts:
The package is nice.
The package is red.
Most people would say that sentiment is positive for the first one and neutral for the
second one, right? All predicates (adjectives, verbs, and some nouns) should not be
treated the same with respect to how they create sentiment. In the examples
above, nice is more subjective than red.
Context and Polarity
All utterances are uttered at some point in time, in some place, by and to some people,
you get the point. All utterances are uttered in context. Analyzing sentiment without
context gets pretty difficult. However, machines cannot learn about contexts if they are
not mentioned explicitly. One of the problems that arise from context is changes
in polarity. Look at the following responses to a survey:
Everything of it.
Absolutely nothing!
Imagine the responses above come from answers to the question What did you like
about the event? The first response would be positive and the second one would be
negative, right? Now, imagine the responses come from answers to the question What
did you DISlike about the event? The negative in the question will make sentiment
analysis change altogether.
7
A good deal of preprocessing or postprocessing will be needed if we are to take into
account at least part of the context in which texts were produced. However, how to
preprocess or postprocess data in order to capture the bits of context that will help
analyze sentiment is not straightforward.
Irony and Sarcasm
When it comes to irony and sarcasm, people express their negative sentiments using
positive words, which can be difficult for machines to detect without having a thorough
understanding of the context of the situation in which a feeling was expressed.
For example, look at some possible answers to the question, Did you enjoy your
shopping experience with us?
Yeah, sure. So smooth!
Not one, but many!
What sentiment would you assign to the responses above? The first response with an
exclamation mark could be negative, right? The problem is there is no textual cue that
will help a machine learn, or at least question that sentiment since yeah and sure often
belong to positive or neutral texts.
How about the second response? In this context, sentiment is positive, but we’re sure
you can come up with many different contexts in which the same response can
express negative sentiment.
1.2 Sentiment analysis
The way we understand what someone has said is an unconscious process relying on
our intuition and knowledge about language itself. In other words, the way we
understand language is heavily based on meaning and context. Computers need a
different approach, however. The word “semantic” is a linguistic term and means
"related to meaning or logic."
Semantic analysis is the process of understanding the meaning and interpretation of

words, signs and sentence structure. This lets computers partly understand natural
language the way humans do. I say partly because semantic analysis is one of the
toughest parts of NLP and it's not fully solved yet.
Speech recognition, for example, has gotten very good and works almost flawlessly,
but we still lack this kind of proficiency in natural language understanding. Your phone
basically understands what you have said, but often can’t do anything with it because
it doesn’t understand the meaning behind it. Also, some of the technologies out there
8
only make you think they understand the meaning of a text. An approach based on
keywords or statistics or even pure machine learning may be using a matching or
frequency technique for clues as to what the text is “about.” These methods are limited
because they are not looking at the real underlying meaning.
Emojis
There are two types of emojis according to Guibon et al.. Western emojis (e.g. :D) are
encoded in only one or two characters, whereas Eastern emojis (e.g. ¯ \ (ツ) / ¯) are
a longer combination of characters of a vertical nature. Emojis play an important role
in the sentiment of texts, particularly in tweets.
You’ll need to pay special attention to character-level, as well as word-level, when

performing sentiment analysis on tweets. A lot of preprocessing might also be needed.
For example, you might want to preprocess social media content and transform both
Western and Eastern emojis into tokens and whitelist them (i.e. always take them as
a feature for classification purposes) in order to help improve sentiment analysis
performance.
Here’s a quite comprehensive list of emojis and their unicode characters that may
come in handy when preprocessing.
Defining Neutral
Defining what we mean by neutral is another challenge to tackle in order to perform

accurate sentiment analysis. As in all classification problems, defining your categories
-and, in this case, the neutral tag- is one of the most important parts of the problem.
What you mean by neutral, positive, or negative does matter when you train sentiment
analysis models. Since tagging data requires that tagging criteria be consistent, a good
definition of the problem is a must.
Here are some ideas to help you identify and define neutral texts:
1. Objective texts. So called objective texts do not contain explicit sentiments, so

you should include those texts into the neutral category.
2. Irrelevant information. If you haven’t preprocessed your data to filter out
irrelevant information, you can tag it neutral. However, be careful! Only do this
if you know how this could affect overall performance. Sometimes, you will be
adding noise to your classifier and performance could get worse.
3. Texts containing wishes. Some wishes like, I wish the product had more
integrations are generally neutral. However, those including comparisons like, I
wish the product were better are pretty difficult to categorize
Human Annotator Accuracy
Sentiment analysis is a tremendously difficult task even for humans. On average, inter-
annotator agreement (a measure of how well two (or more) human labelers can make
the same annotation decision).is pretty low when it comes to sentiment analysis. And
since machines learn from the data they are fed, sentiment analysis classifiers might
not be as precise as other types of classifiers.
9
Still, sentiment analysis is worth the effort, even if your sentiment analysis predictions
are wrong from time to time. By using MonkeyLearn’s sentiment analysis model, you
can expect correct predictions about 70-80% of the time you submit your texts for
classification.
If you are new to sentiment analysis, then you’ll quickly notice improvements. For
typical use cases, such as ticket routing, brand monitoring, and VoC analysis, you’ll
save a lot of time and money on tedious manual tasks.
Methodology of the Project

2.1 TECHNIQUES TO UNDERSTAND TEXT
Let's look at some of the most popular techniques used in natural language processing. Note
how some of them are closely intertwined and only serve as subtasks for solving larger
problems.
PARSING
That actually nailed it but it could be a little more comprehensive. Parsing refers
to the formal analysis of a sentence by a computer into its constituents, which
results in a parse tree showing their syntactic relation to one another in visual
form, which can be used for further processing and understanding.
Below is a parse tree for the sentence "The thief robbed the apartment."
Included is a description of the three different information types conveyed by
the sentence.
10
The letters directly above the single words show the parts of speech for each word (noun,
verb and determiner). One level higher is some hierarchical grouping of words into phrases.
For example, "the thief" is a noun phrase, "robbed the apartment" is a verb phrase and when
put together the two phrases form a sentence, which is marked one level higher.
But what is actually meant by a noun or verb phrase? Noun phrases are one or more words
that contain a noun and maybe some descriptors, verbs or adverbs. The idea is to group nouns
with words that are in relation to them.
A parse tree also provides us with information about the grammatical relationships of the
words due to the structure of their representation. For example, we can see in the structure
that "the thief" is the subject of "robbed."
With structure I mean that we have the verb ("robbed"), which is marked with a "V" above it
and a "VP" above that, which is linked with a "S" to the subject ("the thief"), which has a "NP"
above it. This is like a template for a subject-verb relationship and there are many others for
other types of relationships.
STEMMING
Sentiment analysis (or opinion mining) is a natural language processing technique

used to determine whether data is positive, negative or neutral. Sentiment analysis is
often performed on textual data to help businesses monitor brand and product
sentiment in customer feedback, and understand customer needs.
Stemming is a technique that comes from morphology and information retrieval which is used
in NLP for pre-processing and efficiency purposes. It's defined by the dictionary as to "originate
in or be caused by.”
11
Basically, stemming is the process of reducing words to their word stem. A "stem" is
the part of a word that remains after the removal of all affixes. For example, the stem
for the word "touched" is "touch." "Touch" is also the stem of "touching," and so on.
You may be asking yourself, why do we even need the stem? Well, the stem is
needed because we're going to encounter different variations of words that actually
have the same stem and the same meaning. For example:
I was taking a ride in the car.

I was riding in the car.
These two sentences mean the exact same thing and the use of the word is
identical.
3.1 TEXT SEGMENTATION

Text segmentation in NLP is the process of transforming text into meaningful units like words,
sentences, different topics, the underlying intent and more. Mostly, the text is segmented
into its component words, which can be a difficult task, depending on the language. This is
again due to the complexity of human language. For example, it works relatively well in
English to separate words by spaces, except for words like "icebox" that belong together but
are separated by a space. The problem is that people sometimes also write it as "ice-box."
3.2 Named Entity Recognition

Named entity recognition (NER) concentrates on determining which items in a text (i.e. the
"named entities") can be located and classified into pre-defined categories. These categories
can range from the names of persons, organizations and locations to monetary values and
percentages.
3.3 Relationship Extraction

Relationship extraction takes the named entities of NER and tries to identify the semantic
relationships between them. This could mean, for example, finding out who is married to
whom, that a person works for a specific company and so on. This problem can also be
transformed into a classification problem and a machine learning model can be trained for
every relationship type.
3.4 Sentiment Analysis

With sentiment analysis we want to determine the attitude (i.e. the sentiment) of a speaker
or writer with respect to a document, interaction or event. Therefore it is a natural language
processing problem where text needs to be understood in order to predict the underlying
intent. The sentiment is mostly categorized into positive, negative and neutral categories.
With the use of sentiment analysis, for example, we may want to predict a customer's opinion
and attitude about a product based on a review they wrote. Sentiment analysis is widely
applied to reviews, surveys, documents and much more.
Sentiment analysis is the process of determining whether a piece of writing is positive,

negative or neutral, and then assigning a weighted sentiment score to each entity, theme,
topic, and category within the document. This is an incredibly complex task that varies wildly
12
with context. For example, take the phrase, “sick burn” In the context of video games, this
might actually be a positive statement.
Creating a set of NLP rules to account for every possible sentiment score for every possible
word in every possible context would be impossible. But by training a machine learning
model on pre-scored data, it can learn to understand what “sick burn” means in the context
of video gaming, versus in the context of healthcare. Unsurprisingly, each language requires
its own sentiment classification model.
Sentiment Analysis Use Cases & Applications

The applications of sentiment analysis are endless and can be applied to any industry,
from finance and retail to hospitality and technology. Below, we’ve listed some of the
most popular ways that sentiment analysis is being used in business:
1. Social Media Monitoring

2. Brand Monitoring
3. Voice of customer (VoC)
4. Customer Service
5. Market Research
Social Media Monitoring
Sentiment analysis is used in social media monitoring, allowing businesses to gain

insights about how customers feel about certain topics, and detect urgent issues in
real time before they spiral out of control.
On the fateful evening of April 9th, 2017, United Airlines forcibly removed a
passenger from an overbooked flight. The nightmare-ish incident was filmed by other
passengers on their smartphones and posted immediately. One of the videos, posted
13
to Facebook, was shared more than 87,000 times and viewed 6.8 million times by 6pm
on Monday, just 24 hours later.
The fiasco was only magnified by the company’s dismissive response. On Monday
afternoon, United’s CEO tweeted a statement apologizing for “having to re-
accommodate customers.”
This is exactly the kind of PR catastrophe you can avoid with sentiment analysis. It’s
an example of why it’s important to care, not only about if people are talking about
your brand, but how they’re talking about it. More mentions don't equal positive
mentions.
Brands of all shapes and sizes have meaningful interactions with customers, leads,
even their competition, all across social media. By monitoring these conversations you
can understand customer sentiment in real time and over time, so you can detect
disgruntled customers immediately and respond as soon as possible.
Most marketing departments are already tuned into online mentions as far as volume
– they measure more chatter as more brand awareness. But businesses need to look
beyond the numbers for deeper insights.
Brand Monitoring
Not only do brands have a wealth of information available on social media, but across
the internet, on news sites, blogs, forums, product reviews, and more. Again, we can
look at not just the volume of mentions, but the individual and overall quality of those
mentions.
In our United Airlines example, for instance, the flare-up started on the social media
accounts of just a few passengers. Within hours, it was picked up by news sites and
spread like wildfire across the US, then to China and Vietnam, as United was accused
of racial profiling against a passenger of Chinese-Vietnamese descent. In China, the
incident became the number one trending topic on Weibo, a microblogging site with
almost 500 million users.
And again, this is all happening within mere hours of the incident.
Brand monitoring offers a wealth of insights from conversations happening about your
brand from all over the internet. Analyze news articles, blogs, forums, and more
to guage brand sentiment, and target certain demographics or regions, as desired.
Automatically categorize the urgency of all brand mentions and route them instantly to
designated team members.
Get an understanding of customer feelings and opinions, beyond mere numbers and
statistics. Understand how your brand image evolves over time, and compare it to that
of your competition. You can tune into a specific point in time to follow product
releases, marketing campaigns, IPO filings, etc., and compare them to past events.
Real-time sentiment analysis allows you to identify potential PR crises and take
immediate action before they become serious issues. Or identify positive comments
and respond directly, to use them to your benefit.
14
Example: Expedia Canada
Around Christmas time, Expedia Canada ran a classic “escape winter” marketing
campaign. All was well, except for the screeching violin they chose as background
music. Understandably, people took to social media, blogs, and forums. Expedia
noticed right away and removed the ad.
Then, they created a series of follow-up spin-off videos: one showed the original actor
smashing the violin; another invited a real negative Twitter user to rip the violin out of
the actor’s hands on screen. Though their original campaign was a flop, Expedia were
able to redeem themselves by listening to their customers and responding.
Sentiment analysis allows you to automatically monitor all chatter around your brand
and detect and address this type of potentially-explosive scenario while you still have
time to defuse it.
Voice of Customer (VoC)
Social media and brand monitoring offer us immediate, unfiltered, and invaluable
information on customer sentiment, but you can also put this analysis to work on
surveys and customer support interactions.
Net Promoter Score (NPS) surveys are one of the most popular ways for businesses
to gain feedback with the simple question: Would you recommend this company,
product, and/or service to a friend or family member? These result in a single score
on a number scale.
Businesses use these scores to identify customers as promoters, passives, or

detractors. The goal is to identify overall customer experience, and find ways to
elevate all customers to “promoter” level, where they, theoretically, will buy more, stay
longer, and refer other customers.
Numerical (quantitative) survey data is easily aggregated and assessed. But the next
question in NPS surveys, asking why survey participants left the score they did, seeks
open-ended responses, or qualitative data.
Open-ended survey responses were previously much more difficult to analyze, but
with sentiment analysis these texts can be classified into positive and negative (and
everywhere in between) offering further insights into the Voice of Customer (VoC).
Sentiment analysis can be used on any kind of survey – quantitative and qualitative –
and on customer support interactions, to understand the emotions and opinions of
your customers. Tracking customer sentiment over time adds depth to help
understand why NPS scores or sentiment toward individual aspects of your business
may have changed.
You can use it on incoming surveys and support tickets to detect customers who are
‘strongly negative’ and target them immediately to improve their service. Zero in on
certain demographics to understand what works best and how you can improve.
Real-time analysis allows you to see shifts in VoC right away and understand the
nuances of the customer experience over time beyond statistics and percentages.
15
Discover how we analyzed the sentiment of thousands of Facebook reviews, and
transformed them into actionable insights.
Example: McKinsey City Voices project
In Brazil, federal public spending rose by 156% from 2007 to 2015, while satisfaction
with public services steadily decreased. Unhappy with this counterproductive
progress, the Urban Planning Department recruited McKinsey to help them focus on
user experience, or “citizen journeys,” when delivering services. This citizen-centric
style of governance has led to the rise of what we call Smart Cities.
McKinsey developed a tool called City Voices, which conducts citizen surveys across
more than 150 metrics, and then runs sentiment analysis to help leaders understand
how constituents live and what they need, in order to better inform public policy. By
using this tool, the Brazilian government was able to uncover the most urgent needs
– a safer bus system, for instance – and improve them first.
If this can be successful on a national scale, imagine what it can do for your company.
Customer Service
We already looked at how we can use sentiment analysis in terms of the broader VoC,
so now we’ll dial in on customer service teams.
We all know the drill: stellar customer experiences means a higher rate of returning
customers. Leading companies know that how they deliver is just as, if not more,
important as what they deliver. Customers expect their experience with companies to
be immediate, intuitive, personal, and hassle-free. If not, they’ll leave and do business
elsewhere. Did you know that one in three customers will leave a brand after just one
bad experience?
You can use sentiment analysis and text classification to automatically organize
incoming support queries by topic and urgency to route them to the correct department
and make sure the most urgent are handled right away.
Analyze customer support interactions to ensure your employees are following

appropriate protocol. Increase efficiency, so customers aren’t left waiting for support.
Decrease churn rates; after all it’s less hassle to keep customers than acquire new
ones.
4.1 DEEP LEARNING AND NLP
Central to deep learning and natural language is "word meaning," where a word and
especially its meaning are represented as a vector of real numbers. With these vectors that
represent words, we are placing words in a high-dimensional space. The interesting thing
about this is that the words, which are represented by vectors, will act as a semantic space.
This simply means the words that are similar and have a similar meaning tend to cluster
16
together in this high-dimensional vector space. You can see a visual representation of word
meaning below:
You can find out what a group of clustered words mean by doing principal component analysis
(PCA) or dimensionality reduction with T-SNE, but this can sometimes be misleading because
they oversimplify and leave a lot of information on the side. It's a good way to get started (like
logistic or linear regression in data science), but it isn’t cutting edge and it is possible to do it
way better.
We can also think of parts of words as vectors which represent their meaning. Imagine the
word "undesirability." Using a morphological approach, which involves the different parts a
word has, we would think of it as being made out of morphemes (word parts) like this: "Un +
desire + able + ity." Every morpheme gets its own vector. From this we can build a neural
network that can compose the meaning of a larger unit, which in turn is made up of all of the
morphemes.
Deep learning can also make sense of the structure of sentences with syntactic parsers.
Google uses dependency parsing techniques like this, although in a more complex and larger
manner, with their "McParseface" and "SyntaxNet."
By knowing the structure of sentences, we can start trying to understand the meaning of
sentences. We start off with the meaning of words being vectors but we can also do this with
whole phrases and sentences, where the meaning is also represented as vectors. And if we
want to know the relationship of or between sentences, we train a neural network to make
those decisions for us.Deep learning is also good for sentiment analysis. Take this movie
review, for example: "This movie does not care about cleverness, with or any other kind of
intelligent humor." A traditional approach would have fallen into the trap of thinking this is a
positive review, because "cleverness or any other kind of intelligent humor" sounds like a
positive intent, but a neural network would have recognized its real meaning. Other
applications are chatbots, machine translation, Siri, Google inbox suggested replies and so
on.There has also been huge advancements in machine translation through the rise of
recurrent neural networks, about which I also wrote a blog-post.
Including Technology
5.1 Spyder: A powerful weapon for Machine Learning in Python
First of all, you would need to install Anaconda distribution which can be
downloaded from the link https://www.anaconda.com/download/ (for
Windows users only).
The installation is pretty simple just keep on clicking next and agree to terms and
conditions. So, the reason for installing Anaconda is that it comes with a lot of
preinstalled packages and Spyder is one of them. After installing the software just
click on the anaconda icon on the desktop or go to the search option in windows
17
10 and type in anaconda navigator, for the Ubuntu users you can install anaconda
using the terminal. As you open the navigator you will see the anaconda GUI
which looks like this:
Screenshot-0
From here just click on the launch button below the Spyder and a new Spyder
GUI will be opened in a separate window:
18
Screenshot-1
Ad by Valueimpression
As you can see by default a new .py file named untitled2.py has been
created. Untitled2 is the name of the file in which you will be writing your python
code.
Here, I would be highlighting some of the basic features that important features
and would be explaining them to you:
19
Screenshot-2
The portion marked in sky blue is used to set the directory of the file to be opened,
in the previous article (Linear Regression in Machine Learning) I had mentioned in
my code that I have stored the source code and .csv file in the same folder so
after you save those 2 files in the same folder you can just go to this set directory
option and select the folder in which you have stored the two files.
The portion marked in orange is the variable explorer this basically shows us the
info about all the variables that we have created, after selecting your code using
ctrlA and compiling your code using shift-enter just click on this option you will
see the following:
20
Screenshot-3
On the upper right of this screen you will see a box containing some names below
the name column such as x_test, x_train etc these are the variables I have
used and below the type column, you’ll see their datatype. As you move further
right you can see size as well as the values stored.
As you go back to screenshot-2 you will see another portion marked in dark blue.
The area marked is known as the file explorer and the main purpose of the file
explorer is to select the files and load it on your Spyder. It also allows you to have
a glimpse of the files present in the particular directory that you have selected.
And at last but not the least you’ll see the IPython console option marked with
black ink in screenshot-2. IPython is basically a command shell for interactive
computing in multiple programming languages, originally developed for
the Python programming language, that offers introspection, rich media, shell
syntax, tab completion, and history.
These are some of the basic features that Spyder offers, however, there are my
more just install Anaconda see for yourself.
SCREENSHOTS OF THE PROJECT
21
22
23
24
25
26
27
28
LIMITATION OF PROJECT
Sentiment analysis tools can identify and analyse many pieces of text
automatically and quickly.But computer programs have problems recognizing
things like sarcasm and irony, negations, jokes, and exaggerations - the sorts
of things a person would have little trouble identifying. And failing to recognize
these can skew the results.
'Disappointed' may be classified as a negative word for the purposes of

sentiment analysis, but within the phrase “I wasn't disappointed", it should be
classified as positive.We would find it easy to recognize as sarcasm the
statement "I'm really loving the enormous pool at my hotel!", if this statement is
accompanied by a photo of a tiny swimming pool; whereas an automated
sentiment analysis tool probably would not, and would most likely classify it as
an example of positive sentiment.
With short sentences and pieces of text, for example like those you find on
Twitter especially, and sometimes on Facebook, there might not be enough
context for a reliable sentiment analysis. However, in general, Twitter has a
reputation for being a good source of information for sentiment analysis, and
with the new increased word count for tweets it's likely it will become even more
useful.So, automated sentiment analysis tools do a really great job of analysing
text for opinion and attitude, but they're not perfect.
When you're using a tool like Typely to analyse your text to see if it conveys the
sentiment you want for your readers/audience, combine the results it gives you
with your human judgement to identify anything the tool may not be able to
easily determine.
Typely highlights phrases in your text by positive and negative sentiment,

making it super easy for you to see where your document is either expressing
exactly the sentiments you want it to, or where you may need to make some
changes.
29
Conclusions and Futur SCOPE
Conclusions
The field of sentiment analysis is an exciting new research direction due to large number of
real-world applications where discovering people’s opinion is important in better decision-
making. The development of techniques for the document-level sentiment analysis is one of
the significant components of this area. Recently, people have started expressing their
opinions on the Web that increased the need of analyzing the opinionated online content for
various real-world applications. A lot of research is present in literature for detecting
sentiment from the text. Still, there is a huge scope of improvement of these existing
sentiment analysis models.
Sentiment analysis or opinion mining is a field of study that analyzes people’s sentiments,
attitudes, or emotions towards certain entities. This project tackles a fundamental problem
of sentiment analysis, sentiment polarity categorization. Online product reviews from
Amazon.com are selected as data used for this study. A sentiment polarity categorization
process has been proposed along with detailed descriptions of each step. Experiments for
both sentence-level categorization and review-level categorization have been performed.
30
The future scope of sentiment analysis
Sentiment analysis is a useful tool for any organization or group for which public
sentiment or attitude towards them is important for their success - whichever way that
success is defined.
On social media, blogs, and online forums millions of people are busily discussing and
reviewing businesses, companies, and organizations. And those opinions are being
‘listened to’ and analysed.
Those being discussed are making use of this enormous amount of data by using
computer programs that don’t just locate all mentions of their products, services, or
business, but also determine the emotions and attitudes behind the words being used.
The results from sentiment analysis help businesses understand the conversations
and discussions taking place about them, and helps them react and take action
accordingly.
They can quickly identify any negative sentiments being expressed, and turn poor
customer experiences into very good ones.
They can create better products and services, and they can formulate the marketing
messages they send out according to the sentiments being expressed by their target
audience or customers.
All of which adds up to increased sales and revenue.
By listening to and analysing comments on Facebook and Twitter, local government

departments can gauge public sentiment towards their department and the services
they provide, and use the results to improve services such as parking and leisure
facilities, local policing, and the condition of roads.
Universities can use sentiment analysis to analyze student feedback and comments
garnered either from their own surveys, or from online sources such as social media.
They can then use the results to identify and address any areas of student
dissatisfaction, as well as identify and build on those areas where students are
expressing positive sentiments.
And by analysing the sentiment behind customer reviews on sites like TripAdvisor and
Yelp, hotels and restaurants can not only manage their reputations by improving the
services offered, but can also gauge the general customer attitude to their business
or brand.
Businesses can compare their results with those of their competitors to better
understand people’s attitude to their business. They can identify where they may be
excelling, or identify where there’s room for improvement compared to the competition.
31
They can also conduct market research into general sentiment around key issues,
topics, products, and services, before developing and launching their own new
services, products or features.
References:
https://monkeylearn.com/sentiment-analysis/
https://ieeexplore.ieee.org/abstract/document/6812968
32

Sentiment Analysis: Understanding the Challenges of NLP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sentiment Analysis: Understanding the Challenges of NLP

Uploaded by

Copyright:

Available Formats

i

COMPUTER SCIENCE & ENGINEERING

SATISH KUMAR JAISWAL,Roll. No. 170280571

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

JULY - DECEMBER 2020

SATISH KUMAR JAISWAL

I,am Satish Kumar Jaiswal, Roll No.170280571, B.Tech (Semester-

Satish KUMAR JAISWAL

3. LITERATURE REVIEW .......................................... 8

4. METHODOLOGY OF THE PROJECT ................. 11

6. SCREENSHOTS OF PROJECT ............................ 22

● Speech Recognition — The translation of spoken language into text.

The package is nice.

The package is red.

Context and Polarity

Yeah, sure. So smooth!

Not one, but many!

1.2 Sentiment analysis

Semantic analysis is the process of understanding the meaning and interpretation of

You’ll need to pay special attention to character-level, as well as word-level, when

Defining what we mean by neutral is another challenge to tackle in order to perform

1. Objective texts. So called objective texts do not contain explicit sentiments, so

Methodology of the Project

Sentiment analysis (or opinion mining) is a natural language processing technique

I was taking a ride in the car.

3.1 TEXT SEGMENTATION

3.2 Named Entity Recognition

3.3 Relationship Extraction

3.4 Sentiment Analysis

Sentiment analysis is the process of determining whether a piece of writing is positive,

Sentiment Analysis Use Cases & Applications

1. Social Media Monitoring

Sentiment analysis is used in social media monitoring, allowing businesses to gain

Businesses use these scores to identify customers as promoters, passives, or

Example: McKinsey City Voices project

Analyze customer support interactions to ensure your employees are following

4.1 DEEP LEARNING AND NLP

SCREENSHOTS OF THE PROJECT

'Disappointed' may be classified as a negative word for the purposes of

Typely highlights phrases in your text by positive and negative sentiment,

All of which adds up to increased sales and revenue.

By listening to and analysing comments on Facebook and Twitter, local government

You might also like