Professional Documents
Culture Documents
AI Coursework 1 (20%)
I confirm that I understand my coursework needs to be submitted online via Google Classroom under the relevant module page
before the deadline in order for my assignment to be accepted and marked. I am fully aware that late submissions will be treated
as non-submission and a mark of zero will be awarded.
Table of Contents
1. Introduction ................................................................................................................................. 1
2. Background ................................................................................................................................. 5
2.2. Review and analysis of existing work in the problem domain ............................................ 6
3. Solution ....................................................................................................................................... 7
4. Conclusion ................................................................................................................................ 14
References ..................................................................................................................................... 17
Table of Figures
Figure 1: Bayes Theorem ................................................................................................................ 8
Figure 2: Flowchart ....................................................................................................................... 13
Table of Tables
1. Introduction
With the advancement in technology in today’s world, the different new technologies are getting
in touch day by day. Here, one of the trending technologies topic of computer science is Artificial
Intelligence which is creating a new revolution in the world making machines intelligent. It is
currently working with a variety of subfields, ranging from general to specific, such as self-driving
cars, playing chess, proving theorems, playing music, Painting, etc. (javaTpoint, 2020).
The Artificial Intelligence (AI) is a branch of computer Science, which is mainly concerned with
automation of intelligent behavior. AI is a machine’s capacity to understand the environment and
take action that are relatable to human behavior, and this action is extremely likely to succeed. It
is not a system but it is applied to understand and address challenges in the system (Sharma, 2018)
The aim of an AI is to improve computer functions which eases the life of human related to human
knowledge i.e. learning and problem-solving. It is not the system but is implement to make the
computers intelligent (Selvamanikkam, 2018). Insight gathering and task automation were
impossible to occur without being strategically applying AI to certain processes like Parsing
through the mountains of data created by humans, AI systems perform intelligent searches,
interpreting both text and images to discover patterns in complex data, and then act on those
learnings (Otte, 2020).
Another technology that AI has is Natural Language Processing (NLP). NLP is a fundamental
feature of AI for interacting with an autonomous system using natural language. Some famous
applications of NLP are speech recognition, text translation and sentiment analysis. Basically, NLP
is like building a system that can understand human language. The machine should first learn how
to do things, in order to make the machine understand a language, and this is where machine
learning is used within the Natural Language Processing (NLP) (Expert System, 2016).
Machine learning’s primary role in sentiment analysis is to improve and automate the functions of
low-level text analytics that sentiment analysis relies on, including part of speech tagging. For
example, machine learning model can be trained to identify positive or negative feedbacks by
feeding it a large volume of datasets containing feedbacks. Using supervised and unsupervised
machine learning techniques, the model will be trained whether the given feedback is positive or
negative
The sentiment analysis for text analysis combines natural language processing (NLP) and machine
learning techniques to assign weighted sentiment scores to the entities, topics and categories within
a sentence or phrase (Lexalytics, 2020).
Sentiment analysis falls under the different application of natural language processing. It is a
process of analyzing whether a piece of texts is positive, negative or neutral. Besides identifying
the sentiment of texts, these system extract attributes of the expression (MonkeyLearn, 2020). i.e.
Polarity: person or entity expresses a positive or negative opinion.
Sentiment analysis is essentially a classification of text that seeks to estimate the polarity of a body
of text based solely on its content i.e. text can be characterized as a value that indicates if the
opinion expressed is positive. (Polarity=1), negative (polarity=0) or neutral. The computer has to
be trained with a pre-labeled dataset of positive or negative content to get the machine to derive
sentiments from pieces of text. This means techniques of natural language processing and machine
learning are required for a system to perform sentiment analysis.
Nowadays there are many online platform like Amazon, E-bay and other learning platform like
Coursera and Udemy. It provides thousands of brand, products and learning courses and has
thousands of users and customers. Customers reviews their feedback on brands and product with
their experiences and this feedback is also generated in thousands. Determining whether a
particular feedback is positive or negative along with thousands of other feedback is humanly
impossible. Reviews and feedback are very important for determining the performance of a
particular product or services and can be tracked and helps in future business decisions. Sentiment
analysis can be a best ultimate solution for this problem domain which can be used to identify and
extract subjective information which will help the business to understand the social sentiment of
their products and brands.
2. Background
On another research Alec Go and team performed a sentiment search by twitter to collect training
data. Various classifiers were used in a corpus (collection of texts) constructed by using positive
and negative samples of feedbacks and emotions. Among all other classifiers used, Naïve Bayes
classifier performed by best result with accuracy of 81% on their test set. But this method when
used with three classes like negative positive and neutral obtained bad performance (Go, Alec;
Bhayani, Richa; Huang, Lei;, 2009).
In a research done by Alexander Pak and Patrick Paroubek, they used twitter as a corpus for
sentiment analysis and opinion mining. Their research papers focuses on using twitter for the task
of sentiment analysis. Their paper includes on procedures for automatic collection of bulk of texts
and approaches on performing linguistic analysis of the collected texts. The authors further built a
sentiment classifier by using collection of texts which is able to determine the polarity i.e. (positive,
negative and neutral) of a texts (Pak, Alexander; Paroubek, Patrick;, 2010).
An excellent example of brand monitoring using sentiment analysis is KFC. KFC had chosen to
use the sentiment analysis for brand building and monitoring. They engages users with their brand
and ultimately are led to engage with the product by combining sentiment analysis in social
networks monitoring and campaign management.
In another research Amazon is also using sentiment analysis for monitoring their brand and keep
track of performance of their product. In essence amazon is using this application to get insights
and understand what their customers are looking for in their product. Apart for brand monitoring,
they are also using this application for market research and competitor analysis. For better planning
and further business decisions, they are analyzing competitors and their movements on the market
by the help of this application.
3. Solution
P(A|B) – posterior
P(A) – prior
P(B) – evidence
P(B|A) – likelihood
The first step in Naïve Bayes algorithm is creating a frequency table containing word frequencies.
Every document is treated as a set of the words it contains by ignoring word order and sentence
construction. From the training data the text can be represented by using the bag of words
approach. It is an approach where each word from a sentence is separated and its repentance in
that sentence is counted. For example:
Boring -
helpf cour an materi bori don was tim i thi usef conte help lo than aL
ul se d als ng t te e n s ul nt ed t ks a
b
e
l
Helpfu 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0+
l
course
and
materia
ls.
Boring 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0-
.
Don’t 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0-
waste
time in
this.
Useful 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0+
materia
ls and
content
.
Helped 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1+
a
l
ot.
Thanks
P(“ I dont like it” | + ) * P (+) and P(“ I dont like it” | + ) * P (-). Comparison between these two
probabilities can be made to separate either the given review is positive or negative.
As we are using naïve bayes algorithm we assume every word in a sentence is independent of the
other ones so we are no longer looking at entire sentences, but rather at individual words.
So, for P(“ I dont like it” | + ) * P (+) we write P(+ ) * P( I | + ) * P ( don’t | + ) * P ( like | + ) * P
( it | + ) and for negative P(“ I dont like it” | - ) * P (-) we write P(- ) * P( I | - ) * P ( don’t | - ) * P
( like | - ) * P ( it | - ).
For positive:
P( + ) = 3/5 = 0.6
P( I | + ) = (0+1)/(10+16)=0.0384
P( don’t | + ) = (0+1)/(10+16)=0.0384
P (like | + ) = (0+1)/(10+16)=0.0384
P (it | +) = (0+1)/(10+16)=0.0384
Y+ = P(+ ) * P( I | + ) * P ( don’t | + ) * P ( like | + ) * P ( it | + ) = 0.09216
For negative:
P ( - ) = 2/5 = 0.4
P( I | - ) = (0+1)/(6+16)= 0.0454
P( don’t | - ) = (1+1)/(6+16)= 0.0909
P (like | - ) = (0+1)/(6+16)= 0.0454
P (it | -) = (0+1)/(6+16)= 0.0454
y- = P(- ) * P( I | - ) * P ( don’t | - ) * P ( like | - ) * P ( it | - ) = 0.19986
As value of y- is greater that y+ the review is classified as negative. This is how Bayes theorem is
used in Naïve Bayes classifier.
Figure 2: Flowchart
4. Conclusion
From different available machine learning classifiers for text classification, Naïve Bayes classifier
were selected for sentiment analysis. Brief discussion of the approach on selecting this classifier
has also been included in this report. Naïve Bayes classifier uses Bayes theorem to predict the
sentiment. How this theorem is used for predicting the sentiment of a text has explained with each
steps of algorithm demonstrating how sentiment of a word can be predicted using Naïve Bayes
classifier. For proper understanding of the implementation of the algorithm, pseudocode and
flowchart of the algorithm have included in this report.
Sentiment analysis has empowered all kinds of market research and competitive analysis, whether
exploring a new market, or keeping an edge on the competition, sentiment analysis has made all
the difference. Sentiment analysis make this possible by analyzing product review of a brand and
compare those with competitors, compare sentiment across international markets and so on
(Stecanella, 2017).
Sentiment analysis can be used in monitoring social media and product reviews. Tweets, Facebook
post, or product reviews can be analyzed over a period of time to see sentiment of a particular
audience or public sentiment over a product. This can be used to have deep insights into the current
status of the product in the market and helps to prioritize action and track trends over time
(Stecanella, 2017).
For any types of services, feedbacks and the opinions of the public is crucial. Surveys can be
performed to get the feedbacks and the sentiment of the public. Sentiment analysis can be
performed in surveys to identify the performance of the services and how well they are benefitting
the people and understand the changes required for improving the services.
These are some real-world scenarios that sentiment analysis has been benefiting. It can be applied
to many other aspects like on market research and competitor analysis, social network monitoring
and so on. Different business brands are being able work with more accuracy by incorporating
sentiment analysis to the existing system
References
Expert System, 2016. Examples of natural language processing systems in artificial intelligence.
[Online]
Available at: https://www.expert.ai/blog/examples-natural-language-processing-systems-
artificial-intelligence/
[Accessed 29 December 2020].
Go, Alec; Bhayani, Richa; Huang, Lei;, 2009. Twitter sentiment classification using distant
supervision, s.l.: Stanford.
javaTpoint, 2020. Artificial Intelligence Tutorial. [Online]
Available at: https://www.javatpoint.com/artificial-intelligence-tutorial
[Accessed 27 December 2020].
Kathait, S. S., 2017. INTELLIGENT SYSTEM FOR ANALYZING SENTIMENTS OF
FEEDBACK. International Journal of Scientific & Technology Research, VIII(2), pp. 588-594.
Lexalytics, 2020. Sentiment Analysis Explained. [Online]
Available at: https://www.lexalytics.com/technology/sentiment-analysis#machine-learning-
sentiment
[Accessed 29 December 2020].
MonkeyLearn, 2020. Sentiment Analysis: A Definitive Guide. [Online]
Available at: https://monkeylearn.com/sentiment-analysis/#the-basics-of-sentiment-analysis
[Accessed 29 December 2020].
Otte, S., 2020. How does Artificial Intelligence work?. [Online]
Available at: https://www.innoplexus.com/blog/how-artificial-intelligence-works/
[Accessed 30 December 2020].
Pak, Alexander; Paroubek, Patrick;, 2010. Twitter as a Corpus for Sentiment Analysis and Opinion
Mining. Proceedings of the International Conference on Language Resources and Evaluation,
I(1), pp. 1321-1326.
Pang, Bo; Lee, Lillian;, 2008. Opinion Mining and Sentiment Analysis. 1st ed. USA: now
Publishers Inc.