Ankit Adhikari 2 PDF

Module Code & Module Title
CU6051NT- Artificial Intelligence
Assessment Weightage & Type
AI Coursework 1 (20%)
Year and Semester
2020-21 Autumn Year Long
Student Name: Ankit Adhikari
London Met ID: 18028880
College ID: NP05CP4A1800004
Assignment Submission Date: January 17, 2020
Submitted To: Mr. Prateek Kokh Shrestha
Word Count: 3231
I confirm that I understand my coursework needs to be submitted online via Google Classroom under the relevant module page
before the deadline in order for my assignment to be accepted and marked. I am fully aware that late submissions will be treated
as non-submission and a mark of zero will be awarded.
Table of Contents
1. Introduction ................................................................................................................................. 1
1.1. Explanation of the topic/AI concepts used .......................................................................... 2
1.1.1. Explanation of the AI concepts used ........................................................................ 2
1.1.2. Explanation of the topic ............................................................................................ 3
1.2. Introduction of the chosen topic problem domain ............................................................... 4
2. Background ................................................................................................................................. 5
2.1. Research work done on the chosen topic ............................................................................. 5
2.2. Review and analysis of existing work in the problem domain ............................................ 6
3. Solution ....................................................................................................................................... 7
3.1. Approach to solving the problem ......................................................................................... 7
3.2. Explanation of the AI algorithm used .................................................................................. 8
3.3. Pseudocode of the solution ................................................................................................ 12
3.4. Diagrammatical representations of the solution ................................................................ 13
4. Conclusion ................................................................................................................................ 14
4.1 Analysis of the work done .................................................................................................. 14
4.2. How the solution addresses real world problems .............................................................. 15
4.3. Further work....................................................................................................................... 16
References ..................................................................................................................................... 17
Table of Figures
Figure 1: Bayes Theorem ................................................................................................................ 8
Figure 2: Flowchart ....................................................................................................................... 13
Table of Tables
Table 1: Labeled training data ........................................................................................................ 8

Table 2: Bag of Words .................................................................................................................... 9
CU6051NT Artificial Intelligence
1. Introduction
With the advancement in technology in today’s world, the different new technologies are getting
in touch day by day. Here, one of the trending technologies topic of computer science is Artificial
Intelligence which is creating a new revolution in the world making machines intelligent. It is
currently working with a variety of subfields, ranging from general to specific, such as self-driving
cars, playing chess, proving theorems, playing music, Painting, etc. (javaTpoint, 2020).
The Artificial Intelligence (AI) is a branch of computer Science, which is mainly concerned with
automation of intelligent behavior. AI is a machine’s capacity to understand the environment and
take action that are relatable to human behavior, and this action is extremely likely to succeed. It
is not a system but it is applied to understand and address challenges in the system (Sharma, 2018)
The aim of an AI is to improve computer functions which eases the life of human related to human
knowledge i.e. learning and problem-solving. It is not the system but is implement to make the
computers intelligent (Selvamanikkam, 2018). Insight gathering and task automation were
impossible to occur without being strategically applying AI to certain processes like Parsing
through the mountains of data created by humans, AI systems perform intelligent searches,
interpreting both text and images to discover patterns in complex data, and then act on those
learnings (Otte, 2020).
1|Page Ankit Adhikari 18028880

1.1. Explanation of the topic/AI concepts used
1.1.1. Explanation of the AI concepts used

AI is a broad field of science that is embedded across a number of technology and machine learning
is one of it. Machine learning is one of the subfield of AI, which enables software programs to
learn and enhance from experience automatically without being programmed specially. Machine
learning focuses on the development of algorithms to receive inputs and significant analysis to
predict the result while new data are available to update results whenever new data is available.
The main objective of Machine Learning is allow the computers to learn automatically without
being programmed or without human intervention (Reese, 2017).
Another technology that AI has is Natural Language Processing (NLP). NLP is a fundamental
feature of AI for interacting with an autonomous system using natural language. Some famous
applications of NLP are speech recognition, text translation and sentiment analysis. Basically, NLP
is like building a system that can understand human language. The machine should first learn how
to do things, in order to make the machine understand a language, and this is where machine
learning is used within the Natural Language Processing (NLP) (Expert System, 2016).
Machine learning’s primary role in sentiment analysis is to improve and automate the functions of
low-level text analytics that sentiment analysis relies on, including part of speech tagging. For
example, machine learning model can be trained to identify positive or negative feedbacks by
feeding it a large volume of datasets containing feedbacks. Using supervised and unsupervised
machine learning techniques, the model will be trained whether the given feedback is positive or
negative
The sentiment analysis for text analysis combines natural language processing (NLP) and machine
learning techniques to assign weighted sentiment scores to the entities, topics and categories within
a sentence or phrase (Lexalytics, 2020).

1.1.2. Explanation of the topic

Because of technological advancements, there is much data in internet today which is humanly
impossible to monitor those data. In business field, online selling platform like amazon, e-bay
generates huge amount of data daily. And it is impossible that going to each pieces of data. For
clear understanding, reliable information about consumer preferences those online selling platform
are performing practice of high level sentiment analysis. This analysis is done to understand the
social sentiment of customer towards a brand or a product. To be able to understand the positive
or negative responses of people from textual data the topic “Sentiment Analysis of text” has
chosen.
Sentiment analysis falls under the different application of natural language processing. It is a
process of analyzing whether a piece of texts is positive, negative or neutral. Besides identifying
the sentiment of texts, these system extract attributes of the expression (MonkeyLearn, 2020). i.e.
 Polarity: person or entity expresses a positive or negative opinion.
Sentiment analysis is essentially a classification of text that seeks to estimate the polarity of a body
of text based solely on its content i.e. text can be characterized as a value that indicates if the
opinion expressed is positive. (Polarity=1), negative (polarity=0) or neutral. The computer has to
be trained with a pre-labeled dataset of positive or negative content to get the machine to derive
sentiments from pieces of text. This means techniques of natural language processing and machine
learning are required for a system to perform sentiment analysis.

1.2. Introduction of the chosen topic problem domain

With the advancement in internet today, the data is being generated in such a huge scale that going
into each pieces of data is humanly impossible. In today’s businesses, the data is being very useful
for findings of different problems and the study of those data helps to prepare the next step for
improvising the business. One of the most important part of any business is public sentiment and
customer feedbacks on their products, brands and services. With all those immense amount of
customer feedback, it becomes impossible to determine whether their products and services are
booming or customers are not liking their product. Customer views on individual products and
services are what makes that product improve and if the customer feedbacks are in huge amount,
it is very challenging task for human to determine whether the feedbacks is positive or negative
(Stecanella, 2017).
Nowadays there are many online platform like Amazon, E-bay and other learning platform like
Coursera and Udemy. It provides thousands of brand, products and learning courses and has
thousands of users and customers. Customers reviews their feedback on brands and product with
their experiences and this feedback is also generated in thousands. Determining whether a
particular feedback is positive or negative along with thousands of other feedback is humanly
impossible. Reviews and feedback are very important for determining the performance of a
particular product or services and can be tracked and helps in future business decisions. Sentiment
analysis can be a best ultimate solution for this problem domain which can be used to identify and
extract subjective information which will help the business to understand the social sentiment of
their products and brands.

2. Background
2.1. Research work done on the chosen topic

Many research works have been carried out on the chosen topic “sentiment analysis”. On one of
the research author by Pang and Lee have described the existing techniques and approaches for the
extraction of opinion oriented content. Their research provides the material for the description of
evaluative text and on greater issues regarding privacy, manipulation, and economic impact that
the growth of opinion information-access services gives rise to (Pang, Bo; Lee, Lillian;, 2008).
On another research Alec Go and team performed a sentiment search by twitter to collect training
data. Various classifiers were used in a corpus (collection of texts) constructed by using positive
and negative samples of feedbacks and emotions. Among all other classifiers used, Naïve Bayes
classifier performed by best result with accuracy of 81% on their test set. But this method when
used with three classes like negative positive and neutral obtained bad performance (Go, Alec;
Bhayani, Richa; Huang, Lei;, 2009).
In a research done by Alexander Pak and Patrick Paroubek, they used twitter as a corpus for
sentiment analysis and opinion mining. Their research papers focuses on using twitter for the task
of sentiment analysis. Their paper includes on procedures for automatic collection of bulk of texts
and approaches on performing linguistic analysis of the collected texts. The authors further built a
sentiment classifier by using collection of texts which is able to determine the polarity i.e. (positive,
negative and neutral) of a texts (Pak, Alexander; Paroubek, Patrick;, 2010).

2.2. Review and analysis of existing work in the problem domain

Sentiment analysis has been a crucial method for interpreting the data where 2.5 quintillions of
data is generated every day. This has helped organization to gain crucial insights and automate all
sorts of procedures and analytics for improvising business. Sentiment analysis is being used in
different field with various purposes. Sentiment analysis has helped different organization in
planning and future business decisions by collecting sentiments through customer feedbacks and
reviews tracking the performance of the particular product in the market (Stecanella, 2017).
Sentiment analysis is being used on various purposes. Some common purpose are:
 Customer Feedback
 Product Analytics
 Spam Filtering
 Brand Monitoring
 Market Research and Analysis
 Social Network monitoring
An excellent example of brand monitoring using sentiment analysis is KFC. KFC had chosen to
use the sentiment analysis for brand building and monitoring. They engages users with their brand
and ultimately are led to engage with the product by combining sentiment analysis in social
networks monitoring and campaign management.
In another research Amazon is also using sentiment analysis for monitoring their brand and keep
track of performance of their product. In essence amazon is using this application to get insights
and understand what their customers are looking for in their product. Apart for brand monitoring,
they are also using this application for market research and competitor analysis. For better planning
and further business decisions, they are analyzing competitors and their movements on the market
by the help of this application.

3. Solution
3.1. Approach to solving the problem

Considering above research, facts, and explanation it is clear that sentiment analysis can be used
for various aspects like:
 Customer Feedback
 Customer Support
 Brand Monitoring
 Product Analysis
The optimal approach in achieving the above aspects is the use of machine learning techniques
and algorithms through the incorporation of NLP techniques into data pre-processing. The
preferred approach to achieve this task of predicting sentiment is supervision learning. Kaggle
holds many datasets for sentiment analysis and for this particular tasks the labeled dataset on
product reviews is to be used as the training dataset. There are numerous algorithms available to
fit the model. But for this task Naïve Bayes algorithm is using for predicting the sentiment. It is
considered to be used as the classifier due to following reasons: (Kathait, 2017)
 Frequently used for working with natural language text documents.

 Probabilistic model.
 Fast, reliable, and accurate.
 Highly practical method (Kathait, 2017) .

3.2. Explanation of the AI algorithm used

Naïve Bayes is a probabilistic algorithm that take advantage of probability theory and Bayes
theorem to predict the sentiment of a text. In this algorithm the probability of each tag for a given
text is calculated and output is the tag with highest probability. In probability theory, Bayes rule
describes the probability of a feature based on prior knowledge of conditions that might be related
to that feature (Stecanella, 2017)
Figure 1: Bayes Theorem
P(A|B) – posterior
P(A) – prior
P(B) – evidence
P(B|A) – likelihood
The first step in Naïve Bayes algorithm is creating a frequency table containing word frequencies.
Every document is treated as a set of the words it contains by ignoring word order and sentence
construction. From the training data the text can be represented by using the bag of words
approach. It is an approach where each word from a sentence is separated and its repentance in
that sentence is counted. For example:
Table 1: Labeled training data
Training Data Label

Helpful course and materials +
Boring -
Useful materials and content +
Don’t waste time in this -
Helped a lot. Thanks +

Vocabulary of unique words ignoring case and punctuations:

(Helpful, course, and, materials, boring, don’t, waste, time, in, this, useful, content, helped, lot,
thanks)
Table 2: Bag of Words
helpf cour an materi bori don was tim i thi usef conte help lo than aL
ul se d als ng t te e n s ul nt ed t ks a
b
e
l
Helpfu 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0+
l
course
and
materia
ls.
Boring 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0-
.
Don’t 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0-
waste
time in
this.
Useful 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0+
materia
ls and
content
.
Helped 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1+
a
l
ot.
Thanks

To predict a review to be positive or negative bayes theorem can be used:
Let’s take a review: “I dont like it.”
P(“ I dont like it” | + ) * P (+) and P(“ I dont like it” | + ) * P (-). Comparison between these two
probabilities can be made to separate either the given review is positive or negative.
As we are using naïve bayes algorithm we assume every word in a sentence is independent of the
other ones so we are no longer looking at entire sentences, but rather at individual words.
So, for P(“ I dont like it” | + ) * P (+) we write P(+ ) * P( I | + ) * P ( don’t | + ) * P ( like | + ) * P
( it | + ) and for negative P(“ I dont like it” | - ) * P (-) we write P(- ) * P( I | - ) * P ( don’t | - ) * P
( like | - ) * P ( it | - ).
For positive:
P( + ) = 3/5 = 0.6
P( I | + ) = (0+1)/(10+16)=0.0384
P( don’t | + ) = (0+1)/(10+16)=0.0384
P (like | + ) = (0+1)/(10+16)=0.0384
P (it | +) = (0+1)/(10+16)=0.0384
Y+ = P(+ ) * P( I | + ) * P ( don’t | + ) * P ( like | + ) * P ( it | + ) = 0.09216
10 | P a g e Ankit Adhikari 18028880

For negative:
P ( - ) = 2/5 = 0.4
P( I | - ) = (0+1)/(6+16)= 0.0454
P( don’t | - ) = (1+1)/(6+16)= 0.0909
P (like | - ) = (0+1)/(6+16)= 0.0454
P (it | -) = (0+1)/(6+16)= 0.0454
y- = P(- ) * P( I | - ) * P ( don’t | - ) * P ( like | - ) * P ( it | - ) = 0.19986
As value of y- is greater that y+ the review is classified as negative. This is how Bayes theorem is
used in Naïve Bayes classifier.

3.3. Pseudocode of the solution

Import numpy
Import pandas
Import Naïve Bayes Classifier
Input dataset
Feature extraction
Create training model
Apply Naïve Bayes Classifier to ready a model
Input new data
Extract feature value
Apply model
Polarity check
Display result

3.4. Diagrammatical representations of the solution
Figure 2: Flowchart

4. Conclusion
4.1 Analysis of the work done

Due to increase of computational power and development on big data, the field of AI is blooming
and has brought revolutionary changes in current technologies. In this report short explanation of
AI has done highlighting its impact on other field. Making a machine or raw data make sense can
be achieved by different machine learning approaches. The process of how machine learning
techniques makes machine smart has explained in this report. Making a machine understand our
natural language and act accordingly is one of the biggest achievement of AI which is made
possible by different machine learning algorithms. For a business to be succeed, it has to monitor
different aspects like brand monitoring, customer review and feedback etc. In this report how those
aspects can be achieved by the implementation of machine learning algorithms has discussed. The
introduction of the topic “Sentiment analysis” has also been discussed with the analysis on
approaches it takes place to tackle with different problem domains.
From different available machine learning classifiers for text classification, Naïve Bayes classifier
were selected for sentiment analysis. Brief discussion of the approach on selecting this classifier
has also been included in this report. Naïve Bayes classifier uses Bayes theorem to predict the
sentiment. How this theorem is used for predicting the sentiment of a text has explained with each
steps of algorithm demonstrating how sentiment of a word can be predicted using Naïve Bayes
classifier. For proper understanding of the implementation of the algorithm, pseudocode and
flowchart of the algorithm have included in this report.

4.2. How the solution addresses real world problems

Sentiment analysis has become a crucial method for interpreting the data where 2.5 quintillions of
data is generated every day. This has helped organization to gain crucial insights and automate all
sorts of procedures and analytics for improvising business. Sentiment analysis is being used in
different field with various purposes. Sentiment analysis has helped different organization in
planning and future business decisions by collecting sentiments through customer feedbacks and
reviews tracking the performance of the particular product in the market (Stecanella, 2017).
Sentiment analysis has empowered all kinds of market research and competitive analysis, whether
exploring a new market, or keeping an edge on the competition, sentiment analysis has made all
the difference. Sentiment analysis make this possible by analyzing product review of a brand and
compare those with competitors, compare sentiment across international markets and so on
(Stecanella, 2017).
Sentiment analysis can be used in monitoring social media and product reviews. Tweets, Facebook
post, or product reviews can be analyzed over a period of time to see sentiment of a particular
audience or public sentiment over a product. This can be used to have deep insights into the current
status of the product in the market and helps to prioritize action and track trends over time
(Stecanella, 2017).
For any types of services, feedbacks and the opinions of the public is crucial. Surveys can be
performed to get the feedbacks and the sentiment of the public. Sentiment analysis can be
performed in surveys to identify the performance of the services and how well they are benefitting
the people and understand the changes required for improving the services.

These are some real-world scenarios that sentiment analysis has been benefiting. It can be applied
to many other aspects like on market research and competitor analysis, social network monitoring
and so on. Different business brands are being able work with more accuracy by incorporating
sentiment analysis to the existing system
4.3. Further work

This report has only touched the surface of sentiment analysis. For predicting a sentiment
accurately, it requires combined usage of both rule-based approaches like lexicons and automatic
approaches like machine learning technique. Naïve Bayes is a basic model but performance of this
model can be increased by using different data pre-processing techniques, matching the level of
other advanced techniques. The techniques like lemmatizing words, Ngrams, TF-IDF, laplace
correction, stemming, emoticon, negation, dictionary and so on can significantly increase the
accuracy score (Ray, 2017).
For further work, I am implementing researched knowledge on to actually develop a system of

sentiment analysis. However, this report is targeted to gather some basic knowledge and working
mechanisms and other information regarding the topic. The further research would be done for
actual implementation of specific approach and to address the problem domain.

References
Expert System, 2016. Examples of natural language processing systems in artificial intelligence.
[Online]
Available at: https://www.expert.ai/blog/examples-natural-language-processing-systems-
artificial-intelligence/
[Accessed 29 December 2020].
Go, Alec; Bhayani, Richa; Huang, Lei;, 2009. Twitter sentiment classification using distant
supervision, s.l.: Stanford.
javaTpoint, 2020. Artificial Intelligence Tutorial. [Online]
Available at: https://www.javatpoint.com/artificial-intelligence-tutorial
Kathait, S. S., 2017. INTELLIGENT SYSTEM FOR ANALYZING SENTIMENTS OF
FEEDBACK. International Journal of Scientific & Technology Research, VIII(2), pp. 588-594.
Lexalytics, 2020. Sentiment Analysis Explained. [Online]
Available at: https://www.lexalytics.com/technology/sentiment-analysis#machine-learning-
sentiment
MonkeyLearn, 2020. Sentiment Analysis: A Definitive Guide. [Online]
Available at: https://monkeylearn.com/sentiment-analysis/#the-basics-of-sentiment-analysis
Otte, S., 2020. How does Artificial Intelligence work?. [Online]
Available at: https://www.innoplexus.com/blog/how-artificial-intelligence-works/
Pak, Alexander; Paroubek, Patrick;, 2010. Twitter as a Corpus for Sentiment Analysis and Opinion
Mining. Proceedings of the International Conference on Language Resources and Evaluation,
I(1), pp. 1321-1326.
Pang, Bo; Lee, Lillian;, 2008. Opinion Mining and Sentiment Analysis. 1st ed. USA: now
Publishers Inc.

Ray, S., 2017. Naive Bayes Explained. [Online]

Available at: https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/
[Accessed 12 January 2021].
Reese, H., 2017. Understanding the differences between AI, machine learning, and deep learning.
[Online]
Available at: https://www.techrepublic.com/article/understanding-the-differences-between-ai-
machine-learning-and-deep-learning/
Selvamanikkam, M., 2018. Introduction to Artificial Intelligence. [Online]
Available at: https://becominghuman.ai/introduction-to-artificial-intelligence-5fba0148ec99
Sharma, A., 2018. Difference between Machine learning and Artificial Intelligence. [Online]
Available at: https://www.geeksforgeeks.org/difference-between-machine-learning-and-artificial-
intelligence/
Stecanella, B., 2017. [Online]
Available at: https://monkeylearn.com/blog/practical-explanation-naive-bayes-classifier/
Stecanella, B., 2017. A practical explanation of a Naive Bayes classifier. [Online]
Available at: https://monkeylearn.com/blog/practical-explanation-naive-bayes-classifier/

Ankit Adhikari 2 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ankit Adhikari 2 PDF

Uploaded by

Copyright:

Available Formats

Module Code & Module Title

CU6051NT- Artificial Intelligence

Assessment Weightage & Type

Year and Semester

2020-21 Autumn Year Long

Student Name: Ankit Adhikari

London Met ID: 18028880

College ID: NP05CP4A1800004

Assignment Submission Date: January 17, 2020

Submitted To: Mr. Prateek Kokh Shrestha

Word Count: 3231

1.1. Explanation of the topic/AI concepts used .......................................................................... 2

1.1.1. Explanation of the AI concepts used ........................................................................ 2

1.1.2. Explanation of the topic ............................................................................................ 3

1.2. Introduction of the chosen topic problem domain ............................................................... 4

2.1. Research work done on the chosen topic ............................................................................. 5

3.1. Approach to solving the problem ......................................................................................... 7

3.2. Explanation of the AI algorithm used .................................................................................. 8

3.3. Pseudocode of the solution ................................................................................................ 12

3.4. Diagrammatical representations of the solution ................................................................ 13

4.1 Analysis of the work done .................................................................................................. 14

4.2. How the solution addresses real world problems .............................................................. 15

4.3. Further work....................................................................................................................... 16

Table 1: Labeled training data ........................................................................................................ 8

1|Page Ankit Adhikari 18028880

1.1. Explanation of the topic/AI concepts used

1.1.1. Explanation of the AI concepts used

2|Page Ankit Adhikari 18028880

1.1.2. Explanation of the topic

3|Page Ankit Adhikari 18028880

1.2. Introduction of the chosen topic problem domain

4|Page Ankit Adhikari 18028880

2.1. Research work done on the chosen topic

5|Page Ankit Adhikari 18028880

2.2. Review and analysis of existing work in the problem domain

6|Page Ankit Adhikari 18028880

3.1. Approach to solving the problem

 Frequently used for working with natural language text documents.

7|Page Ankit Adhikari 18028880

3.2. Explanation of the AI algorithm used

Figure 1: Bayes Theorem

Table 1: Labeled training data

Training Data Label

Useful materials and content +

Don’t waste time in this -

Helped a lot. Thanks +

8|Page Ankit Adhikari 18028880

Vocabulary of unique words ignoring case and punctuations:

Table 2: Bag of Words

9|Page Ankit Adhikari 18028880

To predict a review to be positive or negative bayes theorem can be used:

Let’s take a review: “I dont like it.”

10 | P a g e Ankit Adhikari 18028880

11 | P a g e Ankit Adhikari 18028880

3.3. Pseudocode of the solution

12 | P a g e Ankit Adhikari 18028880

3.4. Diagrammatical representations of the solution

13 | P a g e Ankit Adhikari 18028880

4.1 Analysis of the work done

14 | P a g e Ankit Adhikari 18028880

4.2. How the solution addresses real world problems

15 | P a g e Ankit Adhikari 18028880

4.3. Further work