You are on page 1of 16

Department Of Masters Of Comp.

Applications

• Course Code: MCA 61


• Project Work

• Presentation For I-Review


• On
• Sentiment Analysis for Product Review.

• Project Guide
Presented By:
• Dr. Niranjan Murthy
Kavya kumari
• 1MS17MCA19
1.Project Introduction

PROBLLEM DEFINITION

This Sentiment analysis is defined as the process of mining of data, view, review or sentence to predict the
emotion of the sentence through natural language processing (NLP). The sentiment analysis involve
classification of text into three phase “Positive”, “Negative” or “Neutral”. It analyses the data and labels the
‘better’ and ‘worse’ sentiment as positive and negative respectively. Thus, in the past years, the World Wide
Web (WWW) has become a huge source of raw data generated custom or user. Using social media, e-
commerce website, movies reviews such as Facebook, twitter, Amazon, Flip kart etc. user share their views,
feelings in a convenient way. In WWW, where millions of people express their views in their daily interaction,
either in the social media or in e-commence which can be their sentiments and opinions about particular
thing. These growing raw data are an extremely high source of information for any kind of decision making
process either positive or negative. To analysis of such huge data automatically, the field of sentiment
analysis has turn up. The main aim of sentiment analysis is to identifying polarity of the data in the Web and
classifying them. Sentiment analysis is text based analysis, but there are certain challenges to find the
accurate polarity of the sentence. This states that there is need to find the better solution to get much
better results than the previous approach or technique used to find polarity of sentence. Therefore, to find
polarity or sentiment of, user or customer there is a demand for automated data analysis techniques.
2.HARDWARE AND SOFTWARE
Requirements

Hardware Requirements:
´ Processor i3 or higher
´ Ram 4gb or higher

Software Requirements:
´ Windows 8 or higher or Any Linux based OS
´ Django(web designing)
´ BeautifulSoup and Pypeteer.(web scraping)
´ VADER Sentiment Analysis.(comments sentiments are analysed)
´ Any browser
SOFTWARE REQUIRNMENT
SPECIFIATIONS (SRS)
1.INTRODUCTION

1.1 PURPOSE

Every single day huge amount of information, reviews or opinions are getting stored in the websites of social
media or e- services in the form of raw data. To work with those raw data proper methods required. Most of
the methods either focus on verbs, nouns, adverbs or adjectives. Although a recent study has shown that
combination of adverbs and adjectives in sentiment analysis is better than adjectives alone . But no work
has focused on all the possible combinations of adverbs, adjectives and verbs. Project presents the
theoretical analysis of some well-known methods or proposal of Sentiment Analysis. Both the advantages
and disadvantages of the discussed methods are considered to add new features in the proposed approach.
The new approach follows machine learning technique at document level with combination of adjectives,
adverbs, and verbs. The following combinations are taken into for analysis, adverbs-adjectives, adverbs-
verbs, adjectives-verbs and adverbs-adjectives-verbs along with adverbs, adjectives and verbs. The
Standard classifier like Naive Bayes (NB), Linear Model and Decision Tree are used to deduct result and for
analysis. This section presents the classification of Sentiment Analysis followed by detailed revision of the
existing methods related to sentiment analysis.
1.2 Product Scope:

The scope of the project is to provide a user friendly web based product that extracts people’s sentiment feelings
toward certain services, products, organisations. In this project phase which aims at developing product review
analysis Were we have graph and dynamic outputs to help out people by giving accurate product review.

The project aims to:

1. Provide an accurate sentiment analysis results.

2. Achieve a wide range of users Reviews.

3. Smooth, fast, efficient, reliable and easy to use web-based tool.

5. Providing a user friendly menu visualisation capabilities.

6. Having a plenty of options in term of filtering and viewing information according to user’s needs.

1.3 References:

´ Samaneh Moghaddam and Martin Ester, “Opinion Digger: An Unsupervised Opinion Miner from Unstructured
Product Reviews”, Proceedings of 19 ACM International Conference on Information and Knowledge Management, pp. 1825-1828,
th

2010.

´ G. Vinodhini and R.M. Chandrasekaram, “Sentiment Analysis and opinion Mining: A Survey”, International Journal of
Advanced Research in Computer Science and Software Engineering, Vol. 2, No. 6, pp. 28-35, 2012.

´ Mika V. Mantyla, Daniel Graziotin and Miikka Kuutila, “The Evolution of Sentiment Analysis-A Review of Research
Topics”, Computer Science Review, Vol. 27, No. 1, pp. 16-32, 2018.
2.Overall Description

2.1 Product Perspective:

Due to the world’s massive growth of social networks and the rapid flow of reviews over the internet, The main
aim is that to develop a tool that can allow users to use a simple search bar to search for any products or items
and the engine of that tool is to crawl over the internet collecting all comments, reviews or even notes in website
related to the user’s search keyword. Then perform an intelligent processing technique to extract the true
meanings of the people’s comments and to decide and classify them in terms of positive, negative or neutral thus
to know the majority of people like or dislike the desired topic. More specifically providing people's feelings
regarding certain topics with high accuracy will lead to a better decision making. The purpose of the prototype is to
demonstrate the concept and to deliver operational and functional services for testing purposes. As for initial Our
Website will be the only source of data for the prototype and then integration will be needed to include more
sources like Flipkart, Amozon ecommerce website
2.1 Product Functions:

• Topic Extraction

This part is considered a key stone in the project as it detects and extracts topics titles from the comments. Our
approach goes as follows, first we do preprocessing which includes removal of stop words that occur frequently in
the comments but have no relevant meaning, then generate the feature vector. The features used are n-grams,
unigram, bigram, and trigrams, and some named entities that are extracted from the crawled comments. The main
step is to cluster related comments together using similarity measures so we can have multiple clusters each has
one topic. Afterwards key-phrase extraction is used on each cluster to extract the key-phrases that are candidates
to be title topics. Clusters that result in similar key phrase are merged together and this key phrase has higher
weight to be the topic title.

• Sentiment Classification

Sentiment classification is the primary module of the product. The objective of this part is to provide as much as
possible an accurate classification for opinions embedded in certain sentences like Reviews as positive, negative or
neutral. In addition to counting the total numbers of positive, negative and neutral comments found in the data
source with regards to specified Product.

• Determining Influential Reviews

If influential members in a social network can be responsible for starting a buzz or negative opinion on product
.we are interested in the problem of identifying which is influential review or spam review for a product. A manual
analysis is only possible on a very limited scale. An automated computer supported analysis for identifying spam
review from huge amount of comment/reviews
´ 2.3 USER CLASS and CHARACTERISTICS:

• This part is to identify various user classes that we anticipate will use the web
application. User classes will be differentiated based on the use, product functions
and features, technical expertise, security and privilege levels and educational
level. The solution is intended to be used by three main different user classes;
system administrators, system operators and customers or regular users.

No special knowledge or skills should


be assumed for the part of the regular users. Users are not expected to learn or
remember a set of commands in order to start using the application. The prototype
application will be only a web based and then for the product versions there will be
a desktop versions, smart phones and smart Tablets.

´ 2.3 ASSUMPTIONS and DEPENDENCIES:

• The factors that affect the requirements stated in the SRS. These factors are not
design constraints on the software but any changes to these factors can affect the
requirements in the SRS. For example, an assumption may be that a specific
operating system will be available on the hardware designated for the software
product. If, in fact, the operating system is not available, the SRS would then have
to change accordingly.
WEB SCRAPING
What is web scraping?
Web scraping is a process of gathering information from internet

Why scrape the web


To grab the required data to analyse any model

Web scraping steps:-


• Explore the website

• Deichiper the information url(lot of information can be incoded in url

• Inspect the site using developer tools

• Grab html content in a page

• Parse html code with beautifulsope


• Find element by html class name

• Extract text from html elemtent

• Extract attribute from html element

• Then go for data cleaning


• Find element by html class name

• Extract text from html elemtent

• Extract attribute from html element

• Then go for data cleaning


NLP (NATURAL LANGUAGE
PROCESSING)
Natural Language Processing

Natural Language Processing (NLP) is all about leveraging tools, techniques and algorithms to process and understand natural language-based
data, which is usually unstructured like text, speech and so on. In this series of articles, we will be looking at tried and tested strategies,
techniques and workflows which can be leveraged by practitioners and data scientists to extract useful insights from text data.

Outline for this Series

The nature of this series will be a mix of theoretical concepts but with a focus on hands-on techniques and strategies covering a wide variety of
NLP problems. Some of the major areas that we will be covering in this series of articles include the following.

1.Processing & Understanding Text

2.Feature Engineering & Text Representation

3.Supervised Learning Models for Text Data

4.Unsupervised Learning Models for Text Data

5.Web scraping

NLP in detail with :

1.Data Retrieval with Web Scraping

2.Text wrangling and pre-processing

3.Parts of Speech Tagging

4.Shallow Parsing

5.Constituency and Dependency Parsing

6.Named Entity Recognition

7.Emotion and Sentiment Analysis


CONCLUSION

Key consideration of this newly proposed technique is part of speech and tested on benchmark Stanford Dataset
using six well-known supervised classifiers. It is noticed that the combination of adjective, adverb and verb turned
out to be the best combination among various combinations of the parts of speech.

You might also like