You are on page 1of 12

Department Of Masters Of Comp.

Applications

• Course Code: MCA 61


• Project Work

• Presentation For I-Review


• On
• Informative Website for EDS BU.

• Project Guide Presented By:


• Dr. Niranjan Murthy Kavya kumari
• 1MS17MCA21
1.Project Introduction

PROBLLEM DEFINITION

This Sentiment analysis is defined as the process of mining of data, view, review or sentence to predict the emotion of the sentence
through natural language processing (NLP). The sentiment analysis involve classification of text into three phase “Positive”,
“Negative” or “Neutral”. It analyses the data and labels the ‘better’ and ‘worse’ sentiment as positive and negative respectively.
Thus, in the past years, the World Wide Web (WWW) has become a huge source of raw data generated custom or user. Using social
media, e-commerce website, movies reviews such as Facebook, twitter, Amazon, Flipkart etc. user share their views, feelings in a
convenient way. In WWW, where millions of people express their views in their daily interaction, either in the social media or in e-
commence which can be their sentiments and opinions about particular thing. These growing raw data are an extremely high source
of information for any kind of decision making process either positive or negative. To analysis of such huge data automatically, the
field of sentiment analysis has turn up. The main aim of sentiment analysis is to identifying polarity of the data in the Web and
classifying them. Sentiment analysis is text based analysis, but there are certain challenges to find the accurate polarity of the
sentence. This states that there is need to find the better solution to get much better results than the previous approach or technique
used to find polarity of sentence. Therefore, to find polarity or sentiment of, user or customer there is a demand for automated data
analysis techniques. In this paper, a detailed survey of different techniques or approach is used in sentiment analysis and a new
technique which is proposed in this paper.
2.HARDWARE AND SOFTWARE
Requirements
Hardware Requirements:
 Processor i3 or higher
 Ram 4gb or higher

Software Requirements:
 Windows 8 or higher or Any Linux based OS
 Django(web designing)
 BeautifulSoup and Pypeteer.(web scraping)
 VADER Sentiment Analysis.(comments sentiments are analysed)
 Any browser
SOFTWARE REQUIRNMENT
SPECIFIATIONS (SRS)
1.INTRODUCTION

1.1 PURPOSE

Every single day huge amount of information, reviews or opinions are getting stored in the websites of social media or e- services
in the form of raw data. To work with those raw data proper methods required. Most of the methods either focus on verbs, nouns,
adverbs or adjectives. Although a recent study has shown that combination of adverbs and adjectives in sentiment analysis is better
than adjectives alone [8]. But no work has focused on all the possible combinations of adverbs, adjectives and verbs. This paper
presents the theoretical analysis of some well-known methods or proposal of Sentiment Analysis. Both the advantages and
disadvantages of the discussed methods are considered to add new features in the proposed approach. The new approach follows
machine learning technique at document level with combination of adjectives, adverbs, and verbs. The following combinations are
taken into for analysis, adverbs-adjectives, adverbs-verbs, adjectives-verbs and adverbs-adjectives-verbs along with adverbs,
adjectives and verbs. The Standard classifier like Naive Bayes (NB), Linear Model and Decision Tree are used to deduct result and
for analysis. This section presents the classification of Sentiment Analysis followed by detailed revision of the existing methods
related to sentiment analysis.
1.2 Product Scope:

The scope of the project is to provide a user friendly web based product that extracts people’s sentiment feelings toward certain services,
products, organisations. In this project phase which aims at developing product review analysis Were we have graph and dynamic outputs to
help out people by giving accurate product review.

The project aims to:

1. Provide an accurate sentiment analysis results.

2. Achieve a wide range of users Reviews.

3. Smooth, fast, efficient, reliable and easy to use web-based tool.

5. Providing a user friendly menu and good entertainment visualisation capabilities.

6. Having a plenty of options in term of filtering and viewing information according to user’s needs.

1.3 References:

 Samaneh Moghaddam and Martin Ester, “Opinion Digger: An Unsupervised Opinion Miner from Unstructured Product Reviews”,
Proceedings of 19 ACM International Conference on Information and Knowledge Management, pp. 1825-1828, 2010.
th

 G. Vinodhini and R.M. Chandrasekaram, “Sentiment Analysis and opinion Mining: A Survey”, International Journal of Advanced Research
in Computer Science and Software Engineering, Vol. 2, No. 6, pp. 28-35, 2012.

 Mika V. Mantyla, Daniel Graziotin and Miikka Kuutila, “The Evolution of Sentiment Analysis-A Review of Research Topics”, Computer
Science Review, Vol. 27, No. 1, pp. 16-32, 2018.
2.Overall Description

2.1 Product Perspective:

Due to the world’s massive growth of social networks and the rapid flow of news over the internet,Link Development and AUC came up with
the sentiment analysis tool for Arabic (SATA) research project. The main aim of SATA is that to develop a tool that can allow users to use a
simple search bar to search for any services, products or any political topics and the engine of that tool is to crawl over the internet collecting all
comments, reviews, tweets or even notes in blogs related to the user’s search keyword. Then perform an intelligent processing technique to
extract the true meanings of the people’s comments and to decide and classify them in terms of positive, negative or neutral thus to know the
majority of people like or dislike the desired topic. More specifically providing people's feelings regarding certain topics with high accuracy will
lead to a better decision making. The purpose of the prototype is to demonstrate the concept and to deliver operational and functional services
for testing purposes. As for initial Twitter will be the only source of data for the prototype and then integration will be needed to include more
sources like facebook , news websites and blogs.
2.1 Product Functions:

• Topic Extraction

This part is considered a key stone in the project as it detects and extracts topics titles from the comments. Our approach goes as follows, first
we do preprocessing which includes removal of stop words that occur frequently in the comments but have no relevant meaning, then generate
the feature vector. The features used are n-grams, unigram, bigram, and trigrams, and some named entities that are extracted from the crawled
comments. The main step is to cluster related comments together using similarity measures so we can have multiple clusters each has one topic.
Afterwards key-phrase extraction is used on each cluster to extract the key-phrases that are candidates to be title topics. Clusters that result in
similar key phrase are merged together and this key phrase has higher weight to be the topic title.

• Sentiment Classification

Sentiment classification is the primary module of the product. The objective of this part is to provide as much as possible an accurate
classification for opinions embedded in certain sentences like Reviews as positive, negative or neutral. In addition to counting the total numbers
of positive, negative and neutral comments found in the data source with regards to specified Product.

• Determining Influential Reviews

Since influential members in a social network can be responsible for starting a buzz or getting the community to notice a new trend, product, or
even adopt an opinion, we are interested in the problem of identifying which users are leaders. For companies, organisations and governments, it
is of great importance to learn about opinions in order to assess chances and risks. A manual analysis is only possible on a very limited scale. An
automated computer supported analysis is necessary given the large number of virtual communities with huge amounts of postings.
 2.3 USER CLASS and CHARACTERISTICS:

• This part is to identify various user classes that we anticipate will use the web application. User classes
will be differentiated based on the use, product functions and features, technical expertise, security and
privilege levels and educational level. The solution is intended to be used by three main different user
classes; system administrators, system operators and customers or regular users.

No special knowledge or skills should be assumed for the part of the regular users. Users are
not expected to learn or remember a set of commands in order to start using the application. The
prototype application will be only a web based and then for the product versions there will be a desktop
versions, smart phones and smart Tablets.

 2.3 ASSUMPTIONS and DEPENDENCIES:

• The factors that affect the requirements stated in the SRS. These factors are not design constraints on
the software but any changes to these factors can affect the requirements in the SRS. For example, an
assumption may be that a specific operating system will be available on the hardware designated for the
software product. If, in fact, the operating system is not available, the SRS would then have to change
accordingly.
NLP (NATURAL LANGUAGE
PROCESSING)
Natural Language Processing

Natural Language Processing (NLP) is all about leveraging tools, techniques and algorithms to process and understand natural language-based data, which is usually
unstructured like text, speech and so on. In this series of articles, we will be looking at tried and tested strategies, techniques and workflows which can be leveraged by
practitioners and data scientists to extract useful insights from text data.

Outline for this Series

The nature of this series will be a mix of theoretical concepts but with a focus on hands-on techniques and strategies covering a wide variety of NLP problems. Some of
the major areas that we will be covering in this series of articles include the following.

1.Processing & Understanding Text

2.Feature Engineering & Text Representation

3.Supervised Learning Models for Text Data

4.Unsupervised Learning Models for Text Data

5.Advanced Topics

NLP in detail with :

1.Data Retrieval with Web Scraping

2.Text wrangling and pre-processing

3.Parts of Speech Tagging

4.Shallow Parsing

5.Constituency and Dependency Parsing

6.Named Entity Recognition

7.Emotion and Sentiment Analysis


CONCLUSION

Key consideration of this newly proposed technique is part of speech and tested on benchmark Stanford Dataset using six well-known
supervised classifiers. It is noticed that the combination of adjective, adverb and verb turned out to be the best combination among various
combinations of the parts of speech.

You might also like