Index

DECLARATION
I hereby declare that this report submission is my own work and that, to
the best of my knowledge and belief, it contains no material previously published
or written by another person nor material which has been accepted for the
award of any other degree or diploma of the university or other institute of higher
learning, except where due acknowledgment has been made in the text.
Place:
Date
Signature of the candidate

Name:
Reg. No. 2014013744

Roll. No. 140132018
CERTIFICATE
Certified that this thesis entitled Sentiment Analysis Using Hybrid Cluster
and Predict Model is the bonafide work of Mr. KAMAL SINGH who carried out
project work under my supervision. Certified further, that to the best of my
knowledge the work reported herein does not form part of any other project
report or dissertation on the basis of which a degree or award was conferred on
an earlier occasion on this or any other candidate.
Signature of Supervisor
Mr. Mukul Varshney
The M.Tech. Viva-Voce Examination of Mr./Ms, has been

held on................................
Signature of External Examiner
Head of the Department/Program Coordinator
ii
ABSTRACT
Over the past decade humans have experienced exponential growth in the use
of online resources, in particular social media and microblogging websites such
as Facebook, Twitter, YouTube and also mobile applications such as WhatsApp,
Line, etc. Many companies have identified these resources as a rich mine of
marketing knowledge. This knowledge provides valuable feedback which allows
them to further develop the next generation of their product. In this report
sentiment analysis about apple product have been performed by extracting
tweets about that product and classifying the tweets showing it as positive and
negative feedback for apple product. We propose a hybrid approach which uses
k medoid clustering to form the clusters and uses a supervised learning
technique known as CART method to make the predictions on those clusters.
iii
ACKNOWLEDGEMENT
I take this opportunity to acknowledge to Mr Mukul Varshney, my project guide
whose valuable inputs helped us to complete this report.
With profound sense of gratitude and sincere thanks to Prof. Ishan Ranjan
(Head of the Department), Department of Computer Science and Engineering,
Sharda University, Greater Noida, U.P., INDIA. It was very inspiring and
knowledgeable for me to work with enlightened and disciplined personality.
I also want to express sincere thanks to Dr. Manoj Kumar Gupta (Program
Coordinator) for his continuing sincere helps and supports to complete this
report. Last but not the least, I wish to thank my friends for their continuous
support.
KAMAL SINGH
iv
LIST OF TABLES
Table 2.1 Performance of lexical approach variants
16
Table 2.2 Performance machine learning approach variants
17
Table 2.3
19
Summary of literature Survey
Table 4.4 Comparison of Various Classification Algorithms
34
LIST OF FIGURES
2.1 Generic architecture of an lexical approach classifier
12
2.2 Generic architecture of a machine learning approach classifier
14
3.1 flow diagram of the proposed model
23
4.1 Confusion matrix
32
4.2 Roc curve for cluster 1
33
4.3 Roc curve for cluster 2
34
TABLE OF CONTENTS
Declaration
Certificate
ii
Abstract
iii
Acknowledgement
iv
List of Tables
List of Figures
CHAPTER 1: INTRODUCTION
1.1 Background
1.2 Objective
1.3 Motivation and Goals
CHAPTER 2: LITERATURE SURVEY

2.1 Issues in Sentiment Analysis
2.2 Classification of Approaches
2.2.1 Knowledge-based Approach
2.2.2 Relationship-based Approach
2.2.3 Language Models Approach
2.2.4 Discourse Structures and Semantics Approach
2.3 Twitter Specific Approaches
2.3.1 Lexical Analysis Approach
10
2.3.2 Machine Learning Approach
12
2.3.3 Hybrid Approach
13
2.4 Performance Review
14
2.4.1 Lexical Approach Performance
14
2.4.2 Machine Learning Approach Performance
15
2.4.3 Hybrid Approach Performance
16
2.5 Research Gap
17
vi
CHAPTER 3: METHODOLOGY
3.1 R Studio
20
3.2 Training Data
20
3.3 Test Data
20
3.4 Obtaining Raw Data
20
3.5 Process Flow
21
3.6 Steps of k Medoids Clustering Algorithm
22
3.7 Prediction Algorithms
22
3.7.1 Logistic Regression
22
3.7.2 Random Forest algorithm
23
3.7.3 CART Method
24
3.8 Feature Extraction
25
3.9 Evaluation
29
CHAPTER 4: Experiment
4.1 Data Sets
30
4.2 Experimental Results
30
CHAPTER 5: Conclusion
5.1 Conclusion
33
CHAPTER 6: FUTURE EXTENSIONS

6.1 Future Extensions
37
References
38
vii
viii

Index

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Index

Uploaded by

Copyright:

Available Formats

DECLARATION

Signature of the candidate

Reg. No. 2014013744

The M.Tech. Viva-Voce Examination of Mr./Ms, has been

Signature of External Examiner

Head of the Department/Program Coordinator

Table 2.2 Performance machine learning approach variants

Summary of literature Survey

Table 4.4 Comparison of Various Classification Algorithms

2.1 Generic architecture of an lexical approach classifier

2.2 Generic architecture of a machine learning approach classifier

3.1 flow diagram of the proposed model

4.1 Confusion matrix

4.2 Roc curve for cluster 1

4.3 Roc curve for cluster 2

1.3 Motivation and Goals

CHAPTER 2: LITERATURE SURVEY

2.2 Classification of Approaches

2.2.1 Knowledge-based Approach

2.2.2 Relationship-based Approach

2.2.3 Language Models Approach

2.2.4 Discourse Structures and Semantics Approach

2.3 Twitter Specific Approaches

2.3.1 Lexical Analysis Approach

2.3.2 Machine Learning Approach

2.3.3 Hybrid Approach

2.4 Performance Review

2.4.1 Lexical Approach Performance

2.4.2 Machine Learning Approach Performance

2.4.3 Hybrid Approach Performance

2.5 Research Gap

3.2 Training Data

3.3 Test Data

3.4 Obtaining Raw Data

3.5 Process Flow

3.6 Steps of k Medoids Clustering Algorithm

3.7 Prediction Algorithms

3.7.1 Logistic Regression

3.7.2 Random Forest algorithm

3.7.3 CART Method

3.8 Feature Extraction

4.2 Experimental Results

CHAPTER 6: FUTURE EXTENSIONS

You might also like