Welcome to Scribd!

Pushpa 10610624 PPT Presentation

Uploaded by

0% found this document useful (0 votes)

14 views19 pages

This document summarizes a presentation on sentiment analysis of restaurant reviews. It discusses using machine learning algorithms like naive bayes, support vector classifiers, random forests and logistic regression to analyze sentiment from reviews and ratings. Evaluation shows support vector classifiers and logistic regression perform best at predicting sentiment polarity. While ratings provide good sentiment analysis, reviews contain more contextual information. The best model is noted to be logistic regression due to its performance and computational efficiency. Limitations discussed include the size and noisiness of the dataset used to evaluate the models.

Original Description:

text mining on hotel review

Original Title

Pushpa_10610624_ppt_presentation

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

14 views19 pages

Pushpa 10610624 PPT Presentation

Uploaded by

drpri15

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 19

Search inside document

SENTIMENT ANALYSIS

OF RESTAURANT
REVIEWS

PRESENTED BY: PUSHPA (10610624)

GUIDED BY : DR MEHRAN RAFIEE

Background & Motivation:
 Customer Insights
 Quality Improvement
 Competitor Analysis
 Menu Optimization
 Staff Performance
 Reputation Management
 Marketing Insights
 Predictive Analysis
 Customer Loyalty
 Identifying Trends
Research Question:

 Are traditional Machine Learning algorithm Suitable for Sentiment Analysis?

 Are hybrid algorithms better than traditional Machine Learning algorithms?
 Which is the better parameter, rating or review for sentiment analysis?
Methodology:

 Crisp DM methodology
Business understanding

 Its important to understand the problem the business is going to solve.

 Are we just going to infer the overall sentiment ?
 Are we going to work upon certain aspects for improvement ?
 Are we going to use something as a part of business and marketing strategy?
Data understanding

 Rating vs
Review
count
Data understanding………

 Review
count per
state
Distribution of sentiment based on Ratings
Distribution of sentiment based on Review
Data preparation

 Data preprocessing & Data cleaning

 Tokenization
 Special character removal
 Punctuations & Stopwords removal
 Lemmatization
 Topic modelling
Modelling & Evaluation

 Naïve bayes
 Support Vector Classifier
 Random Forest
 Logistic Regression
 Hybrid model
Naïve Bayes

With Rating With Review

Precision Recall F1 score Precision Recall F1 score

Negative 0.81 0.43 0.56 Negative 0.76 0.27 0.40

Class Class

Positive 0.82 0.96 0.88 Positive 0.85 0.98 0.91

Class Class
Accuracy 0.84
Accuracy 0.81

Macro- 0.81 0.70 0.72 Macro- 0.80 0.62 0.65

avg avg

Weighted 0.81 0.81 0.79 Weighted 0.83 0.84 0.81

-avg -avg
Support Vector Classifier

With Rating With Review

Precisi Recall F1 score Precisi Recall F1 score
on on

Negative Class 0.79 0.68 0.73 Negative 0.80 0.61 0.69

Class

Positive Class 0.88 0.93 0.91 Positive Class 0.91 0.96 0.94

Accuracy 0.86 Accuracy 0.89

Macro- avg 0.84 0.80 0.82 Macro- avg 0.85 0.78 0.81

Weighted -avg 0.86 0.86 0.86 Weighted - 0.89 0.89 0.89

avg
Random Forest
With Rating With Review
Precision Recall F1 score Precision Recall F1 score

Negative 0.81 0.59 0.68 Negative 0.87 0.33 0.48

Class Class

Positive 0.86 0.95 0.90 Positive 0.86 0.99 0.92

Class Class
Accuracy 0.85 Accuracy 0.86

Macro- 0.84 0.77 0.79 Macro- avg 0.86 0.66 0.70

avg
Weighted - 0.85 0.85 0.84 Weighted - 0.86 0.86 0.83
avg avg
Logistic Regression

With rating With review

Precision Recall F1 score Precision Recall F1 score

Negative 0.80 0.67 0.73 Negative 0.83 0.56 0.67

Class Class

Positive 0.88 0.93 0.91 Positive 0.90 0.97 0.94

Class Class

Accuracy 0.86 Accuracy 0.89

Macro- 0.84 0.80 0.82 Macro- 0.87 0.77 0.80

avg avg

Weighted - 0.86 0.86 0.86 Weighted - 0.89 0.89 0.88

avg avg
Hybrid Model –(SVC+NB)

Precision Recall F1 score

Negative Class 0.76 0.27 0.40

Positive Class 0.85 0.98 0.91

Accuracy 0.84
Macro- avg 0.80 0.62 0.65
Weighted -avg 0.83 0.84 0.81
Best Model

 Based on the results SVC & Logistic regression are the good algorithms for this dataset
owning to bias towards positive class.
 SVC is computationally expensive , therefore Logistic regression outperforms among all
the models tested
Limitations

 Limited number of records

 Negation words, contrasts and sarcasm are not dealt separately.
 Dataset is raw and has not been corrected for grammatical errors
 Performance of various algorithm is tested without the understanding of cultural and
contextual factors
 Language evolve over time, new slangs, and expressions emerging, the models used
might not always keep up with the changes.
Thank you

Using the Standards: Measurement, Grade 2
From Everand
Using the Standards: Measurement, Grade 2
Melissa J. Owen
No ratings yet
Dataset Credit Experiment
Document3 pages
Dataset Credit Experiment
Farhad Ahmed Chirley
No ratings yet
Puma, Brianna MVMT Eval #1-Professor Score
Document2 pages
Puma, Brianna MVMT Eval #1-Professor Score
Brianna Puma
No ratings yet
ML Workflow Output
Document3 pages
ML Workflow Output
Sai Ram Perisetti
No ratings yet
Classification: MCI 90 100: Statistics
Document7 pages
Classification: MCI 90 100: Statistics
Anonymous nErkwtXnuS
No ratings yet
Linear Filters: April 6, 2017
Document74 pages
Linear Filters: April 6, 2017
me
No ratings yet
Skill Lab Ebm
Document5 pages
Skill Lab Ebm
xxxxyyyyy1234
No ratings yet
Final Output
Document3 pages
Final Output
Spam King
No ratings yet
Data Exam Scores
Document2 pages
Data Exam Scores
ANa Trần
No ratings yet
Example LOSS Rate
Document17 pages
Example LOSS Rate
tjiendradjaja yamin
No ratings yet
Ratios Maximos
Document1 page
Ratios Maximos
proyectistanuevo
No ratings yet
App E.1.3 - UC Plot (Main Deck) 1yr NDLL
Document1 page
App E.1.3 - UC Plot (Main Deck) 1yr NDLL
adlanamran2
No ratings yet
Aguigam, April Jessa Mae S. - Midterm
Document4 pages
Aguigam, April Jessa Mae S. - Midterm
April Aguigam
No ratings yet
Report Varsha GanapathyRao
Document4 pages
Report Varsha GanapathyRao
Varsha Surve
No ratings yet
Excercise
Document12 pages
Excercise
srinithi
No ratings yet
Smile Linearity Worksheet: Acceptable Acceptable Acceptable Acceptable Acceptable Acceptable
Document3 pages
Smile Linearity Worksheet: Acceptable Acceptable Acceptable Acceptable Acceptable Acceptable
ARIF AHAMMED P
No ratings yet
Kreatinin Kinase MB1
Document8 pages
Kreatinin Kinase MB1
dmandatari7327
No ratings yet
Measurement and Structure Model
Document3 pages
Measurement and Structure Model
Maaz Siddiqui
No ratings yet
Statistics and Probability: Midterm Project 1 & 2
Document10 pages
Statistics and Probability: Midterm Project 1 & 2
Mikhaella Bulatao
No ratings yet
Data Normalitas Dan Uji Wilcolxon
Document9 pages
Data Normalitas Dan Uji Wilcolxon
Afthon Yazid Abrory
No ratings yet
Analisis Desempeño para Incremento 2011
Document16 pages
Analisis Desempeño para Incremento 2011
Hisnardo Sánchez Molina
No ratings yet
NSGA and Optimization Document
Document6 pages
NSGA and Optimization Document
Anutthara Ratnayake
No ratings yet
National Health Care Managerial Report
Document5 pages
National Health Care Managerial Report
kanak kathuria
No ratings yet
Correlation and Model Error
Document7 pages
Correlation and Model Error
ANIL PAL
No ratings yet
7047 001 Group 8 Water Potability
Document19 pages
7047 001 Group 8 Water Potability
Manda Reema
No ratings yet
Dataset Cancer Experiment
Document3 pages
Dataset Cancer Experiment
Farhad Ahmed Chirley
No ratings yet
Itemanalysis PDF
Document2 pages
Itemanalysis PDF
aduigia
No ratings yet
Biostatistik Ulya
Document9 pages
Biostatistik Ulya
Fadilla Rizky Richard
No ratings yet
Group 1 - Sun Microsystem - Group
Document12 pages
Group 1 - Sun Microsystem - Group
Aninda Dutta
No ratings yet
E Class Record 2020
Document11 pages
E Class Record 2020
DEODAR STA ANA
No ratings yet
Smile Linearity Worksheet: Acceptable Acceptable Acceptable Acceptable Acceptable Acceptable
Document3 pages
Smile Linearity Worksheet: Acceptable Acceptable Acceptable Acceptable Acceptable Acceptable
Saifeldein Elimam
No ratings yet
App E.1.2 - UC Plot (Mezzanine Deck) 1yr NDLL
Document1 page
App E.1.2 - UC Plot (Mezzanine Deck) 1yr NDLL
adlanamran2
No ratings yet
Exercise Z-Sccore
Document2 pages
Exercise Z-Sccore
Pungus Rini
No ratings yet
Statistik Jilid
Document25 pages
Statistik Jilid
risda hanifa rahman
No ratings yet
Materia XYZ Primavera 2003: Calificaciones Del Grupo
Document26 pages
Materia XYZ Primavera 2003: Calificaciones Del Grupo
Rodrigo Valle
No ratings yet
Control Calif Icac I On
Document26 pages
Control Calif Icac I On
lckv_18
No ratings yet
Input Data Sheet For E-Class Record: NO. Male NO. Female IG TG IG TG
Document11 pages
Input Data Sheet For E-Class Record: NO. Male NO. Female IG TG IG TG
Jesson Albaran
No ratings yet
2019 220 Moesm1 Esm
Document1 page
2019 220 Moesm1 Esm
ajeng
No ratings yet
Article V - Grading System: Guidelines
Document10 pages
Article V - Grading System: Guidelines
RYAN JONES RUBIO
No ratings yet
Compression-Rotation: Accuracy Rank Sensitivity Specificity
Document1 page
Compression-Rotation: Accuracy Rank Sensitivity Specificity
Rafael
No ratings yet
Intervalos de Confianza
Document11 pages
Intervalos de Confianza
Iván Chang
No ratings yet
Nama: Faris Abdi El Hakim NRP: 122180006 Tugas (Homework)
Document4 pages
Nama: Faris Abdi El Hakim NRP: 122180006 Tugas (Homework)
1206 Faris Abdi El Hakim
No ratings yet
UTS Husni Randa
Document28 pages
UTS Husni Randa
Husni Randa
No ratings yet
Ordinal Logistic Regression Goodness-Of-Fit Test: Appendix A
Document3 pages
Ordinal Logistic Regression Goodness-Of-Fit Test: Appendix A
Pushpa Choudhary
No ratings yet
Anritsu Power Measurement Uncertainty Calculator
Document426 pages
Anritsu Power Measurement Uncertainty Calculator
philippeb
100% (1)
Parte Inferencial
Document46 pages
Parte Inferencial
Leidy Tatiana Martinez
No ratings yet
Skil Lab EBM
Document6 pages
Skil Lab EBM
Yahya Darmais Farid
No ratings yet
Ca BS 7671
Document1 page
Ca BS 7671
Nawarathna Engineering Dept.
No ratings yet
Contoh UJI HOMOGENITAS
Document1 page
Contoh UJI HOMOGENITAS
Nurjannah Anwar
No ratings yet
Untitled
Document3 pages
Untitled
Queven James Elemino
No ratings yet
Table S7
Document2 pages
Table S7
Muhammad Ilham Hafidz
No ratings yet
Calibración Door Fan Test Bloer 6000
Document2 pages
Calibración Door Fan Test Bloer 6000
Proyectos Ashes Fire Colombia
No ratings yet
Stats Again
Document4 pages
Stats Again
Jericoh Ticgue
No ratings yet
STAT232
Document2 pages
STAT232
mife
No ratings yet
App E.1.1 - UC Plot (Upper Deck) 1yr NDLL
Document1 page
App E.1.1 - UC Plot (Upper Deck) 1yr NDLL
adlanamran2
No ratings yet
Group 1 - Sun Microsystem
Document6 pages
Group 1 - Sun Microsystem
Aninda Dutta
No ratings yet
Z Test
Document2 pages
Z Test
Sumit Gupta
No ratings yet
Lembar Jawaban Skillab Evidence Based Medicine (Ebm) Nama: Desi Mareta Alfina NIM: 04011181320040
Document6 pages
Lembar Jawaban Skillab Evidence Based Medicine (Ebm) Nama: Desi Mareta Alfina NIM: 04011181320040
Rafika Triasa
No ratings yet
5 Lampiran 5 (Hasil Effect Size)
Document4 pages
5 Lampiran 5 (Hasil Effect Size)
Achmad Vindo
No ratings yet
Kon Versi
Document4 pages
Kon Versi
alfarishy Cahyana
No ratings yet