Professional Documents
Culture Documents
OF RESTAURANT
REVIEWS
Crisp DM methodology
Business understanding
Rating vs
Review
count
Data understanding………
Review
count per
state
Distribution of sentiment based on Ratings
Distribution of sentiment based on Review
Data preparation
Naïve bayes
Support Vector Classifier
Random Forest
Logistic Regression
Hybrid model
Naïve Bayes
Positive Class 0.88 0.93 0.91 Positive Class 0.91 0.96 0.94
Macro- avg 0.84 0.80 0.82 Macro- avg 0.85 0.78 0.81
Based on the results SVC & Logistic regression are the good algorithms for this dataset
owning to bias towards positive class.
SVC is computationally expensive , therefore Logistic regression outperforms among all
the models tested
Limitations