Professional Documents
Culture Documents
NLP Submission
NLP Submission
# Data visualization
import plotly.express as px
import plotly.graph_objects as go
import plotly.figure_factory as ff
import plotly.io as pio
# Evaluation metrics
from sklearn import metrics
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
# Disable warnings
import warnings
warnings.filterwarnings('ignore')
In [4]: print("")
print("Dataset Shape =>", df.shape) #get dataset shape
print("")
print("Dataset Info")
print("-"*18)
print(df.info()) #get dataset info
print("")
print("")
print("Dataset Summary")
print("-"*18)
print(df.describe()) #get dataset summary
Dataset Shape => (3473, 5)
Dataset Info
------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3473 entries, 0 to 3472
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Review Header 3473 non-null object
1 Review Date 3473 non-null object
2 Review 3473 non-null object
3 Class Type 3472 non-null object
4 Sentiment 3473 non-null object
dtypes: object(5)
memory usage: 135.8+ KB
None
Dataset Summary
------------------
Review Header Review Date \
count 3473 3473
unique 2464 1669
top British Airways customer review 19th January 2015
freq 956 26
Sentiment
count 3473
unique 2
top Positive
freq 2746
print("")
print("Missing Values")
print("-"*18)
print(df.isna().sum())
Missing Values
------------------
Review Header 0
Review Date 0
Review 0
Class Type 1
Sentiment 0
dtype: int64
1000 200
2000 100
3000
0
Review Header Review Date Review Class Type Sentiment
Dataset Pre-Processing
1) Drop columns that are not needed
In [6]: '''
We do not require the columns Review Date,&Class Type
for our sentiment analysis.Hence to drop them
'''
newdf = df
newdf = newdf.drop(["Review Date","Class Type"], axis=1)
newdf.head(5)
3) Reduce Noise
In [7]: '''
Remove "Trip Verified"
from the column for noise removal and standardization
'''
newdf['Review'] = newdf['Review'].str.lstrip(' ✅
Trip Verified | ')
newdf.head(5)
4) Lowercasing
In [8]: '''
Lowercasing:
For data normalisation.Allows for uniform word comparisons across cases.
Conforms to language models and decreases vocabulary quantity,
increasing efficiency and properly capturing underlying semantics.
'''
newdf.loc[:, 'Review Header'] = newdf['Review Header'].str.lower()
newdf['Review'] = newdf['Review'].str.lower()
newdf.head(5)
5) Remove Punctuation
In [9]: '''
Remove Punctuation:
For data normalisation. Remove punctuations from review text.
Apply to both columns - Review Header, Review
'''
def removeP(review):
punct = r'[^\w\s]'
review = re.sub(punct, '', review)
return review
newdf['Review Header'] = newdf['Review Header'].apply(removeP)
newdf['Review'] = newdf['Review'].apply(removeP)
newdf.head(5)
6) Tokenization
In [10]: '''
Tokenize:
The tokenizer function splits the review text into
individual words, creating tokens.
Each token is then converted to lowercase for consistency and
easier word comparisons.
Applying the tokenizer function to the "Review" column
creates a new "Tokenized Review" column with the tokenized text.
'''
def tokenizer(review):
tokens = word_tokenize(review)
return [token.lower() for token in tokens]
6) Remove Stopwords
In [11]: '''
Stopwords are inconsequential words that are useless in SA.
We will remove them from the 'Tokenized Review' by the code.
The NLTK library is used to retrieve the set of stopwords in the English language.
My removeSW function removes stopwords from each tokenized review,
resulting in a new filtered list of tokens.
'''
stopWords = set(stopwords.words('english'))
def removeSW(tok_review):
new_tok_review = [word for word in tok_review if word.lower() not in stopWords]
return new_tok_review
7) Stemming
In [12]: '''
Stemming is applied to 'Tokenized reviews'
in order to reduce words to their basic or root form.
The NLTK library's SnowballStemmer is used to stem English words.
The stemmer function stems each token in the tokenized review,
adding a new column called "Stemmed Review" to the DataFrame.
'''
stemming = SnowballStemmer("english")
def stemmer(tok_review):
return [stemming.stem(i) for i in tok_review]
8) Lemmatization
In [14]: """
Instead, we shall lemmatize our tokenized reviews in order
to reduce phrases to their simplest form.
The NLTK library's WordNetLemmatizer is used for lemmatization.
The lemmatizer function lemmatizes each token in the tokenized review,
inserting a new column called "Lemmatized Review" into the DataFrame.
"""
lemm = WordNetLemmatizer()
def lemmatizer(tok_review):
return [lemm.lemmatize(j) for j in tok_review]
2500
2000
Count
1500
1000
500
0
Positive Negative
Sentiment
Sentiment Percentages
80
70
60
50
Percentage
40
30
20
10
0
Positive Negative
Sentiment
Sentiment Analysis
Method 1: Multinomial Naive Bayes
About:
The Multinomial Naive Bayes model is a common choice for text classification problems due to its effectiveness in dealing with discrete variables
such as word frequencies. It successfully uses the Bayes' theorem with the premise of feature independence to calculate conditional probabilities and categorise
sentiment especially in customer reviews.
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X,
y,
test_size=0.2,
random_state=42)
# Create and train the NB classifier
nbclf = MultinomialNB()
nbclf.fit(X_train, y_train)
fig.update_layout(
title="Naive Bayes Performance Metrics",
xaxis_title="Metrics",
yaxis_title="Value",
showlegend=False
)
fig.update_layout(width=800, height=500)
fig.show()
0.8
0.7
0.6
0.5
Value
0.4
0.3
0.2
0.1
0
Accuracy Precision Recall F1-Score
Metrics
Confusion Matrix - Sentiment Analysis
1.5
400
1
True Labels
300
0.5
200
100
−0.5
−0.5 0 0.5 1 1.5
Predicted Labels
In [18]: '''
I will use a pipeline to train my Decision Tree classifier.
We will use CountVectorizer and TfidfTransformer to preprocess the text data and make it suitable for the classifier.
After fitting the pipeline to the training data, we will predict the sentiment of the test data and evaluate performance hen.
'''
# Employ and train the pipeline for Decision Tree classifier
dt_pipeline = Pipeline([("vect", CountVectorizer()),
("tfidf", TfidfTransformer()),
("clf_decisionTree", DecisionTreeClassifier())])
dt_pipeline.fit(X_train, y_train)
# Calculate evalutation
dt_accuracy = accuracy_score(y_test, decision_tree)
dt_scores = classification_report(y_test,
decision_tree,
zero_division=0,
output_dict=True)
dt_metrics = {
"Accuracy": dt_accuracy,
"Precision": dt_scores["macro avg"]["precision"],
"Recall": dt_scores["macro avg"]["recall"],
"F1-Score": dt_scores["macro avg"]["f1-score"]
}
# Plotting scores
fig2 = go.Figure(data=[
go.Bar(x=dt_metrics.columns,
y=dt_metrics.iloc[0],
marker_color='lightskyblue')
])
fig2.update_layout(
title="Decision Tree Performance Metrics",
xaxis_title="Metrics",
yaxis_title="Value",
showlegend=False
)
fig2.update_layout(width=800, height=500)
fig2.show()
0.7
0.6
0.5
Value
0.4
0.3
0.2
0.1
0
Accuracy Precision Recall F1-Score
Metrics
Decision Tree: Confusion Matrix
1.5
400
1
350
True Labels
300
0.5
250
200
0
150
100
−0.5
−0.5 0 0.5 1 1.5
Predicted Labels
# Plotting scores
fig3 = go.Figure(data=[
go.Bar(x=lr_metrics.columns,
y=lr_metrics.iloc[0],
marker_color='lightskyblue')
])
fig3.update_layout(
title="Logistic Regression Performance Metrics",
xaxis_title="Metrics",
yaxis_title="Value",
showlegend=False
)
fig3.update_layout(width=800, height=500)
fig3.show()
0.8
0.7
0.6
0.5
Value
0.4
0.3
0.2
0.1
0
Accuracy Precision Recall F1-Score
Metrics
Logistic Regression: Confusion Matrix
1.5
500
1 400
True Labels
300
0.5
200
0
100
−0.5
−0.5 0 0.5 1 1.5
Predicted Labels
In [20]: # Employ and train the pipeline for Random Forest classifier
rf_pipeline = Pipeline([("vect", CountVectorizer()),
("tfidf", TfidfTransformer()),
("clf_randomForest", RandomForestClassifier())])
rf_pipeline.fit(X_train, y_train)
# Plotting scores
fig4 = go.Figure(data=[
go.Bar(x=rf_metrics.columns,
y=rf_metrics.iloc[0],
marker_color='lightskyblue')
])
fig4.update_layout(
title="Random Forest Performance Metrics",
xaxis_title="Metrics",
yaxis_title="Value",
showlegend=False
)
fig4.update_layout(width=800, height=500)
fig4.show()
0.8
0.7
0.6
0.5
Value
0.4
0.3
0.2
0.1
0
Accuracy Precision Recall F1-Score
Metrics
Random Forest: Confusion Matrix
1.5
500
1 400
True Labels
300
0.5
200
0
100
−0.5
−0.5 0 0.5 1 1.5
Predicted Labels
# Melt df
combined_metrics_melted = pd.melt(combined_metrics,
id_vars="Classifier",
var_name="Metric",
value_name="Value")
# Visualization
fig = px.bar(combined_metrics_melted,
x="Classifier",
y="Value",
color="Metric",
barmode="group")
fig.update_layout(title="Comparison of Metrics",
xaxis_title="Classifier",
yaxis_title="Value",
width=800,
height=500)
fig.show()
Comparison of Metrics
Metric
0.8 Accuracy
Precision
Recall
0.7
F1-Score
0.6
0.5
Value
0.4
0.3
0.2
0.1
0
Naive Bayes: Random Forest: Logistic Regression: Decision Tree:
Classifier
fig.update_layout(
title='Comparison of Metrics',
xaxis_title='Classifier',
yaxis_title='Value',
width=800, height=500
)
Comparison of Metrics
0.85 Accuracy
Precision
Recall
0.8 F1-Score
0.75
Value
0.7
0.65
0.6
Classifier
Negative => implies that the algorithm detects negative emotion fairly well, however there is potential for improvement.
Positive => the model has excellent accuracy, recall, and f1-score, showing great performance in recognising positive cases.
Overall, this model performs well, with strong accuracy, precision, recall, and f1-score. It outperforms the other models evaluated in the study, demonstrating its effectiveness in sentiment classification.
Contributions to the selected domain-specific area & potential scope of project (transferability)
The transferability of airline review sentiment analysis extends beyond industry-specific applications to many geographical locations. The approaches and models used here could be altered to analyse experiences in
reviews from other airlines or places throughout the world. This portability enables not only to British Airways but other organisations to acquire worldwide insights into client feelings and preferences, allowing them to
customise their goods and strategies to individual regions. Furthermore, the portability of sentiment analysis models can encourage partnerships and information exchange across sectors and geographies, leading to
breakthroughs in sentiment analysis methodologies and the development of more robust and accurate models.
The tools and methodology created for analysing feelings in airline evaluations may be extended and used to a wide range of areas, including hospitality, e-commerce, healthcare, and others.In the hospitality industry,
sentiment analysis may assist hotels and resorts in gauging client happiness, identifying areas for development, and tailoring services to fit visitor expectations. Analysing sentiments in customer reviews may give
significant insights into product quality, consumer preferences, and brand perception in the e-commerce business, assisting in product development and marketing strategies. Similarly, in healthcare, sentiment analysis
may help medical institutions evaluate patient feedback, improve patient treatment, and increase overall satisfaction. Because sentiment analysis methodologies are versatile, firms in these various areas may leverage
the power of consumer sentiment and make data-driven decisions to drive success and improve customer experiences.
Conclusion, Personal Reflection- what I think I could have done better?
Working on this project has been a really delightful experience for me, and it has given me vital insights into the power of leveraging relatively little data to have an enormous impact using simple NLP approaches. It
seemed like a marriage between two giants: mightful data and NLP's skills. This project has certainly opened my eyes to the limitless potential in this subject and undoubtedly piqued my curiosity.
While I am pleased with the progress made, I recognise that certain areas of the project may have been handled more efficiently. I faced particular difficulties in dealing with dataset pre-processing, which I recognize as
a critical aspect impacting the project's outcomes but despite it I have given my best efforts.I'm excited to return and improve on this following my midterm submissions. I should also be looking at other options for
models rather than the standard popular ones. Maybe there are better performing models for this case study and its dutiful only if I explore the same.
The opportunity to further drive my interest in the field of NLP was made possible by the professors' decision to give us this coursework, and I would want to thank them for that. Even more so, I'm interested to see what
the second half of the module has in store for me.
References
1. Divyansh (2020) “Airline Review Data Preprocessing - pt. 2 (NLP),” Kaggle [Preprint]. Available at: https://www.kaggle.com/code/divyansh22/airline-review-data-preprocessing-pt-2-nlp.
2. Artefact (2022b) Using NLP to extract insights from your customers’ reviews. Available at: https://www.artefact.com/blog/using-nlp-to-extract-quick-and-valuable-insights-from-your-customers-reviews/.
3. Bernardes, V. (2023) “How to analyze customer reviews with NLP: a case study,” Blog | Imaginary Cloud [Preprint]. Available at: https://www.imaginarycloud.com/blog/how-to-analyze-customer-reviews-with-nlp-
case-study/.
4. Black_Raven (2021) “Using NLP machine learning models to analyse product reviews,” Medium, 12 December. Available at: https://medium.com/analytics-vidhya/analysing-product-reviews-using-nlp-machine-
learning-models-29f2819a72b.
5. Bassig, M. (2022) “Take Action on Online Reviews with NLP,” ReviewTrackers, 30 June. Available at: https://www.reviewtrackers.com/blog/nlp-reviews/.
6. Bhatt, T. (2021) “Restaurant Review Analysis using NLP,” International Journal for Research in Applied Science and Engineering Technology, 9(VII), pp. 1099–1104. Available at:
https://doi.org/10.22214/ijraset.2021.36540.