You are on page 1of 26

Swiggy Sales Analysis Page|1

CHAPTER 1
INTRODUCTION

1.1 PROJECT AIM

In the digital age, user-generated reviews play a pivotal role in shaping the success of
applications. Understanding the sentiment expressed in these reviews can provide valuable
insights for developers and stakeholders. App Reviews Sentiment Analysis means evaluating
and understanding the sentiments expressed in user reviews of mobile applications (apps). This
project aims to harness the power of sentiment analysis using Python to analyze app reviews
and uncover the sentiments conveyed by users. It involves using data analysis techniques to
determine whether the sentiments in these reviews are positive, negative, or neutral.

LinkedIn is a leading professional networking platform designed to connect


individuals, businesses, and organizations worldwide. Launched in 2003, it has become a vital
tool for professionals across various industries to establish and nurture professional
relationships, share insights, and explore career opportunities. With over 774 million members
in more than 200 countries and territories, LinkedIn offers a diverse range of features, including
profiles, job postings, networking groups, and content sharing. It serves as a platform for
recruiters to source talent, for businesses to showcase their brand and expertise, and for
individuals to build their professional brand and advance their careers. LinkedIn's mission is to
create economic opportunity for every member of the global workforce by empowering
professionals to connect, learn, and grow. As a trusted and influential platform in the
professional world, LinkedIn continues to evolve and innovate to meet the changing needs of
its users and the dynamic landscape of work.

LinkedIn, being a prominent player in the professional networking space, accumulates


a wealth of user-generated reviews that encapsulate sentiments and experiences. This Python-
based sentiment analysis project is dedicated to unraveling the sentiments embedded within
LinkedIn app reviews, offering a nuanced understanding of user feedback. By leveraging
natural language processing (NLP) techniques and machine learning algorithms, we aim to
decipher the sentiments expressed by users – whether they convey satisfaction, dissatisfaction,
or neutrality.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|2

1.2 PROBLEM STATEMENT


When a team of developers has launched a new feature on the LinkedIn app. Shortly
after its release, they started receiving various user reviews, ranging from praises to complaints
about the feature's functionality and user experience. However, due to the sheer volume and
diversity of feedback, it becomes challenging for the developers to manually sift through the
reviews and understand the overall sentiment and specific areas of concern. As a result, they
are unable to prioritize improvements effectively or gauge user satisfaction accurately. In this
context, our project focuses on developing a Python-based solution to analyze and categorize
the LinkedIn app reviews, enabling the developers to gain action able insights, enhance the
feature, and improve the overall user experience.

1.3 OBJECTIVES
The primary objective of this project is to analyze the sentiment of user reviews for
various LinkedIn applications to provide insights into user satisfaction, areas of improvement,
and overall sentiment trends.

• Gather a comprehensive dataset of user reviews for the LinkedIn app to ensure
a representative sample for analysis from various sources such as app stores
(Google play Store)
• Implement preprocessing techniques to clean and prepare the textual data,
including tokenization, removal of stop words, and handling of special
characters.
• Utilize Natural Language Processing (NLP) techniques to perform sentiment
analysis on the reviews, categorizing them as positive, negative, or neutral.
• Extract meaningful insights from the sentiment analysis results, identifying
common themes, recurring issues, and areas of user satisfaction or
dissatisfaction.
• Create visualizations, such as charts and graphs, to present the findings in an
accessible and understandable format, facilitating easier interpretation and
decision-making for stakeholders.
• Use sentiment analysis to demonstrate responsiveness to user feedback,
fostering trust and loyalty among users by showing a commitment to
addressing theirconcerns and enhancing their experience.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|3

1.4 SCOPE AND PURPOSE


Scope:
The scope of LinkedIn app review sentiment analysis encompasses a
comprehensiveapproach to gathering, analyzing, and utilizing user feedback to improve
the performance and user experience of the mobile application. This involves collecting
app reviews and ratings from various sources, including app stores and online platforms,
and applying naturallanguage processing techniques to categorize feedback as positive,
negative, or neutral. Additionally, sentiment analysis involves extracting topics or themes
mentioned in reviews,such as usability, features, performance, and customer support, to
prioritize issues and areas for enhancement. Continuous monitoring of trends in user
sentiment over time allows for the identification of emerging issues and the assessment
of the impact of updates and improvements on user satisfaction. By integrating insights
from sentiment analysis into theapp development process and product roadmap planning,
LinkedIn aims to drive iterative improvements, enhance user engagement, and maintain
competitiveness in the mobile appmarket.

Purpose:
The purpose of conducting sentiment analysis on LinkedIn app reviews is
multifaceted, aiming to extract valuable insights that inform decision-making processes
anddrive continuous improvement in the mobile application. By analyzing user feedback,
LinkedIn seeks to understand the sentiments expressed by app users, ranging from
satisfaction to frustration, and identify recurring themes or issues affecting their
experience. This analysis enables LinkedIn to prioritize areas for enhancement, address
pain points, andoptimize features to better meet user needs and expectations. Moreover,
sentiment analysis helps LinkedIn track trends in user sentiment over time, assess the
effectiveness of implemented changes, and benchmark against competitors. Ultimately,
the overarching purpose of sentiment analysis on LinkedIn app reviews is to foster a user-
centric approachto app development, enhance user satisfaction, and maintain LinkedIn's
position as a leadingplatform for professional networking and career development.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|4

CHAPTER 2
LITERATURE SURVEY

• S Kurniawan et al. Text Mining Pre-Processing Using Data Framework and


RapidMiner for Indonesian Sentiment Analysis. The GATA Framework Text
Mining provided is one of the options for conducting text mining research in
Indonesian and has been used by several researchers. There are several known data
mining processing methods, includingKKD, CRISP-DM, and SEMMA, all three
of which are quite reliable methods. CRISP- DM which consists of; Business
Understanding, Data Understanding, Data Preparation,Modeling, Evaluation, and
Deployment is a method that is quite widely used in researchin the field of text
mining which can be combined with text pre-processing.

• Mahmud Isnan et al. Sentiment analysis for TikTok review using VADER
sentiment and SVM model. TikTok, a social networking site for uploading short
videos, has become one of the most popular. Despite this, not all users are happy
with the app; thereare criticisms and suggestions, one of which is reviewed via the
TikTok app on the Google Play Store. The reviews were extracted and then used
for training a sentiment analysis model.
• C.J., & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for
Sentiment Analysis of Socia Media Text. Eighth International Conference on
Weblogs and Social Media. The inherent nature of social media content poses
serious challenges to practical applications of sentiment analysis. This paper
present VADER, a simple rule-based model for general sentiment analysis, and
compare its effectiveness to eleventypical state-of-practice benchmarks including
LIWC, ANEW, the General Inquirer, Sent WordNet, and machine learning
oriented techniques relying on Naive Bayes, Maximum Entropy, and Support
Vector Machine (SVM) algorithms.

• Abdi A, Shamsuddin SM, Hasan S, Piran J (2019) Deep learning-based sentiment


classification of evaluative text based on multi-feature fusion. Inf Process Manag
56(4):1245–1259. Sentiment analysis concerns the study of opinions expressed in a
text.Due to the huge amount of reviews, sentiment analysis plays a basic role to

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|5

extract significant information and overall sentiment orientation of reviews. In this


paper, we present a deep-learning-based method to classify a user's opinion
expressed in reviews.

• Al Amrani Y, Lazaar M, El Kadiri KE (2018) Random forest and support vector


machine based hybrid approach to sentiment analysis. Procedia Compute Sci
127:511–520. Sentiment analysis becomes more popular in the research area. It allocates
positive or negative polarity to an entity or items by using different natural language
processing tools and also predicted high and low performance of various sentiment
classifiers. Thework focuses on the Sentiment analysis resulting from the product reviews
using originaltechniques of text’s search.

2.1 EXISTING SYSTEM


The current approach to managing app reviews on LinkedIn relies heavily on
manualprocesses, lacking the efficiency and scalability that automated solutions can offer.
Users provide feedback directly through the app, which is then manually sorted and
categorized by a team to discern sentiments. This manual categorization is not only time-
consuming but also prone to human error, potentially leading to inconsistencies in the
analysis. Moreover,the visual representations used to present sentiment trends, such as
basic charts, provide limited insights and lack the depth necessary for informed decision-
making.

One of the major drawbacks of the existing system is the absence of automated sentiment
analysis, which would greatly expedite the review process and allow for more
comprehensive analysis of user feedback. Automated sentiment analysis tools can
efficiently categorize and analyze large volumes of data, providing valuable insights into
user sentiments in real-time. This would enable LinkedIn to identify emerging trends and
address issues promptly, ultimately enhancing user satisfaction and retention.

2.1.1 Limitations:

• Manual categorization introduces subjectivity and potential inconsistencies.


• Time-consuming manual processes for review tracking and analysis.
• Limited scalability with manual methods as user feedback volume grows.
• Real-time insights are unavailable due to the absence of automated

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|6

sentimentanalysis.
• Dependency on human resources for manual analysis.
• Inefficiency compared to automated processes, leading to delays in
addressing user concerns.
• Increased risk of errors with human involvement in sentiment analysis.
Limited integration with LinkedIn's backend, reducing the
effectiveness of
feedback in product development.

2.2 PROPOSED SYSTEM


The proposed system aims to develop a Python-based application for sentiment analysis
of LinkedIn app reviews. Leveraging LinkedIn's API, the system will collect user reviews
and preprocess the data to clean and tokenize text. Using machine learning modelslike Naive
Bayesor SVM, the application will classify reviews into positive, negative, or neutral
sentiments. Visualizations, such as bar charts and word clouds, will depict sentiment
distributions and key feedback themes. A user-friendly dashboard will facilitate real-time
analysis and integration with LinkedIn's backend. Continuous feedback mechanisms will
ensure the model's relevance and ongoing improvement, empowering LinkedIn to make
informed decisions for enhancing app user experience.

2.2.1 Advantages:
• Increased efficiency through automation of sentiment analysis processes.
• Ability to handle large volumes of user reviews, ensuring scalability.
• Accurate classification of reviews into positive, negative, or neutral
sentimentsusing advanced machine learning algorithms.
• Intuitive visualizations, such as bar charts and word clouds,
offercomprehensive insights into sentiment distributions and key feedback
themes.
• Real-time access to sentiment analysis results via a user-friendly dashboard.
• Continuous feedback mechanisms for ongoing improvement and
adaptation toevolving user preferences.
• Empowerment of decision-making processes through timely and accurate
insights into user sentiments.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|7

2.2.2 Disadvantages:
• Building and implementing a Python-based application with advanced
machine learning models requires significant initial investment in terms of
time, resources, and expertise.
• The accuracy of sentiment analysis heavily depends on the quality and
relevance of the data collected from user reviews. Poor-quality or biased
data can lead to inaccurate analysis results.
• Machine learning models may struggle to understand nuanced context or
sarcasm present in user reviews, potentially leading to misinterpretation of
sentiments.
• Relying solely on automated sentiment analysis may overlook the
importance of human judgment and qualitative insights, potentially
neglecting crucial aspects of user feedback analysis.
• Building and implementing a Python-based application with advanced
machine learning models requires significant initial investment in terms of
time, resources, and expertise.
• The accuracy of sentiment analysis heavily depends on the quality and
relevance of the data collected from user reviews. Poor-quality or biased
data can lead to inaccurate analysis results.
• Machine learning models may struggle to understand nuanced context or
sarcasm present in user reviews, potentially leading to misinterpretation of
sentiments.
• Relying solely on automated sentiment analysis may overlook the
importance of human judgment and qualitative insights, potentially
neglecting crucial aspects of user feedback analysis.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|8

CHAPTER 3

SOFTAWARE REQUIREMENTS AND ANALYSIS

3.1 PYTHON LIBRARIES:

Pandas:
• Pandas is a widely used open-source data manipulation and analysis library
in Python. It provides high-performance data structures and tools for
working withstructured data, making it particularly useful for tasks such as
data cleaning, exploration, and analysis.
• Central to Pandas is the DataFrame, a two-dimensional labeled data
structure resembling a spreadsheet or SQL table, which allows for easy
handling and manipulation of data. With Pandas, users can load data from
various file formats such as CSV, Excel, SQL databases, and JSON, and
perform operations like indexing, slicing, filtering, grouping, and
aggregation efficiently.
• Its integration with other Python libraries such as NumPy, Matplotlib, and
Scikit- learn further enhances its capabilities in data analysis and
visualization. Pandas also offers powerful features for handling missing
data, time-series data, and categorical data, providing comprehensive
solutions for real-world dataanalysis tasks.

Matplotlib.pyplot:
• Matplotlib.pyplot is a comprehensive plotting library in Python widely used
for creating static, interactive, and publication-quality visualizations. It is a
part of the Matplotlib library, which offers a wide range of plotting functions
and options to visualize data in various formats.
• Matplotlib.pyplot provides a MATLAB-like interface for generating plots,
allowing users to create line plots, scatter plots, bar plots, histograms, and
manyother types of plots with ease. Its versatility and customization options
make it suitable for a wide range of applications, from simple exploratory
data analysisto complex data visualization tasks.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|9

• Matplotlib.pyplot integrate seamlessly with other Python libraries such


NumPy and Pandas, allowing for seamless data manipulation and
visualization. Additionally, Matplotlib.pyplot supports various output
formats, including PNG, PDF, SVG, and EPS, making it suitable for
generating publication-quality figures for scientific publications and
presentations.
• With its extensive documentation, active community support, and rich
ecosystem of extensions and plugins, Matplotlib.pyplot remains one of the
mostpopular and powerful plotting libraries in the Python ecosystem for
data visualization.

Seaborn:
• Seaborn is a Python data visualization library based on Matplotlib that
provides a high-level interface for creating informative and visually
appealing statistical graphics. It is built on top of Matplotlib and integrates
seamlessly with Pandas data structures, making it particularly well-suited
for data analysis tasks.
• Seaborn simplifies the process of creating complex visualizations by
providing a wide range of built-in themes and color palettes, as well as
functions for creating various statistical plots such as scatter plots, bar plots,
box plots, violinplots, and heatmaps.
• These plots are designed to reveal patterns, trends, and relationships within
data quickly and efficiently. Seaborn also offers tools for visualizing
categorical dataand for visualizing distributions of univariate and bivariate
data. Additionally, Seaborn provides support for complex multi-plot grids
and facilitates the creation of faceted visualizations for exploring data across
multiple dimensions.
• With its user-friendly interface, elegant default styles, and powerful
capabilities for statistical visualization, Seaborn has become a popular
choice among data scientists and analysts for creating professional-quality
visualizations for data exploration and presentation.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|10

3.2 TECHNOLOGIES:

Swiggy Sales Data:

The Swiggy project involved data collection, cleaning, and exploration using Python
and advanced data exploration using SQL. The data collection process involved creating a
collection of tables that stored important information such as orders, food items, restaurant
menus, restaurants, and user registration information. The data was then cleaned and analyzed
to understand various aspects of the business, including popular cuisines, average price per
dish, top restaurants, and monthly sales. The SQL queries allowed for more advanced
exploration, such as identifying loyal customers, popular dishes, and revenue growth. The
project culminated in the creation of a dynamic Tableau dashboard that provided insights into
customer behaviour, top restaurants, monthly sales, and revenue growth. Overall, this project
demonstrated the importance of data-driven decision-making in the food delivery industry
and the value of data analysis in improving customer experience and increasing revenue.

Text Blob:

Text Blob is a Python library commonly used for sentiment analysis tasks, including
analyzing LinkedIn app reviews. Text Blob provides a straightforward and intuitive interface
for performing sentiment analysis, making it accessible for developers without extensive
NLPexpertise. One of the main techniques used in Text Blob for sentiment analysis is the
polarity scoring system, where each word in the text is assigned a polarity score indicating its
sentiment orientation (positive, negative, or neutral). These polarity scores are then
aggregated to determine the overall sentiment polarity of the text. Additionally, Text Blob
employs machine learning algorithms to classify text into sentimentcategories based on a pre-
trained sentiment classifier. This classifier is trained on a labeleddataset containing examples
of positive, negative, and neutral text, enabling Text Blob to accurately classify new text
inputs. Furthermore, Text Blob offers capabilities for handling various aspects of natural
language processing, such as tokenization, part-of-speech tagging, and noun phrase
extraction, which can be useful for preprocessing text data beforesentiment analysis.

Natural Language Processing (NLP) Libraries:

NLP libraries like NLTK (Natural Language Toolkit) provide tools and resources for
processing and analyzing text data. Natural Language Processing (NLP) techniques are

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|11

essential for conducting sentiment analysis on LinkedIn app reviews. These techniques
involve a range of processes aimed at understanding and interpreting human language. One
common approach is text preprocessing, which involves tasks like tokenization, stemming,
and lemmatization to break down text into its constituent parts and standardize word forms.
The text can be classified into positive, negative, or neutral sentiments based on features
extracted from the text. Additionally, techniques such as part- of-speech tagging and named
entity recognition may be employed to identify important entities or aspects mentioned in the
reviews, providing deeper insights into specific areas of user satisfaction or dissatisfaction.
Overall, leveraging these NLP techniques enables LinkedIn app developers to gain valuable
insights from user reviews, facilitating data- driven decisions for enhancing the app's
performance and user satisfaction.

3.3 SOFTWARE AND HARDWARE REQUIREMENTS:

Hardware requirements:

• RAM: 4GB

• CPU: Intel Core i5 processor

Software Requirements:
• RAM: 4GB
• CPU: Intel Core i5 processor
A laptop with a minimum of 4GB RAM and an Intel Core i3 processor or higher is
generally suitable for basic computing tasks, productivity, and light multitasking. The Intel
Core i3 processor, or a higher-tier processor, provides decent performance for everyday
computing tasks. It can handle activities such as web browsing, word processing, spreadsheet
work, and multimedia consumption smoothly. With a minimum of 4GB RAM, the laptop can
efficiently run basic applications and provide a responsive computing experience. However,
for more demanding tasks or improved multitasking performance, consider opting for a laptop
with 8GB or more RAM. Laptops with these specifications often come with a variety of
operating systems, including Windows, macOS, or Linux. The choice of the operating system
depends on personal preferences and specific application requirements. Storage capacity may
vary, but laptops in this category typically come with at least 128GB to 256GB of solid-state
drive (SSD) storage. SSDs contribute to faster boot times and improved overall system
responsiveness. These laptops typically come with a variety of connectivity options, including

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|12

USB ports, HDMI, audio jacks, and wireless connectivity (Wi-Fi and Bluetooth). Some models
may also include additional features such as USB-C ports.

Python:

Python is an interpreter, object-oriented, high-level, dynamically semantic


programming language. It is particularly desirable for Rapid Application Development as well
as for usage as a scripting or glue language to tie existing components together due to its high-
level built in data structures, dynamic typing, and dynamic binding. Python's straightforward
syntax prioritizes readability and makes it simple to learn, which lowers the cost of program
maintenance. Python's support for modules and packages promotes the modularity and reuse
of code in programs. On all popular platforms, the Python interpreter and the comprehensive
standard library are freely distributable and available in source or binary form.

Features:

• Python works on different platforms (Windows, Mac, Linux, Raspberry Pi,

etc).

• Python has a simple syntax similar to the English language.

• Python has syntax that allows developers to write programs with fewer lines

than some other programming languages.

• Python runs on an interpreter system, meaning that code can be executed as

soon as it is written. This means that prototyping can be very quick.

Google collab:

• Google Collab, short for Google Collaboratory, is a cloud-based platform


provided by Google that offers users the capability to write and execute Python
code in a collaborative environment.

• It provides a free, accessible solution that requires no setup or installation,


allowing users to seamlessly run Python code directly from their web
browsers. Integrated with Google Drive, Collab enables users to save and share
their notebooks effortlessly. Leveraging the popularity of Jupyter notebooks,
Collab supports interactive documents combining live code.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|13

CHAPTER 4
SYSTEM DESIGN

4.1 DATA FLOW DIAGRAM

Fig 4.1.1: Data Flow Diagram

• Input Data:
Providing data for app review sentiment analysis on the Play Store involves users
submitting their feedback and ratings regarding their experiences with the application.
Users typically access the Play Store through the Google Play app on their Android
devices or via the web interface. Once on the app's page, users can navigate to the
review section where they have the option to rate the app on a scale from one to five
stars and provide written feedback. This feedback can include comments on various
aspects of the app such as usability, performance, features, bugs, and overall
satisfaction. Users may also express their likes, dislikes, suggestions for improvements,
or report any issues they encounter.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|14

• Data Processing:

Data preprocessing for sentiment analysis of LinkedIn app reviews involves


several essential steps to ensure the raw text data is cleaned and structured appropriately
for analysis. The preprocessing begins with removing noise, which includes eliminating
special characters, emojis, URLs, and other irrelevant symbols that may not contribute
to sentiment analysis. Following this, the text is converted to lowercase to standardize
the text and reduce complexity. Stop words, commonly occurring words like "and" or
"the," are then removed to focus on more meaningful content. Tokenization is applied
to break down the text into individual words or tokens. Additionally, stemming or
lemmatization techniques may be utilized to reduce words to their base forms, ensuring
consistency in the dataset. Finally, any missing data points are handled appropriately,
either through imputation or removal, toensure the integrity of the dataset. By
performing these preprocessing steps, the raw LinkedIn app review data is transformed
into a clean and structured format ready for sentiment analysis, facilitating the
extraction of valuable insights into user sentiments towards the app.

• Feature Extraction:
Term Frequency-Inverse Document Frequency weighs the importance of each
word based on its frequency in the review and across the entire corpus of reviews, thus
prioritizing words that are more informative for sentiment analysis. Word embeddings,
on the other hand, represent words in a continuous vector space, capturing semantic
relationships between words. These numerical feature representations encode the
semantic meaning of the reviews, allowing machine learning algorithms to learn
patterns and relationships between words and sentiments. By performing feature
extraction, the LinkedIn app review data is transformed into a format suitable for
training sentiment analysis models, enabling accurate prediction of the sentiment
expressed in the reviews.

• Sentiment Analysis (NLP Technologies):


Sentiment analysis for LinkedIn app reviewsutilizes Natural Language
Processing (NLP) techniques to understand and categorize the sentiments expressed by

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|15

users. NLP enables the analysis of textual data to discern whether the sentiment
conveyed in the reviews is positive, negative, or neutral. Techniques such as
tokenization, part-of-speech tagging, and syntactic parsing are applied to break down
the text into smaller units, identify the grammatical structure, and extract relevant
features. By leveraging NLP techniques and machine learning models, sentiment
analysis provides valuable insights into user perceptions and sentiments towards the
LinkedIn app, enabling developers and stakeholders to make data-driven decisions for
enhancing user experience and satisfaction.

• Analyze and Visualization:


The analysis and visualization step in sentiment analysis of LinkedIn app
reviews is pivotal for gaining actionable insights from the processed data. During
analysis, the sentiment labels assigned to each review are examined to identify trends,
patterns, and outliers. This involves calculating metrics such as the proportion of
positive, negative, and neutral reviews, as well as sentiment distribution over time or
across different user segments. For visualization, various graphical techniques can be
employed to present the analysis findings in an intuitive andcomprehensible manner.
Bar charts are commonly used to depict the distribution of sentiments, with bars
representing the frequency or proportion of positive, negative, and neutral reviews.
Additionally, word clouds can be generated to highlight frequentlyoccurring words in
positive and negative reviews, providing insights into the key themes and topics driving
user sentiments. By combining analysis and visualization techniques, stakeholders can
gain a deeper understanding of user perceptions towards the LinkedIn app and identify
areas for improvement to enhance user satisfaction.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|16

CHAPTER 5
IMPLEMENTATION
5.1 CODING
# import python libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt # visualizing data
%matplotlib inline
import seaborn as sns
# import csv file
df = pd.read_csv(r'C:\Users\RAJU\Desktop\MY Recent files\Python_Diwali_Sales_Analysis-
main\Diwali Sales Data.csv', encoding= 'unicode_escape')
# to avoid encoding error, use 'unicode_escape'
df.shape
df.head(4)
df.info()
#drop unrelated/blank columns
df.drop(['Status', 'unnamed1'], axis=1, inplace=True)
#check for null values
pd.isnull(df).sum()
# drop null values
df.dropna(inplace=True)
# change data type
df['Amount'] = df['Amount'].astype('int')
# to check the datatype
df['Amount'].dtypes
df.columns
#rename column
df.rename(columns= {'Marital_Status':'Shaadi'})
# describe() method returns description of the data in the DataFrame (i.e. count, mean, std,
etc)
df.describe()

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|17

# use describe() for specific columns


df[['Age', 'Orders', 'Amount']].describe()
df.columns
# plotting a bar chart for Gender and it's count

ax = sns.countplot(x = 'Gender',data = df)

for bars in ax.containers:


ax.bar_label(bars)
# plotting a bar chart for gender vs total amount

sales_gen = df.groupby(['Gender'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)

sns.barplot(x = 'Gender',y= 'Amount' ,data = sales_gen)


ax = sns.countplot(data = df, x = 'Age Group', hue = 'Gender')

for bars in ax.containers:


ax.bar_label(bars)
# Total Amount vs Age Group
sales_age = df.groupby(['Age Group'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)

sns.barplot(x = 'Age Group',y= 'Amount' ,data = sales_age)


# total number of orders from top 10 states

sales_state = df.groupby(['State'], as_index=False)['Orders'].sum().sort_values(by='Orders',


ascending=False).head(10)

sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Orders')
# total amount/sales from top 10 states

sales_state = df.groupby(['State'],

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|18

as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False).head(10)

sns.set(rc={'figure.figsize':(15,5)})
sns.barplot(data = sales_state, x = 'State',y= 'Amount')
ax = sns.countplot(data = df, x = 'Marital_Status')

sns.set(rc={'figure.figsize':(7,5)})
for bars in ax.containers:
ax.bar_label(bars)
sales_state = df.groupby(['Marital_Status', 'Gender'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)

sns.set(rc={'figure.figsize':(6,5)})
sns.barplot(data = sales_state, x = 'Marital_Status',y= 'Amount', hue='Gender')
sns.set(rc={'figure.figsize':(20,5)})
ax = sns.countplot(data = df, x = 'Occupation')

for bars in ax.containers:


ax.bar_label(bars)
sales_state = df.groupby(['Occupation'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False)

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Occupation',y= 'Amount')
sns.set(rc={'figure.figsize':(21,5)})
ax = sns.countplot(data = df, x = 'Product_Category')

for bars in ax.containers:


ax.bar_label(bars)
sales_state = df.groupby(['Product_Category'],
as_index=False)['Amount'].sum().sort_values(by='Amount', ascending=False).head(10)

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_Category',y= 'Amount')

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|19

sales_state = df.groupby(['Product_ID'],
as_index=False)['Orders'].sum().sort_values(by='Orders', ascending=False).head(10)

sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(data = sales_state, x = 'Product_ID',y= 'Orders')
# top 10 most sold products (same thing as above)

fig1, ax1 = plt.subplots(figsize=(12,7))


df.groupby('Product_ID')['Orders'].sum().nlargest(10).sort_values(ascending=False).plot(kind
='bar')

5.2 SNAPSHOTS:

Fig 5.2.1: plotting a bar chart for Gender and it's count.
From above graphs we can see that most of the buyers are females and even the
purchasing power of females are greater than men

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|20

Fig 5.2.2: Total Amount vs Age Group


From above graphs we can see that most of the buyers are of age group between 26-35
years female

Fig 5.2.3 top 10 most sold products (same thing as above)


Married women age group 26-35 yrs from UP, Maharashtra and Karnataka working in
IT, Healthcare and Aviation are more likely to buy products from Food, Clothing and
Electronics category

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|21

CHAPTER 6
SOFTWARE TESTING
The procedure of executing system with the target of finding error is outlined as testing. It can
also be defined as the process that defines, isolates, subjects to rectification of defects, and so
that the customer satisfaction is reached at last with the assurance of the system is free from
defects. Software testing is a very important element of the quality assurance, and it represents
the SRS, designing, coding and implementation of the system proposed.

6.1 LEVELS OF TESTING

Test Planning:

Test plan is the document that gives the information regarding the procedure that is to be
followed in performing various tasting on the whole application. This document involves scope
and objectives of the testing, areas that are to be tested and areas that should not be tested,
scheduling of resources available, the area that need to be automated and various tools that are
used for testing.

Test Development:

Test development involves development of test cases and their procedural preparation i.e.
description of the developed test cases.

6.2 TYPES OF TESTING

Various types of testing that are done on the system are as follows:
• Unit testing
• Integration testing
• System testing

• Unit Testing:

As the name itself says, this type of testing is done on small units of the system. A part
of the system is considered as a unit and its testing is done. If as an example, login page
considered; the user or the administrator can enter into their respective home pages only

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|22

after giving the valid username and password. This part of validating a system, by
considering Login as a unit can be said as a unit testing.

• Integration Testing:
This part of testing deals with the testing procedure. It involves, testing of various
integrations of several units. It checks whether the system is functioning correctly when
two or more units are integrated together. This part of testing gives information about
order of arrangements of various units, integrating modules, systems, sub-systems and
the entire system as a whole.

• System Testing:
This testing technique deals with the process of testing the system as a whole. At the
end of each project, all defects are removed and the interface errors are uncovered in
order to achieve the good functioning of the whole system. This testing technique can
be called as the final part of whole testing process.

6.3 TEST CASES


Data set collection Test case:

Test Case 1

Name of Test Data set Collection

Input Collection of fraud insurance dataset

Excepted output Csv data format file

Actual output Csv data format file

Result Successful

Table 6.3.1: Data set Collection

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|23

Review and Analysis Test case:

Test Case 1

Name of Test Data set Collection

Input Collection of fraud insurance dataset

Excepted output Csv data format file

Actual output Csv data format file

Result Successful

Table 6.3.2: Review and Analysis Test

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|24

CHAPTER 7
CONCLUSION

Integrating advanced sentiment analysis capabilities into the review system of the LinkedIn
app holds tremendous potential for enhancing user experience and driving continuous
improvement. Ultimately, leveraging sentiment analysis not only allows LinkedIn to better
understand and respond to user feedback but also empowers the platform to deliver more
personalized and satisfying experiences to its diverse user base, solidifying its position as a
leading professional networking platform in the digital landscape. Swiggy sales data reviews
using Python presents a comprehensive approach to understanding user opinions and
experiences with the app. By deciphering the sentiment polarity of reviews, whether positive,
negative, or neutral, this analysis provides developers and marketers with actionable
information to enhance the app's user experience. On the other hand, there exists a cohort of
users who express dissatisfaction with certain aspects of the LinkedIn app. Common grievances
include occasional technical glitches, such as slow loading times or crashes, which can disrupt
the user experience. Some users also report limitations in messaging functionality, finding it
cumbersome or inadequate for effective communication with connections. Furthermore, there
are those who perceive a lack of significant innovation in the app's development, noting a
stagnation in features or improvements compared to other social networking platforms.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|25

CHAPTER 8
FUTURE ENHANCEMENT

In the future, several enhancements can be integrated into the LinkedIn app review sentiment
analysis project to further refine its effectiveness and applicability.

• Instead of just categorizing reviews into positive, negative, or neutral, a more nuanced
approach could be adopted. This might involve identifying specific emotions like joy,
frustration, satisfaction, etc., allowing for a more detailed understanding of user
feedback.

• Analyzing different aspects of the LinkedIn app separately, such as user interface,
functionality, customer support, etc., can provide more targeted insights.

• LinkedIn has a global user base, so supporting sentiment analysis for reviews in
multiple languages would be beneficial. This would involve training models on data
in various languages and ensuring accurate sentiment analysis across different
linguistic contexts.

• Integrate sentiment analysis directly into LinkedIn's feedback loop, enabling faster
response to user concerns and facilitating continuous improvement of the app based
on real-time insights from user feedback.

East West Institute of Technology Department of MCA 2023-24


Swiggy Sales Analysis Page|26

BIBLIOGRAPHIY

1. S Kurniawan et al. Text Mining Pre-Processing Using Data Framework and RapidMiner for
Indonesian Sentiment Analysis.

2. Mahmud Isnan et al. Sentiment analysis for TikTok review using VADER sentiment and
SVM model.

3. Hutto, C.J., & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for
Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and
Social Media.

4. Abdi A, Shamsuddin SM, Hasan S, Piran J (2019) Deep learning-based sentiment


classification of evaluative text based on multi-feature fusion. Inf Process Manag 56(4):1245–
1259

5. Al Amrani Y, Lazaar M, El Kadiri KE (2018) Random forest and support vector machine
based hybrid approach to sentiment analysis. Procedia Compute Sci 127:511– 520

East West Institute of Technology Department of MCA 2023-24

You might also like