You are on page 1of 26

Starbuck Dataset

Analysis

Submitted to- Prateek Gupta

Submitted BY- Manan Verma


Shubham Dhasmana
Introduction
• Welcome! Today, we will clean the dataset using Jupyter Notebook. Here we
will transform messy data into actionable insights and craft a clear, concise
PowerPoint presentation that communicates your findings effectively. Let's
dive in!
Importing essential libraries: Pandas,
NumPy, and Matplotlib
Reading a CSV file named “drinks“ and
printing it
Changing the column names in data set
which we want to change
Providing information about the dataset's
structure
Generating descriptive statistics for the
dataset
Grouping and counting the occurrences
in 'Beverage_categories' column

Getting the dimensions of the


dataset
Counting null values in each column of the dataset
Filling missing values in the columns
Checking if there are still null values in the
dataset

There are no null


values
Checking duplicate rows in the DataFrame and
droping duplicate rows

154 and 210


index rows
have duplicate
values
Reindexing dataset
Filtering the 'Beverage_categories' column matches 'Frappuccino® Blended
Crème’ and calories column value must be more than or equal to 300
Retrieving an array of unique values present in the 'beverages' column.
arranging the rows in the DataFrame from the lowest to the highest 'Calories' value.
Sorting the DataFrame in descending order
based on the 'Calories' column
Extracting all rows and columns from index 0 up to (but not including) index 4.
Visualizing 'Calories' column using a line plot
Creating a bar plot to visualize the relationship
between 'Beverage_categories' and 'Calories'
creating a categorical plot using Seaborn's catplot function .This plot will display the
count of different beverage preparations
Heatmap using Seaborn Libraries
TABLEAU DASHBOARDS

You might also like