Shubham Dhasmana Introduction • Welcome! Today, we will clean the dataset using Jupyter Notebook. Here we will transform messy data into actionable insights and craft a clear, concise PowerPoint presentation that communicates your findings effectively. Let's dive in! Importing essential libraries: Pandas, NumPy, and Matplotlib Reading a CSV file named “drinks“ and printing it Changing the column names in data set which we want to change Providing information about the dataset's structure Generating descriptive statistics for the dataset Grouping and counting the occurrences in 'Beverage_categories' column
Getting the dimensions of the
dataset Counting null values in each column of the dataset Filling missing values in the columns Checking if there are still null values in the dataset
There are no null
values Checking duplicate rows in the DataFrame and droping duplicate rows
154 and 210
index rows have duplicate values Reindexing dataset Filtering the 'Beverage_categories' column matches 'Frappuccino® Blended Crème’ and calories column value must be more than or equal to 300 Retrieving an array of unique values present in the 'beverages' column. arranging the rows in the DataFrame from the lowest to the highest 'Calories' value. Sorting the DataFrame in descending order based on the 'Calories' column Extracting all rows and columns from index 0 up to (but not including) index 4. Visualizing 'Calories' column using a line plot Creating a bar plot to visualize the relationship between 'Beverage_categories' and 'Calories' creating a categorical plot using Seaborn's catplot function .This plot will display the count of different beverage preparations Heatmap using Seaborn Libraries TABLEAU DASHBOARDS