You are on page 1of 27

Mc’donald

Analysis
S U D H A N VA S A R A L AYA
Introduction and Purpose
This particular presentation is made to analyse the data given and answer the questions asked below.

Plot graphically which food categories have the highest and lowest varieties.

Which all variables have an outlier?

Which variables have the highest correlation? Plot them and find out the value?

Which category contributes to the maximum % of Cholesterol in a diet (% daily value)?

Which item contributes maximum to the Sodium intake?

Which 4 food items contain the most amount of Saturated Fat?


Application

▪For analysis Jupyter Notebook

▪For presentation PowerPoint

Packages

Application and
▪Import numpy as np for Numpy related calculations

▪Import pandas as pd for Dataframe related


packages used ▪Import matplotlib.pyplot as plt for Plotting the graph visualization

▪%matplotlib inline for directly showing the graph

▪Import seaborn as sns for Plotting the graph visualization

▪From warnings import filterwarnings to ignore unnecessary warnings

▪Filterwarnings("ignore")
Plot graphically which food
categories have the highest
and lowest varieties.
Answering this question is very easy. First, we see the count
for all the category and then we use a graph to make sense
out of it.

As we see clearly in the graph as well as the count next to it


Coffee and tea have the highest variety and Salad have the
lowest variety
Which all Columns with Numerical values are as below
Calories
variables Calories from Fat
Total Fat
have an Total Fat (% Daily Value)

outlier?
Saturated Fat
Saturated Fat (% Daily Value)
Trans Fat
Cholesterol
Before we go to what and all variables
Cholesterol (% Daily Value)
have a outliner we need to understand
what a outliner is. Sodium
Sodium (% Daily Value)
▪Outliner is the extreme values in a data
Carbohydrates
set. It can be extremely high or extremely
low values associated to the data set. Carbohydrates (% Daily Value)
Outliners can be only checked for Dietary Fiber
numerical values. Dietary Fiber (% Daily Value)
For Example : 0 100 200 300 Sugars
10000. here 0 and 10000 are a outliner. Protein
Vitamin A (% Daily Value)
▪The following slides will show all the
graphs with a small inference on outliers Vitamin C (% Daily Value)
Calcium (% Daily Value)
Iron (% Daily Value)
Here we can see outliers in
Calories so we can assume that
outliers are present is calories
from Fat, Total Fat (% Daily
Value)
total fat as they are all directly
proportional to each other. We
can see the below Boxplot and
tell that the assumptions are
correct
Interesting inference is that even though Total Fats and Total
Fat (% Daily Value) have outliers, Saturated fat Saturated Fat
(% Daily Value) don’t have any
When we come to
trans fat we see
some outliers. As
most of the values
are 0 we see no
proper plot and but
many outliers
As we see in the 2
plots Carbohydrates
and Carbohydrates
(% Daily Value) have
outlier present in
them.
Outliers are
present in
Cholesterol and
Cholesterol (%
Daily Value)
Outliers are
present in
Sodium and
Sodium (%
Daily Value)
When it comes to Dietary
Fiber and Dietary Fiber (%
Daily Value) which are
related to each other we see
that outliers are present
only Dietary Fiber (% Daily
Value) even though Dietary
Fiber does not have any.
Even Sugar has outliers
present
We even see outliers
in Protein as well
Here we see
outliers in
different
Vitamins like C,
A as well in Iron
And Calcium
Which variables have the highest correlation?
Plot them and find out the value?

IN THE UPCOMING SLIDES


WE WILL SEE THE
FOR FINDING CORRELATION NOW WHAT IS CORRELATION –
DIFFERENT CORRELATION
BETWEEN DIFFERENT IT IS THE LEVEL OF
DEPENDENCE BETWEEN BETWEEN EACH VARIABLE IN
VARIABLE, WE USE THE
VARIABLES. A TABLE FORMAT AS WELL A
FUNCTION .CORR()
HEAT MAP SHOWING THE
LEVEL OF CORRELATION.
Item 1 Item 2 Corr Value
Sodium Sodium (% Daily Value) 0.999929
Cholesterol Cholesterol(% Daily Value) 0.999855
Total Fat Total Fat (% Daily Value) 0.999765
Calories from Fat Total Fat (% Daily Value) 0.999725
Calories from Fat Total Fat 0.999663

How to read a heat map – the


lighter the box the higher is the
correlation. The Darker the box the
lower is the correlation.

In the above table we see the


variables having highest level of
correlation
Which category
contributes to the
maximum % of
Cholesterol in a diet
(% daily value)?

As we see in the table the


category that contributes the
most % cholesterol in a dies is
breakfast with 50.95%,
followed by beef and pork
with 28.93%
Which item
contributes
maximum to the
Sodium intake?

From the table we can infer


that chicken mcnuggets has
the most sodium intake.

Further we can even see that


each piece of chicken
mcnuggets contribute as
much as 90 sodium intake.
Which 4 food items
contain the most amount
of Saturated Fat?
The top 4 food items with most amount of saturated fat is
McFlurry with M&M’s Candies (Medium), Big Breakfast with
Hotcakes (Large Biscuit), Chicken McNuggets (40 piece) and
Frappé Chocolate Chip (Large) with exactly 20 each.
Thank You
Feedback will be much appreciated.

You might also like