Professional Documents
Culture Documents
Why Python?
Good to know
DATA
So, According to import of FMCG goods data we deliberate
dimensions and datatype of data. In which we calculated Number of
rows and columns, header names, datatypes. And in Sample data we
calculated top 10 rows, bottom 10 rows.
CORRELATION ,
SKEWNESS AND KURTOSIS OF DATA
On the above data we calculated correlation between discount,
order ID, order quantity, Profit, Sales, Price. Where we get
correlation above 0.6 and below 1.
On the basis of this data we also calculated skewness and kurtosis
where we got positive skewness on discount, order ID, Profit,
Sales, Price and got negative skewness on order quantity. Likewise
on kurtosis we get positive on profit, sales and price and negative
kurtosis on discount, order ID, order quantity.
kutrosis-17.8386060
Skewness- 3.463269
VARIABLE PRICE
On the basis of import data we done certain operations like we
calculated the minimum value of PRICE variable, sum of PRICE
variable, mean of PRICE variable, mode of PRICE variable and
standard deviation of PRICE variable
Mean=3060.916
standard deviation=5167.5817
mode=275.0363
shape of Boston data set-(7811,9)
SCATTER PLOT
This graph represent the correlation between
QTY and price of the data
A scatter plot is a diagram where each value in the data set is
represented by a dot.
The Matplotlib module has a method for drawing scatter plots,
it needs two arrays of the same length, one for the values of
the x-axis, and one for the values of the y-axis.
import matplotlib.pyplot as plt
HISTOGRAM PLOT
• It showing the frequency distribution of every column of data
• It is a graph showing the number of observations within each
given interval.
• Using the data we plotted the graph for Discount, order ID,
order quantity, profit, sales, price.
• import seaborn as sns
DISTRIBUTION PLOT
Seaborn is a Python data visualization library based on
Matplotlib. It provides a high-level interface for drawing
attractive and informative statistical graphics. This article deals
with the distribution plots in seaborn which is used for
examining univariate and bivariate distributions
HEATMAP
A heatmap is a two-dimensional graphical representation of
data where the individual values that are contained in a matrix
are represented as colors. The Seaborn package allows the
creation of annotated heatmaps which can be tweaked using
Matplotlib tools as per the creator's requirement
THANK YOU