You are on page 1of 8

● Presentation by Arpan Nyati

● Department : Electrical Engineering


● Roll No : 18EJCEE008
Intro to Exploratory Data Analysis
Machine Learning and AI
In statistics, exploratory data analysis is an approach of analyzing data
sets to summarize their main characteristics, often using statistical
graphics and other data visualization methods. A statistical model can be
used or not, but primarily EDA is for seeing what the data can tell us
beyond the formal modeling or hypothesis testing task.
How to perform EDA

We can perform EDA either by using most popular programming


languages used for statistics like R , python or use Business
Intelligence(BI) tools like Tableau, IBM Cognos, Qlik sense and other
tools. BI tools provide interactive dashboards for understanding data.
They are easy to use and some of BI tools are also integrated with
building machine learning models with no need of writing of code.
NumPy
Data Analysis NumPy is used for comprehensive
mathematical functions, random
Data analysis is a process of number generators, linear algebra
inspecting, cleansing, routines, Fourier transforms, and
transforming and modeling data more.
with the goal of discovering
useful information, informing
conclusions and supporting
decision-making.
Pandas
Pandas is a fast and efficient data
frame object for data manipulation
with integrated indexing.
Pandas is in use in a wide variety of
academic and commercial domains, Time series-functionality: date
including Finance, Neuroscience, range generation and frequency
Economics, Statistics, Advertising, conversion, moving window
Web Analytics, and more. statistics, date shifting and lagging.
Even create domain-specific time
offsets and join time series without
losing data.
Data Visualization
With data visualisation libraries like
seaborn , matplotlib in python aims to
make visualization a central part of
exploring and understanding data. It is
data oriented plotting function operate
on dataframes and arrays containing
whole datasets and internally perform
the necessary semantic mapping and
statistical aggregation to produce
informative plots.
Some of the examples are shown.
Histograms
Common Plots used Scatter plots
for Visualization Pair plots
Box plots
Violin plots
Distribution Plots
Thank You

You might also like