You are on page 1of 2

Data Analysis

Data Analysis is basically is a method where we use statistics and probability to figure out
trends in data set. It helps us to sort out the “real” trends from the statistical noise.

Techniques.
There are different types analysis methods are available for analyzing data. The selection of
method depends on the kind of data to be analyze.. Some techniques are:

 General linear model: Useful for assessing how several variables affect continuous
variables. Example: ANOVA tests.
 Generalized linear model: Used for discrete variables. Example: Linear Regression
 Structural equation modelling: Used for abstract variables like “Soap preference,”
“Intelligence,” or “Future goals.”
 Item response theory: A way to analyze results from tests, exams, and
questionnaires.

It’s vital we should use the right technique; Using the wrong one can lead to faulty claims
about our data.

Exploratory Data Analysis?


Exploratory Data Analysis (EDA) is an approach to analyzing data. It’s where the
researcher takes a bird’s eye view of the data and tries to make some sense of it. It’s often the
first step in data analysis, implemented before any formal statistical techniques are applied.

We can use specific statistical techniques like creating histograms or box plots etc.,
EDA is not a set of techniques or procedures; EDA a “philosophy.

Exploratory data analysis is a complement to inferential statistics, which tends to be


fairly rigid with rules and formulas. EDA involves the analyst trying to get a “feel” for the data
set, often using their own judgment to determine what the most important elements in the data
set are.

Purpose of EDA
The purpose of exploratory data analysis is to:

 Check for missing data and other mistakes.


 Gain maximum insight into the data set and its underlying structure.
 Uncover a parsimonious model, one which explains the data with a minimum number
of predictor variables.
 Check assumptions associated with any model fitting or hypothesis test.
 Create a list of outliers or other anomalies.
 Find parameter estimates and their associated confidence intervals or margins of error.
 Identify the most influential variables.
Other, specific knowledge can be obtained through EDA such as creating a ranked list
of relevant factors. You may not necessarily include all of the above items in your data analysis,
although it’s likely you’ll want to include at least a few. They should be viewed as guidelines,
rather than rigid rules.

You might also like