You are on page 1of 15

Lectures Review

Lecture 10, 11, 12

Lecture 10
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA)
● Investigate missing values / outliers

● Search for patterns, i.e. linear, non-linear

● Calculate numerical summaries

● Ask and define questions

● Univariate / Bivariate / Multivariate

Dimensions of Analysis within EDA
1. Univariate: Seeking to explore, plot, and measure one variable
2. Bivariate: Seeking to explore, plot, and measure two variables
3. Multivariate: Seeking to explore, plot, and measure many variables
● Understand data properties

● Discover Patterns

● Ask and define questions

● Communicate results
Lecture 11
Communicating Data
Plot Types
Exploratory v.s. Explanatory Visualization
Story Telling
Less is More
Lecture 12
Inferential Analysis
Inferential Analysis
● Goals

○ Draw conclusions from a sample and generalize them to a population

● We can only use inference when the collected samples are representative of the population
Approaches to Inference
1. Correlation: association between variables
a. One common approach: Pearson’s r
b. Different types of correlations
i. Positive correlation
ii. Negative correlation
2. Comparison of Means: difference in means between variables
a. One common approach: t-test
3. Regression: relationship between the change in a variable and the change in another
a. One common approach: Linear Regression
4. Non-parametric Tests: when the above categories are not applicable
Linear Regression
● Model

○ Mathematical equations used for finding values when given a new input

● Difference between linear regression and correlation

○ Correlation: the strength the linear relationship between two variables

○ Linear regression: to find how a change in the independent variable relates to the change in
the dependent variable
Thank you!

You might also like