Professional Documents
Culture Documents
Unit 3:
Statistical Analysis Initial Data Analysis
3
Types of Data
4
Measuring Relationship between Attributes
5
• Correlation: A statistical measure of the strength of a linear
relationship between two variables. Its values can range from -
1 to 1.
✓ -1 means perfect negative or inverse correlation
✓ 1 means perfect positive or direct correlation
✓ 0 means no linear relationship.
6
• Chi-square: A statistical procedure for determining the
difference between observed and expected data.
✓ It is also used to determine whether it correlates to
the categorical variables in our data.
✓ It helps to find out whether a difference between
two categorical variables is due to chance or a
relationship between them.
7
ρ (X,Y) = cov (X,Y) / σ𝑿.σy
8
Measure of Distribution
9
Skewness
10
11
• Kurtosis is a numerical method in
statistics that measures the
sharpness of the peak in the data
distribution.
• Also called as Tailedness of a
distribution.
Definition of
Kurtosis
12
13
Box & Whiskers Plot
14
Box & Whiskers Plot
15
• Fundamental concept in
Probability statistics and data analysis
16
Types of Probability
Marginal Conditional
Joint Probability
Probability Probability
• Categories : 2
– Continuous Distributions for variables like height or
weight.
– Discrete Distributions for variables like the number of
coin tosses needed to get a head.
18
• Stands for Probability Density
Functions.
19
• Stands for Cumulative
Distribution Functions.
20
THANK YOU