You are on page 1of 19

Descriptive Statistics and Inferential Statistics

CSC 426 Week 7 Dmitriy Zinovev

Agenda
Data Preparation Descriptive Statistics Inferential Statistics

Data Preparation
Logging the Data Checking the Data For Accuracy Data Transformations

Descriptive Statistics
Univariate Analysis
Accesses properties of a single variable
Distribution Center Spread

Correlation
Shows ties between variables

Univariate Analysis (distribution)

Univariate Analysis (Center)


Mean

Non-stable to extreme observations Very useful in case of a normal distribution

Median

Great for visual comparison between distributions Very useful in case of skewed distribution

Mode
Most frequent value in the distribution

Univariate Analysis (Spread)


5 number summary
Min smallest observation Q1 median of the first half of a distribution Median median of a distribution Q3 median of the second half of a distribution Max biggest observation

1.5 IQR rule

Univariate Analysis (Spread cont.)


Standard Deviation
Shows relation of observations to the mean of a distribution
Calculate a distance to mean for each value Square the results Divide a sum by the size of a distribution 1 (variance) Take a square root from variance

Univariate Analysis (Spread cont.)


Standard Deviation
Empirical rule
approximately 68% of the scores in the sample fall within one standard deviation of the mean approximately 95% of the scores in the sample fall within two standard deviations of the mean approximately 99% of the scores in the sample fall within three standard deviations of the mean

Correlation
Need to determine whether there is a relationship between variables

Correlation (cont.)
Magnitude Direction

Correlation (cont.)
Calculation

Test significance of produced value


Significance level Degree of freedom

Correlation (cont.)
Situations when there is only 1 variable in the model are rare in real life. Need to compute correlation matrix.

Inferential Statistics
Used for drawing conclusion about the population from a sample
Estimation
Estimate true value of the parameter from a sample

Hypothesis testing
Determine if there is a difference in a parameter value for two groups.

Inferential Statistics (General linear model )


General linear model family of statistical models that produce most of inferential statistics y = b0 + bx + e y outcome b0 intercept x predictors b coefficient estimates e error component

Inferential Statistics (General linear model cont.)


Foundation for many statistical analyses
t-test
Checks if means of two groups are different from each other on defined confidence level

ANOVA
Checks if there is a difference between more than two groups

ANCOVA
Adjusts the use of ANOVA by including covariates into the analysis

Regression analysis
Creates a model for predicting dependent variable

Inferential Statistics (Dummy variables.)


Define different groups.

Research design
Experimental Analysis. Quasi-Experimental Analysis.

QUESTIONS?