Agenda

Data Preparation Descriptive Statistics Inferential Statistics

Data Preparation

Logging the Data Checking the Data For Accuracy Data Transformations

Descriptive Statistics

Univariate Analysis

Accesses properties of a single variable

Distribution Center Spread

Correlation

Shows ties between variables

Mean

Median

Great for visual comparison between distributions Very useful in case of skewed distribution

Mode

Most frequent value in the distribution

5 number summary

Min smallest observation Q1 median of the first half of a distribution Median median of a distribution Q3 median of the second half of a distribution Max biggest observation

Standard Deviation

Shows relation of observations to the mean of a distribution

Calculate a distance to mean for each value Square the results Divide a sum by the size of a distribution 1 (variance) Take a square root from variance

Standard Deviation

Empirical rule

approximately 68% of the scores in the sample fall within one standard deviation of the mean approximately 95% of the scores in the sample fall within two standard deviations of the mean approximately 99% of the scores in the sample fall within three standard deviations of the mean

Correlation

Need to determine whether there is a relationship between variables

Correlation (cont.)

Magnitude Direction

Correlation (cont.)

Calculation

Significance level Degree of freedom

Correlation (cont.)

Situations when there is only 1 variable in the model are rare in real life. Need to compute correlation matrix.

Inferential Statistics

Used for drawing conclusion about the population from a sample

Estimation

Estimate true value of the parameter from a sample

Hypothesis testing

Determine if there is a difference in a parameter value for two groups.

General linear model family of statistical models that produce most of inferential statistics y = b0 + bx + e y outcome b0 intercept x predictors b coefficient estimates e error component

Foundation for many statistical analyses

t-test

Checks if means of two groups are different from each other on defined confidence level

ANOVA

Checks if there is a difference between more than two groups

ANCOVA

Adjusts the use of ANOVA by including covariates into the analysis

Regression analysis

Creates a model for predicting dependent variable

Define different groups.

Research design

Experimental Analysis. Quasi-Experimental Analysis.

QUESTIONS?

