You are on page 1of 19

Descriptive Statistics and Inferential Statistics

CSC 426 Week 7 Dmitriy Zinovev

Data Preparation Descriptive Statistics Inferential Statistics

Data Preparation
Logging the Data Checking the Data For Accuracy Data Transformations

Descriptive Statistics
Univariate Analysis
Accesses properties of a single variable
Distribution Center Spread

Shows ties between variables

Univariate Analysis (distribution)

Univariate Analysis (Center)


Non-stable to extreme observations Very useful in case of a normal distribution


Great for visual comparison between distributions Very useful in case of skewed distribution

Most frequent value in the distribution

Univariate Analysis (Spread)

5 number summary
Min smallest observation Q1 median of the first half of a distribution Median median of a distribution Q3 median of the second half of a distribution Max biggest observation

1.5 IQR rule

Univariate Analysis (Spread cont.)

Standard Deviation
Shows relation of observations to the mean of a distribution
Calculate a distance to mean for each value Square the results Divide a sum by the size of a distribution 1 (variance) Take a square root from variance

Univariate Analysis (Spread cont.)

Standard Deviation
Empirical rule
approximately 68% of the scores in the sample fall within one standard deviation of the mean approximately 95% of the scores in the sample fall within two standard deviations of the mean approximately 99% of the scores in the sample fall within three standard deviations of the mean

Need to determine whether there is a relationship between variables

Correlation (cont.)
Magnitude Direction

Correlation (cont.)

Test significance of produced value

Significance level Degree of freedom

Correlation (cont.)
Situations when there is only 1 variable in the model are rare in real life. Need to compute correlation matrix.

Inferential Statistics
Used for drawing conclusion about the population from a sample
Estimate true value of the parameter from a sample

Hypothesis testing
Determine if there is a difference in a parameter value for two groups.

Inferential Statistics (General linear model )

General linear model family of statistical models that produce most of inferential statistics y = b0 + bx + e y outcome b0 intercept x predictors b coefficient estimates e error component

Inferential Statistics (General linear model cont.)

Foundation for many statistical analyses
Checks if means of two groups are different from each other on defined confidence level

Checks if there is a difference between more than two groups

Adjusts the use of ANOVA by including covariates into the analysis

Regression analysis
Creates a model for predicting dependent variable

Inferential Statistics (Dummy variables.)

Define different groups.

Research design
Experimental Analysis. Quasi-Experimental Analysis.