Statistics Intro

Univariate Analysis Central Tendency Dispersion

Review of Descriptive Stats.

Descriptive Statistics are used to present quantitative descriptions in a manageable form. This method works by reducing lots of data into a simpler summary. Example:
 

Batting average in baseball Cornell’s grade-point system

Univariate Analysis
 

This is the examination across cases of one variable at a time. Frequency distributions are used to group data. One may set up margins that allow us to group cases into categories. Examples include
  

Age categories Price categories Temperature categories.


 

Two ways to describe a univariate distribution A table A graph (histogram, bar chart)

Distributions (con’t)

Distributions may also be displayed using percentages. For example, one could use percentages to describe the following:

 

Percentage of people under the poverty level Over a certain age Over a certain score on a

Distributions (cont.)
A Frequency Distribution Table
Category Under 35 36-45 46-55 56-65 66+ Percent 9% 21 45 19 6

Distributions (cont.)
A Histogram
45 40 35 30 25 20 15 10 5 0








Central Tendency

An estimate of the “center” of a distribution Three different types of estimates:
  

Mean Median Mode


The most commonly used method of describing central tendency. One basically totals all the results and then divides by the number of units or “n” of the sample. Example: The HSS 292 Quiz 1 mean was determined by the sum of all the scores divided by the number of students taking the

Working Example (Mean)

Lets take the set of scores: 15,20,21,20,36,15, 25,15 The Mean would be 167/8=20.875


The median is the score found at the exact middle of the set. One must list all scores in numerical order and then locate the score in the center of the sample. Example: If there are 500 scores in the list, score #250 would be the median. This is useful in weeding out

Working Example (Median)

  

Lets take the set of scores: 15,20,21,20,36,15, 25,15 First line up the scores. 15,15,15,20,20,21,25,36 The middle score falls at 20. There are 8 scores, and score #4 and #5 represent the halfway point.


  

The mode is the most repeated score in the set of results. Lets take the set of scores: 15,20,21,20,36,15, 25,15 Again we first line up the scores 15,15,15,20,20,21,25,36 15 is the most repeated score and is therefore labeled the mode.

Central Tendency

If the distribution is normal (i.e., bell-shaped), the mean, median and mode are all equal. In our analyses, we’ll use the mean.


Two estimates types:
 

Range Standard deviation

Standard deviation is more accurate/detailed because an outlier can greatly extend the range.


The range is used to identify the highest and lowest scores. Lets take the set of scores:15,20,21,20,36,15, 25,15. The range would be 15-36. This identifies the fact that 21 points separates the highest to the lowest score.

Standard Deviation

The standard deviation is a value that shows the relation that individual scores have to the mean of the sample. If scores are said to be standardized to a normal curve, there are several statistical manipulations that can be performed to analyze the data set.

Standard Dev. (con’t)

Assumptions may be made about the percentage of scores as they deviate from the mean. If scores are normally distributed, one can assume that approximately 69% of the scores in the sample fall within one standard deviation of the mean. Approximately 95% of the scores would then fall within two standard deviations of the mean.

Standard Dev. (con’t)

The standard deviation calculates the square root of the sum of the squared deviations from the mean of all the scores, divided by the number of scores. This process accounts for both positive and negative deviations from the mean.

Working Example (stand. dev.)

  

Lets take the set of scores 15,20,21,20,36,15, 25,15. The mean of this sample was found to be 20.875. Round up to 21. Again we first line up the scores. 15,15,15,20,20,21,25,36. 21-15=6, 21-15=6, 21-15=6,20-21=1,20-21=-1, 21-21=0, 21-25=-4, 3621=15.

Working Ex. (Stan. dev. con’t)
     

Square these values. 36,36,36,1,1,0,16,225. Total these values. 351. Divide 351 by 8. 43.8 Take the square root of 43.8. 6.62 6.62 is your standard deviation.

Sign up to vote on this title
UsefulNot useful