You are on page 1of 47

WELCOME TO STATISTICS -160

Section L40
Statistics is………
 …a way to summarize and describe
information: not very interesting in itself
 …an important tool for research in my field,
and something I look forward to learning more
about
 … something that I should learn to earn my
degree
 … boring
What best describes your attitude towards statistics?
Statistics is…….
 How can we evaluate evidence against global
warming?
 Are cell phones dangerous?
 What are the chances of a tax return being
audited?
 How likely are we to win the lottery?
 Is there bias against women in appointing
managers?
Data
Data is information we
gather through
experiments and surveys.
1. Experiment on low carb
diet
 Data: weight of subjects
before and after
2. Survey on effectiveness of
a TV ad
 Data: percentage who
went to Starbucks since ad
aired beanactivist.files.wordpress.com
Statistics
Statistics is the art and science of
1. Designing studies,
2. Analyzing data that those studies produce.

The ultimate goal is to translate data into knowledge and


understanding.

Statistics is the art and science of learning from data.


Three aspects of a study
1. Design: Planning how
to obtain data

2. Description:
Summarizing the data

3. Inference: Making
decisions and
predictions www.icts.uiowa.edu
1st Aspect of a Study: Design
How do we conduct the
experiment or select people
for the survey to insure
trustworthy results?
Design Examples:
1. Planning data collection
to study effects of
Vitamin E on athletic
strength fineartamerica.com

2. For a marketing survey,


selecting people to
provide proper coverage
2nd Aspect of a Study: Description

Summarize raw data and


present in useful formats
(e.g., average, charts or
graphs)
Description Examples:
 A graph showing total
precipitation in
Clarksville for each www.emecogroup.org

month of 2005
 Average age of students
in a statistics class is 25
years
3rd Aspect of a Study: Inference

Make decisions or predictions


based on the data
Inference Examples:
 Relationship between
smoking cigarettes and
getting emphysema
 47% of the registered
voters in Regina will vote
in the primary
Ladder of Inference
www.reply-mc.com
Descriptive vs. Inferential Statistics

 Descriptive statistics
summarize data –
graphs and numbers such
as averages and
percentages
 Inferential statistics make
decisions or predictions
about a population
based on data obtained
mallimages.mallfinder.com
from a sample of that
population.
Variable

A variable is any
characteristic that
changes or varies over
time and /or for different
individuals or objects
under consideration.

Example: Height, Weight,


IQ, Hair Color

www.thewallstickercompany.com.au
Definitions Contd.,
 Experimental unit: the individual or object on which
a variable is measured

 Measurement: results when a variable is actually


measured on an experimental unit
A set of measurements, called data, can be either a
sample or a population
Definitions Contd.,
 Population: the set of all measurements of interest
to the investigator

 Sample: a subset of measurements selected from


the population of interest
Example:
 Variable
 Hair color

 Experimental unit
 Person

 Typical measurements
 Brown, black, blonde, etc.
How many variables have you measured

 Univariate data: one variable is measured on a


single experimental unit

 Bivariate data: two variables are measured on a


single experimental unit

 Multivariate data: more than two variables are


measured on a single experimental unit
Types of Data
Qualitative Variable
 Measure a quality or characteristic on each
experimental unit

 Examples:
◼ Hair color (black, brown, blonde…)
◼ Make of car (Dodge, Honda, Ford…)
◼ Gender (male, female)
◼ Province of birth (Alberta, Ontario…)
Quantitative Variable

A variable is called quantitative if observations take


numerical values for different magnitudes of the
variable.

 Examples:
1. Age
2. Number of siblings
3. Annual Income
Quantitative Variables
 Discrete: if it can assume only a finite or countable
number of values

 Continuous: if it can assume the infinitely many


values corresponding to the points on a line interval
Discrete Quantitative Variable

 A quantitative variable
is discrete if its possible
values form a set of
separate numbers:
0,1,2,3,….
 Examples:
1. Number of pets in
a household
2. Number of children
in a family
3. Number of foreign
languages spoken upload.wikimedia.org

by an individual
Continuous Quantitative Variable

 A quantitative variable
is continuous if its
possible values form an
interval
 Measurements
 Examples:
1. Height/Weight
2. Age
3. Blood pressure
www.wtvq.com
Graphing Qualitative Variables

Use a data distribution to describe:

 What values of the variable have been


measured

 How often each value has occurred:


 Frequency
 Relative frequency = Frequency/n
(where n = sample size)
 Percent = 100 x Relative frequency
Graphs for Categorical Data

Example: In a survey concerning public education, 400


school administrators were asked to rate the quality of
education in Canada
GRAPH TYPES: BAR CHART
24
PIE CHART
Angle = Relative Frequency × 360°
Dot plots

The simplest graph for quantitative data, dot plots


plot the measurements as points on a horizontal
axis, stacking the points that duplicate existing
points
Example: the set 2, 3, 6, 6, 7, 9
Interpreting Graphs: Location and Spread

Where is the data centred on the horizontal axis,


and how does it spread out from the centre?
Interpreting Graphs: Shapes

Mound shaped and symmetric


(mirror images)

Skewed right: a few unusually


large measurements

Skewed left: a few unusually


small measurements

Bimodal: two local peaks


Outlier

An outlier falls far from the rest of the data


Outliers
Describing data with numerical
measures
 Graphical methods may not always be sufficient
for describing data

 Numerical measures can be created for both


populations and samples
Measures of Centre
 Measure of centre: a measure along the horizontal
axis of the data distribution that locates the centre
of the distribution
Median
 Median: the middle measurement when the
measurements are ranked from smallest to
largest
 The position of the median is

.5(n + 1)

once the measurements have been ordered


Example

 The set: 2, 4, 9, 8, 6, 5, 3 n = 7
 Sort: 2, 3, 4, 5, 6, 8, 9
 Position: .5(n + 1) = .5(7 + 1) = 4th
Median = 4th largest measurement
• The set: 2, 4, 9, 8, 6, 5 n=6
• Sort: 2, 4, 5, 6, 8, 9
• Position: .5(n + 1) = .5(6 + 1) = 3.5th
Median = (5 + 6)/2 = 5.5 — average of the 3rd and 4th
measurements
Mode

 Mode: the measurement which occurs most


frequently
 In the set: 2, 4, 9, 8, 8, 5, 3
◼ The mode is 8, which occurs twice

 In the set: 2, 2, 9, 8, 8, 5, 3
◼ There are two modes—8 and 2 (bimodal)

 In the set: 2, 4, 9, 8, 5, 3
◼ There is no mode; each value is unique
Extreme Values

Symmetric: Mean = Median

Skewed right: Mean > Median

Skewed left: Mean < Median


Measures of Variability
 Measure of variability: a measure along the
horizontal axis of the data distribution that
describes the spread of the distribution from
the centre
The Range

 Range (R): the difference between the largest


and smallest measurements in a set
 Example: a botanist records the number of petals on
five flowers: 5, 12, 6, 8, 14
 The range is R = 14 – 5 = 9

It is quick and easy, but only uses 2 of the 5


measurements
The Variance
 Variance: a measure of variability that uses all
the measurements; it measures the average
deviation of the measurements about their
mean
 Example: a botanist records the number of petals
on five flowers: 5, 12, 6, 8, 14

45
x= =9
5
4 6 8 10 12 14

You might also like