You are on page 1of 31

1/31

Statistics

Descriptive Statistics

Shaheena Bashir

FALL, 2019
2/31
Outline

Introduction

Data

Graphic Presentation
Categorical Data
Numeric Data

Numeric Presentation

o
3/31
Introduction

Why Study Statistics


IThe field of statistics uses numerical information obtained
from samples to draw inferences about populations
I Statistics is a general intellectual method that applies
wherever data, variation and chance that are omnipresent in
modern life appear.
I Being able to provide sound evidence-based arguments and
critically evaluate data based claims are important skills that
all citizens should have.
I The study of statistics provides students with tools, ideas and
dispositions to react intelligently to information in the world
around them. Reflecting this need to improve students ability
to think statistically, statistical literacy and reasoning are
becoming part of the mainstream school and university
curricula in many countries.
o
As a consequence, statistics education is becoming a thriving field
4/31
Data

Data

o
5/31
Data

Data
Data, data, everywhere and we are forced to look at it.
I 63% of the people polled support the president’s decision
to· · ·
I scientists at a major research university report that treatment
of Parkinsons disease with a combination of drugs A and B
has the potential to extend remission of the disease by an
average of 2 years· · ·
I the nations trade deficit narrowed last month, for the first
time in· · ·
I the Dow Jones Industrial Average rose again today to a new
record high, marking the seventh consecutive day of record
highs, but the broader market· · ·
I despite the claims of robust economy, the % of families living
below the poverty line has not dropped substantially over the
past six months· · · o
6/31
Data

Data Presentation

Once you have collected data,


I What do the numbers indicate?
I What will you do with it?
For example, suppose you are interested in buying a house in a
particular area. You may have no clue about the house prices, so
you might ask your real estate agent to give you a sample data set
of prices. Looking at all the prices in the sample often is
overwhelming. A better way might be to look at the median price
and the variation of prices. The median and variation are just two
ways that you will learn to describe data. Your agent might also
provide you with a graph of the data.
o
7/31
Data

Data Types: Quantitative

I A quantitative variable is one for which the associated


observations will be numerical and, therefore, such that the
usual arithmetic manipulations make sense. Quantitative data
then correspond to the measured numerical values of a
quantitative variable
1. Discrete: number of day-scholars, number of successful
candidates in CSS, etc.
2. Continuous: housing prices, rainfall, heights, etc.

o
8/31
Data

Data Types: Categorical/Qualitative

I A categorical variable is one for which the associated


observations are simply listings of physical characteristics or
traits of the subjects or objects being studied. For example,
eye color is a categorical variable, with categories brown, blue,
green, etc. Categorical data then correspond to observed
sample counts in each of the possible categories
1. Nominal: having the disease vs not having
2. Ordinal: pain on lickert scale (0-10; no pain to excruciating
pain), etc.

o
9/31
Data

Data Presentation

The goal of statistics is to help researchers organize and interpret


the data. Data can be described and presented in many different
formats.
I Graphic Presentation
I Numeric Presentation

o
10/31
Graphic Presentation
Categorical Data

I The main purpose of some studies is to see how a set of data


is distributed across a small set of categories or classes.
I If each observation falls into exactly one of the classes, we say
that the classes partition the data collection.
I For example, the classes urban, suburban, and rural partition
new housing construction
I categories cats, dogs, and ’other’ partition domestic animals.
I The classes or categories in a partition are exhaustive and
exclusive, meaning that they include every possible
observation and they do not overlap, respectively.

o
11/31
Graphic Presentation
Categorical Data

Contingency Tables

Tabular arrangement to present categorical data


Degrees earned in foreign langauges in 1992

BS MS PhD Total
M 3990 971 378 5339
F 9913 1955 472 12340
Total 13903 2926 850 17679

I What percent of females take Doctorate degrees?


I What percent of MS degree holders were men?

o
12/31
Graphic Presentation
Categorical Data

Contingency Tables

Gender × Treatment Type

Controls Treated Total


M 42(75%) 41(73.2%)
F 14(25%) 15(26.8%)
Total

o
13/31
Graphic Presentation
Categorical Data

Bar graphs and pie charts that can be used to provide visual
summarization of categorical data.
I Bar graphs can be used to display either the actual frequency
counts or the relative frequencies (sample percentages) in a
sample.
I Pie charts, on the other hand, are designed solely to visualize
relative frequencies (sample percentages). It is most
commonly used when we wish to pictorially display the sample
relative frequencies, or percentages, rather than the raw
frequencies for the various classes of the partition. It is
particularly effective for displaying differences between two
populations with respect to the same categories.

o
14/31
Graphic Presentation
Categorical Data

Cars Data Set

mpg cyl disp hp drat wt qsec


Mazda RX4 21.00 6.00 160.00 110.00 3.90 2.62 16.4
Mazda RX4 Wag 21.00 6.00 160.00 110.00 3.90 2.88 17.0
Datsun 710 22.80 4.00 108.00 93.00 3.85 2.32 18.6
Hornet 4 Drive 21.40 6.00 258.00 110.00 3.08 3.21 19.4
Hornet Sportabout 18.70 8.00 360.00 175.00 3.15 3.44 17.0
.. ..
. .

o
15/31
Graphic Presentation
Categorical Data

Bar Chart
Car Distribution

14
12
10
8
6
4
2
0

3 4 5
o
Number of Gears
16/31
Graphic Presentation
Categorical Data

Stacked Bar Chart


Car Distribution by Gears and Engine Shape

s
14

v
12
10
Counts

8
6
4
2
0

3 4 5
o
Number of Gears
17/31
Graphic Presentation
Categorical Data

Pie Chart
Car Distribution

3 gears 47%

5 gears 16%

4 gears 38%

o
18/31
Graphic Presentation
Numeric Data

o
19/31
Graphic Presentation
Numeric Data

Dot Plot

A dotplot of a set of quantitative data is a technique for grouping


observations that are equal. The horizontal axis is the scale of the
variable being measured and a dot is placed above the value of
each observation. Stacking the dots vertically above the outcome
represents repeated values. This form of graphical display is only
useful if there are a limited number of distinct outcomes among
the sample data.

o
20/31
Graphic Presentation
Numeric Data

Dot Chart
Collection A

4 6 8 10 12

Collection B

4 6 8 10 12
o
21/31
Graphic Presentation
Numeric Data

Stem and Leaf Plot

Another way to display quantitative data for which the number of


observations is not too large is known as a stem plot (or a stem
and leaf display).
I The stem usually corresponds to the first digit (or digits) in a
number and
I the leaf then represents the final digit.

o
22/31
Graphic Presentation
Numeric Data

Stem and Leaf Plot

I To produce the stem plot for a given choice of stem and leaf,
the stems are listed in a column from smallest (at the top) to
the largest (at the bottom).
I Then the leaf for each observation is recorded to the right in
the row of the display containing the observation’s stem.
I For ease of interpretation, the leaves are also usually sorted
from smallest to largest within a given stem.

o
23/31
Graphic Presentation
Numeric Data

Example

Present the below data of scores of 10 students, in stem & leaf


display.
3, 4, 5, 6, 7, 9, 10, 11, 12, 13

o
24/31
Graphic Presentation
Numeric Data

Histogram
A histogram is a graph of numerical data for different categories of
events, individuals, or objects.
Step 1: Divide the range for the observed data values into a
reasonable number of interval classes of equal width.
Step 2: Record the number of observations in each class,
either as a straight count or as a percentage of the
total number of observations in the data collection.
Thus, this step creates either a frequency or relative
frequency table for the data collection and our
particular choice of interval classes.
Step 3: Graphically display the histogram. The horizontal
axis for this display corresponds to the units of
measurement for our observations, divided into the
interval classes specified in Step 1. Either frequency
or relative frequency is plotted on the vertical axis. o
25/31
Graphic Presentation
Numeric Data

Histogram
Distribution of Female Weights

60
50
40
Frequency

30
20
10
0

40 45 50 55 60 65 70 75
o
Weight in Kg
26/31
Numeric Presentation

I Frequency Distribution

o
27/31
Numeric Presentation

Frequency Distribution

I A frequency distribution is a tabular arrangement of data that


indicates the individual number of events, individuals, or
objects in the separate categories.
I A cumulative frequency distribution indicates the successive
addition of the number of events, individuals, or objects in the
different categories of the histogram, which always sums to
100%.

o
28/31
Numeric Presentation

Example
Sam’s team has scored the following numbers of goals in recent
games

2, 3, 1, 2, 1, 3, 2, 3, 4, 5, 4, 2, 2, 3
Put the number in order
1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5

Score Frequency
1 2
2 5
3 4
4 2
5 1
o
https://www.mathsisfun.com/data/frequency-distribution.html
c
29/31
Numeric Presentation

Example: Grouped Frequency Distribution


These are the numbers of newspapers sold at a local shop over the
last 10 days:
22, 20, 18, 23, 20, 25, 22, 20, 18, 20
Paper Sold Frequency
18 2
19 0
20 4
21 0
22 2
23 1
24 0
25 1
https://www.mathsisfun.com/data/frequency-distribution-
c
o
grouped.html
30/31
Numeric Presentation

Paper Sold Frequency


15-19 2
20-24 7
25-29 1

o
31/31
Numeric Presentation

Frequency Distribution of Weights

Weight Frequency
40-45 4
45-50 24
50-55 77
55-60 61
60-65 26
65-70 7
70-75 1

You might also like