You are on page 1of 31

Introduction to statistics

Noorazlinda Yacob
MSc (Medical Statistic), USM;
BPharm (Hons), UKM
1
INTRODUCTION TO
STATISTICS

2
What Is Statistics?

Statistics: The science of conducting studies to

collect organize

analyze summarize

and

draw conclusions from data.


3
Need for Statistics

Why study Statistics?

!May be called to conduct


research in your field.

!To become better


consumers and citizens.

4
TYPES OF
STATISTICS

5
Descriptive and Inferential Statistics

consists of the collection,


1 Descriptive statistics organization, summarization,
and presentation of data

consists of

2 Inferential statistics • generalizing from samples


to population,
. • performing estimations
hypothesis tests
• determining relationships
among variables,
6 • making predictions
Classification of variables

Variables

Qualitative/ Quantitative/
Categorical Numerical

Nominal Ordinal Discrete Continuous

7
Qualitative/
Categorical

Nominal Ordinal

•Unordered categories •Ordered categories


•No implied order among the •Ranked according to some
categories criteria
•EG:
•EG:
Race
BP – normal, high, low
Sex
Socioeconomic status-
Medical diagnosis
high,middle,low
8
Quantitative/
Numerical

Discrete Continuous

•Values that can be •Can assume any numerical


assumed only by whole value over a certain interval
numbers (gaps between (with decimal values)
values)
Eg:
•Eg:
-Height
•-number of students
-Weight
•-no of teeth extracted
-Age
9
•-no of accidents
Answer Exercise 1
Question Variables Measurement
scale
1.What is blood pressure Blood pressures Ordinal
level for this patient?
•Low
•Normal
•High

2.What is the height of this Height Discrete


patient?
_______ cm

10
DESCRIPTIVE
STATISTICS

11
Descriptive
statistic
• Simply describing the data

• Used to present quantitative


descriptions in a manageable
form

• Reduces lots of data into a


12
simpler summary
Organizing & Displaying data

Variables

Qualitative/ Quantitative/
Categorical Numerical

Nominal Ordinal Discrete Continuous

Measure of centrality
Frequency (%) Measure of dispersion

Mean (SD)
13 Median (IQR)
Categorical variables
— Statistics
- Frequency
- Percentage (%)

— Graphical
- Pie chart
- Bar chart
14
Pie chart
— Graphical presentation of frequency
distribution of categorical data (usually
nominal).
— Circle represent 3600, start at 12
o’clock.
Stone location among 111 cases in HKB, 2003 - 04
Each piece of
slice represent
each category
both

2.7%

Size of slice proximal

represent 41.4%

frequency or
percent distal

55.9%

15
Bar graph or chart
— Graphical presentation of frequency
distribution of categorical data (nominal
or ordinal). Height
represent
Figure 1: Gender distribution among 111 renal stone patients
80
frequency or
frequency

percent

70
Y axis:
Frequency or
relative freq 60
Bars of equal
Bars separated width
50
by equal gaps

40

30
male female

SEX X axis: Categorical variables


16
Frequency Tables
— Tables – organized data into values and
categories with titles and caption.
— Title: variables?, when?, where?, sample size
(n)?

— A frequency table may include:


— Categories - should be listed in some natural
order
— Frequency
— Cumulative Frequency
— Relative Frequency
— Proportion/Percent

17
Constructing a frequency
distribution table
— A survey was taken in Melaka. In each
of 20 homes, people were asked how
many cars were registered to their
households. The results were recorded
as follows:
1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2,
2, 3, 2, 3, 2, 1, 4, 0, 0
18
Examples of Frequency Table_1
(SPSS output)-without missing
values
Gender distribution in a sample of 111 patients
Gender distribution in a sample of 111 patients
Cumulative
Frequency Percent Valid Percent Percent
Valid male 40 36.0 36.0 36.0 Cumulative
female Frequency
71 Percent
64.0 Valid
64.0 Percent100.0 Percent
Valid male
Total 111 40 100.0 36.0 100.0 36.0 36.0
female 71 64.0 64.0 100.0
Total 111 100.0 100.0

stoneLocation

Cumulative
Frequency Percent Valid Percent Percent
Valid proximal 46 41.4 41.4 41.4
distal 62 55.9 55.9 97.3
both 3 2.7 2.7 100.0
Total 111 100.0 100.0
19
Examples of Frequency Table_1
(SPSS output)-with missing values

20
Examples of Frequency Table_2
(SPSS output)
Continuous data (age) was
grouped and converted into a
ordinal data (age group)

age group

Valid Cumulativ
Frequency Percent Percent e Percent
Valid 20below 4 3.6 3.6 3.6
21 - 30 6 5.4 5.4 9.0
31 - 40 18 16.2 16.2 25.2
41 - 50 30 27.0 27.0 52.3
51 - 60 24 21.6 21.6 73.9
61 - 70 17 15.3 15.3 89.2
71above 12 10.8 10.8 100.0
Total 111 100.0 100.0

21
Numerical variables
— Measures of central tendency
— Mean, Median, Mode

— Measures of dispersion/variability
— Variance
— Standard deviation
— Max, min, range, inter quartile range
22
General rule
— FOR SYMMETRIC DATA, QUOTE
THE MEAN AND STANDARD
DEVIATION.

— FOR SKEWED DATA, QUOTE


THE MEDIAN AND
INTERQUARTILE RANGE.
Normality
testing
23
DATA PRESENTATION
—Statistics
- Mean (SD)
- Median (IQR)

—Graphical
- Histogram
24 - Box Plot
Statistics Summary (SPSS)
Descriptives Statistics of Age among 111 patients, HKB 2004
Descriptives Statistics of Age among 111 patients, HKB 2004
Statistic Std. Error
Statistic Std. Error
AGE Mean 50.96 1.42
AGE Mean 50.96 1.42
95% Confidence Lower Bound 48.15
95% Confidence Lower Bound 48.15
Interval for Mean Upper Bound
Interval for Mean Upper Bound 53.78
53.78
5% Trimmed Mean 51.18
5% Trimmed Mean 51.18
Median 50.00
Median 50.00
Variance 224.562
Variance 224.562
Std. Deviation 14.99
Std. Deviation 14.99
Minimum 13
Minimum
Maximum 8013
Maximum
Range 6780
Range
Interquartile Range 21.0067
Interquartile
Skewness Range 21.00
-.139 .229
Skewness
Kurtosis -.139
-.114 .229
.455
Kurtosis -.114 .455

25
Descriptive Statistics :
presentation

Characteristics of renal stone patients, Hospital Melaka, 2003 -2013


(n = 111)
Mean (SD) Median (IQR) No. (%)

Age (years) 50.9 (14.99)

Stone size (cm) 50.0 (21)

Stone location
Proximal 46 (41.4)
Distal 62 (55.9)
Both 3 (2.7)

26
INFERENTIAL
STATISTICS

27
Inferential statistic
• Reach a conclusion about a
population on the basis of
information obtained from that
population

28
Inferential statistic
POPULATION

Infer back to
population SAMPLE

Statistical
conclusion
29
Dependent versus Independent
variable
Variable Definition

Dependent Outcome of interest


variable
Independent Risk factor
variable Exposure

30
— Example
— How much will a 1 gm of salt
change blood pressure in
mmHg in the Melaka
population?

X Y
Predictor variable Outcome variable
Amount of salt Blood pressure
31

You might also like