You are on page 1of 52

# Introduction to

biostatistics
Lecture plan

Basics
Variable types
Descriptive statistics:

1.
2.
3.

Categorical data
Numerical data

Inferential statistics

4.

Confidence intervals
Hipotheses testing

DEFINITIONS
STATISTICS can mean 2 things:
- the numbers we get when we measure and
count things (data)
- a collection of procedures for describing and
anlysing data.
BIOSTATISTICS application of statistics
in nature sciences, when biomedical and
problems are analysed.
2

????

## Basic parts of statistics:

Descriptive
Inferential

Terminology

Population
Sample

Variables

Variable types

Categorical (qualitative)

Numerical (quantitative)

Combined

Categorical data
Nominal

2 categories
>2 categories

Ordinal

Numerical data

Continuous
Discrete

Description of categorical
data
Arranging data
Frequencies, tables
Visualization (graphical
presentation)

Frequencies and
contingency tables
From those
who were
unsatisfied 4
were males,
6 were
females.

Total

Males Females

40
80%

14
77,8
%

26
81,3%

Unsatisfied 10
20 %

4
22,2
%

6
18,7%

Total

18
32
100% 100%

Satisfied

50
100%

10

Graphical presentation

11

Graphical presentation

12

Graphical presentation

13

Graphical presentation

14

Graphical presentation
Other:
- Maps
- Chernoff faces
- Star plots, etc.

15

Description of numerical
data

Arranging data
Frequencies (relative and cumulative),
graphical presentation
Measures of central tendency and
variance
Assessing normality
16

Grouping

Sorting data
Groups (5-17 gr.) according
researchers criteria.

17

## Frequencies, their comparison

and calculation
197
students
were
the
amount
of money
(litas)
in cash
at the

18

Gaphical presentation of
frequencies

19

Normal distributions
Most

## of them around center

Less above and lower central
values, approximately the
same proportions
Most often Gaussian
distribution

20

## More observations in one part.

21

Asymmetrical distribution

22

## How would you

describe/present your
respondents if the data are
numeric?
2 groups of measures:
1. Central tendency (central
value, average)
2. Variance

23

MEASURES OF CENTRAL
TENDENCY

Means/averages (arithmetic,
geometric, harmonic, etc.)
Mode
Median
Quartiles

24

MEASURES OF CENTRAL
TENDENCY

## Arithmetic mean (X, )

25

1
2

MEASURES OF CENTRAL
TENDENCY

## Median (Me) the middle value or 50th

procentile (the value of the observation,
that divides the sorted data in almost
equal parts).
It is found this way

When

## n odd: median is the middle observation

When n even: median is the average of values
of two middle observations

26

MEASURES OF CENTRAL
TENDENCY

values

## Can be more than one mode

27

MEASURES OF CENTRAL
TENDENCY

## Quartiles (Q1, Q2, Q3, Q4) sample

size is divided into 4 equal parts
getting 25% of observations in each
of them.

28

Is it enough measure of
central tendency to
describe respondents?

29

MEASURES OF VARIANCE
Min and max
Range
Standard deviation sqrt of
variance (SD)
Variance - V= (xi - x)2/n-1
Interquartile range (Q3-Q1 or
75%-25%) IQRT

30

## What measures are to be used for

sample description?
If distribution is NORMAL

Mean
Variance (or standard deviation)

Median
IQRT or min/max

## Those measures are used also with numeric ordinal data

31

X, Mo, Me
Mean~Median~Mode,
SD ir empyric rule

32

EMPYRICAL RULE

## Number of observations (%) 1, 2 ir

2.5 SD from mean if distribution is
normal

33

Example

X=8
SD=2,5

-2SD

+2SD

34

Normality assessment
Summary

Graphical
Comparison of measures of central
tendency; empyrical rule (mean and
standard deviation)
Skewness and kurtosis (if Gaussian
=0)
Kolmogorov-Smirnov test
35

Boxplot
75th Procentile
75th Procentile
Mean( *)
Median
25th Procentile
25th Procentile
Outliers

Boxplot example
26,00
24,67
23,33
22,00
20,67
19,33
18,00
16,67
15,33
14,00
440

## Central limit theorem

Inferential statistics

Confidence intervals
Hipotheses testing

39

Confidence intervals
Interval where the true value
most likely could occur.

40

## The variance of samples

and their measures
X2, SD2; p2
X1, SD1; p1

X3, SD3; p3
X4; SD4; p4

, , p0
41

## The variance of samples and

confidence intervals

, p0

42

Confidence interval
Statistical definition:
If the study was carried out 100 times, 100
results ir 100 CI were got, 95 times of 100 the
true value will be in that interval. But it will
not appear in that interval 5 times of 100.

43

Confidence intervals

## (general, most common

calculation)
95% CI : X 1.96 SE

Xmin; Xmax

## Note: for normal distribution, when n is large

95% CI : p 1.96 SE

pmin ; pmax

44

SD
p
(
1

)
NN

Numeric data
(X )

Categorical data
(p)

45

## Width of confidence inerval

depends on:
a) Sample size;
b) Confidence level (guaranty - usually 95%,
but available any %);
c) dispersion.

46

Hipotheses testing
H0: 1=2; p1=p2; (RR=1, OR=1,
difference=0)
HA: 12; p1p2 (two sided, one
sided)

47

Hipotheses testing
Significance level (agreed 0.05).
Test for P value (t-test, 2 , etc.).
P value is the probability to get the
difference (association), if the null
hypothesis is true.
OR P value is the probability to get the difference
(association) due to chance alone, when the null
hypothesis is true.
48

Statistical agreements

## If P<0.05, we say, that results cant

be explained by chance alone,
therefore we reject H0 and accept HA.

## If P0.05, we say, that found

difference can be due to chance
alone, therefore we dont reject H 0.

49

Tests
Test depends on

Study design,
Variable type
distribution,
Number of groups, etc.

## Tests (probability distributions):

z test
t test (one sample, two independent, paired)
2 (+ trend)
F test
Fisher exact test
Mann-Whitney
Wilcoxon and others.

50

Inferential statistics
Summary

## P value tells, if there is statistically

significant difference (association).

## CI gives interval where true value can

be.

51

Inferential statistics
Summary

## Neither P value, nor CI give other

explanations of the result (bias and
confounding).

## Neither P value, nor CI tell anything

about the biological, clinical or public
health meaning of the results.
52