Pertemuan 08 PDF

Basic Biostatistics
By. Oczhinvia Dwitasari, M.Si.

Population vs. Sample
Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

28 December 2017
Parameter vs. Statistics
Parameter
characteristic of the whole population
Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Statistics
characteristic of a sample, presumably
measurable.
5
28 December 2017
Statistics estimate parameters
 Representative
 Sampling error

 Different samples yield different estimates
 Statistics = Parameter if sampling done properly
 How to prove?
6
28 December 2017
N=6 N=6 N=6
Red = 3/6 = 50% Red = 2/6 = 33.3% Red = 4/6 = 66.7%

50% + 33.3% + 66.7%
Average % Red = = 50%
3
Statistics
N = 35
Red = 18/35 = 51.4% 50%  51.4%
Parameter Statistics Parameter
7
28 December 2017
Variable & its role
 A value
and whose associated value may be
changed

Independent Dependent
8
28 December 2017
Causation
 Relation of events (cause and effect)
 But correlation (between two events) does not

(always) imply causation
 Rooster's crow does not cause the sun to rise
 Switch does not cause the bulb to light
9
28 December 2017
Hill’s criteria
1. Strength of 6. Plausibility
association 7. Coherence

2. Consistency 8. Experiment
3. Specificity 9. Analogy
4. Temporality
Hill AB. The environment and disease:
5. Biological gradient Association or causation? Proceed Roy Soc
Medicine – London. 1965;58:295–300.
11
Exposure
Exposure
Exposure
Time & causation
Outcome
Time

12
28 December 2017
Time & causation (example)
Exposure to
Age silica Lung Cancer

Time
Smoking
13
28 December 2017
Causal web
 Web of causation
 Conceptual framework

 Path analysis/web
 Relationship between variables
 Cause and effect
14
15
28 December 2017
Exposure
Mediator
Outcome Exposure Exposure

Exposure
Exposure
Confounder
Effect modifier or Moderator
Slide
16
17
18
http://www.apa.org/science/about/psa/2008/06/ahn.aspx

19
http://www.apa.org/science/about/psa/2008/06/ahn.aspx

20
2017
28 December
Type of data
(Level of measurement)
Categorical Numerical
Basic Biostatistics (C) Jamalludin

Ab Rahman 2015
Nominal Ordinal Discrete Continuous
e.g. Gender, Race e.g. Cancer e.g. Parity, e.g. Hb, RBS,
staging, Severity Gravida cholesterol.
of CXR for PTB
21
28 December 2017
Distribution (shape) of data
 Applicable to numerical value
 Discrete or Continuous

 Discrete ~ Binomial, Poisson, Negative Binomial,
Hypergeometry, Multinomial etc.
 Continuous ~ Normal, t, chi-square, F etc.
22
28 December 2017
Central limit theorem
“Given a distribution with a mean μ and
variance σ², the sampling distribution of the

mean approaches a normal distribution
with a mean (μ) and a variance σ²/N as N,
the sample size, increases” (David M. Lane)
23
28 December 2017
Normal Distribution
1 1 𝑥−𝜇
−2 ( 𝜎 ) 2
𝑓 𝑥; 𝜇, 𝜎 2 = 𝜎 𝑒 )
2𝜋

Why Normal?
- Because many biological
& psychological variables
are distributed normally
24
 Bell
 Unimodal
 Symmetrical
shaped curve
Characteristics

25
28 December 2017
Test of Normality
 Anderson–Darling Test
 Corrected Kolmogorov–Smirnov Test (Lilliefors Test)

 Cramér–von-Mises Criterion
 D'agostino's K-squared Test
 Jarque–Bera Test
 Pearson's Chi-square Test
 Shapiro–Francia
 Shapiro–Wilk Test
26
28 December 2017
Use Normality test with caution
 Small samples almost always pass a normality
test. Normality tests have little power to tell whether or not a small

sample of data comes from a Gaussian distribution.
 With large samples, minor deviations from
normality may be flagged as statistically
significant, even though small deviations from a normal distribution
won’t affect the results of a t test or ANOVA.
27
28 December 2017
Why run statistical test?
1. Measure magnitude of event
2. Determine presence of difference (or similarity)

3. Determine degree of difference
4. Determine the direction of changes (trend)
5. Predict changes (outcomes)
28
28 December 2017
Is there any difference
between A & B?
Which one is taller? A or B?
How big is the difference

between A & B?
Is C different from A & B?
Is there any pattern now?
If there will be D, can you

predict how tall is D?
A B C
29
28 December 2017
Statistical analysis
Descriptive Analytical

Univariable Bivariable Multivariable
IV DV IV DV IV DV IV
e.g. Describe socio- e.g. Compare demographic e.g. How demographic

demographic characteristics - characteristics between two characteristics (more than one
Age, Sex, Race etc. population – Compare age factors) explain hypertension
between male & female
e.g. Prevalence of
hypertension. e.g. Distribution of gender by
hypertension status 30
28 December 2017
Descriptive statistics
Basic Biostatistics (C) Jamalludin Ab Rahman 2015 31

28 December 2017
Descriptive Statistics
 Explainone variable at one time
 Method based on type of measure

Categorical
Frequency (Percentage)
Numerical
Central measures (e.g. mean, median) & Dispersion
(e.g. variance, standard deviation, range, min-max,
interquartile range)
32
28 December 2017
How to describe a data
Frequency
Categorical (count) &
Percentage

Data
Normal Mean (SD)
Numerical
Median
Not Normal
(Range/IQR)
33
28 December 2017
Analytical statistics
Basic Biostatistics (C) Jamalludin Ab Rahman 2015 34

28 December 2017
Comparing difference
Which of the following
A shows true difference
between two

populations?
B C
35
28 December 2017
3 methods to compare values
1. P-value
2. Confidence interval
3. Effect size
Basic Biostatistics
36
28 December 2017
P value
 P-value is ‘likely’ or ‘unlikely’ that Ho is true
 Taking 0.05 as the cut-off point (a), if P ≤ 0.05, it is
then ‘unlikely’ Ho is true, therefore reject Ho
Basic Biostatistics
37
28 December 2017
Hypothesis testing
Truth
Ho True Ho False

Do not Type II error
Correct
reject Ho ()
Test
Type I error
Reject Ho Correct
(a)
P-value is the probability to make Type I

error (based on frequentist inference) 38
28 December 2017
One-tailed vs. two-tailed
 Isthere a difference between Hb 14 g% vs. Hb 12
g% in male & female respectively?

Ho: HbM – HbF = 0
H1: HbM = HbF
H2: HbM > HbF
H3: HbM < HbM
 Note: Should be determined a priori
39
Two-sided
Left-sided
One-tailed vs. two-tailed
Right-sided

40
P & Sample Size

41
28 December 2017
The truth about P value
 Measures effectiveness (even by US FDA)
 < P means statistically significant, NOT clinical

significant
 But, be careful when interpreting P value
 P is affected BOTH by effectiveness AND sample size
 P can be < 0.05 even though the effectiveness is
marginal when sample size is huge
 Compare Ps between studies only appropriate if the
sampling & sample size is the same
42
28 December 2017
P < 0.05
 Why 5%?
 Cut-off point proposed by Sir Ronald A. Fisher

1925 to reject or not to reject a hypothesis
 If P < 0.05 = Probability to make Type I error is
less than 5%
 If P > 0.05, > 5% of the difference occurred by
chance & not due to the TRUE difference
43
28 December 2017
Hypothesis Testing using
bivariable analysis
 Try to prove that Exposure causes the Disease
e.g. Smoking causing Lung Cancer

 Ho: No difference of risk to get Lung Cancer
between smoker and non-smoker
44
28 December 2017
No Lung
Lung Cancer
Cancer
Smoking 20 (18.2%) 90 (81.8%)
Not Smoking 5 (4.5%) 105 (95.5%)

The occurrence of lung cancer is
significantly higher (18.2%) among
smokers compared to non-smokers (4.5%)
(2 (df=1)= 10.15, P =0.001, OR = 4.7 (CI95%
1.7 – 13.0))
45
28 December 2017
Confidence Interval
 Range of plausible values
 Narrow interval  high precision

Wide interval  poor precision
 How narrow is narrow? And how wide is wide?
Base on your clinical judgment
47
28 December 2017
Interpret single CI
 Compare with the null value
i.e. can be 0 for % or 1 for risk

 Compare with practical significance or the clinical
significance/indifference
A C
Null Null
B D
Null Null
Source: http://www.childrens-mercy.org/stats/journal/confidence.asp
48
A
B
Comparing multiple CIs

49
28 December 2017
Effect size
 Themeasure of effect irrespective of sample size
 Cohen (1988) classify effect size into

Low (<0.3)
Medium (0.3-0.7)
Large (> 0.7)
 Manual calculation or web based calculation
50
51
28 December 2017
Statistical Test
 Bivariable (univariate) ~ One dependent & one
independent

 Multivariate ~ Multiple dependent & multiple
independent variable
52
28 December 2017
What test to use?
Variable 1 Variable 2 Test
Categorical Categorical Chi-square
Categorical (2 pop) Numerical (Normal) Independent sample t-test

Categorical (2 pop) Numerical (Not Normal) Mann-Whitney U test
Categorical (> 2 pop) Numerical (Normal) One-way ANOVA
Categorical (> 2 pop) Numerical (Not Normal) Kruskal-Wallis test
Numerical (Normal) Numerical (Normal) Pearson Correlation Coefficient
Test
Numerical (Normal/ Not Numerical (Not Normal) Spearman Correlation Coefficient

Normal) Test
Numerical (Normal) Numerical (Normal) – Paired t-test

Paired
Numerical (Not Normal) Numerical (Not Normal) – Wilcoxon Signed Rank Test
Paired
53
28 December 2017
Bivariable Analyses
 Compare means
Independent sample t-test (Unpaired t-test) ~ Two unrelated

means
Paired t-test ~ Two related means
One-way ANOVA ~ More than 2 means
 2 Test ~ Between categorical variables
 Non-parametric tests (Kruskall-Wallis, Man-Whitney U
tests) ~ If data is not normally distributed
54
Writing plan for statistical analysis
#1
Data were analyzed using the complex sample function of SPSS
(version 13.0). Sampling errors were estimated using the primary
sampling units and strata provided in the data set. Sampling
weights were used to adjust for nonresponse bias and the
oversampling of blacks, Mexican Americans, and the elderly in
NHANES. The prevalence of hypertension, as well as the
awareness, treatment, and control rates, were age adjusted by
direct standardization to the US 2000 standard population.10 To
analyze differences over time, the 2003–2004 data were compared
with the 1999–2000 data. Estimates with a coefficient of variation
>0.3 were considered unreliable. A 2-tailed P value <0.05 was
considered statistically significant.
(Ong et al. 2009)
Writing plan for statistical analysis
#2
To assess the effect of the selection process on the characteristics of the
cases, we compared cases included in the final analysis to the rest of the
cases. Since controls included in the present analysis were different from
the rest of the diabetes free participants by design, no similar comparisons
were performed for that group. To compare baseline characteristics of
cases and controls appropriate univariate statistics were used. Similar
binary logistic and multiple linear regression models were built with incident
diabetes or HbA1c as respective outcomes and additive block entry of
adiponectin and potential confounders. For linear regression CRP and
triglycerides were log transformed. Since HbA1c could be modified by drug
treatment, we ran a sensitivity analysis excluding all participants on
antidiabetic medication. A p-value of <0.05 was considered significant.
Analyses were performed with SPSS 14.0 for Windows.
Reporting analysis (example)

57

58
28 December 2017
Summary
1. Identify & define variables
2. Type – independent vs. dependent

3. Level of measurements – nominal, ordinal or
continuous
4. Check distribution – Normal vs. Not Normal
5. Decide what to do - descriptive vs. analytical
60

Pertemuan 08 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pertemuan 08 PDF

Uploaded by

Copyright:

Available Formats

Basic Biostatistics

By. Oczhinvia Dwitasari, M.Si.

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Outcome Exposure Exposure

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Which one is taller? A or B?

How big is the difference

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Is C different from A & B?

Is there any pattern now?

If there will be D, can you

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

e.g. Describe socio- e.g. Compare demographic e.g. How demographic

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 31

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 34

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

P-value is the probability to make Type I

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Smoking 20 (18.2%) 90 (81.8%)

Not Smoking 5 (4.5%) 105 (95.5%)

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Numerical (Normal/ Not Numerical (Not Normal) Spearman Correlation Coefficient

Numerical (Normal) Numerical (Normal) – Paired t-test

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015 28 December 2017

Basic Biostatistics (C) Jamalludin Ab Rahman 2015

You might also like