Biostatistics and Epidemiology: Bioepi

MODULE IN
BIOSTATISTICS AND EPIDEMIOLOGY
BioEpi
Department of Medical Laboratory Science
School of Natural Sciences
Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any form or by
any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without the prior written permission of SLU, is strictly
prohibited.
0
DESCRIPTIVE STATISTICS
AND MEASURES OF DISEASE
FREQUENCY
Congratu! ations for making it to module 3!
This module serves 1o provide you with the procedural knowledge on ihe
inilial an alysis of public health and medical quantitative data. It wi I cover the
measures of centra! tendency, dispersion, and location; and, the measures of
disease frequency. Some of these were already introduced in tne previous
modules you have read and, hopefully, understood and appreciated. Several
commonly used statistical tools that you might have encountered in your high
school mathematics and research classes will be covered witn emphasis on how
s uch analytical test results wi!I be made sense within the context of public
health and medicine
This module emphasizes t nat statistical analyses are not supposed to be

interpreted rigidIy within the context of statistical ana lysis alone, but has to be
re ated to tne data and nature of data wnere the analysis s applied in. You need
to read the main reference 1o have a prior understanding about concepts,
definitions and conditions of statistical tests.
Again, to help you keep track of your module tasks for this module, you are
provided in the next page with a self monitoring form. Take the time to tick on
the ”Yes” box for each act ivity that you finish, and be reminded about pending
activities th az you are yet to do. Remember that your success in achieving the
modu!e objeclives depends entire!y on how conscientious you are of your own
progress.
Happy learning!
MODULE SELF MONITORING FORM
DONE?
ACTIVITIES
YES NO
Read the Module Introduction, Module Contents, and Module Objectives ☐ ☐
Do Lec Activity 01 – What do I expect from this module? ☐ ☐
Read Lec Activity 02 ☐ ☐
Read Lec Activity 03 ☐ ☐
Do Lab Activity 01 ☐ ☐
Property of and for the exclusive use of SLU. Reproduction, storing in a retrieval system, distributing, uploading or posting online, or transmitting in any
form or by any means, electronic, mechanical, photocopying, recording, or otherwise of any part of this document, without
the prior written permission of SLU, is strictly prohibited. 1
MODULE CONTENTS
MODULE SELF MONITORING FORM 1
MODULE CONTENTS 2
MODULE OBJECTIVES 3
ENGAGE: MAKING CONNECTIONS 4

LECTURE ACTIVITY 01 – WHAT DO I EXPECT FROM THIS MODULE? 4
EXPLORE: LOOK UP 5
LECTURE ACTIVITY 02 – CONCEPTS AND RELATIONSHIPS 5
EXPLAIN: HOW DO THESE ALL COME TOGETHER? 6

LECTURE ACTIVITY 03 – READ THEN APPLY 6
UNIT 1: DESCRIPTIVE STATISTICAL PROCEDURES AND THEIR APPLICATIONS 6
UNIT 2: CALCULATIONS AND NARRATIONS IN DESCRIPTIVE STATISTICS 8
LABORATORY ACTIVITY 01 – DESCRIPTIVE STATISTICS IN EXCEL® USING ITS DATA SOLVER AND
THE REALSTATISTICS® ADDIN 8
ELABORATE: ALTERNATIVE DESCRIPTIVE STATISTICAL ANALYSIS USING EXCEL 17

LABORATORY ACTIVITY 02 - ANALYSIS OF THE MEAN 17
EVALUATE: APPLICATION 21
LABORATORY ACTIVITY 03 – ANALYZE THESE 21
REFERENCES/SOURCE MATERIALS 23
MODULE OBJECTIVES
After studying this module, you should be able to:
1. Select appropriate descriptive statistical procedures for a given clinical and public health data set;
and,
2. Perform descriptive statistical analyses and measures of disease frequency procedures on clinical
and public health data.
This module is divided into two units as follows:
Unit 1: Descriptive statistical procedures and their applications Unit
objectives:
1. Recall definitions of descriptive statistics terminologies;

2. Relate descriptive statistical procedures with data characteristics;
3. Use appropriate statistical procedures for a given data set. Unit
2: Calculations and narrations in descriptive statistics
Unit objectives:
1. Compute for key descriptive statistics measures;

2. Make narratives for descriptive statistics analyses results in report format;
3. Compute for measures of disease frequency;
4. Make narratives for measures of disease frequency results in report format.
ENGAGE: MAKING CONNECTIONS
LECTURE ACTIVITY 01 – WHAT DO I EXPECT FROM THIS MODULE?
Write your expectations about this module in the provided space below.
Why do we want to know your expectations from this module? So that we can have a basis for deciding
later if we do share the same expectations – on whether we are on the same page, so to speak. On top of
that, we do also have our expectations from you.
EXPLORE: LOOK UP
LECTURE ACTIVITY 02 – CONCEPTS AND RELATIONSHIPS
From the previous modules, you should be familiar with the following concepts. If the two concepts are
related, write “conceptually related” inside the box provided in each item.
Otherwise, write “conceptually unrelated”. After deciding whether the concepts are related or not,
provide a one-sentence explanation as to how these concepts are related or not.
Concept 1 Concept 2 Related? How?
Measures of central
Mean
tendency
Measures of dispersion Mode
Mean Standard deviation
Median Interquartile range
Prevalence Counts
If most of these terms are still unfamiliar to you, you may read Sections 2.4 and 2.5 (pages 38-52)
of Biostatistics: A Foundation for Analysis in the Health Sciences 10 th edition by Daniel
and Cross. Make sure to take notes as needed using the module and unit objectives as a guide when you
are reading these references.
EXPLAIN: HOW DO THESE ALL COME TOGETHER?
LECTURE ACTIVITY 03 – READ THEN APPLY
UNIT 1: DESCRIPTIVE STATISTICAL PROCEDURES AND
THEIR APPLICATIONS
Remember that descriptive statistical procedures can be applied to summarize large data into
packets that can be made sense of as it gives us a description of the entire data. For quantitative data, the
categories for summarizing them include measures of central tendency, measures of dispersion, and
measures of location. For categorical data, the summary procedures include counts and the relative
measures – ratio, proportion, and rate. The definitions and descriptions of these terminologies should
already be quite clear to you at this point after reading the previous modules and the identified references.
For now, let us look into the applications of these measures as shown in Table 1.
Table 1. Applications of descriptive statistical procedures

Measures Applications
QUANTITATIVE DATA
Measures of central tendency
Reported when the data is normally distributed (henceforth referred to simply as
Mean “normal”); reported together with the standard deviation (mean±SD); reported with the
same unit as the original observation
Reported when the data is not normally distributed (henceforth referred to as

Median “nonnormal”); reported together with the range or interquartile range (IQR); reported
with the same unit as the original observation
Reported when the objective is to focus on the most frequently occurring value
regardless of level of measurement (the most common disease among the residents, the
Mode
leading cause of depression, the most prescribed
anti-inflammatory drug, etc.)
Measures of dispersion
Reported when the objective is to emphasize the gap, or lack thereof, in
Range
observations made about a particular variable of interest
Reported as reference for determining further inferential statistical procedures applicable

Variance
to data (in determining scedasticity of data)
Standard
Reported together with the mean as reference for statistical inference
deviation
Coefficient of Reported when the units of measurements of the variables being compared are different
variation or the means being compared are markedly different
Measures of location
Reported when the emphasis is on a point in a distribution within a certain
Quartiles
quarter of the distribution
Reported when the emphasis is on a point in a distribution within a certain tenth
Deciles
of the distribution
Reported when the emphasis is on a point in a distribution within a certain
Percentiles
hundredth of the distribution
CATEGORICAL DATA
Reported when a single occurrence of an event is important, such as an
Counts
infectious disease in a community
Relative measures
Reported when the emphasis is on the occurrence of an event over another event, such as
Ratio the number of males infected with a disease in a community over
the number of females infected with the same disease in that same community
Reported when the emphasis is on the such as males infected with a disease in a
Proportion community over all those who are infected with the same disease in that same
community
Reported when the emphasis is on the relationship between an event and a
Rate
defined population at risk over a specified time period
Notice that the characteristic of the data and the context for which the data were collected play
major roles in determining what descriptive statistical procedures can be used to summarize them. When
we talk about the characteristic of the data at this point, we mean three things: (1) either the data was
collected from samples culled via random or non-random sampling techniques; (2) either the data is
normal or non-normal; and, (3) either the data exhibits homoscedasticity or heteroscedasticity. The first
item lays the foundation for the assumption of representativeness of the data. The second item deals with
the distribution of the data and this is crucial since most common statistical procedures assume normally
distributed data. The third one has something to do with the distribution of error terms in the data –
whether equal variances are assumed or unequal variances are assumed. These assumptions are vital
considerations when you will work with inferential statistics in Module 4 but they are worthwhile
mentioning now as you have just encountered how normal or non-normal data distribution affects the
choice of descriptive statistics that you should be using. For the descriptive measures for categorical data,
these will be fully utilized in Module 5 when you learn about the health indicators and in Module 6 when
you look into the epidemiology of communicable and non-communicable diseases.
Regarding the normality of data, exploratory data analysis involving some other statistical
procedures such as the Shapiro-Wilk test and the d’Agostino-Pearson test can be performed on raw
data to know whether it is normally or non normally distributed. D’ Agostino Pearson quantifies how far
data points distribution are from the Normal or Gaussian Curve in terms of asymmetry and shape by
computing skewness and kurtosis. An alternative measure of normality is the Shapiro wilk test, which
works very well if every value is unique thus, does not work if values are identical, that is why D’
Agostino pearson is preferred.
Descriptive data analysis sometimes involves graphical representation of data for ease of comprehending
it. Most often, graphical techniques include histograms and box plots. Analyzing these graphs may also
provide us with clues on whether we are working with normal or non-normal data.
The least that is expected of you after completing this module would be for you to calculate and
report the appropriate measure of central tendency with its appropriate measure of dispersion when the
data is normal or non-normal. Let us focus on how we do these in the next unit.
UNIT 2: CALCULATIONS AND NARRATIONS IN

DESCRIPTIVE STATISTICS
LABORATORY ACTIVITY 01 – DESCRIPTIVE STATISTICS IN EXCEL® USING ITS
DATA SOLVER AND THE REALSTATISTICS® ADDIN
For this activity, you need a desktop or a laptop installed with Microsoft Excel ®. To install the
RealStatistics ® add-in, follow these steps
1. To install the RealStatistics® addin, copy the addin file XRealStats from your learning packet into
“Drive C > Users > [Your computer name] > AppData > Roaming > Microsoft > AddIns”
a. If you open your user name under “Users” and AppData is not in there, it might be hidden, so
unhide it.
To unhide hidden files and folders, in any open window click on “View” then click on the
“Options” icon. When a pop-up menu appears, click on the tab “View”, select “Show hidden
files, folders, and drives” then click on “OK”.
To activate your Excel Data Solver (if it is not yet activated) and the newly placed addin, follow these
steps
1. Open an Excel worksheet then click on the tab “File”

2. On the left panel, click on “Options” then you will see another pop-up menu. On the left panel of that
menu, click “Add-ins”
3. Near the bottom area, click on “Go…”
4. On the next pop-up menu, click on the boxes for the add-ins: “Analysis ToolPak”, “Analysis
ToolPak VBA”, “Solver Add-in” and “Xrealstat”. Click “Okay”. If “Xrealstats” is not in the list, you
might have copied the file into the wrong folder.
To check if the add-ins were activated successfully, follow these steps
1. Check the tab “Data”, the icon “Data Analysis” should be there
2. Another way is to press Ctrl + M, a pop-up menu “Real Statistics” should appear
Now, let us try doing some exercises. Consider Exercise 2.5.1 on page 53 of Daniel and Cross.
1. On cell A1 of your open Excel file, type the variable name “cell_counts”
then encode the data down column A. Save the file as
“BIOEPI_MODULE03_WORKSHEET”. By simply looking and
inspecting the individual values that you encoded, can you derive
information from them?
2. After saving, go to “Data” then click on “Data Analysis”. ON the pop-up menu, select
“Descriptive Statistics” then click “OK”.
3. The Descriptive Statistics pop-up menu will appear, click on the box for “Input Range” then
select and highlight cells A1 to A14. Check the checkbox to the left of “Labels in first row”.
Click on the box for “Output Range” then click at cell C1 or anywhere where you wish your
output will appear. [Selecting “New Worksheet Ply” will make your output appear in a new
worksheet while choosing “New Workbook'' will place it in another workbook.] Check the
checkbox to the left of “Summary statistics”. Click “OK”.
4. You now have the descriptive statistics for your data.
5. Since no numerical or graphical techniques were employed to determine normality of data,

you are left with reporting your results under the assumption that data is normal and data is
non-normal.
Our sample narrative for the results section would be as follows:
Assuming normal data (report mean and Assuming non-normal data (report median
standard deviation) and range)
“The baseline CD4 T cell counts (x106/L) for the 13 “The baseline CD4 T cell counts (x10 6/L) of the study
study participants has a mean±SD equal to participants ranged from 58 to 313 with median of
193.62±74.62.” 205.”
or
“The mean baseline CD4 T cell counts (x10 6/L) of the

study participants is 193.62 (SD=74.62, n = 13).”
When reporting results, we are again guided by the characteristics of the data, as was
demonstrated in here, and the context to which we are doing the analysis for. What was illustrated forms
the first part of a results format which is the presentation of data. Then finally you corroborate by relating
it to findings of other studies. In essence, narratives for descriptive statistics provide the starting point for
the results section.
Reporting results within the context of the study may

also be done to emphasize descriptive statistics values that are
crucial towards understanding the overall study. The data may be
normally distributed so you report the mean with the standard
deviation but focusing on the mode makes you and your target
audience understand the fact that the most ingested vitamin
supplement in a community is the one that is associated with the
prevalence of a certain condition, say for example.
Regarding the matter of how many decimal places you should be reporting, the convention is to
report statistical results with the same decimal places as the data. Journal editors do have different views
on this matter. For the purposes of this module, we shall stick with rounding off answers to two decimal
places.
Doing the analysis using the RealStatistics add-in starts with data encoding.
1. Start with Step 1 as is above. After encoding, press Ctrl+M. The Real Statistics pop-up menu
should appear.
2. Select “Descriptive Statistics and Normality” then click “OK”. The “Descriptive Statistics and
Normality” pop-up menu should appear. On the “Input Range” box, select then highlight cell A1
to A14, then on the “Output Range” box select cell C1 or any cell where you want your output to
be shown. Click on the tick boxes to the left of “Column headings included with data”,
“Descriptive statistics”, “Box Plot w/ Outliers” and “Shapiro-Wilk”. Click “OK”.
3. You now have more complete descriptive statistics results. Added features of this add-in are the
numerical (Shapiro-Wilk and d’Agostino-Pearson test) and graphical (boxplot) techniques to
determine normality of your data. Knowing this, you can now make the appropriate narrative for
your results section.
Are the results similar to that of the Data Analysis add-in? How then are you going to narrate
your results?
Regarding the numerical techniques, the Shapiro-Wilk Test and the d’Agostino-Pearson Test
operate under the null hypothesis that the data follow a normal distribution. A p-value was computed to
test this hypothesis at 0.05 level of significance and, from your understanding of module 2, for the
Shapiro-Wilk test, the p-value computed
is greater than the alpha. Is this statistically significant? No, it is
not. What should the action be? Fail to reject the null hypothesis
and so it is retained. Is the data normally distributed? Yes, it is.
The same is true with the d’Agostino-Pearson test. Start working
your way around these somewhat confusing concepts since you
will be meeting and making a lot of them in Module 4.
For the boxplot, if it is symmetric with the median line in approximately the center of the box and
with symmetric whiskers somewhat longer than the subsections of the center box, then these suggest that
the data have come from a normal distribution. Should a histogram be made for the data, normal
distribution is suggested by a histogram shape that approximates a bell curve.
Sometimes, the numerical techniques may give opposite results. In such cases, make your
decisions based on the graphical techniques.
Now, try doing the other exercises on page 53 (items 2.5.2, 2.5.3, and 2.5.4) of Daniel and Cross.
Show the descriptive statistics results just like the examples above then come up with appropriate
narratives for each result. Do the descriptive measures provide you with a better understanding of the data
as opposed to inspecting them individually?
Most often, descriptive statistics results provide the basis for the application of further inferential
statistical procedures to the data. Doing descriptive statistics can allow you to make something like this…
…into something that can be made sense out of such as this.

Source: Egan, H., Isbister, G.K., Robinson, J., Downes, M., Chan, B.S, Vecellio, E. &
Chiew, A.L. (2019). Retrospective evaluation of repeated supratherapeutic ingestion (RSTI)
of paracetamol. Clinical Toxicology, 57:8, 703-711.
For the measures of disease frequency, remember that epidemiology is very much invested in
assessing the health of a population and it would want to know answers to questions such as how many
infants in a barangay have measles in April 2018, or what is the rate at which new cases of measles occur.
These statistics generally fall under incidence or prevalence.
Incidence describes the number of new cases of a condition that occurred in a defined time
period. Prevalence describes the total number of cases with the condition at any point in time. These
measures of disease frequency are needed to generate measures of association (evaluation of the
association between exposure and outcomes), and both are needed to get measures of impact
(determination of the impact of removal of an exposure on the outcome). To learn more about measures
of disease frequency, read Chapter 3 (pages 161 to 212) of Epidemiology for public health practice
5th edition by Friis and Sellers and Measures of Morbidity (pages 41-58) of Epidemiology 5 th
edition by Gordis. Applications of these will be introduced in Modules 5 and 6.
Incidence and prevalence rates may be reported as percentages or proportions with the choice
depending on the clarity of presentation and the objective of the study. If these rates are reported as
percentages, sample sizes larger than 100 are reported to one decimal place, sample sizes between 20 and
100 are reported with no decimal places, and sample sizes less than 20 are never reported as percentages.
The following are some examples:
● “In a study involving 623 participants, 23.8% had pneumonia.”

● “In a study involving 54 out-of-school youths, 24% reported being bullied when they were
enrolled in school.”
● “In a study with 14 breast cancer survivor participants, 3 admitted to have used marijuana in their
treatment regimen.”
If the rates are reported as proportions, only one decimal place is used if the sample is less than
100 and two decimal places for samples 100 and above. The denominator can be changed to avoid several
decimal places for low rates. For example, 0.0039% can instead be reported as 3.9 cases per 100,000
smokers.
As with descriptive statistics results reporting, presentation is followed by interpretation then

possibly corroboration.
ELABORATE: ALTERNATIVE DESCRIPTIVE STATISTICAL
ANALYSIS USING EXCEL
LABORATORY ACTIVITY 02 - ANALYSIS OF THE MEAN
Part 1. Determination of the Mean. In this activity, you will need a laptop or desktop computer
installed with Microsoft Excel. Open a new file and save as Lab Activity 02_Analysis of the Mean.
1. The data below is based on the diastolic blood pressure of 40 male participants after following a
diet regimen. Group A was advised to follow Diet Plan A, while Group B followed Diet Plan B.
Diastolic blood pressures were obtained after 30 days and are presented below.
Open a new worksheet in Microsoft Excel. Save the file in advance as Lab Activity 02_Analysis
of the Mean. Encode the data below on Cell A, Cell B and Cell C. Pay attention to the cells
where your actual data are encoded to avoid errors in computations.
Participant Diastolic Blood Pressure (mm Hg)
Number Group A (Diet A) Group B (Diet B)
1 72 75
2 74 73
3 72 75
4 70 70
5 85 80
6 85 85
7 101 120
8 98 100
9 110 120
10 100 120
11 90 78
12 70 79
13 78 85
14 85 95
15 89 92
16 90 93
17 93 120
18 82 89
19 100 105
20 82 88
2. On Cell B22, input the command: =AVERAGE(B2:B21) then press Enter. The command
means you are getting the average of all data from Cell B2 to Cell B21.
NOTE: You can change the range of cells to be included in the command depending on where you placed
your data. In this example, my data on Column B is encoded in Row 2 to Row 21.
Answer:
3. Do the same command to the cells in Column B using the command

=AVERAGE(C2:C21). HINT! You can also drag the answer in Cell B22 to Cell C23 to arrive
with the same average.
Answer:
4. Based on this information, which group has a higher mean diastolic blood pressure? Is this
enough to say that Diet Plan B is a more effective intervention to lower diastolic blood pressure
based on the available data? Why or why not?
Answer:
Part 2. Determining Standard Deviation
1. Using the same set of data in Part 1, input =STDEV(B2:B21) on Cell B23. This is the standard
deviation of the data in Cells B2 to B21. Write your answer below.
Answer:
2. Input =STDEV(B2:B21) on Cell B24. Write the standard deviation below.
Answer:
3. The standard deviation shows how near or far the data is relative to the mean, or how far each
individual data deviates from the mean of the sample. Make a statement comparing the standard
deviations of the two sets of data. In determining the effectiveness of the diet plan, which is more
preferred: small standard deviation, or large standard deviation? Why?
Answer:
Part 3. Interpretation of mean and standard deviation in a data set.
In this part of the activity, the pre-intervention and post-intervention diastolic blood pressure data were
tabulated.
1. Open a new file and input the data below.
Participant Diastolic Blood Pressure (mmHg) Group B

Number Group A Group B
Pre Post Pre Post
1 75 72 78 75
2 74 74 78 73
3 74 72 78 75
4 72 70 80 70
5 83 85 85 80
6 86 85 84 85
7 103 101 123 120
8 100 98 105 100
9 105 110 118 120
10 105 100 120 120
11 82 90 83 78
12 70 70 82 79
13 80 78 92 85
14 83 85 94 95
15 89 89 94 92
16 95 90 92 93
17 90 93 123 120
18 85 82 90 89
19 103 100 107 105
20 85 82 90 88
2. This time, subtract the post-intervention diastolic blood pressure data from the pre-intervention
diastolic blood pressure data per participant. Let us assume that your pre and post data for Group
A are on C2 and C3, respectively (please see the figure on the next page).
3. Choose another column to organize the difference between post-intervention diastolic BP and
pre-intervention diastolic BP. In this case, I used Column F. To perform the operation, use the
command =(B3-C3) then press Enter. Drag the answers down to compute for all diastolic BP
differences. Do the same for Group B and place your data on Column G.
4. Compute for the mean and standard deviations of the MEAN DIFFERENCES in Group A and
Group B. Record your data below using the format mean ± standard deviation.
Mean Difference on Diastolic BP (Group A): _____ ± _____

Mean Difference on Diastolic BP (Group B): _____ ± _____
Given these data, which group had a higher decrease in diastolic blood pressure? Which group
had a more consistent decrease in diastolic blood pressure? Why do you say so?
Answer:
Can the decrease in diastolic blood pressure sufficient in claiming that the diet regimen given in
the situation can be prescribed to lower diastolic blood pressure? Why or why not?
EVALUATE: APPLICATION
LABORATORY ACTIVITY 03 – ANALYZE THESE
Given:
WHERE:
➔ height_cm:Height in centimeters
➔ id: Participant Code
➔ weight_kg:Weight in kilograms
➔ group: Gender0 = male
1 = female
➔ sbp_mmHg:
➔ smo_stat:Smoking Status 0 = nonsmoker Systolic Blood Pressure in millimeters mercury
1 = smoker
➔ diet_pat:Diet Pattern ➔ dbp_mmHg
0 = High Carbohydrates Diastolic Blood Pressure in millimeters mercury
1 = High Protein
2 = High Fat
3 = No Diet Pattern ➔ hip_cm:Hip Circumference in centimeters
➔ fam_his_NCD: ➔ waist_cm:Waist Circumference in centimeters
Family History Of Non-communicable Diseases ➔ chest_cm:Chest Circumference in centimeters
0 = Absent ➔ wab_kg:Weight at Birth in kilograms
1 = Present
1. Encode the data set above in Sheet 1 of an Excel workbook. Save your file as Lab Activity 03_
Evaluate_[YOURFAMILYNAME,YOURGIVENNAME]
2. On Sheet 2, determine the percentages by gender, smoking status, diet pattern, and family history
of NCD. Prepare a tabular presentation of your data as shown in the example by Egan et al
(2019). Present your output in textual form in an inserted text box within the worksheet as shown
in the example below.
3. On separate worksheets (Sheet 3 = height; Sheet 4 = weight; Sheet 5 = sbp; Sheet 6 = dbp; Sheet
7 = hip; Sheet 8 = waist; Sheet 9 = chest; and Sheet 10 = wab), perform descriptive statistics
analysis using Real Statistics add-in for each of the indicated variables. Present your output in
textual form in an inserted text box within each worksheet.
4. Save your output in the appropriate folder in your OTG flash drive.
REFERENCES/SOURCE MATERIALS
Daniel, W.W. & Cross, C.L. (2013). Biostatistics: a foundation for analysis in the health sciences,
10th edition. New Jersey: John Wiley & Sons, Inc.
Friis, R.H & Sellers, T.A. (2014). Epidemiology for public health practice, 5th edition. Burlington.
MA: Jones % Bartlett Learning.
Gordis, L. (2014). Epidemiology, 5th edition. Pennsylvania: Elsevier Saunders.

Biostatistics and Epidemiology: Bioepi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biostatistics and Epidemiology: Bioepi

Uploaded by

Copyright:

Available Formats

MODULE IN

BIOSTATISTICS AND EPIDEMIOLOGY

Department of Medical Laboratory Science

School of Natural Sciences

This module emphasizes t nat statistical analyses are not supposed to be

ENGAGE: MAKING CONNECTIONS 4

EXPLAIN: HOW DO THESE ALL COME TOGETHER? 6

ELABORATE: ALTERNATIVE DESCRIPTIVE STATISTICAL ANALYSIS USING EXCEL 17

This module is divided into two units as follows:

Unit 1: Descriptive statistical procedures and their applications Unit

1. Recall definitions of descriptive statistics terminologies;

2: Calculations and narrations in descriptive statistics

1. Compute for key descriptive statistics measures;

Concept 1 Concept 2 Related? How?

Measures of dispersion Mode

Mean Standard deviation

Median Interquartile range

Table 1. Applications of descriptive statistical procedures

Reported when the data is not normally distributed (henceforth referred to as

Reported as reference for determining further inferential statistical procedures applicable

UNIT 2: CALCULATIONS AND NARRATIONS IN

1. Open an Excel worksheet then click on the tab “File”

To check if the add-ins were activated successfully, follow these steps

5. Since no numerical or graphical techniques were employed to determine normality of data,

“The mean baseline CD4 T cell counts (x10 6/L) of the

Reporting results within the context of the study may

…into something that can be made sense out of such as this.

● “In a study involving 623 participants, 23.8% had pneumonia.”

As with descriptive statistics results reporting, presentation is followed by interpretation then

3. Do the same command to the cells in Column B using the command

Part 2. Determining Standard Deviation

2. Input =STDEV(B2:B21) on Cell B24. Write the standard deviation below.

1. Open a new file and input the data below.

Participant Diastolic Blood Pressure (mmHg) Group B

Mean Difference on Diastolic BP (Group A): _____ ± _____

Gordis, L. (2014). Epidemiology, 5th edition. Pennsylvania: Elsevier Saunders.

You might also like

Mean Difference on Diastolic BP (Group A): _ ± _