You are on page 1of 4

BIOSTAT WEEK 1

Statistics values, and behavior in the production and


consumption of health and health care.
- Science dealing with the collection, organization,
 Genetics and Genomics: Heredity; genes &
analysis, and interpretation of numerical data.
function
- Art of summarizing data so that non-statistician
- play a role in health sciences
can understand it.
Genetics – study of genes and the way that certain
- Tool in decision making - formulation of good
traits and conditions that are passed down from
judgment
one gen to other gen.
- Method or Data - any information that is
Genomics – describes the study of all a person’s
concerning to a population or sample.
genes (genomes)
Ex of Data: Interview, Questionnaire (survey), observe,
Branch of Statistics
documents or record, focus group, oral history, case
study  Descriptive Statistics
– Methods of summarizing and presenting data
Uses of Statistics
– Computation of measures of central tendency
 Data reduction technique – transformation of and variability
numerical or alphabetical digital information that – Tabulation and graphical presentation
is derived empirically or experimentally into a – Facilitate understanding, analysis, and
corrected, ordered, and simplified form. interpretation of data.
 Tool for analyzing research projects and clinical – Devoted in summarization and description of
trials data, it consists of methods for organizing and
 objective appraisal and evaluation of programs summarizing information
 in decision making process and policy making – Calculation of various descriptive measures –
means, measures of variation, & percentiles
Biostatistics – It describes the most important characteristic of
- Bio – life; Statistics – data, a given set of data
- it is a special branch of statistics which deals – Contains exact numbers
with quantitative and qualitative aspects of vital  Inferential Statistics
phenomena – Methods of arriving at conclusions and
- Term that is used when the focus is on the generalizations about a target population based
biological and health sciences instead on stats on information from a sample
only. – Estimation of parameters and hypotheses
testing
Health Statistics - data required in the planning, – Consists of methods for drawing and measuring
administration, and evolution of health programs the variability & reliability of conclusion about
Uses of Biostatistics population that is based from information
obtained from a sample population
 Epidemiology - distribution & determinants of – Can make prediction
health-related states and events. – Branch of stats that is concerned with using
- A method that is used to find the causes of sample data to make an inference about a
health outcomes and diseases in population population of data
 Demography - study of human population
Terms in Biostatistics
Ex: age, gender, educ, nationality, ethnicity, &
religion  Population – all members of a specified group
 Health Economics - functioning of health care  Sample – subset of a population
system and health affecting behaviors  Parameter – measure of a characteristic of a
- Branch of economics that is concerned with population
issues that is related in efficiency, effectiveness,
BIOSTAT WEEK 1
 Constant – value of a characteristic that remains  Nominal - simply used as names or identifiers of
the same from person to person, from time to a category
time or from place to place - always qualitative
 Variable – characteristics takes on different - does not represent any amount or quantity
value Ex: Names, labels, categories,
 Ordinal - represents an ordered series of
Types of Data
relationships, may be qualitative or quantitative
According to Source Ex: sequence, level, order, ranks, scaling
 Interval – does not have a true-zero value
 Primary Data – coming from the researcher starting point
- via interview, experiment, questionnaires - Always quantitative
 Secondary Data – comes from another source Ex: Temp (F), standardized score,
- book, journal, newspaper, thesis, &  Ratio – Modified interval level which includes
dissertation. zero as a starting point
According to Functional Relationship - Always quantitative

 Independent – refers to any controlling data, Nominal Named variables


- anything that can be manipulated, (treatment) Ordinal + Ordered variables
 Dependent – any data that is affected by Interval + Proportionate interval between
variables
controlling data
Ratio + Can accommodate absolute zero
Categories of Data

Types of Variables True Equal Or Cate


Scale Example
Zero Interval der gory
 Qualitative - descriptions or labels to distinguish Marital
one group from another status, sex,
- Uses categories or attributes that can Nominal x x x /
gender,
distinguish non numeric characteristics ethnicity
- Ex: gender, marital status, eye color, address, Student
ethnicity, religion, etc. Ordinal x x / / grade, NFL
 Quantitative - can be measured and ordered rankings
according to quantity or amount and expressed Temp (F),
numerically Interval x / / / SAT scored,
� - Consist numbers that represents counts or IQ, Year
measurement Age,
Ex: height, weight, temperature, no. of years, etc Ratio / / / / height,
weight
Types:
 Discrete can assume a finite or
countable number of values Data Processing
Ex: no. children, no. of students
 Systematic procedure to ensure that the
 Continuous can assume an infinity or
information/data gathered are complete, consistent,
other possible value that corresponding
and suitable for analysis.
to a point on a line interval.
 Data Analysis Steps:
Ex: Weight, decimal, temp that can in
1. Identify the problem
principle be measured arbitrarily,
2. Collect data
accurately.
3. Presentation of data
Scale of Measurement of Variables 4. Analysis of data
5. Interpretation of data
BIOSTAT WEEK 1
Data Processing Flowchart  A document which contains a record of all codes
assigned to the responses to all questions in the
data collection forms
 Minimum information that must be included in a
coding manual
 Variable name – variables must consist
of one string only and consisting of
letters (when useful), numbers and
underscore
Data Coding - Spaces are not allowed
- enter it at a top of each column
- Conversion of verbal/written information into - Long enough to be meaningful, short
numbers which can be more easily encoded, enough to be easy to read
counted and tabulated.  Variable description/label of a variable
- Assigning numerals or other symbols to answers, such as textual description, or reference
so responses can be put into a limited category. to the question number of the item
Codes – it is the rules for interpreting, classifying, and arises from the questionnaire.
recording data in the coding process. - Include descriptive variable label for
each variable in the file
Types of Codes
- Important for statisticians to
 Field Code - actual value or information given by understand the contents for each data
the respondent item, as well as for the researchers as
 Bracket Code - recorded as range of values the table will facilitate in understanding
rather than actual values. output of the statistical analysis.
 Factual Code - codes are assigned to a list of  Coding instructions
categories of a given variable. Code book – contains coding instructions
 Pattern Code - applicable for questions with and necessary information about
multiple response. variables in a data set.
It generally contains column no., record
Rules in Code Construction
no., variable no., variable mean,
 Number of rules must be kept to minimum (<8) question no., instruction for coding.
 Codes should be exhaustive and mutually Note: In coding manual, you can add additional
exclusive information in subsequent column
 Adopt coding convention for questions with
similar answer Data Encoding - Entering the data/responses in a
Category codes - should be only few or around spreadsheet
10% or less of the responses should fall into the
 MS Excel
other category.
 MS Access
- Should be assigned for critical issues even if no
 Epi Info
one has mentioned them, data should be coded
to retain as much detail as possible. Data Editing
Coding Problems
 Inspection and correction of any errors or
 No response
inconsistencies in the information collected
 Not applicable questions
- During data collection, encoding, before data
Coding Manual analysis
 Process of examining the collected raw data to
detect errors/omissions, and to correct them asap.
BIOSTAT WEEK 1
Types of Editing
I. Field Editing
 Reviewing the accomplished data collection forms
 Decoding of abbreviations or special symbols
 Making callbacks/messages for verification
/clarification of incomplete answers
 Raw files
II. Central Editing
 Checking of inconsistencies and incorrect entries
after receiving the questionnaire from the field
 Checking of encoded data
 Computerized, consolidated, summarized

Importance of Data Editing

 Make corrections as early as possible


 Reduce non-response or incomplete answers
 Eliminate inconsistencies, incorrect info.
 Make the entries clear, legible & comprehensive
 Prepare data for analysis

What to check when editing data?

 Check for duplicate entries


 Check the totals of each variable if the same as
with the sample size
 For qualitative data, check if categories are
consistent with what is specified in the coding
manual
 For quantitative data, check the minimum and
maximum if they are logical given the possible
values of variable

You might also like