Professional Documents
Culture Documents
Data Sources
Primary
Survey
Observation
Experimental
Secondary
Existing Records
Registry
Census
Hospitals, Clinics, Laboratories and Physician’s Offices
Primary Data Sources
o Advantages:
The investigator collects data specific to the problem under study.
There is no doubt about the quality of the data collected (for the investigator).
If required, it may be possible to obtain additional data during the study period.
o Disadvantages:
The investigator has to contend with all the hassles of data collection-deciding why, what, how,
when to collect; getting the data collected (personally or through others); getting funding and
dealing with funding agencies; ethical considerations (consent, permissions, etc.).
Ensuring the data collected is of a high standard-all desired data is obtained accurately, and in
the format it is required in; there is no fake/ cooked up data; unnecessary/ useless data has not
been included.
Cost of obtaining the data is often the major expense in studies.
Secondary Data Sources
o Advantages:
No hassles of data collection
It is less expensive
o Disadvantages
Data collected in one location may not be suitable for the other one due variable environmental
factor.
With the passage of time the data becomes obsolete and very old.
Secondary data collected can distort the results of the research. For using secondary data a
special care is required to amend or modify for use.
Data Collection
o Census
o Survey
o Observation
o Experiment
o Simulations
o Review of documents and records
Survey
o To query (someone) in order to collect data for the analysis of some aspect of a group or area (Merriam
Webster)
o Solicits information from people
o Steps:
Interview
Verbal communication between the researcher and the participant, during which
information is collected
Types:
UNSTRUCTED
i. More conversational
ii. Allows flexibility in questioning
STRUCTED
i. Operates within a formal interview schedule
ii. Order of questions are designed prior to the interview
Key Informant Interview
One-on-one interview with a point person
Focus Group Discussion
Small group of people interviewed at the same time to discuss specific topics under the
guidance of a moderator
Questionnaire
A series of questions designed to collect information
Most common type of instrument used
Typically filled out by participants
Types:
i. Open Ended
Can elicit more detailed responses
Responses require more effort to encode for data analysis
ii. Close Ended
Easy to administer
Uniform and pre-coded
Can be encoded and analyzed in a short time
Observation
Experiment
o Treatment and observe the response
Control group (a group receiving not treatment or a placebo)
Used to compare the effectiveness of a treatment
Simulation
o Uses a mathematical, physical, or computer model to replicate the conditions of a process or situation
o Frequently used when the actual situation is too expensive, dangerous, or impractical to replicate in real
life
Review of Records
o Collection of data from existing records using an abstraction from
o Examples:
Hospital or facility records
Computer data bases
Government reports
Census data
ADVANTAGES DISADVANTAGES
Questionnaire Can assess a large group quickly Requires a “good” language
Easy to analyze if constructed correctly Social desirability bias
Not very good in getting in-depth
information
Interview Best when you want to know what people Recall bias
think, believe, or perceive Social desirability bias
Review of Records Relatively inexpensive Coding errors (missing or incomplete
Faster than collecting the original data again data)
Data may not be exactly what is
needed
Difficulty in getting access
Needs to verify the validity and
reliability of data
Observation Relatively inexpensive Hawthorne effect
Collects data on actual vs. self-reported Observer bias
behavior or perceptions Can be labor intensive
Timeliness
o Data are up to date (current)
o Information is available on time
Validity
o Data measure what they are intended to measure
Reliability
o Data are measured and collected consistently according to standard definitions and methodologies
o Results are the same when measurements are repeated
Completeness
o All data elements are included (as per the definition and methodologies specified)
Precision
o Repeatability
o Data have sufficient detail
Integrity
o Relying to the data to be able to draw valid conclusions
o Data are protected from deliberate bias or manipulation for political or personal reasons
Data Processing
o Systematic procedure to ensure that the information/data gathered are complete, consistent and
suitable for data analysis (1: Data Coding, 2: Data Encoding & 3: Data Editing)
o Data Coding
Transforming collected information/observation into numbers (cohesive categories) which can
be more easily encoded, counted and tabulated
Allows rapid storage of data
Minimizes errors in encoding data
Sometimes necessary so that the statistical software can perform various analysis on the data
Guidelines: number of codes must be kept to minimum (preferably <8) and it should be
Exhaustive and Mutually Exclusive
o Data Encoding
Entering of data in a spreadsheet
Use computer programs for encoding
o Data editing
Inspection and correction of any errors or inconsistencies in the information collected
Purpose:
To make changes/corrections as early as possible
To ensure completeness, consistency and legibility of data entries
To prepare the data for analysis
Data Analysis
o The process of evaluating data using analytical and statistical tools to discover useful information
Objective of Analysis
o Relationship tests
Test for the significance of the relationship of variables
o Difference tests
Test for the significance of differences in the groups being compared
Level of measurement of the variable
o Parametric tests
Make assumptions about the parameters of the population distribution(s) from which the data
are drawn
o Non-Parametric tests
Make no assumptions about the parameters of the population distribution(s) from which the
data are drawn
Study Design
o Number of groups to be compared
o Whether the samples are independent or related
Data Presentation
The method of summarizing, organizing and communicating information using a variety of tools
Purpose:
o Display data clearly and effectively
o Summarize large quantities of date to the reader
o Facilitate analysis of trends, comparisons or relationships between variables
Methods of Data Presentation
o Tabular Method
o Graphical Method
Guidelines in Table Construction
o All tables must be simple, direct and clear.
o It should appear immediately after the text where it is first cited.
o All tables should have a uniform style.
o Categories must be mutually exclusive.
o The unit of measurement should be well defined.
o Ideally, limit only to 3 or 4 variables per table.
o If the observations are large in numbers, they can be broken into 2 or 3 tables.
o Tables should be self-explanatory
Parts of Statistical Table
o Title
Explanatory
Gives a clear and concise description of the data
Answer the following questions:
What
Who
Where
When
o Box Head (Column Heading)
Indicated the basis of classification of the column or
vertical series
o Stubs (Row Heading)
Indicates the basis of classification in rows or horizontal series
o Body
main part of the table (composed of cells)
Figures within the cell should be aligned
Consistency in the number of decimal places
Align all plus, minus, and plus-minus signs
Empty cells should be indicated with a zero (0) or hyphen (-)
Should be uniform in terms of decimals
Include parenthesis for sign
Can add space instead of a symbol to avoid problems having to translate between languages
o Foot Notes
Appear immediately below the body of the table
Designated by letters instead of numbers
Provides additional information that cannot be easily understood from the title, box head or
stub
o Source of data
Exact reference of the information
Includes the information about compiling agency, publication, etc.
Source should not be placed as a footnote to the page
Master Table
Dummy Table
Skeleton tables that give a preview of what table outputs may be expected from the study
Purpose:
o Help researcher clarify instrument
o Help protocol reviewer
o Help statistician/computer programmer
Show either the actual number of observations falling in each range or the percentage of observations
Parts of Frequency distribution table
o Class interval
Width of class distribution
o Frequency
Record the number of times a result appears in class interval
o Cumulative frequency
Add the frequency of the previous row to the frequency of the current row
o Percentage
List the percentage of the frequency in each class interval
o Cumulative percentage
Add the percentage of the previous row to the percentage of the current row
Type of Graphs
Guidelines
o Should be self explanatory
o Source should be cited if data is secondary
o Title may be placed at the top or bottom of the graph
o Vertical and horizontal scales should be properly labeled
o Properly identify trend lines or curves with labels or legend
o Frequencies are placed on vertical axis while basis for classifications is on the horizontal
o Vertical scale should always starts with zero
o Use colors or degrees of shading for emphasis or to differentiate between
Pie chart
o Describe how a whole is divided into parts/slices
o Show the percentage of the total number of observations falling into each categories of a qualitative
variable
Bar Graph
o Compare data between different categories
o Qualitative variable or discrete quantitative variable
o Height of the bar is proportional to their values
o Bars should be of equal width and separated by gaps
Horizontal Bar Graph
o Qualitative variable
Vertical Bar Graph
o Discrete Quantitative Variable
Component Bar Diagram
o Compare the compositions of two or more different groups as opposed to pie chart
o Qualitative data
o Each bar shows how a whole is made up of its component parts
Histogram
o Represents frequency distribution of continuous quantitative variables
o Horizontal axis shows the unit of measurement
o Vertical scale gives the frequencies
o The area of rectangle is proportional to both the frequency and the width
Frequency Polygon
o Displays the frequency of continuous quantitative variable
o Advantageous for two or more distributions are being depicted in a single graph
o Frequencies are plotted against the corresponding midpoints of the classes
Line Graph/ Line Diagram
o Time series or time charts
o Quantitative variable over a period of time
o Intended to show trends or changes in the variable with time
Scatterplot
o Relationship between two quantitative variables
o A graph in which the values of two variables are plotted along two axes
Box Plot
o Shows skewness of data by comparing the mean and median
o Useful for showing description of large quantitative data including range, quartiles, spread, shape, tail
lengths and outliers