You are on page 1of 3

BIOSTAT LESSON 2 – Descriptive Statistics - may be drawn vertically or horizontally depending on available space

and/or numbers of categories or groupings of the variables being depicted


Descriptive Statistics - most appropriate for comparing data taken at a particular time
- used to describe the basic features of the data in a study
c. Pie Chart or Circle Graph – appropriate for comparing the parts with the
- provide simple summaries about the sample and the measures
whole (100%) and thus is used to show how a whole is divided into its
- present quantitative description in a manageable form
components, through the use of wedge-shaped figures
- help simplify large amounts of data in a sensible way
d. Pictograph – use of actual picture or facsimiles (exact
Types of Data Presentation copy, especially of written or printed material) of the objects under
1. Narrative or Textual Method – data is simply narrated, story fashion and is the
study are used to represent values
most basic way of presenting data
- each figure is considered a unit representing a
definite number
disadvantages:
• can be boring if too long
• not applicable for very large data set
• difficult to observe trends e. Component Bar Diagram – alternative of a pie chart and used to show how a
• difficult to make a comparison and interrelationships whole is made of its parts
- bar representing the whole is divided into smaller rectangles representing
2. Tabular Presentation – provides more compact way of presenting large sets of the parts proportional to the relative contribution to the whole
detailed information - preferably used in situations where the compositions of two or more
- it complements textual presentation which can present highlights different groups are combined

guidelines in the construction of a table: eg. Gender Wise


1. must be self-explanatory: clear, direct, simple Frequency Distribution of
2. must be clear and descriptive: gives information about what, where, how Study Subjects According
and when the data were taken to the Places of Their
Previous BP Check Up
e.g. (obscure) dengue cases (better) increasing dengue cases
(too verbose) increasing cases of dengue in Central Visayas for 2019

3. each characteristic may be summarized and compared separately by using


percentage or any appropriate procedure
4. if there are >1 information, several columns must be constructed and f. Histograms – series of columns, each having as
properly labelled its base one class of interval and frequency or
5. footnotes (if necessary) should be placed at the bottom number of cases in that class as its height
- graphical representation of the frequency
parts of the table (7): distribution of a continuous quantitative
• table number • body variable including age
• title • footnotes
• column headings • sources of data
• row headings or stubs
g. Frequency Polygon – advantageous if two
3. Graphical Presentation – more effective tool in data presentation than tables or more distributions are to be depicted in
- can show trends/patterns in large sets of data which could be missed in table form a single graph
- comparisons can be made
- more appealing to observers than table form

Types of Graphs (9):


a. Line Graph – primarily to portray trends
b. Bar Graph – consists of vertical and horizontal bars of equal widths h. Scatterplot or Scatterpoint Diagram – shows the relationship between two
representing rates of frequency and are proportional to their values quantitative variables
- gives rough estimate of the type and degree of correlation between the Mean (arithmetic average) – sum of the values in the data group divided by the
variables number of values
- used as preliminary step to more detailed mathematical analysis
When to use Mean as a Measure of Central
4. Frequency Distribution Tendency:
Frequency – as applied in research is a respond of respondents on a given research 1. when the distribution consists of interval or
problem ratio data that have no extreme values (too
Frequency distribution – the process of organizing and classifying the data into high or too low in comparison with the other
desired form scores in the set)
2. when other statistics like SD, coefficient of
eg The entrance exam scores of the 20 incoming 1st year medical technology correlation, etc. are to be computed later
students of Section D in Velez College 3. when the distribution is normal or not greatly skewed, the mean is usually preferred
to either median and mode
4. when used to compare 2 or more sets of data where variations of values between
or among the sets follow the same pattern, or that the distribution have the same
characteristics

Median – middlemost, the centermost, or the midpoint value in a distribution


- a point on the scale of measurements that divides a series of ranked observations
into halves, such that half of the observations fall above it and the other half fall below
it
- most representative value if the distribution is grossly asymmetrical or skewed

When to use the median as a Measure of Centrality:


1. when the values of measurements are in ordinal or in ranked scale
2. when the exact midpoint of the distribution is wanted, the 50% point
3. when there are extreme values or open-ended distribution which would
markedly affect the mean; extreme values do not disturb the median

Formula:
(n+1)
Median =
MEASURES OF CENTRAL TENDENCY VALUES 2
- focus on the concentration of values in a set of data n = # of observations
- generally located towards the middle or center of the distribution where most of the
data tend to be concentrated Mode – observation which occurs the most often in a set of values
- a measure of central tendency for nominal data (unimodal, bimodal, multi-modal)
Definition of Terms: - quite possible for a set of data to have no mode when all values observed occur with
Population Sample Parameter equal frequency
collection of all cases the portion of the subset of numerical characteristic
researcher is interested the population from of a population When to use the mode as a Measure of Centrality:
in in the study; the entity which the information is 1. when a quick and approximate measure of central value is all that is wanted
the researcher wishes to gathered; representation 2. most appropriate value when we wish to know the most typical value
understand of the population 3. when the data are in nominal scale or categorical in nature

a. Mean: average or norm CONSIDERATIONS (MEASURES OF CENTRAL TENDENCY)


b. Median: middle value 1. Stability of measures: vary with respect to their consistency and stability based
c. Mode: most frequent value from one sample to another, thus the mean is considered the most satisfactory and
d. Range: difference between the lowest and highest value mode the least satisfactory

Common Measures of Central Tendency: 2. Data manipulation: of the three, only the MEAN is determined by the algebraic
value of every score, making it useful in higher order statistical computation
3. Time factors: the Mode can be determined easily unlike mean and median
wherein it needs computation

MEASURE OF DISPERSION OR VARIABILITY


- describes how far apart the observations are
- values that indicate the degree of dispersion or scatter in the distribution of values
- identify whether the groups is homogenous or heterogenous

Range simplest method of variability range = highest observation –


which is the difference lowest observation
between the highest and the
lowest values in a set of
observations
Variance average of the squared
deviations from the mean population
variance
the need to square the n
deviations arises from the *for
fact that the sum of the ungrouped
deviations from the mean is data or
equal to zero sample variance, use n-1 as
denominator
Coefficient expresses the SD as a CV = (SD / mean) x 100
of Variation percent of the mean
Standard - measures how far the
Deviation values are from the mean
- square root of the variance

NORMAL DISTRIBUTION CURVE (watch lecture vid)


- bell-shaped curve when the values are plotted on the y-axis and the frequency on
the x-axis
- half (50%) of the values are lesser from the mean and half (50%) are greater than
the mean

 68.26% of the values belong to +/-1SD


 95.44% of the values belong to +/- 2SD
 99.72% of the values belong to +/-3SD

ex. Total observations 287


287 x 68% = 195 (1SD)

Confidence Limit is within +/- 2SD


(confidence limits show how accurate
an estimation of the mean is, or is
likely to be; shows the range in which
the true value is likely to fall within)

Outliers = values beyond +/-3SD

You might also like