Professional Documents
Culture Documents
Collection of data.
Statistical Analysis :
Statistical analysis is concerned with making sense of data to get valid
conclusions or inferences which enables us in making wise decisions in the
face of uncertainty.
There are two phases (types) of statistical analysis, the descriptive and
the inferential (analytic) phases :
• Classification (types) of Variables :
The first step in any statistical analysis is to identify
the type of data (variables) you have.
• There are three ways of classification of
variables :
• Quantitative vs Qualitative.
• Continuous vs Discrete.
• Dependent vs Independent.
1- Quantitative versus Qualitative
Quantitative (Numerical, Metric) Variables :
• They are variables that yield measurements for which the value has numerical
meaning (?).
Height (1.83, 1.74…Cm), weight (48.72, 65.83…Kgm), temperature, time, blood pressure, and blood sugar level
are commonly used continuous variables.
Discrete Variables:
• A discrete variable is the one that can take values only at specific points along its scale of measurement.
•
3- Dependent versus Independent
Dependent / Response.
Variable of primary interest (e.g. blood pressure in an antihypertensive drug trial).
Not controlled by the experimenter.
Independent / Predictor.
When an experiment is conducted, some variables are manipulated by the
experimenter (independent variables) and the effects of these are measured on a
response variable (dependent variable).
An easy way to distinguish the independent variables in an experiment is to ask
the question, “What would be the effect of (the independent variable) on (the
dependent variable). For example, a new statin drug may be given to patients to
see if it lowers their cholesterol levels.
Scales of Measurement
(Scales used to measure variables)
• Nominal - categories only.
• Ordinal - categories with some order.
• Interval - differences but not ratios (no
natural starting point). = no true Zero
• Ratio - differences and ratios (a natural
starting point).= True Zero
Meaning of the measurement scales.
Marital status
Eye colour
Nominal Is A different than B?
Gender
Religion
Stage of disease
Ordinal Is A bigger than B? Severity of pain
Level of satisfaction
Temperature
Interval By how many units A and B differ? Calendar date
IQ test
Distance
Ratio How many times bigger than A is B? Height
Weight
Graphical presentation of data
(Graphs or Charts)
- Pie chart.
- Bar chart.
- Histogram.
- Frequency polygon.
- Scatter diagram.
- Box and plot.
Pie chart:
• A histogram looks like a bar chart used with discrete data except that each bar in a
histogram represents an interval (category or class) of possible values rather than a single
value but without any gaps between adjacent bars. This emphasizes the continuous nature
of the underlying variable.
• The width of the bar represents the interval of each category and the total area of each
bar is proportional to the corresponding frequency or percentage of each category.
Histogram
Frequency Polygon
Scatter Diagrams
• The relationship between two variables can be shown graphically in a
scatter diagram, as shown in Fig. A scatter diagram is a graph in which each
individual or unit measured is entered as a point, the position of each point
being determined by the values for the two characteristics measured.
Box-plot
• Boxplots provide a graphical summary of distribution based on the three quartile
values, the minimum and maximum values, and any outliers. Like the pie chart, the
boxplot can only represent one variable at a time, but a number of boxplots can be set
alongside each other. Useful for comparing large sets of data.
?! QUANTITATIVE
variables
Normal Distribution Curve
• mean=median=mode
• Symmetry about the center
• 50% of the values less than the mean and 50%
greater than the mean
68% of values
are within 1
standard
deviation of the
The Standard mean
Deviation :
95% of values
are within 2
is a measure standard
deviations of
of how the mean
spread out
values are. 99.7% of values
are within 3
standard
deviations of
the mean
Why do we need to know Standard
Deviation?
• Any value is
– likely to be within 1 standard deviation of the
mean
– very likely to be within 2 standard deviations
– almost certainly within 3 standard deviations
LET’S RECAP!
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have the same value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
50% 50%
μ
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have the same value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
• Approximately 68% of the area is within 1 standard deviation, σ, of the mean.
68%
σ σ
μ-σ μ μ+σ
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have the same value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
• Approximately 68% of the area is within 1 standard deviation, σ, of the mean.
• Approximately 95% of the area is within 2 standard deviations of the mean.
95%
σ σ σ σ
μ - 2σ μ - σ μ μ + σ μ + 2σ
The properties of a normal distribution:
• It is a bell-shaped curve.
• It is symmetrical about the mean, μ. (The mean, the mode and the median all have the same value).
• The total area under the curve is 1 (or 100%).
• 50% of the area is to the left of the mean, and 50% to the right.
• Approximately 68% of the area is within 1 standard deviation, σ, of the mean.
• Approximately 95% of the area is within 2 standard deviations of the mean.
• Approximately 99% of the area is within 3 standard deviations of the mean.
99%
σ σ σ σ σ σ
μ - 3σ μ - 2σ μ - σ μ μ + σ μ + 2σ μ + 3σ
LET’S PRACTICE!
WE DO: 95 % of students at school are
between 1.1 m and 1.7 m tall.
• Assuming this data is normally distributed can
you calculate the mean and standard
deviation?
YOU DO: 68% of mothers’ own
children between 4 and 6 years old.
Assuming this data is normally distributed can you
calculate the mean?
WE DO: The reaction times for a hand-eye coordination test
administered to 1800 teenagers are normally distributed with a
mean of 0.35 seconds and a standard deviation of 0.05
seconds.
• Represent this information on a bell curve: