You are on page 1of 20

Data distribution 1

• The three basics of statistics are:


1. Variability: making sense from variation
2. Inference: making generalization
3. Probability: making proportion and chance
Inferential Analysis
• Inferential analysis uses statistical tests to see
if a pattern we observe is just due to chance or
is due to the program or intervention effects.
• Research often uses inferential analysis to
determine if there is a relationship between
an intervention and an outcome, as well as
the strength of that relationship.
• One of the first steps in inferential analysis is
to answer the question:
1. What does the distribution of data look like?
2. The type of test you choose will be guided by
the distribution of the data.
Distributions fall into two categories, normal
and non-normal.
3. Always check the distribution of your data
before beginning inferential analysis.
Data Distribution Shapes
• Normal distribution curve
• Skewed distribution curves
• Probability distribution

5
The Histogram and normal curve
%
0

marks of Q2

Histogram of Q1, this time for a sample of 5,000 students:


notice how the shape of the histogram is more defined that
with the previous sample of 380 students. 6
Non-Normal Distributions
Skewed
Positively skewed data Negatively skewed data
Normal Curve
• Symmetric (Right and left sides are mirror
images)
– Left tail looks like right tail
– Mean = Median = Mode

Mean Median Mode


8
Skewed Distribution
• Right skewed (positively skewed)
– Long right tail
– Mean > Median

Mode Mean
Median
9
Shapes of Distributions
• Left skewed (negatively skewed)
– Long left tail
– Mean < Median

Mean Median Mode


10
Normal distribution curve
Important concept of statistical theory
If we collect the HB level of a large number of people
and make frequency distribution with narrow class
interval.
Then make histogram and inline with polygon
The result smooth symmetrical curve(bell or Gaussian
curve).
The shape of the curve depends on the mean and SD
which ultimately depends on the number and nature of
observations
If we take several samples for HB level we get infinite
number of normal curves
11
Normal Distribution
• Fig A: high variability- low reliability or consistency- more
accurate and give chance to every value being selected.
• Fig B: less variability- high reliability or consistency- less
accurate since that some extreme values may be missed

Fig B Fig A
Standard Deviation
Standard Deviation

Mean Mean
12
Mean from a Positively Skewed Distribution

When the data is positively skewed analyses are


commonly done on the log scale.
This is done to minimize the effect of extreme
observations.
Method of obtaining the mean:
Take the log of each data value
Calculate the mean on the log scale
Take the antilog of the mean to return to the original
scale of measurement.
This is called the “GEOMETRIC” Mean.

13
Example of logarithmic transformation
for skewed data

Logarithm Drug threshold (mg/kg body wt)


0.6021 4 1
07782. 6 2
0.9031 8 3
0.9542 9 4
09542. 9 5
1.1461 14 6
1.1761 15 7
1.2041 16 8
1.2553 18 9
1.3979 25 10
1.4472 28 13
1.8451 70 12
2.2041 160 13
15.8677 Total 382
• The arithmetic mean 382/13= 29.4 mg/kg
• Note that 11 values of the 13 are smaller than
the mean and only 2 values are bigger
• The algorithms values showed normal
distributions
• The mean of logarithms 15.8677/13=1.2206
• The anti log ( geometric mean)=16.6mg/kg
• Note that the median(15mg/kg) is near to
geometric mean
Standard Normal Curve
Standard Normal Deviate or Variable
• Each mean and standard deviation defines a
different normal curve.

• The areas ( probabilities) under the curve


being tabled for only one standard curve.

• This is because for any normal curve it is


possible to relate the distance between an
observed value and the mean to SD

17
• This transformation gives standard normal
deviate( z)

• The new ( z )follows normal distribution

• The total area under the z distribution curve is


1
• The mean of standard normal curves 0 and
the SD is 1
Standard Normal Curve

19

You might also like