AhmedRefat AG Refat
FOMZU
AhmedRefatZU
Definition of Statistics
Statistics is the science of dealing with
numbers.
collection, summarization,
presentationandanalysisofdata.
It is used for
Statisticsprovidesawayoforganizingdatato
get information on a wider and more formal
(objective) basis than relying on personal
experience(subjective).
AhmedRefatZU
Uses of medical
statistics
Medicalstatisticsareusedin
1 Planning, monitoring and evaluating
community
healthcareprograms.
2 Epidemiologicalresearchstudies.
3 Diagnosisofcommunityhealthproblems.
4 Comparisonofhealthstatusanddiseasesindifferent
countriesandinonecountryoveryears.
5 To form standards for the different biological
measurementsasweight,height.
6 Todifferentiatebetweendiseasedandnormalgroups.
AhmedRefatZU
Types of data
Anyaspectofanindividualthatismeasured,iscalled
variable.Variablesareeither
1Quantitative or 2Qualitative.
1Quantitative data: it isnumericaldata.
Discrete data: areusually whole numbers, suchasnumber
of cases of certain disease, number of hospital beds (no
decimalfraction).
Continuous data: it implies the measurement on a
continuous scalee.g.height,weight,age(adecimalfraction
canbepresent).
AhmedRefatZU
1Quantitative data
.
Quantitative data: it isnumericaldata.
TowTypes
A Discrete data: are usually whole numbers, such
as number of cases of certain disease, number of
hospitalbeds(nodecimalfraction).
B Continuous data:itimpliesthemeasurementona
continuous scalee.g.height,weight,age
(adecimalfractioncanbepresent).
AhmedRefatZU
2 Qualitative data
Qualitative data: It is non numerical data and is
subdividedintoTwoTypes:
A Categorical : data are purely descriptive and
implynoorderingofanykindsuchassex,areaof
residence.
B Ordinal data:arethosewhichimplysomekind
oforderinglike
Levelofeducation:
Socioeconomicstatus:
Degreeofseverityofdisease:
AhmedRefatZU
Presentation Of Data
Thefirststepinstatisticalanalysisistopresent
datainaneasywaytobeunderstood.
Thetwobasicwaysfordatapresentationare:
Tabular presentation.
2. Graphical presentation
1.
AhmedRefatZU
Tabulation
Somerulesfortheconstructiontables:
1Thetablemustbeselfexplanatory.
2 Title: written at the top of table to define
preciselythecontent,theplaceandthetime.
3Clearheading of the columns and rows and
units of measurements
4The size of the table depends on the number
of classes. Usually lie between 2 and 10 rows
or classes. Its selection depends on the form of data and the
requirementofthedistribution.Toosmallmayobscuresomeinformationand
toolongwillnotdifferfromrawdata.
AhmedRefatZU
Types of tables
For Qualitative data, draw a simple table eg., List
Table : count the number of observations
( frequencies) in each category.
For Quantitative data, we have to form a
frequency distribution Table
AhmedRefatZU
Types of tables
:List:
A table consisting of two columns, the first giving an
identificationoftheobservational unitandthesecondgiving
thevalue of variable for that unit.
Example : number of patients in each hospital department
are
Medicine100patients
Surgery80
ENT28
Ophthalmology30
AhmedRefatZU
Frequency Distribution
tables
FDTs are used for presentation of
qualitative ( and quantitative Discrete)data,
Byrecordingthenumberof
observationsineachcategory.
Thesecountsarecalledfrequencies.
.
No Classes .. No Intervals
AhmedRefatZU
Frequency Distribution
tables
FDTforQuantitative Continuous Data
consistsofaseriesof classes
(intervals) together with the number of
observations ( frequency) whose values
fallwithintheintervalofeachclass.
AhmedRefatZU
Frequency Distribution
tables
EXAMPLE(1)Assumewehaveagroup
of 20 individuals whose blood groups
were as followed :A,AB,AB, O, B,A,
A,B,B,AB,O,AB,AB,A,B,B,B,A,O,
A.Wewanttopresentthesedataby
table.
?????Typeofdata>>>>>>
AhmedRefatZU
How to Construct a
Frequency Distribution
tables
FourSteps
Title,Table,No,%
1 Put a title
2 Draw Columns & Rows
3 Enumerate the individuals in each
category
4 Calculate The relative frequency (%)
(%)
AhmedRefatZU
How to Construct a
Frequency Distribution
tables
FourSteps
1 Put a titleeg.,
Distribution of the studied individuals according
to their blood group.
2 Draw a table (Columns & Rows),
Firstcolumn>StudiedVariable
Blood Group,
2ndcolumnheading>FrequencyNumber
3rdcolumnheading>Percentage%
AhmedRefatZU
Frequency Distribution
tables
3 Enumerate the individuals in each
blood group, i.e. individuals with blood group A are 6
and those with blood group B are 6 , AB are 5 and blood group
AhmedRefatZU
Frequency Distribution
tables
4 Calculate The relative frequency
(%)ofeachbloodgroupbydividingthe
(%)
frequency of that group over the total
number of individuals and multiplied by
100
i.e.thepercentageofgroupA=6/20x100,andthesamefor
groupAB=5/20x100andgroupO=3/20x100.Thefinal
tablewillbe
:
AhmedRefatZU
Frequency Distribution
tables What is Your
Conclusion?
AhmedRefatZU
Frequency Distribution
tables
We can conclude from this table that
blood groups A & B are the most
commongroupsandtherarestisgroup
O(depending on the percentage of each group).
Sopresentingdataintableisbeneficial
in deducing facts and simplify
informationthanrawdata.
AhmedRefatZU
Frequency Distribution
tables
EXAMPLE (3) : The Following data are
Systolic Blood Pressure measurements
(mmHg) of 30 patients with hypertension.
Presentthesedatainfrequencytable:
150,155,160,154,162,170,165,155,190,186,180,178,
195,200,180,156,173,188,173,189,190,177,186,
177,174,155,164,163,172,160.
???????TypeofData
AhmedRefatZU
Frequency Distribution
tables
FourSteps
1 Put a titleeg.,
Frequencydistributionofbloodpressure
measurements(mmHg)amongagroupof
hypertensivepatients.
2 Draw a table (Columns & Rows),
AhmedRefatZU
Frequency Distribution
tables
3Inthefirstcolumnwehavetoclassify
blood pressure into categories or
classes because we have a large
sample(N=30)
and the measured variable is of
continuoustype(notdiscreteasintheprevious
examples).
AhmedRefatZU
Frequency Distribution
tables
construction of classes
Calculate the Range of observation:
subtractthelowestvalueofbloodpressuresfromthehighestvalue
(thehighestwas200andthelowestwas150)thedifferenceis50 .
intervalbe10,sowewillhave50/10=5classes.
EnumeratetheFrequencyByTallyMethods
Calculate the Exact Frequncy & Relative
frequency
AhmedRefatZU
Frequency Distribution
tables
construction of classes
AhmedRefatZU
2Graphical Presentation
The diagram should be:
Simple
Easy to understand
Save a lot of words
Self explanatory
Has a clear title indicating its content
Fully labeled
The y axis (vertical) is usually used for frequency
AhmedRefatZU
2Graphical Presentation
Graphicpresentationsusedtoillustrate
and clarify information. Tables are
essential in presentation of scientific
data and diagrams are complementary
to summarize these tables in an easy,
attractiveandsimpleway.
AhmedRefatZU
Graphical Presentation
1 Bar chart
>>>Simple ,
>>> Multiple,
>>>Components
AhmedRefatZU
Graphical Presentation
Meanageinyears
27
26.5
26
25.5
25
24.5
24
groupI
groupII
Thestudiedgroups
AhmedRefatZU
groupIII
Graphical Presentation
1 Bar chart
Graphical Presentation
1 Bar chartMultiple
Multiple bar chart:
Males
Females
Cancer
Anemia
AhmedRefatZU
Graphical Presentation
1 Bar chart
AhmedRefatZU
Graphical Presentation
1 Bar chart
Graphical Presentation
1 Bar chartComponent
ComparisonbetweenEgyptandUSAinsocioeconomicstandardof
living
percentageofpopulation
100%
80%
high
60%
moderate
low
40%
20%
0%
Egypt
USA
AhmedRefatZU
Graphical Presentation
2Pie diagram:
Consistofacirclewhosearearepresents
thetotalfrequency(100%)whichis
dividedinto segments.
Eachsegmentrepresentsaproportional
compositionofthetotalfrequency.
AhmedRefatZU
Graphical Presentation
2Pie diagram:
PercentageofcausesofchilddeathinEgypt
congenital
10%
accident
10%
diarrhea
50%
chestinfection
30%
AhmedRefatZU
Graphical Presentation
3 Histogram:
Graphical Presentation
3 Histogram:
Distributionofstudiedgroupaccordingtotheirheight
numberofindividuals
30
25
20
15
10
5
0
100
110
120
130
height in cm
AhmedRefatZU
140
150
Graphical Presentation
4 Frequency Polygon
Derived from a histogram by connecting the
mid points of the tops of the rectangles in
thehistogram.
The line connecting the centers of histogram
rectanglesiscalledfrequencypolygon.
We can draw polygon without rectangles so
wewillgetsimplerformoflinegraph.
A special type of frequency polygon is the
NormalDistributionCurve.
AhmedRefatZU
Graphical Presentation
5  Scatter diagram

relationship
between
numeric measurements,
two
each
observation being represented by a
pointcorrespondingtoitsvalueoneach
axis
AhmedRefatZU
Thisscatterdiagramshowedapositiveordirect
relationshipbetweenNAGand
albumin/creatinineamongdiabeticpatients
NAG
CorrelationbetweenNAGandalbumincreatinine
ratioingroupofearlydiabetics
35
30
25
20
15
10
5
0
0
0.05
0.1
0.15
0.2
albumincreatinineratio
AhmedRefatZU
0.25
0.3
0.35
CorrelationbetweenDopplervelocimetry(RI)and
babybirthweight
1
RI
0.8
0.6
0.4
0.2
0
1.5
2.5
3.5
4.5
babyweightinkg
Innegativecorrelation,thepointswillbe
scatteredindownwarddirection,
meaningthattherelationbetweenthe
twostudiedmeasurementsis
controversiali.e.ifonemeasure
increasestheotherdecreases.As
showninthefollowinggraph
AhmedRefatZU
Graphical Presentation
6 Line graph:
itisdiagramshowingtherelationshipbetweentwo
numericvariables(asthescatter)butthepointsare
joinedtogethertoformaline(eitherbrokenlineor
smoothcurve)
Changesinbodytemperatureofapatientafteruseofantibiotic
39.5
39
temperature
38.5
38
37.5
37
36.5
36
1
AhmedRefatZU
timeinhours
Normal Distribution
Curve
AhmedRefatZU
Normal Distribution
curve
NDCisaGraphical
Presentation<FrequencyPolygon>
ofanyQuantitativeBiologicVariables
TheNormalDistributionCurveisthefrequency polygonofaquantitativevariable
measuredinlargenumber.
Itisaformofpresentationoffrequencydistributionofbiologicvariablessuchas
weights,heights,hemoglobinlevelandbloodpressureoranycontinuousdata.
Itoccupiesamajorroleinthetechniquesofstatistical
analysis.
AhmedRefatZU
AhmedRefatZU
Characteristics of Normal
Distribution curve
1 Itisbellshaped,continuouscurve.
2 It is symmetrical i.e.can bedividedinto two equal
halvesvertically.
3 The tails never touch the base line but extended
toinfinityineitherdirection.
4 Themean,medianandmodevaluescoincide
5 Itisdescribedbytwoparameters:arithmeticmean
determinethelocationofthecenterofthecurveand
standard deviation represents the scatter around
themean.
AhmedRefatZU
AhmedRefatZU
Skewed data
Ifwerepresentacollecteddatabya
frequencypolygongraphandthe
resultedcurvedoesnotsimulatethe
normaldistributioncurve(withallitscharacteristics)
thenthesedataare not normally
distributed
AhmedRefatZU
Thecurvemaybeskewedtotherightortotheleftside
ThisisbecauseThedatacollectedarefrom:
1.
2.
thereforetheresultsobtainedfromthesedatacannotbeapplied
orgeneralizedonthewholepopulation.
AhmedRefatZU
NDCcanbeusedindistinguishingbetweennormal
fromabnormalmeasurements.
Example:
IfwehaveNDCforhemoglobinlevelsforapopulation
ofnormaladultmaleswithmeanSD=11
1.5
Ifweobtainahemoglobinreadingforanindividual=
8.1andwewanttoknowifhe/sheisnormalor
anemic.
Ifthisreadinglieswithintheareaunderthecurveat
95%ofnormal (i.e. mean 2 SD)he/she
willbeconsiderednormal.Ifhisreadingisless
thenheisanemic.
AhmedRefatZU
i.ethenormalrangeofhemoglobinofadultmales
isfrom8to14.
our sample (8.1 ) lieswithinthe95%ofhis
population.
thereforethisindividualis normalbecausehis
readinglieswithinthe95%ofhispopulation.
AhmedRefatZU
Data Summarization
To summarize data, we need to use
one or two parameters that can
describethedata.
1.
Measures of Central
tendency
whichdescribesthecenterofthedata
2. and the Measures of Dispersion,
whichshowhowthedataarescattered
arounditscenter.
AhmedRefatZU
1 The arithmetic
mean:
the sum of observation divided by the number
ofobservations:
x =
x
n
Where:x=mean
denotesthe(sumof)
xthevaluesofobservation
nthenumberofobservation
AhmedRefatZU
1 The arithmetic
mean:
Example: In a study the age of 5
studentswere:12,15,10,17,13
Mean = sum of observations / number
ofobservations
ThenthemeanX=(12+15+10+17
+13)/5=13.4years
AhmedRefatZU
CalculationofMean
ForfrequencyDistributionData
In case of frequency distribution data we
calculatethemeanbythisequation:
x =
fx
n
wheref=frequency
for example : we want to calculate the
meanincubationperiodofthisgroup.
AhmedRefatZU
CalculationofMean
ForfrequencyDistributionData
AhmedRefatZU
CalculationofMean
ForfrequencyDistributionData
AhmedRefatZU
AhmedRefatZU
2 Median
It is the middle observation in a series
of observation after arranging them in
anascendingordescendingmanner.
The rank of median for is (n + 1)/2 if
thenumberofobservationisodd
andn/2ifthenumberiseven
AhmedRefatZU
2 Median
Calculatethemedianofthefollowing
data5,6,8,9,11n = 5~ Odd!!
Therankofthemedian=n + 1 / 2
i.e.(5+ 1)/ 2 = 3
The median is the third value in these groups
when data are arranged in ascending (or
descending)manner.
So the median is 8 (the third value)
AhmedRefatZU
2 Median
 If the number of observation is even, the
medianwillbecalculatedasfollows:
e.g. 5, 6, 8, 9
n=4
Therankofmedian= n / 2i.e.4/2=2.The
medianisthesecondvalueofthatgroup.Ifdata
arearrangedascendinglythenthemedianwillbe
6 and if arranged descendingly the median will
be8thereforethemedianwill be the mean of
both observationsi.e.(6+8)/2=7.
AhmedRefatZU
2 Median
For simplicity we can apply the same
equationusedforoddnumbersi.e.
n + 1 / 2. The median rank will be 4 +
1 /2 = 2 i.e. the median will be the
secondandthethirdvaluesi.e.6and
8,taketheirmean=7.
AhmedRefatZU
3 Mode
Themostfrequentoccurringvalueinthedata
isthemodeandiscalculatedasfollows:
Example: 5, 6, 7, 5, 10. The mode in this
data is 5 since number 5 is repeated twice.
Sometimes, there is more than one mode
andsometimesthereis no modeespecially
insmallsetofobservations.
AhmedRefatZU
3 Mode
Example : 20 , 18 , 14, 20, 13, 14, 30,
19.Therearetwomodes14and20.
Example : 300, 280 , 130, 125 , 240 ,
270.Hasnomode.
UnimodalBimodalNomodal
AhmedRefatZU
the measures of
central Tendency:
of
AhmedRefatZU
the measures of
central Tendency:
of
AhmedRefatZU
Measures of
Dispersion
The measure of dispersion describes the
degreeofvariationsorscatterordispersion
of the data around its central values:
(dispersion=variation=spread=scatter).
1.
RangeR
2.
VarianceV
3.
StandardDeviationSD
4.
CoefficientofVariationCOV
AhmedRefatZU
1Range:
is the difference between the largest and
smallestvalues.
isthesimplestmeasureofvariation.
disadvantages,itisbasedonlyontwoof
theobservationsandgivesnoideaofhowthe
otherobservationsarearrangedbetween
thesetwo.
Also,ittendstobelargewhenthesizeof
thesampleincreases
AhmedRefatZU
2Variance
2Variance
VarianceV=(meanx)/n
Thevalueofthisequationwillbeequal
tozero
because the differences between each value and the
mean will have negative and positive signs that will
equalize zero on algebraic summation.
AhmedRefatZU
2Variance
Toovercomethiszerowesquarethe
differencebetweenthemeanandeachvalue
sothesignwillbealwayspositive
.Thusweget:
V
= (mean x)2 / n  1
AhmedRefatZU
3 Standard Deviation
SD
The main disadvantage of the variance
isthatitisthesquareoftheunitsused.
So,itismoreconvenienttoexpressthe
variation in the original units by taking
the square root of the variance. This is
called the standard deviation (SD).
ThereforeSD=V
i.e.SD = (mean x)2 / n  1
AhmedRefatZU
4 Coefficient of
variation CoV
C. V = SD / mean * 100
C.V is useful when, we are interested in the
relativesizeofthevariabilityinthedata.
Example : if we have observations 5, 7, 10, 12
and 16. Their mean will be 50/5=10. SD =
(25+9+0+4+36)/(51)=74/4=4.3
C.V.=4.3/10x100=43%
AhmedRefatZU
Example
Calculate the mean, variance, SD and CV
Fromthefollowingmeasurements
5,7,10,12and16.
Mean=5+7+10+12+16/5=10.
SD=(25+9+0+4+36)/ (51)=
74/4=4.3
C.V.=4.3/ 10x100=43%
AhmedRefatZU
Example
Another observations are 2, 2, 5, 10, and 11. Their
mean=30/5=6
SD=(16+16+1+16+25)/(51)=74/4
=4.3
C.V=4.3/6x100=71.6%
Both observations have the same SD but they are
different in C.V. because data in the first group is
homogenous (so C.V. is not high), while data in the
second observations is heterogenous (so C.V. is
high).
AhmedRefatZU
Example
Example: In a study where age was
recorded the following were the
observed values: 6, 8, 9, 7, 6. and the
numberofobservationswere5.
Calculate the mean, SD and range,
modeandmedian.
Themean=sumofobservation/
theirnumber
AhmedRefatZU
Examples
The variance = Sum of the squared
differences (mean minus observation) /
number of observations. (7.2 6)2 +
(7.28)2+(7.29)2+(7.27)2+(7.2
6)2/51.whichisequalto(1.2) 2+(
0.8)2+(1.8)2+(0.2)2+(1.2)2/4=1.7
Sothevariance=1.7
AhmedRefatZU
Examples
TheS.D.=1.7=1.3
Range=96=3
Themodeis6
Themedianis:firstwehavetoarrange
dataascendinglyi.e.66789.
Therankofmedian=n+1/2i.e.5+1/2=
3 therefore the median is the third value i.e.
median=7
AhmedRefatZU
Inferential statistics
Inference
involves
making
a
Generalization about a larger group
ofindividualsonthebasisofasubsetor
sample.
AhmedRefatZU
Inferential statistics
Hypothesis Testing
Inhypothesistestingwewanttofindout
whether the observed variation among
samplingisexplainedbychance alone
???? (i.e., the chance of random sampling
variations ),orduetoa real difference
????betweengroups.
AhmedRefatZU
Hypothesis Testing
Itinvolvesconductingatestofstatistical
significance quantifying the chance of
random
sampling variations that may
accountforobservedresults.
Inhypothesestesting,weareaskingwhether
the sample mean for example is consistent
with a certain hypothesis value for the
populationmean.
AhmedRefatZU
Hypothesis Testing
The method of assessing the
hypotheses testing is known as
significance test.
Hypothesis Testing
Steps
>>>FormulateHypothesis
>>>CollecttheData
>>>>TestYourHypothesis
>>>AcceptofRejectYourHypothesis
AhmedRefatZU
General principles of
significance tests
1. set up a null hypothesis and its
alternative.
2. findthevalueoftheteststatistic.
3. referthevalueoftheteststatistictoa
known distribution which it would
followifthenullhypothesiswastrue.
AhmedRefatZU
General principles of
significance tests
4concludethatthedataareconsistentor
inconsistentwiththenullhypothesis.
If the data are not consistent with the
nullhypotheses,thedifferenceissaidto
bestatisticallysignificant.Ifthedataare
consistent with the null hypotheses it is
said that we accept it i.e. statistically
insignificant.
AhmedRefatZU
General principles of
significance tests P<0.05
In medicine, we usually consider that
differences are significant if the
probabilityislessthan0.05.Thismeans
that if the null hypothesis is true, we
shallmakeawrongdecisionlessthan5
inahundredtimes
AhmedRefatZU
Tests of significance
The selection of test of significance depends
essentiallyonthetypeofdatathatwehave.
1Quantitative Data ( Means & SD): t
test ,paired
Tests of significance
Comparison of means:
1comparingtwomeansoflargesamplesusingthe
normaldistribution:
(ztestorSNDstandardnormaldeviate)
Ifwehavealargesamplesizei.e.60ormoreand
itfollowsanormaldistributionthenwehavetouse
theztest.
z = (population mean sample mean) /
Tests of significance
Since the normal range for any
biological reading lies between the
meanvalueofthepopulationreading
2 SD. (this range includes 95% of the
area under the normal distribution
curve).
AhmedRefatZU
Students ttest
2Comparing two means of small
samplesusingttest:
If we have a small sample size (less
than 60), we can use the t distribution
insteadofthenormaldistribution.
T = mean1 mean2 /(SD1 2 / n1) +
(SD22/n2)
AhmedRefatZU
ttest
Thevalueoftwillbecomparedtovaluesin
thespecifictableof"tdistributiontest"atthe
valueofthedegreeoffreedom.Ifthevalueof
t is less than that in the table , then the
differencebetweensamplesisinsignificant.
Ifthetvalueislargerthanthatinthetableso
the difference is significant i.e. the null
hypothesisisrejected.
AhmedRefatZU
ttest
2Comparing two means of small
samplesusingttest:
If we have a small sample size (less
than 60), we can use the t distribution
insteadofthenormaldistribution.
T = mean1 mean2 /(SD1 2 / n1) +
(SD22/n2)
AhmedRefatZU
Paired ttest
3pairedttest:
If we are comparing repeated
observation in the same individual or
difference between paired data, we
have to use paired ttest where the
analysis is carried out using the mean
andstandarddeviationofthedifference
betweeneachpair.
AhmedRefatZU
ANOVA
4comparingseveralmeans:
Sometimesweneedtocomparemore
thantwomeans,thiscanbedonebythe
useofseveralttestwhichisnotonly
tediousbutcanleadtospurious
significantresults.Thereforewehaveto
usewhatwecallanalysisofvarianceor
ANOVA.
AhmedRefatZU
ANOVA
4comparingseveralmeans:
Therearetwomaintypes:onewayanalysisof
varianceandtwowayanalysisofvariance.Onewayanalysisofvarianceisappropriatewhenthe
subgroupstobecomparedaredefinedbyjust
onefactor,forexamplecomparisonbetween
meansofdifferentsocioeconomicclasses.The
twowayanalysisofvariablesisusedwhenthe
subdivisionisbaseduponmorethanonefactor
AhmedRefatZU
ANOVA
The main idea in the analysis of variance is
that we have to take into account the
variability within the groups and between the
groups and value of F is equal to the ratio
between the means sum square of between
thegroupsandwithinthegroups.
F=betweengroupsMS/withingroupsMS
AhmedRefatZU
ChiSquared Test
bQualitative variables:
1)Chi squared test:
Qualitative data are arranged in table
formed by rows and columns. One
variable define the rows and the
categories of the other variable define
thecolumn.
AhmedRefatZU
ChiSquared Test
A chisquared test is used to test whether
there is an association between the row
variable and the column variable or, in other
words whether the distribution of individuals
among the categories of one variable is
independent of their distribution among the
categoriesoftheother.
X2=(OE)2
/E
AhmedRefatZU
ChiSquared Test
1)Chi squared test:
degreeoffreedom=(row1)(column
1)
O=observedvalueinthetable
E=expectedvaluecalculatedasfollows:
E=Rt x Ct / GT
totalofrowxtotalofcolumn/grandtotal
AhmedRefatZU
AhmedRefatZU
ChiSquared Test
FromtablesofX2significanceat
degreeoffreedom(row31)x(column31)=2x2=4.Thelevelofsignificanceat
0.05level,d.f.=4is9.48.thereforewe
concludethatthereissignificantrelation
betweensocioeconomiclevelandthe
degreeofintelligence(becausethe
valueofX2>thatofthetable).
AhmedRefatZU
Z Test
2)Ztestforcomparingtwopercentages:
z = p1 p2
/p1q1/n1 + p2q2/n2.
AhmedRefatZU
ChiSquared Test
Example:ifthenumberofanemicpatientsin
group 1 which includes 50 patients is 5 and
the number of anemic patients in group 2
which contains 60 patients is 20. To find if
groups 1 & 2 are statistically different in
prevalenceofanemiawecalculateztest.
P1=5/50=10%p2=20/60=33%q1=10010=90q2=10033=67
AhmedRefatZU
ChiSquared Test
Z=1033/10x90/50+33x67/60
Z=23/18+36.85z=23/7.4
z=3.1
Therefore there is statistical significant
difference between percentages of
anemia in the studied groups (because
z>2).
AhmedRefatZU
Correlation &
regression
cCorrelation and regression:
Correlation measures the closeness of
theassociationbetweentwocontinuous
variables, while linear regression gives
theequationofthestraightlinethatbest
describesandenablesthepredictionof
onevariablefromtheother.
AhmedRefatZU
Correlation &
regression
1Correlation:
In the correlation, the closeness of the
association is measured by the correlation
coefficient,r.Thevaluesofrrangesbetween+
1and1.
Onemeansperfectcorrelationwhile0means
nocorrelation.Ifrvalueisnearthezero,it
meansweakcorrelationwhileneartheoneit
meansstrongcorrelation.Thesignand+
denotesthedirectionofcorrelation,
AhmedRefatZU
Correlation
1Correlation:
the +ve correlation means that if one
variable increases the other one
increases similarly while for the ve
correlation means that when one
variable increases the other one
decreases
AhmedRefatZU
Linear regression
2 Linear regression:
Similar to correlation, linear regression
is used to determine the relation and
prediction of the change in a variable
due to changes in other variable. For
linearregression,theindependentfactor
has to be specified from the dependent
variable.
AhmedRefatZU
Linear regression
2 Linear regression:
The linear regression, not only allow assessment
of the presence of association between the
independent and dependent variable but also
allows the prediction of dependent variable for a
particular independent variable. However,
regression for prediction should not be used
outside the range of original data. a ttest is also
used for the assessment of the level of
significance. The dependent variable in linear
regressionmustbeacontinuousone.
AhmedRefatZU
CorrelationbetweenDopplervelocimetry(RI)and
babybirthweight
1
0.8
RI
0.6
0.4
0.2
0
1.5
2.5
babyweightinkg
AhmedRefatZU
3.5
4.5
Multiple
regression
3Multiple regression:
Situations frequently occur in which we
are interested in the dependency of a
dependent variable on several
independent variables, not just one.
Testofsignificanceusedistheanalysis
ofvariance.(Ftest).
AhmedRefatZU
outpatientclinic
4. Random20femalesand20males
outofgroupof100person
5. Allworkersinafactorychosenfrom
allfactoriesincertaingovernorate
AhmedRefatZU
AhmedRefatZU
AhmedRefatZU
Theweight(Kg)ofapregnant
AhmedRefatZU
AhmedRefatZU