Professional Documents
Culture Documents
Explorative Data Analysis
Explorative Data Analysis
Outline
Data Description
Simple Inference for Continuous Data
Simple Inference for Categorical Data
Graphical Presentation of a Data Set
Measures of Dispersion
Absolute Measures
Range
Quartile Deviation
Mean Deviation
Standard Deviation
Relative Measures
Coefficient of Variation
Shape Characteristics: Skewness & Kurtosis
Manually:
Analysis> Descriptive statistics >Descriptives > Select
Variable>Click on Statistics Button> Choose the options>
Continue>OK.
Syntax:
DESCRIPTIVES VARIABLES=LIFESPAN
/STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX
SEMEAN KURTOSIS SKEWNESS.
Sampling
distribution
of mean. Its
standard
deviation is
standard
error
How can you calculate Geometric Mean and Harmonic Mean for
a continuous variable?
Ans.: It is available in Compare Means options. But its require a
categorical independent variable to compare. But we get total and
sub-total. So we can easily get the result of GM and HM for our
interested dependent variable.
Lets see.
For the given data, let run the Compare Means command using
LIFESPAN as dependent variable and diet as DIET as independent
variable.
Manually:
Analysis> Compare Means >Means> Select Dependent
Variable> Select Independent Variable> Click on Options
Button> Choose the options> Continue>Ok / Paste
Syntax:
MEANS TABLES=LIFESPAN BY DIET
/CELLS MEAN COUNT STDDEV MEDIAN GMEDIAN SEMEAN SUM
MIN MAX RANGE FIRST LAST VAR KURT SEKURT SKEW
SESKEW HARMONIC GEOMETRIC SPCT NPCT
We want to see the descriptive statistics by different groups. In such case we have to
utilize Explore Command of Descriptive Statistics
Explore Command will be utilized here
Need one or more dependent variable
Need one categorical variable to insert in Factor List
Manually:
Analysis> Descriptive statistics >Explore> Select Dependent Variable> Select
Categorical Variable in Factor List>Click on Statistics & Plot Buttons>
Choose the options> Continue>Ok
Syntax:
EXAMINE VARIABLES=LIFESPAN BY DIET
/PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT
/COMPARE GROUP
/PERCENTILES(5,10,25,50,75,90,95) HAVERAGE
/STATISTICS DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.
5% trimmed
mean
indicates the
mean of the
observations
by excluding
lower and
upper 5% of
the
observations
median
Lower quartile Upper quartile
Basics on Cross-tabulation
The cross-tabulation analysis is the basic technique for examining
the relationship between two or more categorical (nominal or
ordinal) variables (attribute), possibly controlling for additional
layering variables.
The Crosstabs procedure offers tests of independence and
measures of association for nominal and ordinal data.
Additionally, you can obtain estimates of the relative risk of an
event given the presence or absence of a particular characteristic.
Manually:
Analyze> Descriptive Statistics
>Crosstabs... >Select Column
Variable > Select Row Variable>
Choose other options> Continue >
OK
Syntax:
CROSSTABS
/TABLES=X3 BY X9
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
/CELLS=COUNT ROW
/COUNT ROUND CELL.
Cross-tabulation
Crosstab
Undernutrition status Total
Nourish Underweight
Age of children for 12-23 Count 482 688 1170
ordinal regression % within Age 41.2% 58.8% 100.0%
of children for
ordinal
regression
24+ Count 1764 1931 3695
% within Age 47.7% 52.3% 100.0%
of children for
ordinal
regression
0-11 Count 917 222 1139
% within Age 80.5% 19.5% 100.0%
of children for
ordinal
regression
Total Count 3163 2841 6004
% within Age 52.7% 47.3% 100.0%
of children for
ordinal
regression
Test of Independence between Undernutrition Status & Children Age
Chi-Square Tests
Value df Asymp. Sig.
(2-sided)
Pearson Chi- 451.927a 2 .000
Square
Likelihood Ratio 482.072 2 .000
Linear-by-Linear 353.863 1 .000
Association
N of Valid Cases 6004
a. 0 cells (.0%) have expected count less than 5. The minimum expected count
is 538.96.
In syntax mode:
After selecting all the options from dialog boxes instead of
click in “ok” we click in “paste”. And its open a new
window called syntax window. We may get the following
syntax for our required analysis.
CROSSTABS
/TABLES=ord_age interval mot_edu wealth icfi
care_ind ca_bmi ari_ord fever_ord diar_ord ord_1 wt
BY undernut
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
/CELLS=COUNT ROW
/COUNT ROUND CELL
Graphical Presentation
36
Bar Diagram
Histogram
Pie Diagram
Stem and Leaf Plot
Box Plot
Scatter Plot
Population Pyramid