Professional Documents
Culture Documents
Without understanding how to analyze data, a researcher will not be able to interpret the data, nor draw any conclusions or
recommendations from it
The following plan outlines the process required to identify the most efficient statistical test or tests for any
research-derived quantitative data. The first five points should be addressed prior to data collection -they are
important since they may serve to frame the measurement approach you adopt to collect the data.
They will also feed into later selection of an analytic strategy. The latter three points apply following data collection
and lead directly into selection of an analytical technique.
1) Identify what type of research you are undertaking, e.g. hypothesis testing, inferring statistical robustness
of a psychometric tool, exploring data, evaluating an intervention of some sort;
2) Identify the type of data you have obtained - are these data categorical, ordinal or interval in nature? (see
glossary for a definition of 'levels of measurement1);
3) If hypothesis testing, decide whether one- or two-tailed tests are most appropriate to answer the research
question;
4) Identify dependent and independent variables (DVs and IVs);
5) Conduct a power analysis (see Chapter 7 for more on this), prior to collecting your data - this will
provide you with a good idea of the sample sizes necessary to ensure accuracy of statistical tests;
6) Screen the data for missing values, outliers and types of distribution (normal, skewed, kurtotic, etc.)
and clean data where necessary via transformation and/or exclusion;
7) Identify sample sizes contained within your data (total sample size, group sizes, etc.);
8) Decide on the test you wish to use.
Fundamental questions which will narrow the search for a suitable test include:
Whether your data are categorical, ordinal or interval level. Only interval -level data can draw on parametric
tests; other data types are restricted to non-parametric tests;
Whether you are interested in only one variable (requiring univariate tests), two variables (requiring bivariate
tests) or more than two variables (requiring multivariate tests).
Having answered these questions, you should be able to identify the test or tests most appropriate to your aims and
data. You can do this by answering the following questions:
1. Are you interested in differences between groups, e.g. pre- and post-intervention, or between work teams? If so,
possible tests will include:
t-tests - if data are interval (unrelated samples t-tests are used where the Samples are unmatched; related,
where samples are matched or include the same participants);
Mann-Whitney U or Wilcoxon - if data are ordinal (Mann-Whitney U for independent groups, Wilcoxon
for related groups);
McNemar change test or Chi-square test of independence - if data are nominal (McNemar change test
for related groups, Chi-square test of independence
2. Are you interested in relationships between factors/variables, e.g. between job satisfaction and performance
at work? If so, possible tests include:
3. Are you exploring patterns in the data set, e.g. in a questionnaire measure which purports to lap into various
different 'constructs'? If so possible tests include;
Exploratory principal components factor analysis - looks for groups of variables that share common variance,
from the assumption that these groupings are 'caused' by the same unobservable (latent) factors. Has some tight
restrictions in terms of type, level of data and sample size;
Cluster analysis - groups variables together on the basis of similarity of the patterns of scores on them.
Less restrictive than factor analysis in terms of property requirements of the data;
Multi-dimensional scaling - looks for variables that share similar patterns of scores across respondents,
and draws a plot of variables so that those^ responding most similarly are located proximally on the plot.
Again, less restrictive than factor analysis. A variation of this technique with even less restrictions on data
type is multi-dimensional scalogram analysis (MSA)
4. Are you interested in categorizing participants according to certain characteristics, e.g. whether two groups
of workers are best distinguished by variations in commitment, or by job and organizational tenure?
Discriminant function analysis - requires two or more continuous predictor variables and attempts to
categorize cases according to these predictor variables into a categorical dependent variable.
5. Do you wish to predict one or more outcome factors using the data you have collected, e.g. predicting
individual performance by examining workers’ affective reactions to the workplace?
Simple regression - if you have one interval-level predictor variable and one interval-level outcome
variable;
Multiple regression - if you have one interval-level outcome variable and more than one interval-,
ordinal or categorical-level predictor variable and wish to determine which predictor variable(s) best
predict(s) the outcome variable;
Logistic regression - if you have one categorical outcome variable and two or more categorical or
interval-level predictor variables and need to determine the best predictor variable(s);
Discriminant function analysis - if you have more than two interval-level predictor variables and a
categorical outcome variable.
2
Confirmatory factor analysis - if the factor structure is established, e.g. with the 16-PF's sixteen
personality factors, the number of expected factors can be specified;
Exploratory factor analysis - if the factor structure is unknown, or alternative factor solutions are
suspected, can be used to identify patterns of scale items.
Spearman rank correlation coefficient Examine relationships between two Ordinal level variables
variables
3
Table 2 Multivariate data analysis: applications, restrictions and interpretation
Multiple regression Predict outcomes, using one or more Interval-level DV Linear relationship between variables
predictor variables, e.g. predicting Homoscedasticity (residuals and variables should be
staff turnover from age, normally distributed)
organizational commitment, rating of Large sample (see Chapter 7 for more on sampling
supervisor issues)
Hierarchical cluster analysis Identify homogeneous groups of cases Predictor variables must be interval-level and may need to be
based on selected attributes, e.g. standardized so that they fall on the same scale
attempting to predict whether or not
someone is married according to their
age, sociability rating, and total number
of friends
K- means cluster analysis As for hierarchical cluster analysis, but Knowledge of expected number of clusters
you must know beforehand how many
clusters you expect
4
Multidimensional scaling Determine the pattern or structure Predictor variables must be interval -level
(not MSA) contained in a matrix of observations Multi-dimensional scalogram analysis (MSA) is a variant
and present this in a psychologically of this approach and does not require interval-level data
meaningful way, e.g. by exploring
people's representations of their
psychological contract at work
MANOVA Examine differences between more than DV interval- or ratio-level Sensitive to outliers
two groups of participants on more than Dependent variable normally distributed
one variable, i.e. on a combination of Each group randomly sampled from the population
variables, thereby addressing variable Linear relationships between all dependent variables
combinations or interactions by group