You are on page 1of 19

869915

research-article2019
ADHXXX10.1177/1523422319869915Advances in Developing Human ResourcesFlatt and Jacobs

Article
Advances in Developing Human
Resources
Principle Assumptions of 2019, Vol. 21(4) 484­–502
© The Author(s) 2019
Regression Analysis: Testing, Article reuse guidelines:
sagepub.com/journals-permissions
Techniques, and Statistical DOI: 10.1177/1523422319869915
https://doi.org/10.1177/1523422319869915
journals.sagepub.com/home/adh
Reporting of Imperfect Data
Sets

Candace Flatt1 and Ronald L. Jacobs2

Abstract
The Problem.
Journal pages are filled with articles that scarcely mention the assumptions behind
the chosen statistical techniques and models. Based on questionable foundations,
the ultimate conclusions are intended to shape academia and guide practitioners.
Violations of the underlying assumptions can result in biased and misleading forecasts,
confidence intervals, and scientific insights.
The Solution.
The field of human resource development (HRD) is equipped to present these
assumptions clearly and concisely to ensure the integrity of statistical analysis and
subsequent conclusions. Testing the principle assumptions of regression analysis is a
process. As such, the presentation of this process in a systems framework provides
a comprehensive plan with step-by-step guidelines to help determine the optimal
statistical model for a particular data set. The goal of this article is to provide
practitioners a Regression Development System that can be adapted to organizational
performance as well as information that can be used to evaluate the strength of
journal articles.
The Stakeholders.
Quantitative researchers, practitioners, instructors, and students.

Keywords
regression, quantitative, assumptions

1Illinois Department of Employment Security, Springfield, USA


2University of Illinois at Urbana–Champaign, USA

Corresponding Author:
Candace Flatt, Economist, Illinois Department of Employment Security, 607 E. Adams, 8th Floor,
Springfield, IL 62701, USA.
Email: candace.flatt@illinois.gov
Flatt and Jacobs 485

Introduction
Journal pages are filled with articles that scarcely mention the assumptions behind the
chosen statistical techniques and models. Based on questionable foundations, the ulti-
mate conclusions are intended to shape academia and guide practitioners. To gain a
sense of pervasiveness, Ernst and Albers (2017) systematically reviewed clinical psy-
chology journals in 2013 for discussions on the underlying assumptions of linear
regression. Out of the 172 articles, only 2% of the articles both addressed and correctly
tested the assumptions of linear regression. Another 6% of articles addressed the
assumptions, but incorrectly tested for them. The field of human resource develop-
ment (HRD) fares slightly better. Garavan et al. (2019) conducted a literature review
of 219 articles examining the relationship between training and organizational out-
comes since 2007. They found 14% addressed independence, 5% reported issues
related to variance, 34% referenced normality, and 14% commented on linearity.
Unfortunately, the reason for the omission is most likely due to lack of basic statis-
tical knowledge. Hoekstra, Kiers, and Johnson (2012) asked 30 PhD students to ana-
lyze data sets with familiar statistical techniques as well as complete a questionnaire
on their approach. Approximately, 90% of the participants were unfamiliar with the
normality assumption of regression, and 70% of the participants were unfamiliar with
the assumption of constant variances.
Violations of the underlying assumptions can result in biased and misleading forecasts,
confidence intervals, and scientific insights (Nau, 2018). Readers are unable to trust or
replicate the results of a study if the model assumptions are violated. Nontransparency in
the analysis of the underlying assumptions reduces the informational value of the study
(Ernst & Albers, 2017). Today’s research standards require a thorough description and
defense of the chosen research design. “Good designs have a beneficial side effect: they
typically lend themselves to a simple explanation and empirical methods and a straight-
forward presentation of results” (Angrist & Pischke, 2010, p. 17).
Each assumption can be analyzed in two ways—graphically or statistically. Graphical
analysis offers a visual overview of the data set. This overview hints at the general
relationships and provides clues to determine the optimal statistical model. However, a
graphical analysis stops short of definitive conclusions. Graphical methods show viola-
tions of assumptions rather than proof of adherence (Smith, 2012). A statistical test
provides an objective determination of the hypothesis. A thorough evaluation of
assumptions includes both types of analyses—graphical and statistical. This article pro-
vides a systematic approach to testing the assumptions of regression analysis. The first
three assumptions describe the requirements of the error terms—statistical indepen-
dence, constant variance, and normal distribution. The last assumption specifies the
linear and additive relationships between dependent and independent variables.

Literature Review
Journal articles, course notes, statistical resources, textbooks, and statistical software
guides were reviewed for content relating to the assumptions of regression analysis.
486 Advances in Developing Human Resources 21(4)

Table 1.  Tests for Statistical Independence of Error Terms.

Type Methods Description


Graphical Scatterplot of the residuals Time and spatial variables are common
against any time or spatial sources of dependence. If the plots on the
variables graph are not random, then the assumption
of independence is likely violated.
Numerical Durbin–Watson coefficient The Durbin–Watson coefficient is a
measurement of the residual differences
over time. Its value is between 0 and 4.
Values less than 1 imply that successive
error terms are positively related.
Conversely, a Durbin–Watson coefficient
greater than 3 implies successive error
terms are negatively related. The test is
available in SAS, Stata, SPSS, and Python.

Note. SAS: Statistical Analysis System; SPSS: Statistical Package for Social Sciences

This literature review is divided into four sections—each section focusing on a spe-
cific assumption of regression analysis. Each section outlines the potential problems
arising from a violation, the common tests, and the possible remedies.

Assumption 1: Error Terms are Statistically Independent


The independence assumption requires observations to be independent of each other
(Nimon, 2012). Violation of this assumption often occurs in time series regression
models. In essence, serial correlation (autocorrelation) is the result of time series data
that are influenced by past values. A violation of statistical independence indicates that
the model could be improved. In extreme cases, this violation signals that the model is
mis-specified. In non–time series models, a violation of statistical independence can
be present if the model systematically underpredicts or overpredicts the coefficient
estimates (Nau, 2018).
Table 1 provides the graphical and numerical types of tests for statistical indepen-
dence of error terms. The graphical analysis consists of a scatterplot of the residuals
against time (or a variable characterized by patterned spatial points). The scatterplot of
residuals versus time should take on a rectangular shape as an indication of random-
ness. The Durbin–Watson statistic is a test for significant residual autocorrelation at
Lag 1. Ideally, the statistic should be close to 2.
Minor violations of this assumption indicate that improvements could be made with
small adjustments to the model. A small adjustment could be the addition of lagged
independent or dependent variables. Some statistical software packages provide an
ARIMA+ regressor procedure which provides the option of adding lagged variable and
error terms. A Durbin–Watson value between 1.2 and 1.6 is an example of a minor vio-
lation. A major violation of statistical independence of error terms (Durbin–Watson < 1)
Flatt and Jacobs 487

Table 2.  Tests for Constant Variance of Error Terms.

Type Methods Description


Graphical Plot residuals versus A scatterplot of the residuals versus predicted values
predicted values provides insight on homoscedasticity at a glance. The
residuals (and the variance of the residuals) should be
the same for all predicted values (Tabachnick & Fidell,
2007).
Numerical White test The White test is a test for heteroscedasticity, model
mis-specifications, or both. The White test is a
special case of the Breusch–Pagan test by relaxing the
assumption that errors are normally distributed. The
test is available in SAS, Stata, and Python.

signals a fundamental structural problem with the model. A violation of this assumption
is occasionally due to a violation of the linearity assumption. In these cases, the trans-
formations described under Assumption 4 are possible remedies (Nau, 2018).
In some cases, ordinary least squares (OLS) may not be the best choice. Linear
mixed models (LMM) are statistical models that can be applied to continuous outcome
variables characterized by normally distributed residuals that may not be independent
(violation of Assumption 1) or have constant variances (violation of Assumption 2). In
particular, LMM is an appropriate choice for study designs using longitudinal data
sets. Longitudinal data are data sets with dependent variables repeatedly measured
over time for each unit of analysis (West, Welch, & Galecki, 2015). If the chosen
model does not address a nested data set, then the risk of rejecting a null hypothesis
(Type I error) is increased (Nimon, 2012). Turner (2015) provides detailed instructions
and uses of LMM in the field of HRD.

Assumption 2: Error Terms Have Constant Variance (Homoscedasticity)


Nonconstant variances (heteroscedasticity) can originate from violations of the other
assumptions. Given all other assumptions are met, coefficients from heteroscedastic
regression results are not BLUE (best linear unbiased estimators) under OLS. When
heteroscedasticity is present, OLS gives equal weight to all observations regardless of
the magnitude of the variance. The standard errors in the presence of heteroscedastic-
ity are biased, which leads to biased test statistic and confidence intervals (Williams,
2015). Slight violations of the homoscedasticity assumption are generally acceptable,
but violations to the assumption can result in an increased risk of Type I error (Osborne
& Waters, 2002).
Table 2 provides common graphical and numerical tests for the constant variance of
error terms. The points on the scatterplot of residuals versus predicted values should
take on a rectangular shape, with the most points concentrated around zero. The mean
residual (zero) is constant for all values of the predicted value. The White test is avail-
able in SAS, Stata, and Python. The null hypothesis is that the error terms have a
488 Advances in Developing Human Resources 21(4)

constant variance. Although the test is not readily available in SPSS, IBM (2016)
provides instructions on SPSS code to produce the White test.
Although severe heteroscedasticity is problematic, the parameter estimates are not
biased. In other words, OLS estimation can still be used without the risk of distortion.
When heteroscedasticity is a concern, there are two ways to address the problem. First,
heteroscedasticity is often due to a violation of another assumption. There are several
tests available to determine if a model is mis-specified (see Assumption 4). A model
re-specification or transformation of variables may eliminate heteroscedasticity.
Second, robust standard errors can be used to obtain unbiased standard errors. Robust
standard errors relax the assumption that errors are independent and identically distrib-
uted (Williams, 2015).

Assumption 3: Error Terms are Normally Distributed


Park (2008) describes normality testing and provides the associated codes in SAS,
Stata, and SPSS. A common misconception is that the variables should be normally
distributed (Osborne & Waters, 2002); however, the correct assumption is that the
error terms are normally distributed. This particular assumption needs to be met for
the p-values of the t-tests to be valid (Chen, Ender, Mitchell, & Wells, 2003a; 2003b;
2003c). According to Nau (2018), a violation of normality can distort confidence inter-
vals for forecasts and cause difficulties in determining the significance of model coef-
ficients. A violation of normally distributed error terms can signal the existence of
unusual data points or that the model can be improved.
Table 3 provides common tests for normality. Smith (2012) cautions against using
stem-and-leaf plots and histograms. The P-P and Q-Q plots of the residuals are the
preferred graphical tests for normality. These plots provide a visual comparison
between the error distribution and a normal distribution with the same mean and vari-
ance. A bow-shaped pattern indicates skewness, and an S-shaped pattern indicates
kurtosis. Normality tests vary by statistical package. Nau (2018) states that the
Anderson–Darling test is considered to be superior because it takes into consideration
the entire normal distribution rather than only skewness and kurtosis. All normal tests
are considered overly “picky” (p. 6). In reality, error terms seldom exhibit a perfectly
normal distribution.
Violations of normality can stem from a violation of linearity (Assumption 4). A
transformation of variables may fix the problem. Outliers are another possible source
of this particular violation. Finally, the dependent and independent variables may pro-
duce error terms that are not normally distributed (Nau, 2018).

Assumption 4: The Relationship Between Dependent and Independent


Variables are Linear and Additive
The linearity assumption requires a straight-line relationship between two variables
(Nimon, 2012). Nonlinear or nonadditive data fitted to a linear model results in
Flatt and Jacobs 489

Table 3.  Tests for Normal Distribution of Error Terms.


Type Methods Description

Graphical Stem-and-leaf plot A stem-and-leaf plot presents the shape of the data by displaying the
numbers (i.e., error terms) in two columns. The left column (stem)
contains the largest place value with a different value in each row.
The right column (leaf) contains the remaining portion of each
number. This graph is useful for small sample sizes.
Histogram A histogram displays the observations (i.e., error terms) proportionally
by intervals or categories. According to Smith (2012), a histogram
is not a good way to check for normality because different intervals
influence the shape of the graph.
Probability– The P-P plot is a comparison of an empirical distribution function with
Probability plot a theoretical cumulative distribution function (i.e., normal distribution
(P-P plot) function).
Quantile–Quantile The Q-Q plot is a quantile comparison of a probability distribution
plot (Q-Q plot) with a specific theoretical distribution (i.e., normal distribution).
Numerical Skewness The right tail is longer for a positively skewed probability distribution.
The left tail is longer for a negatively skewed probability distribution.
Skewness is a descriptive statistic that captures the direction and
magnitude a probability distribution deviates from normality.
Kurtosis Kurtosis measures the peakedness of the distribution. A normal
distribution has a kurtosis of 3. A distribution with kurtosis greater
than 3 is leptokurtic. A distribution with kurtosis less than 3 is
platykurtic.
Normality tests The Shapiro–Wilk test can be performed for sample sizes ≥ 7 and
≤ 2,000. It is the ratio of the best estimator for the sum of squares
estimator of the variance. The value of 1 indicates normality. The test
is available in SAS, Stata, and SPSS.
The Shapiro–Francis test is a modification of the Shapiro–Wilk test. It
can be used for sample sizes ≥ 5 and ≤ 5,000. The test is available
in Stata.
The Kolmogorov–Smirnov test compares the empirical distribution
function of a sample (i.e., error terms) with a cumulative distribution
function of a specific distribution (i.e., normal distribution). The test
is available in SAS, SPSS, and Python. This test is useful when the
sample size is greater than 2,000.
The Anderson–Darling test is considered one of the most powerful
tools for determining if the error term is normally distributed.
Compared with the Cramer–von Mises test, the Anderson–Darling
test places greater weight on the distribution tails. Both of these
tests belong to the class of quadratic empirical distribution functions
and are available in SAS. These tests are useful when the sample size
is greater than 2,000.
The Jarque–Bera test determines if the skewness and kurtosis of
the error term match a normal distribution. This test is a good
alternative for large sample sizes. Other tests tend to reject the null
hypothesis (the error term is normally distributed) with large sample
sizes. The Jarque–Bera test can be performed in Stata using the
.sktest command or Python using jarque_bera.
490 Advances in Developing Human Resources 21(4)

incorrect estimations or predictions. Violations of linearity or additivity are considered


extremely serious (Nau, 2018). Table 4 describes several methods that can be used to
detect nonlinearity or nonadditivity. Although tests on the relationships between vari-
ables are insightful, the model specification should primarily be based on theory.
A violation of this assumption can be addressed through transformations or addi-
tional variables. When independent variables are nonadditive, the inclusion of an
interaction term can solve the problem. For example, if the relationship of X on Y is
dependent on gender, then including a term X*gender is a good option. A nonlinear
relationship can be addressed by transforming an existing variable into a new variable
that is linearly related to the dependent variable. Common transformations include
reciprocals, logarithms, cube root, square root, and squares (Williams, 2015).

Implications
The consequences of ignoring the assumptions of regression analysis impact both
practitioners and scholars. This section outlines the implications for practice, research,
and theory development.

Implications for Practice


Scholars should indicate the test results and limitations associated with the assump-
tions of regression analysis, but often this information is missing in journal articles.
HRD practitioners can still evaluate the strength of the article by comparing the data
used in the study versus the chosen statistical method. Fortunately, the types of data
and statistical methods found in HRD literature are limited.

Is the data nested?  The error terms of longitudinal or other nested data sets most likely
violate the first two assumptions—independence and homoscedasticity. Hierarchical
structures are naturally occurring in organizations. For instance, employees are nested
within departments and departments are nested within organizations. An example of
this type of study is Choi, Lee, and Jacobs (2015). To address the violations, the
authors used LMM to examine the relationship between employee traits, organiza-
tional traits, and structured on-the-job (S-OJT) training.

Does the dependent variable represent a probability?  Performance improvement can be


represented by the probability that an individual improves performance through an
HRD intervention. Likewise, the measurement for employment is often the probability
of employment. If the dependent variable is a probability, then Assumption 3—error
terms are normally distributed—is violated. A regression using logistic regression or
generalized linear mixed models (GLMMs) relaxes the assumption of normally dis-
tributed error terms. For example, Moore and Gorman (2009) used a logistic regres-
sion to analyze the impact of training and demographics in Workforce Investment Act
(WIA) program performance. They found on-the-job training was positively associ-
ated with employment and reported an odds ratio of 2.188. The Workforce Innovation
Flatt and Jacobs 491

Table 4.  Tests for Linear and Additive Relationships Between Variables.

Type Methods Description


Graphical Scatterplots with Scatterplots of the dependent versus independent
smoother curves variables provide an indication of the type of
(such as lowess) relationships and the potential problems that might be
encountered in the regression analysis.
Plot standardized A scatterplot of the residuals versus predicted values
residuals versus provides insight on linearity at a glance. In a linear
predicted values relationship, the line constructed from joining residual
means at each predicted value will be a horizontal line
through zero. If the mean residual varies depending
on the predicted value, then the relationship between
dependent and independent variables is nonlinear.
Numerical Correlations The Pearson correlation measures the linear
between relationship between two variables. A value close to
dependent and zero indicates either no relationship or a nonlinear
independent relationship. The variables must be measured in
variables interval scales. Other correlation measurements
include Spearman’s correlation that can be used for
ordinal variables and Hoeffding’s D correlation that
can measure monotonic (one-directional curve),
nonmonotonic (curve with hills and valleys), and linear
relationships (Bhalla, 2015).
Specification error The incremental F test is not the easiest approach, but
tests it is available in most statistical software. This test
estimates both a full model (including the nonlinear
terms) and a constrained model (excluding the
nonlinear terms). The R squares of the two models are
used to calculate the incremental F statistic (Williams,
2015).
The Wald test uses estimated coefficients and variances/
covariances to determine specification errors. For a
linearity test, the coefficients of nonlinear terms in the
model would be equal to zero in the null hypothesis.
If the null hypothesis cannot be rejected, then the
variables can be removed from the model. The Wald
test is an easier alternative to the incremental F test,
which only requires one estimation model through the
test command in Stata (Williams, 2015).
The Ramsey RESET test can be used to determine if
a nonlinear combination of terms can help explain
a model. The test uses an original model with no
nonlinear terms and an expanded model with nonlinear
terms. The null hypothesis is that the coefficients of the
nonlinear terms are equal to zero. If the null hypothesis
is rejected, the RESET test offers no further guidance. It
is purely a functional form test (Wooldridge, 2013).
492 Advances in Developing Human Resources 21(4)

and Opportunity Act of 2014 (WIOA) places greater emphasis on performance mea-
sures and evidence-based practices than the previous WIA. An increasing number of
studies will predict employment probabilities to meet WIOA requirements.

Does the dependent variable represent income?  In HRD, workforce development studies
that measure income typically take the logarithm of income to address violation of
Assumption 4—linearity. The transformation captures the diminishing utility of
income and simplifies the interpretation of results, which are similar to elasticities
(Flatt & Jacobs, 2018).

Is there an intervening variable?  Song and Lim (2015) examine the use of mediation
analysis in HRD. The increasing complexity of organizations has resulted in the need
for more sophisticated models. A mediating variable is an intervening variable that
alters the relationship between dependent and independent variables. Based on studies
published between 2000 and 2014, they found 66 articles in HRDQ, 17 articles in
HRDI, and one article in ADHR that were mediation studies. A specific example of a
mediation study was conducted by Fletcher (2016). The author found personal role
engagement to be a strong mediator of the relationship between training perceptions
and task proficiency. In addition, both work engagement and personal role engage-
ment served as mediators between training perceptions and task proactivity. Failure to
include a mediating relationship in a model would result in a model mis-specification
and possibly violation of Assumption 4.

Is there a moderating variable?  Whereas a mediator alters the relationship itself, mod-
erators alter the scale of the relationship between independent and dependent vari-
ables. If an interaction term is not included to capture a moderating relationship, then
the additivity requirement of Assumption 4 is violated. Ismail (2016) provides an
example of a moderating relationship. The author found that the relationship between
training and organizational commitment was higher for individuals oriented toward
higher learning goals. In other words, learning goal orientation was found to be a mod-
erator between training and organizational commitment.

Implications for Research


A true causal relationship can rarely be statistically achieved, but the goal of quantita-
tive research is to come as close as possible to infer causality. Violations of the under-
lying assumptions can result in incorrect conclusions and false claims of causality.
Numerous editorials and articles have appeared in HRD journals calling for an increase
in the rigor of quantitative articles (Nimons, 2011, 2012, 2017; Garavan et al., 2019;
Osbourne, 2013). Specifically, studies need to acknowledge the following: (a) the sta-
tistical assumptions, (b) methods that were used to address violations, and (c) resulting
limitations.
Organizations embrace HRD as a practice when its value is clearly demonstrated.
Garavan et al. (2019) stress the need for greater methodological rigor in measuring the
Flatt and Jacobs 493

impact of training on organizations, including the need to report statistical assump-


tions. Similarly, Zientek, Nimon, and Hammack-Brown (2016) emphasize the impor-
tance of the underlying assumptions in pretest–posttest control group designs. Using a
simulated data set, the authors demonstrated that violations of statistical assumptions
lead to misestimating the benefits of an intervention. Scholars need to produce research
studies that are both trustworthy and relevant to HRD practitioners. In conjunction,
practitioners need to be able to identify research studies that are statistically sound.

Implications for Theory Development


Finally, neglecting to test the assumptions of regression analysis impedes theory
development. If assumptions are violated, then theory development can potentially be
derailed indefinitely. In the unlikely event that all assumptions are met, scholars can
only trust the results after the assumptions have been thoroughly tested.

Regression Development System


The field of HRD is uniquely equipped to present these assumptions clearly and
concisely to ensure the integrity of statistical analysis and subsequent conclusions.
System theory is one of the foundational theories of HRD. A system consists of
unique parts (or elements) that interact and interrelate with each other within a given
environment. The system, in its entirety, functions as a whole. It is made up of
inputs, processes, and outputs as well as feedback or feedforward. Mapping out each
component of the system ensures the inclusion of all critical components in the sys-
tem (Jacobs, 2014).
Testing the principle assumptions of regression analysis is a process. As such, the
presentation of this process in a systems framework provides quantitative research-
ers a comprehensive plan to help determine the optimal statistical model for a par-
ticular data set. The input of the Regression Development System includes a
description of each assumption under regression analysis. The process entails the
most common tests performed for each assumption. The outputs are the limitations
of the study under OLS and the potential techniques that can be used to address
violations of the assumptions.
Through the use of this Regression Development System, the information needed
to demonstrate the statistical strength of the study is presented clearly and concisely.
This system is designed to check the assumptions in a specific order. The order takes
into consideration the robustness associated with the techniques to check each assump-
tion. According to Smith (2012), the statistical techniques for testing assumptions
should be performed from least to most robust. The four principle assumptions under
OLS should be analyzed in the following order: (a) the error terms are statistically
independent, (b) the error terms have a constant variance, (c) the error terms are nor-
mally distributed, and (d) the relationships between dependent and independent vari-
ables are linear and additive. For most data sets, one or more of these assumptions are
not valid.
494 Advances in Developing Human Resources 21(4)

Figure 1.  Regression Development System: Independence.

The Application of the Regression Development System to an Imperfect


Data Set
With the availability of free statistical software, such as R or Python, practitioners
from any size organization can run regression analyses. Perktold, Seabold, and Taylor
(2018) provide an overview of the regression diagnostic tests and codes available in
Python. Python developers freely share code and advice to novice users.
An example of a study with an imperfect data set found in all organizations is cap-
turing the change in job performance resulting from HRD interventions (e.g., S-OJT
training, coaching, or apprenticeships). The measurement of the same individual’s job
performance before and after an HRD intervention indicates that the data most likely
violate Assumptions 1 (independence of error terms) and 2 (homogeneity of error
terms). In essence, performance is influenced by past performance.
In addition, the dependent variable representing job performance could be the proba-
bility that the individual improved performance. The probability distribution violates
Assumption 3 (normal distribution of error terms). The Regression Development System
in Figures 1 to 4 reveals information that is needed to determine the limitations of the
study under OLS and construct the most appropriate model. The application of the
Regression Development System is exemplified in Table 5 using SAS. The coding and
results of a similar model are available in SAS, SPSS, Stata, and Python (Flatt, 2019). For
simplicity, the codes and examples are limited to dependent and independent variables. In
practice, the full model—including controls—should be evaluated.
From Table 5, the error terms are statistically independent. A graph of the residual
over time shows a random, rectangular shape. The Durbin–Watson D value of 1.824
confirms that the error terms are statistically independent.
Flatt and Jacobs 495

Figure 2.  Regression Development System: Homogeneity.

Figure 3.  Regression Development System: Normality.

Second, the error terms do not have a constant variance. The heteroscedasticity can
be seen through the negative shape of the residual versus predicted value graph. In
496 Advances in Developing Human Resources 21(4)

Figure 4.  Regression Development System: Linearity.

comparison, the graph of a homoscedastic model would show a random, rectangular


shape. The White test confirms the result. The null hypothesis—the error terms have a
constant variance—is rejected at the alpha level of .05.
Third, the Q-Q plot shows an S-shape, which indicates kurtosis. The normality tests
in SAS confirm that the error terms are not normally distributed. The null hypothe-
sis—the error terms are normally distributed—is rejected at the alpha level of .05.
Finally, the tests for linear and additive relationships find that the relationships between
dependent and independent variables are not linear.
The intent of the study and the magnitude of the violations determine the next
course of action. Minor violations of the OLS assumptions that do not interfere with
the integrity of the results are acceptable. With minor violations, the study should
include an acknowledgment of the limitations as well as a justification of the results
under the limitations.
For this specific example, three out of the four assumptions of OLS are violated.
The limitations under OLS include the following: (a) the regression results are not
BLUE, (b) the significance of the coefficients is in question, and (c) the coefficient
estimates are likely incorrect. Due to the seriousness of the violations, a different
model could be chosen rather than acquiescing to the constraints of OLS.
Choosing a LMM would address the lack of independence and homogeneity of the
error terms. Extending the model to a GLMM can accommodate models suspected to
have nonlinear relationships or residuals with nonnormal distributions—such as bino-
mial or gamma distributions (Anderson, Verkuilen, & Johnson, 2012). GLMM accom-
modates multi-level data and is particularly useful with longitudinal data sets. When
job performance is measured before and after an HRD intervention, each training par-
ticipant is associated with multiple measurements—creating two levels of data.
Table 5.  Example of the Regression Development System.
Input Process Outcome

Assumptions Graphical Numerical OLS limitations Address

1. Error terms are Durbin–Watson D = 1.824 None None


statistically
independent

2. Error terms have Regression results are Consider using


constant variance not BLUE alternative model
White test
(GLMM)
Test of first and second moment specification
DF χ2 Pr > χ2

9 23.94 .0044

(continued)

497
498
Table 5. (continued)

Input Process Outcome

Assumptions Graphical Numerical OLS limitations Address

3. Error terms A better model exists Consider using


are normally Difficulty determining alternative model
distributed Tests for normality significance of (GLMM)
coefficients
Test Statistic p value
Kolmogorov–Smirnov D 0.158 <.010
Cramer–von Mises W2 40.111 <.005
Anderson–Darling A2 268.399 <.005

4. The relationship Incorrect estimations Transform


between Job performance variables
dependent and (accomplished
independent Job Performance 1.00000 through the
variables are Literacy courses −.05758 link function in
linear and <.0001 GLMM)
additive Structured on-the-job training .06443
<.0001
Coaching .06731
<.0001

Note. OLS = ordinary least squares; BLUE = best linear unbiased estimators; GLMM = generalized linear mixed models
Flatt and Jacobs 499

Pictorially, the results of GLMM produce a unique line for each training participant
through the random effects (random intercept and random slope).
The construction of a GLMM involves choosing a link function. The link function
is chosen based on theory and the distribution of the data. The distribution of the ran-
dom component of the dependent variable determines the type of GLMM and the link
function. Common link functions include logit, probit, logarithm, and multinomial
logit (Liao, 1994). For this particular example, a logistic link function and a binomial
distribution would be a good option to accommodate the dependent variable (the prob-
ability of good job performance).

Conclusion
The field of HRD is uniquely equipped with the theory and models to present statisti-
cal methods clearly, concisely, and completely. Based on systems theory, the Regression
Development System provides academics and practitioners a comprehensive over-
view of model development as well as the details in each development phase. This
system provides authors and practitioners a tool to ensure that each principle assump-
tion of OLS is analyzed. In addition, the system is designed to assist with developing
models and identifying limitations.
The intent of the study and the severity of the assumption violations determine if
OLS is the appropriate type of regression. Minor violations of the assumptions might
be acceptable. For example, heteroscedastic models (violation of Assumption 2) are
problematic in forecasting, but coefficient estimates remain unbiased. Similarly, the
assumption of normality (Assumption 3) is technically only required for forecasting.
If the purpose of a study is to examine relationships, then violations of Assumptions 2
and 3 will not impact the results of the study. In such cases, the test of the assumptions
should still be performed, and the reason the violations are acceptable should be noted.
A minimum reporting requirement should include a list of assumptions, the diag-
nostic tools, and the criteria. If the manuscript length is an issue, then the statistical
checks can be provided as supplemental information (Ernst & Albers, 2017). The
Regression Development System provides a tool to promote standard testing of OLS
assumptions and to adhere to the minimum reporting requirements. The principle
assumptions form the basis of OLS regression analysis. An increase in the awareness
and understanding of these assumptions by both authors and audience is necessary to
improve the quality of research.

Acknowledgment
C.F. thanks Eastern Illinois University’s Provost Office and Library.

Declaration of Conflicting Interests


The author(s) declared no potential conflicts of interest with respect to the research, authorship,
and/or publication of this article.
500 Advances in Developing Human Resources 21(4)

Funding
The author(s) received no financial support for the research, authorship, and/or publication of
this article.

References
Anderson, C., Verkuilen, J., & Johnson, T. (2012). Applied generalized linear mixed models:
Continuous and discrete data. Retrieved from https://education.illinois.edu/docs/default-
source/carolyn-anderson/edpsy587/GLM_GLMM_LMM.pdf
Angrist, J., & Pischke, J. (2010). The credibility revolution in empirical economics: How better
research design is taking the con out of Econometrics. Journal of Economic Perspectives,
24, 3-30. doi:10.1257/jep.24.2.3
Bhalla, D. (2015). Detect non-linear and non-monotonic relationship between variables. Retrieved
from https://www.listendata.com/2015/03/detect-non-linear-and-non-monotonic.html
Chen, X., Ender, P., Mitchell, M., & Wells, C. (2003a). Regression with SAS. https://stats.idre.
ucla.edu/stat/sas/webbooks/reg/default.htm
Chen, X., Ender, P., Mitchell, M., & Wells, C. (2003b). Regression with Stata. https://stats.idre.
ucla.edu/stat/stata/webbooks/reg/default.htm
Chen, X., Ender, P., Mitchell, M., & Wells, C. (2003c). Regression with SPSS. https://stats.idre.
ucla.edu/stat/spss/webbooks/reg/default.htm
Choi, Y. J., Lee, C., & Jacobs, R. (2015). The hierarchical linear relationship among struc-
tured on-the-job training activities, trainee characteristics, trainer characteristics, training
environment characteristics, and organizational characteristics of workers in small and
medium-sized enterprises. Human Resource Development International, 18, 499-520. doi:
10.1080/13678868.2015.1080046
Ernst, A., & Albers, C. (2017). Regression assumptions in clinical psychology research prac-
tice—A systematic review of common misconceptions. PeerJ, 5, e3323. Retrieved from
https://peerj.com/articles/3323.pdf
Flatt, C. (2019). Selected works of Candace Flatt. Retrieved from https://works.bepress.com/
candace-flatt/
Flatt, C., & Jacobs, R. (2018). The relationship between participation in different types of
training programs and gainful employment for formerly incarcerated individuals. Human
Resource Development Quarterly, 29, 263-286. doi:10.1002/hrdq.21325
Fletcher, L. (2016). Training perceptions, engagement, and performance: Comparing work
engagement and personal role engagement. Human Resource Development International,
19, 4-26. doi:10.1080/13678868.2015.1067855
Garavan, T., McCarthy, A., Sheehan, M., Lai, Y., Saunders, M., Clarke, N., & Shanahan, V.
(2019). Measuring the organizational impact of training: The need for greater methodologi-
cal rigor. Human Resource Development Quarterly, 2019, 1-19.
Hoekstra, R., Kiers, H., & Johnson, A. (2012). Are assumptions of well-known statistical tech-
niques checked, and why (not)? Frontiers in Psychology, 3, 137. Retrieved from https://
www.ncbi.nlm.nih.gov/pmc/articles/PMC3350940/pdf/fpsyg-03-00137.pdf
IBM. (2016). Retrieved from http://www-01.ibm.com/support/docview.wss?uid=swg21476748
Ismail, H. (2016). Training and organizational commitment: Exploring the moderating role of
goal orientation in the Lebanese context. Human Resource Development International, 19,
152-177. doi:10.1080/13678868.2015.1118220
Flatt and Jacobs 501

Jacobs, R. L. (2014). Chapter two: System theory and HRD. In N. Chalofsky, T. Rocco, & L.
Morris (Eds.), Handbook of human resource development (pp. 21-39). New York, NY:
Wiley.
Liao, T. (1994). Interpreting probability models: Logit, probit, and other generalized linear
models. Thousand Oaks, CA: SAGE publications.
Moore, R., & Gorman, P. (2009). The impact of training and demographics in WIA program
performance: A statistical analysis. Human Resource Development Quarterly, 20, 381-396.
doi:10.1002/hrdq.20029
Nau, R. (2018). Statistical forecasting: Notes on regression and time series analysis. Retrieved
from http://people.duke.edu/~rnau/testing.htm
Nimon, K. (2011). Improving the quality of quantitative research reports: A call for action.
Human Resource Development Quarterly, 22, 387-393. doi:10.1002/hrdq.20091
Nimon, K. (2012). Statistical assumptions of substantive analyses across the general linear
model: A mini review. Frontiers in Psychology, 3, 322. doi:10.3389/fpsyg.2012.00322
Nimon, K. (2017). HRDQ submissions of quantitative research reports: Three common com-
ments in decision letters and a checklist. Human Resource Development Quarterly, 28,
281-298. doi:10.1002/hrdq.21290
Osborne, J., & Waters, E. (2002). Four assumptions of multiple regression that researchers
should always test. Practical Assessment, Research, & Evaluation, 8(2), 1-5. Retrieved
from https://pareonline.net/getvn.asp?n=2&v=8
Osbourne, J. (2013). Is data cleaning and the testing of assumptions relevant in the 21st century?
Frontiers in Psychology, 4, 370. doi:10.3389/fpsyg.2013.00370
Park, H. (2008). Univariate analysis and normality test using SAS, Stata, and SPSS (Working
Paper). Retrieved from http://download.brookerobertshaw.com/normality_testing_sas_
spss.pdf
Perktold, J., Seabold, S., & Taylor, J. (2018). Regression diagnostics and specification tests.
Retrieved from https://www.statsmodels.org/dev/diagnostic.html
Smith, M. (2012). Common mistakes in using statistics. Retrieved from https://web.ma.utexas.
edu/users/mks/statmistakes/TOC.html
Song, J. H., & Lim, D. H. (2015). Mediating analysis approaches: Trends and implications for
advanced applications in HRD research. Advances in Developing Human Resources, 17,
57-71. doi:10.1177/1523422314559807
Turner, J. (2015). Hierarchical linear modeling: Testing multilevel theories. Advances in
Developing Human Resources, 17, 88-101. doi:10.1177/1523422314559808
West, B., Welch, K., & Galecki, A. (2015). Linear mixed models: A practical guide using sta-
tistical software. Baca Raton, FL: CRC Press.
Williams, R. (2015). Nonlinear relationships. Retrieved from https://www3.nd.edu/~rwilliam/
stats2/l61.pdf
Wooldridge, J. (2013). Introductory econometrics: A modern approach. Mason, OH:
South-Western, Cengage Learning. Retrieved from http://economics.ut.ac.ir/docu-
ments/3030266/14100645/Jeffrey_M._Wooldridge_Introductory_Econometrics_A_
Modern_Approach__2012.pdf
Zientek, L., Nimon, K., & Hammack-Brown, B. (2016). Analyzing data from a pretest-post-
test control group design: The importance of statistical assumptions. European Journal of
Training and Development, 40, 638-659. doi:10.1108/EJTD-08-2015-0066
502 Advances in Developing Human Resources 21(4)

Author Biographies
Candace Flatt, PhD, is an Economist with the Illinois Department of Employment Security and
an instructor at Millikin University. She earned her doctorate from the University of Illinois at
Urbana–Champaign in Education Policy, Organization, & Leadership with concentrations in
Human Resource Development and Higher Education.
Ronald L. Jacobs, PhD, is a professor of human resource development at the University of
Illinois at Urbana–Champaign.

You might also like