Professional Documents
Culture Documents
In novometric theory the minimum denominator 100/(ESS/s)-s. For example, for a model with
selection algorithm identifies the descendant s=3 and ESS=60, D=100/(60/3)-3=2: two more
family (DF), consisting of one or more optimal effects of equivalent efficiency are needed to
models, within which the globally optimal (GO) achieve an ideal statistical solution. As seen, the
model for an application resides. In applications distance D of an empirical model (asterisk) from
with a DF consisting of multiple models, the set a theoretically ideal model (upper right-hand
of optimal models differ with respect to their corner) for an application is a function of both
predictive accuracy normed vs. chance by the accuracy and parsimony of the empirical model.
ESS statistic (0=predictive accuracy expected Normed over accuracy and parsimony, D may
by chance; 100=errorless prediction), and their thus be used to directly compare the quality of
complexity defined as the number of sample competing models regardless of their underlying
strata (endpoints) in the model. Observations are architecture—each model considered relative to
homogeneous with respect to the attributes that its corresponding theoretically ideal model.
define the endpoint within a given endpoint, and
are heterogeneous with respect to the defining Figure 1: ESS by Complexity Space, the
attributes between model endpoints.1,2 Theoretically Ideal Statistical Model, and D
Novometric analysis identifies the GO
ESS/Strata Theoretically
model in the DF representing the “best” combi- Ideal Model
nation of predictive accuracy and parsimony M D
o
(the antithesis of complexity), defined as ESS d
divided by the number of model strata (i.e., the e
Empirical
l
mean ESS-per-endpoint obtained by the model).
This assessment requires creating a square E
f
Model
*
Cartesian space, crossing ESS by efficiency as f
i
abscissa and ordinate, respectively. In Figure 1, c
i
for example, for a two-strata model the effi- e
ciency axis ranges from 0 to 100/2=50; for a n
c
three-strata model the efficiency axis ranges y
from 0 to 100/3=33.3; and for a model with s
strata the efficiency axis ranges from 0 to 100/s. 0 25 50 75 100
The distance of an empirical model from Model ESS
a theoretically ideal model1 is computed as: D=
171
Optimal Data Analysis Copyright 2016 by Optimal Data Analysis, LLC
Vol. 5 (November 4, 2016), 171-174 2155-0182/10/$3.00
172
Optimal Data Analysis Copyright 2016 by Optimal Data Analysis, LLC
Vol. 5 (November 4, 2016), 171-174 2155-0182/10/$3.00
173
Optimal Data Analysis Copyright 2016 by Optimal Data Analysis, LLC
Vol. 5 (November 4, 2016), 171-174 2155-0182/10/$3.00
10
References Schwarz G (1978). Estimating the dimension
1
of a model. Annals of Statistics, 6, 461–464.
Yarnold PR, Soltysik RC (2016). Maximizing
predictive accuracy. Chicago, IL: ODA Books. Author Correspondence
DOI: 10.13140/RG.2.1.1368.3286
2
Mail: Optimal Data Analysis, LLC
Yarnold PR (1996). Discriminating geriatric 6348 N. Milwaukee Ave., #163
and non-geriatric patients using functional status Chicago, IL 60646
information: An example of classification tree
analysis via UniODA. Educational and Psycho-
logical Measurement, 56, 656-667.
3
Yarnold PR, Soltysik RC (2005) Optimal data
analysis: A Guidebook with Software for Win-
dows. Washington, DC: APA Books.
4
Rosenbaum PR, Rubin DB (1983). The central
role of propensity score in observational studies
for causal effects. Biometrika, 70, 41–55.
5
Linden A, Yarnold PR (In Press). Combining
machine learning and propensity score
weighting to estimate causal effects in multi-
valued treatments. Journal of Evaluation in
Clinical Practice.
6
Linden A, Yarnold PR (In Press). Combining
machine learning and matching techniques to
improve causal inference in program evaluation.
Journal of Evaluation in Clinical Practice.
7
Linden A, Yarnold PR (In Press). Using
machine learning to assess covariate balance in
matching studies. Journal of Evaluation in
Clinical Practice. DOI: 10.1111/jep.12538
8
Linden A, Yarnold PR (In Press). Using data
mining techniques to characterize participation
in observational studies. Journal of Evaluation
in Clinical Practice. DOI: 10.1111/jep.12515
9
Akaike H (1973). Information theory and an
extension of the maximum likelihood principle.
In: Second International Symposium on Infor-
mation Theory, BN Petrov and F Csaki (Eds.),
Budapest: Akailseoniai–Kiudo (pp. 267–281).
174