You are on page 1of 34

Does Size Matter?

Exploring the
Small Sample Properties of
Maximum Likelihood Estimation
Robert A. Hart, Jr.
Department of Political Science
University of North Texas
Denton, Texas 76203
hart@unt.edu

David H. Clark
Department of Political Science
Florida State University
Tallahassee, Florida 32306-2230
dclark@garnet.acns.fsu.edu

April 20, 1999

 We are grateful to Chris Mooney, Evan Ringquist and Mitch Sanders for helpful con-
versations during the course of this project. Prepared for presentation at the Annual
Meeting of the Midwest Political Science Association, Chicago, IL, March 1999. An ear-
lier version of this project was presented at the Midwest Conference, 1997.
Abstract
The last two decades have witnessed an explosion in the use of
computationally intensive methodologies in the social sciences as com-
puter technology has advanced. Among these empirical methods are
Maximum Likelihood (ML) procedures. ML estimators possess a va-
riety of desirable qualities, perhaps most prominent of which is the
asymptotic efficiency of the standard errors. However, the behavior
of the estimators in general, of the estimates of the standard errors
in particular, and thus of inferential hypothesis tests are uncertain in
small sample analyses. In political science research, small samples
are routinely the subject of empirical investigation using ML meth-
ods, yet little is known regarding what effect sample size has on a
researcher’s ability to draw inferences
This paper explores the behavior of ML estimates in probit mod-
els across differing sample sizes and with varying numbers of inde-
pendent variables in Monte Carlo simulations. Our experimental re-
sults allow us to conclude that: a) the risk of making Type I errors
does not change appreciably as sample size descends; b) the risk of
making Type II errors increases dramatically in smaller samples and
as the number of regressors increases.
1 Introduction
Maximum Likelihood (ML) Estimation is a class of procedures that has
been recognized since the 1920s and which especially received attention in
the work of R.A. Fisher as he searched for a solution to the “inverse prob-
ability” question. But ML estimation, being computationally intensive,
has only become widely used since computer technology has advanced
far enough to make its use tractable. With ML’s emergence as a viable and
valuable tool for estimation, questions regarding the properties of ML esti-
mators are more pressing. While its asymptotic properties are recognized
and can be demonstrated, little is known about the small sample proper-
ties of ML estimators, though one might suspect intuitively that efficiency
and unbiasedness of parameters are no longer given at smaller values of
n.
How estimators of any kind behave in small samples is of practical
importance in political science given the frequency with which political
scientists examine empirical phenomena in limited samples. Students of
American politics, for example, are often limited in their empirical exam-
inations to 50 states or fewer (see, for example, Erikson, Gerald C. Wright
& McIver 1987, Erikson, Gerald C. Wright & McIver 1989, Gray & Lowery
1988, Hall 1992, Hill & Leighley 1992, Hill & Leighley 1993, Lowery &
Gray 1993, Lowery & Gray 1994, Patterson & Caldeira 1984, Quinn &
Shapiro 1991, Ringquist 1993, Smith & Meier 1995), unless time-series data
are available and cross-sections can be pooled. Though international re-
lations researchers often benefit from lengthy time-series data on a large
number of cross-sections, a variety of important studies, especially those
testing domestic politics hypotheses, have relatively small samples as well
(see Fearon 1994, Morgan & Bickers 1992, Morrow 1991, Ostrom & Job
1986, Wang & Ray 1994). Likewise, cross-national studies of voting behav-
ior often are limited in scope and suffer from small samples (e.g. Jackman
1987, Radcliff 1992). Other studies examine single time-series (Williams
1990) or a phenomenon with limited cross-sections like legislators’ votes
(Bartels 1991).
In the following pages we attempt to identify more clearly the behavior
of ML estimates at varied sample sizes. We design a Monte Carlo proce-
dure to examine coefficient and standard error estimates in probit mod-
els, paying special attention to the issue of inference. Most commonly,
researchers express concern that sample size problems may influence the
ratio of coefficient to standard error upon which statistical inference is
based. Our experiments are designed so that we can specifically exam-
ine the frequency with which our models, at varying sample sizes and

1
with varying numbers of independent variables, lead us to make inferen-
tial mistakes. Finally, we examine a limited number of bootstrap standard
error estimates in order to assess the improvement bootstrap resampling
procedures provide when small samples reduce our confidence in statisti-
cal estimates and the corresponding inferences.

2 The Problem
The statistical properties of maximum likelihood estimators and conven-
tional wisdom collectively suggest that sample size should be important
both to estimation and to inference. Yet, political science research rou-
tinely reports models examining samples of 50 cases or fewer (see the par-
tial list in the section above for examples). What constitutes an adequate
sample size and avoids sample size-related problems is not at all clear.
In fact, the problem has generally been dismissed as minimal because of
the desirable asymptotic properties of ML estimates (Greene, 1993: 306ff).
Discussing the problem, one scholar writes “in the typical ML estimation
procedure, one would want to have a large sample size because the de-
sirable properties of the MLE : : : are justified only in large sample situ-
ations.” (Eliason 1993, 8) Eliason goes on to say in a footnote that “with
few parameters to estimate (i.e., 1 to about 5), a sample size of more than
60 is usually large enough.” (1993, fn. 2) However, most of the litera-
ture dealing with maximum likelihood estimation does not even provide
these guidelines and rather cryptically refers to the asymptotic properties
of maximum likelihood estimates (for such discussions see Davidson &
Mackinnon 1993, Greene 1993, Kmenta 1986).
Researchers generally acknowledge the possibility, however, that their
estimates may suffer in small samples. In areas of research where sample
size and population size are nearly the same and are small, few solutions
are apparent, though pooling cross-sections across time has provided one
increasingly-used outlet. State politics research is such a subfield, the sam-
ple and population of 50 states limiting the faith some researchers may
have in inferential statistics.1
While many of these small sample analyses employ least squares, a
1
Of course, this raises the much debated question of whether tests of significance are
appropriate in such cases. If the study is comprised of the population of cases, inference
is unnecessary and thus, so are significance tests. However, some argue that the popu-
lation under study is still a sample representing all possible cross-sections, so inferential
statistics are appropriate. Generally, state politics research reports significance tests, thus
implying that inferences are important to interpretation of results.

2
growing number are turning to maximum likelihood methods. More than
least squares, ML relies somewhat heavily upon the data itself as a source
of information regarding the distributional process that is most likely to
have generated the data. This suggests immediately that the consequences
of small sample size may differ between LS and ML methods. Whereas
least squares has a simple decision rule by which it determines the shape
of the relationship between Y and X, ML’s decision rule is somewhat more
complex. Least squares simply minimizes the sum of the squared error
terms across observations. ML, on the other hand, seeks to maximize the
probability that the observed data are the result of a particular set of dis-
tributional parameters.
The key difference between these two estimation strategies is the extent
to which the amount of information in the data facilitate the production of
estimates. While least squares can minimize the sum squared error vir-
tually without regard to the number of data points available to it, ML’s
maximization problem is made increasingly difficult when it lacks data
points from which to derive information. From the point of view of ML,
a paucity of data leaves open the likelihood that any number of processes
generated the data. For instance, an ML procedure conducted on a sam-
ple of 20 cases ultimately assumes that the 20 cases are representative of
the population and that the shape of the curve is adequately represented
by those 20 cases. On the other hand, a similar procedure conducted on
a sample of 200 gains a substantial amount of information regarding the
shape of the curve and leaves less uncertainty between data points. The
larger sample allows ML to distinguish the probabilities that any particu-
lar process generated the data.
Least squares does not really suffer from this problem since it is not
a process based in likelihood theory, but again, relies on the simple min-
imization decision rule. Though LS, especially in small samples, is sus-
ceptible to the effects of outlying data points, the procedure does not have
to discern between the probabilities that different lines best fit the data.
Rather, it simply establishes which line simultaneously minimizes the dis-
tances between the observed and predicted data points without regard to
the relative likelihoods of different lines.
The informational aspect of ML as an estimation process has impor-
tant implications for drawing inferences based on ML results. Though
ML is known to produce asymptotically efficient standard errors, the ex-
tent to which standard error estimates begin to fail in small samples is
largely unknown.2 Of course, the underpinnings of statistical theory and
2
Important theoretical work on the behavior of ML in small samples can be found in

3
conventional wisdom suggest that standard errors should become ineffi-
cient (inflated) as sample size declines. This has important implications for
classical hypothesis testing, especially for the two types of errors against
which researchers try to protect themselves.

2.1 Expectations about Type I errors


Type I errors occur when the ratio of to SE produces a test statistic large
enough to permit the rejection of the null hypothesis even though the null
hypothesis is true. The result of a Type I error is a false positive finding.
Since ML’s asymptotically efficient standard errors likely become inflated
in smaller sample sizes, the likelihood of rejecting a null hypothesis de-
clines. As a result, the likelihood that small samples will induce Type I
errors is small. In fact, researchers often express more concern regarding
false positive findings when sample sizes are exceptionally large. In sum,
we anticipate that declining sample size will not increase the occurrence
of Type I errors.

2.2 Expectations about Type II errors


Type II errors, on the other hand, occur when the ratio of to SE pro-
duces a test statistic too small to allow the rejection of the null hypothesis
even though the null hypothesis is false. A Type II error leads us to a false
negative finding. Again, because we generally expect ML’s standard error
estimates to become inflated at smaller sample sizes where the asymptotic
property is not operative, the resulting test statistics should diminish in
size as sample sizes decline. In the end, rejecting the null hypothesis be-
comes less probable and we should expect an elevation in the number of
Type II errors we encounter.

3 Monte Carlo Design


Since our primary concern at this time is to investigate the inferential prob-
lems that may occur when scholars utilize ML methods in small samples,
our experimental design is fairly straightforward. We proceed in two
steps, first investigating Type I errors and then exploring Type II errors.
Shenton & Bowman (1977). However, we are not aware of research that directly addresses
the practical implications of sample size for inference.

4
We do so by generating sets of data such that we know the relationship be-
tween the independent and dependent variables. We then reverse the pro-
cess and calculate estimates of and the variance-covariance matrix such
that we can calculate z -scores and see how often Type I and II errors are
made at various sample sizes. Since the most common ML models used
by political scientists involve binary dependent variables we restrict our
focus for now to models where the error-process is normally distributed
(Probit models). In an effort to get as much out of our design as possible,
we run 1,000 models at each sample size and measure the rate of Type I/II
errors at various sample sizes. All Monte-Carlo simulations are run using
GAUSS Statistical Modeling package (Aptech Systems).

3.1 Data Generation


To measure the rate of Type I errors, data generation is simple. Since a
Type I error is when we falsely reject the null hypothesis of no relation-
ship between a dependent and independent variables, we create both the
dependent and independent variables as random draws from the normal
distribution. Since we are investigating binary response models we must
then dichotomize the dependent variable according to some decision rule.
We generate the data as follows:

xN (0; 1)
y  N (0; 1)
y = 1 if y  > 0, y = 0 if y  < 0

For Type II errors, we must create a model wherein there is a relation-


ship between the dependent and independent variables. The x data are
generated as above (x N (0; 1)) but now y is generated as a function of the
x data and some error process.

y  = X + u
where X is a matrix of normally distributed random variables, is a vec-
tor of “true” parameters for the experiment and e N (0; 1). The dependent
variable is dichotomized as above,

y = 1 if y  > 0, y = 0 if y  < 0

5
3.2 Probit
As we indicate above, we use Probit models to investigate the more gen-
eral properties of ML estimators at various sample sizes (for the Type II
errors, Probit is the appropriate method given the distribution of e, for
Type I errors Probit is appropriate given the distribution of y ). Estima-
tion is standard ML for Probit where the following likelihood function is

X[
maximized numerically:

LogL = yi log ( 0 X ) + (1 , yi ) log (1 , ( 0 X ))]


yi

3.3 Sample Sizes


Previous research (Hart and Clark 1997) shows that with one independent
variable problems of inference only appear in very small samples (n < 30).
As such, we conduct our present analyses by looking at samples ranging
from n=10 to n=200 at intervals of 10 (e.g. n=200, 190, 180, etc.). It is our
belief that a sample size of 200 should be sufficient for a small number of
independent variables so that is our upper limit for the experiment.

3.4 Number of Independent Variables


Building on our previous research, we expand by increasing the number
of independent variables in the experiment in a stepwise fashion. In this
study we sequentially increase the number of independent variables from
one to five. Our goal is to see what impact on inference occurs with the
addition of each independent variable.

3.5 Tests for Significance


Finally, the main thrust of our results center on errors of inference. For
Type I errors, we calculate z -scores from our estimates of and the stan-
dard error of . Then, since Type I errors are any false rejections of the
null, we calculate the rate of Type I error-making by counting the number
of times the z -score falls outside the range 1:96 < z < 1:96. Thus we are
setting out alpha rate at .05 for the two-tailed test, which is fairly standard
for the discipline. We then divide the number of errors by the number of
trials to get an estimate of the rate of Type I error occurrence (which should
be, of course .05).

6
For Type II errors we also calculate z -scores as above. Since we set the
values of prior to each experiment in this case then we are explicitly
testing directional hypotheses. Thus, depending upon the sign of the ele-
ments of , we count the number of times that the z -score is either greater
than 1:65 or less than 1:65 (falsely failing to reject the null at the .05 level)
and divide by the number of trials.
There is one more problem which must be addressed. Oftentimes (es-
pecially in smaller samples) parameter estimates go haywire and the Hes-
sian matrix fails to invert (which is the variance-covariance matrix). In
these cases (where, effectively, the iterative grid search procedure fails and
the estimates are not converging) we set the standard error of the estimate
to a very large number (10e +20). This will account for a lack of Type I
errors (which are not being made when the model does not converge) and
for the occurrence of Type II errors (which are being made when the model
does not converge).
Finally, for our graphic presentation and most of our discussion we will
use the average error rate over the k independent variables in the models.
In general the error rate does not differ much from one variable to another
but some interesting things do happen when we have four and five inde-
pendent variables that we will discuss below.

4 Monte Carlo Results and Discussion


4.1 Type I Errors
We begin by analyzing Type I errors as sample size decreases, remember-
ing our above conjecture that the rate of Type I errors should not be abnor-
mal. The results for a single independent variable are shown in Table One
and graphically in Figure One.

*****Table One and Figure One About Here*****

Although there is some variation in the results, they generally fluctuate


around .05 and there is no real pattern with respect to sample size. The
only exception to this is at n=10, where the rate of Type I error-making
drops significantly. We suggest that the reason for this is exactly the same
reason that we expect to observe significantly more Type II errors as sam-
ple sizes get small. Namely, in a sample with only ten observations, it
would seem difficult to uncover relationships that exist. Uncovering rela-
tionships that do not exist with so little information would seem extremely

7
unlikely. When we consider the number of times that models fail to con-
verge in very small samples this result seems to be easily explainable.
The results with two independent variables are very similar. For the
graphical display I take the average of Type I errors for each independent
variable. The results are given in Table Two and Figure Two.

*****Table Two and Figure Two About Here*****

The rates of Type I error-making are slightly higher, generally staying just
above .05. Still, the results do not seem drastic or to warrant concern.
Without expending a great deal of computing time, we conduct a quick
check on a much higher number of independent variables. With sample
sizes of 60, 50 and 40 and k =7, the Type I error rate across all of the vari-
ables is zero. No Type I errors are made in the 3,000 trials and we suspect
that there is little or no problem with Type I errors in small-sample ML
models.

4.2 Type II Errors


4.2.1 One Independent Variable
The results for Type II errors are much less comforting. We will present
and discuss the results beginning with one independent variable and mov-
ing sequentially up to five. Table Three and Figure Three present the first
results.

*****Table Three and Figure Three About Here*****

With one independent variable, the Type II error rate does not move above
.05 until the sample size moves below 40. At n=30 the probability of mak-
ing a Type II error is .11, more than twice the acceptable rate. This increases
dramatically as n goes to 20 and 10. It is comforting, though, to note that
the Type II error rate is essentially nonexistent with larger samples, con-
firming our beliefs in the desirable asymptotic properties of ML.

4.2.2 Two Independent Variables


With two independent variables, the Type II error rate moves above .05
at a larger sample size. Table Four and Figure Four present the two-
independent variable results.

*****Table Four and Figure Four About Here*****

8
At n=50, the rate of Type II errors is a little more than .06, nearly five times
as high as the rate at n=50 with one independent variable. As was seen
in the one variable case, the rate of error increases sharply as the sample
size moves to ten. Still, the problem does not seem so severe just yet as
most researchers have samples larger than 50 (although note the particular
problem facing state politics researchers).

4.2.3 Three Independent Variables


When a third independent variable is added to the model, things get sig-
nificantly worse. The Type II error-rate crosses the .05 threshold at n=130.

*****Table Five and Figure Five About Here*****

Even worse, at significantly small samples (n < 30), the likelihood of miss-
ing true relationships is above .9. Considering the fact that most political
science researchers have far more than three independent variables (and
the call of many to add variables to avoid misspecification) the nature and
seriousness of the problem is beginning to crystallize. We are also be-
ginning to get some sense as to the number of observations needed per
independent variable to avoid making Type II errors. It would appear
that about 30-50 observations per variable might be necessary and this is
something we will look more carefully at in the next two sections. One in-
teresting aspect of these results is how the true value of affects the Type
II error rate. It would seem that the larger is , the fewer errors that are
made, even if only marginally so. We will address this again below.

4.2.4 Four Independent Variables


As expected, the rate of Type II errors increases again with the addition of
another independent variable.

*****Table Six and Figure Six About Here*****

The Type II error-rate threshold is crossed between n=160 and n=150, marginally
higher than in the three variable case. Additionally, researchers will make
Type II errors almost half of the time with a sample of reasonable size
(n=80). With the addition of a fourth independent variable where =-1 we
see that the null hypothesis is falsely rejected at a higher rate than the oth-
ers. Given that we use null starting values for the estimation procedure
this is somewhat surprising.

9
4.2.5 Five Independent Variables
Our threshold level of .05 is crossed between n=190 and n=180 with five
independent variables.

*****Table Seven and Figure Seven About Here*****

This supports the very general conclusion that somewhere between 30 and
50 cases are needed for each independent variable in the model. Clearly
this is not a standard that is easy to meet for many researchers. And once
again we see that with 5=1, the error rate for both 4 and 5 are substan-
tially higher than for the other variables. It may be the case that since all
x variables are standard normal, ML is simply better able to find relation-
ships with a bigger substantive impact than those with marginal impacts
(since this is basically what represents). If so, this is not a terribly trou-
bling finding but one worth consideration.
To get a better feel for how the Type II error rate increases as sample
size dwindles and the number of independent variables increase, we com-
bine Figures Three through Seven into a single figure which represents all
five error rate patterns.

*****Table Eight and Figure Eight About Here*****

The only puzzling aspect of the distribution of error rates is the large gap
between k =2 and k =3. Above and below this break there are fairly uniform
increases in the Type II error rates for each independent variable added to
a model. Before discussing the implications of this research and some po-
tential limitations we would like to present the results from a very limited
investigation of bootstrap methods with small samples.

5 Bootstrap Estimates
Since our original experimental design uses 1,000 trials for each sample
size, we have so far presented the results of 140,000 trials. One possi-
ble solution to researchers faced with small samples is to generate es-
timates and standard errors empirically by applying the bootstrap tech-
nique (Mooney 1996). When we bootstrap a sample we essentially inflate
the sample artificially by drawing a predetermined number of cases from
the original sample, but doing so with replacement. Ideally we could eval-
uate the utility of the bootstrap by drawing samples of, say, 500 to 1,000
(for each trial) and still running the 1,000 trials as was done for the above
experiments. Unfortunately, this would result in between 10,000,000 and

10
20,000,000 trials (for each value of k ) and would take several weeks. As
such, we have conducted a very brief exploration of the bootstrap by set-
ting the bootstrap sample size at 200, the number of trials at 100 and only
investigate n=50,40,30,20 and 10 with k =5. Thus with only 1,000,000 trials
we are able to deliver preliminary results after only a few days of comput-
ing time. While the efficiency of the estimates might improve greatly with
a larger number of trials (and larger bootstrap samples) the results offer at
least a modicum of hope. The results are shown in Table Eight and Figure
Nine.

*****Table Eight and Figure Nine About Here*****

Looking back at Table Seven and Figure Seven we see that the probability
of making a Type II error with five independent variables is above .9 for
n < 50. For the bootstrap models, the error rate ranges from .85 to .67.
While we must remember that these results are much less robust since only
100 trials are performed at each sample size, the reduction in Type II error
rate is encouraging. More research will have to be done to see if increasing
the bootstrap sample size reduces the Type II error rate to an acceptable
level. We must also remember that running a bootstrap estimate of very
large size for a single sample would not take much time at all.

6 Discussion
Though these results are preliminary we can learn a great deal from these
experiments. The most encouraging results we find are the lack of Type
I error problems in small samples. Essentially, this tells us that the dis-
cipline is not being moved in false directions through the publication of
results that are generated by small sample problems with the estimation
technique. Additionally, since the problem with Type II errors seems so
pernicious we might actually have more faith in the positive findings that
do exist in small sample ML projects.
The bad news is that it is very likely that there are many true polit-
ical relationships that scholars have been unable to discover because of
the limitations of ML in small samples. Also, we need to be very wary
of conclusions being drawn by researchers based on non-results when us-
ing small sample ML designs. Our brief literature search did not produce
such a paper but it was by no means comprehensive and scholars should
be careful, especially in areas where non-results might seem to be substan-
tively significant (such as in major debates in the discipline).

11
The next area of concern is with the sterility of our design. All variables
are distributed normally, as are the errors. There is no multicollinearity,
heteroskedasticity, correlation between errors, underspecification, over-
specification, skewness of the dependent variable or any other problems
which are more generally characteristics of our data. We can only guess
that if these inferential problems exist in models featuring such clean data
and designs that the problem could be much more serious in practical ap-
plications.

7 Conclusions
Our goal is to provide an initial glimpse into the nature of inferential prob-
lems when using ML in small samples. Our results are suggestive and can
hopefully guide us to be careful when designing research projects and also
in consuming others’ research. We can say with some reservation that it
appears that scholars might need about 30 to 50 cases per independent
variable to avoid the Type II problems we observe. Of course, the peren-
nial advice to researchers that more is better rings true here. More data is,
of course, better than less data but we hope to find ways to aid researchers
who simply cannot get more data. In the future we plan to extend this
project in several ways. First, we will address the problems listed in the
previous paragraph and see if more realistic and problematic data exacer-
bates the sample size problems identified here. Second, we will investigate
other classes of ML models such as count models, ordered models and
time-series ML models. Third, we will investigate the link between sam-
ple size, collinearity and skewness as all seem to have a similar impact on
data sets in that they reduce the available amount of information (which
we think may be the critical problem in ML models with small samples).
Finally, we will perform a more intensive investigation of bootstrap meth-
ods, which may provide the best solution to the problem.

12
References
Bartels, Larry M. 1991. “Constituency Opinion and Congressional Policy
Making: The Reagan Defense Buildup.” American Political Science Re-
view 85:457–474.

Davidson, J. & J. Mackinnon. 1993. Estimation and Inference in Econometrics.


New York: Oxford University Press.

Eliason, Scott R. 1993. Maximum Likelihood Estimation: Logic and Practice.


Sage University Paper on Quantitative Applications in the Social Sci-
ences Thousand Oaks, CA: Sage.

Erikson, Robert S., Jr. Gerald C. Wright & John P. McIver. 1987. “State Po-
litical Culture and Public Opinion.” American Political Science Review.

Erikson, Robert S., Jr. Gerald C. Wright & John P. McIver. 1989. “Polit-
ical Parties, Public Opinion, and State Policy in the United States.”
American Political Science Review.

Fearon, James D. 1994. “Signaling versus the Balance of Power and Inter-
ests: An Empirical test of a crisis bargaining model.” Journal of Conflict
Resolution 38(2):236–269.

Gray, Virginia & David Lowery. 1988. “Interest Group Politics and Eco-
nomic Growth in the U.S. States.” American Political Science Review
82(1):109–131.

Greene, William H. 1993. Econometric Analysis. 2 ed. New York: Prentice


Hall Publishing Company.

Hall, Melinda Gann. 1992. “Electoral Politics and Strategic Voting in State
Supreme Courts.” Journal of Politics 54(2):427–446.

Hill, Kim Quaile & Jan E. Leighley. 1993. “Party Ideology, Organization,
and Competitiveness as Mobilizing Forces in Gubernatorial Elec-
tions.” American Journal of Political Science 37(4):1158–1178.

13
Hill, Kim Quaile & Jan Leighley. 1992. “The Policy Consequences of Class
Bias in State Electorates.” American Journal of Political Science 36:351–
365.

Jackman, Robert W. 1987. “Political Institutions and Voter Turnout in the


Industrial Democracies.” American Political Science Review 81(2):405–
423.

Kmenta, Jan. 1986. Elements of Econometrics. New York: MacMillan.

Lowery, David & Virginia Gray. 1993. “The Density of State Interest Group
Systems.” Journal of Politics 55(1):191–206.

Lowery, David & Virginia Gray. 1994. “Interest Group System Density and
Diversity.” Social Science Quarterly 75(2):368–377.

Mooney, Christopher Z. 1996. “Bootstrap Statistical Inference: Examples


and Evaluations for Political Science.” American Journal of Political Sci-
ence 40(2):570–602.

Morgan, T. Clifton & Kenneth Bickers. 1992. “Domestic Discontent and


the Use of Force.” Journal of Conflict Resolution 36:25–52.

Morrow, James D. 1991. “Alliances and Asymmetry: An Alternative to


the Capability Aggregation Model of Alliances.” American Journal of
Political Science 35(4):904–933.

Ostrom, Charles W. & Brian Job. 1986. “The President and the Political Use
of Force.” American Political Science Review 80:541–566.

Patterson, Samuel C. & Gregory A. Caldeira. 1984. “The Etiology of Parti-


san Competition.” American Political Science Review 78(1):691–707.

Quinn, Dennis P. & Robert Y. Shapiro. 1991. “Economic Growth Strate-


gies: The Effects of Ideological Partisanship on Interest Rates and
Business Taxation in the United States.” American Journal of Political
Science 35(3):656–685.

14
Radcliff, Benjamin. 1992. “The Welfare State, Turnout, and the Economy:
A Comparative Analysis.” American Political Science Review 86(2):444–
454.

Ringquist, Evan J. 1993. “Does Regulation Matter?: Evaluating the Effects


of State Air Pollution Control Programs.” Journal of Politics 55:1022–
1045.

Shenton, L.R. & K.O. Bowman. 1977. Maximum Likelihood Estimation in


Small Samples. New York: MacMillan.

Smith, Kevin B. & Kenneth J. Meier. 1995. “Politics and the Quality of Edu-
cation: Improving Student Performance.” Political Research Quarterly
48(2).

Wang, Kevin & James Lee Ray. 1994. “Beginners and Winners: The Fate
of Initiators of Interstate Wars Involving Great Powers Since 1495.”
International Studies Quarterly 38(1):139–154.

Williams, John T. 1990. “The Political Manipulation of Macroeconomic


Policy.” American Political Science Review 84(3):767–795.

15
Table 1: Type I Error Rate: One Independent Variabley
N 1

200 .067
190 .049
180 .053
170 .059
160 .05
150 .056
140 .045
130 .055
120 .048
110 .05
100 .047
90 .053
80 .045
70 .053
60 .031
50 .053
40 .049
30 .064
20 .053
10 .01
y
Alpha = 0.05

16
Table 2: Type I Error Rate: Two Independent Variablesy
N 1 2

200 .054 .064


190 .053 .051
180 .054 .06
170 .054 .053
160 .061 .049
150 .051 .042
140 .055 .053
130 .062 .066
120 .055 .05
110 .058 .052
100 .059 .055
90 .054 .064
80 .055 .05
70 .05 .06
60 .063 .047
50 .062 .051
40 .061 .047
30 .069 .052
20 .048 .062
10 .006 .009
y
Alpha = 0.05

17
Table 3: Type II Error Rate: One Independent Variabley
N 1=2

200 0
190 0
180 0
170 0
160 0
150 0
140 0
130 0
120 0
110 0
100 0
90 0
80 0
70 .002
60 .002
50 .012
40 .042
30 .109
20 .353
10 .923
y
Alpha = 0.05

18
Table 4: Type II Error Rate: Two Independent Variablesy
N 1=2 2=-4 Average Rate
200 0 0 0
190 0 0 0
180 0 0 0
170 0 0 0
160 0 0 0
150 0 0 0
140 0 0 0
130 0 0 0
120 0 0 0
110 0 0 0
100 0.001 0.001 0.001
90 0.003 0.002 0.0025
80 0.006 0.006 0.006
70 0.015 0.015 0.015
60 0.033 0.032 0.0325
50 0.063 0.067 0.065
40 0.162 0.168 0.165
30 0.349 0.356 0.3525
20 0.717 0.737 0.727
10 0.988 0.991 0.9895
y
Alpha = 0.05

19
Table 5: Type II Error Rate: Three Independent Variables
N 1=2 2=-4 3=5 Average Rate
200 0.002 0.002 0.002 0.002
190 0.004 0.005 0.004 0.004333
180 0.008 0.007 0.009 0.008
170 0.019 0.019 0.019 0.019
160 0.019 0.019 0.019 0.019
150 0.029 0.029 0.029 0.029
140 0.051 0.049 0.048 0.049333
130 0.058 0.055 0.055 0.056
120 0.101 0.103 0.101 0.101667
110 0.12 0.116 0.117 0.117667
100 0.186 0.182 0.177 0.181667
90 0.272 0.267 0.261 0.266667
80 0.366 0.345 0.349 0.353333
70 0.436 0.415 0.41 0.420333
60 0.615 0.573 0.564 0.584
50 0.756 0.709 0.707 0.724
40 0.86 0.825 0.825 0.836667
30 0.951 0.915 0.903 0.923
20 0.992 0.977 0.975 0.981333
10 0.999 0.999 0.999 0.999
y
Alpha = 0.05

20
Table 6: Type II Error Rate: Four Independent Variablesy
N 1=2 2=-4 3=5 4=-1 Average Rate
200 0.007 0.005 0.006 0.021 0.00975
190 0.012 0.011 0.01 0.026 0.01475
180 0.018 0.02 0.019 0.036 0.02325
170 0.029 0.029 0.03 0.064 0.038
160 0.048 0.048 0.049 0.093 0.0595
150 0.064 0.064 0.062 0.116 0.0765
140 0.083 0.081 0.081 0.153 0.0995
130 0.118 0.116 0.112 0.193 0.13475
120 0.163 0.159 0.159 0.266 0.18675
110 0.207 0.209 0.208 0.331 0.23875
100 0.281 0.267 0.269 0.411 0.307
90 0.355 0.337 0.339 0.536 0.39175
80 0.479 0.455 0.456 0.633 0.50575
70 0.605 0.586 0.594 0.776 0.64025
60 0.737 0.702 0.706 0.852 0.74925
50 0.803 0.77 0.767 0.911 0.81275
40 0.92 0.892 0.887 0.979 0.9195
30 0.981 0.949 0.954 0.992 0.969
20 0.997 0.993 0.993 0.999 0.9955
10 0.999 0.999 0.999 0.999 0.999
y
Alpha = 0.05

21
Table 7: Type II Error Rate: Five Independent Variablesy
N 1=2 2=-4 3=5 4=-1 5=1 Average Rate
200 0.015 0.013 0.012 0.03 0.026 0.0192
190 0.026 0.027 0.026 0.046 0.047 0.0344
180 0.052 0.051 0.05 0.064 0.078 0.059
170 0.061 0.057 0.059 0.093 0.09 0.072
160 0.089 0.088 0.089 0.144 0.127 0.1074
150 0.1 0.101 0.102 0.156 0.15 0.1218
140 0.132 0.132 0.134 0.188 0.2 0.1572
130 0.202 0.195 0.194 0.28 0.281 0.2304
120 0.26 0.258 0.26 0.368 0.362 0.3016
110 0.305 0.298 0.297 0.43 0.42 0.35
100 0.38 0.377 0.376 0.499 0.541 0.4346
90 0.489 0.481 0.479 0.626 0.624 0.5398
80 0.627 0.607 0.606 0.762 0.746 0.6696
70 0.693 0.671 0.673 0.818 0.813 0.7336
60 0.821 0.798 0.799 0.913 0.924 0.851
50 0.899 0.865 0.861 0.954 0.951 0.906
40 0.965 0.95 0.946 0.984 0.987 0.9664
30 0.99 0.977 0.979 0.998 0.996 0.988
20 0.999 0.998 0.999 0.999 0.999 0.9988
10 0.999 0.999 0.999 0.999 0.999 0.999
y
Alpha = 0.05

22
Table 8: Type II Error Rate: Bootstrap Estimates: Five Independent
Variablesy
N 1=2 2=-4 3=5 4=-1 5=1 Average Rate
50 0.83 0.78 0.77 0.94 0.93 0.85
40 0.83 0.68 0.65 0.93 0.93 0.804
30 0.76 0.64 0.65 0.95 0.92 0.784
20 0.75 0.47 0.36 0.89 0.84 0.662
10 0.78 0.48 0.43 0.89 0.93 0.702
y
Alpha = 0.05

23
Figure One, One Independent Variable

0.08

0.07

0.06
P(Type I Error, .05 Level)

0.05

0.04

0.03

0.02

0.01

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

24
Figure Two, Two Independent Variables

0.07

0.06

0.05
P(Type I Error, .05 Level)

0.04

0.03

0.02

0.01

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

25
Figure Three, One Independent Variable

0.9

0.8

0.7
P(Type II Error, .05 Level)

0.6

0.5

0.4

0.3

0.2

0.1

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

26
Figure Four, Two Independent Variables

1.2

0.8
P(Type II Error, .05 Level)

0.6

0.4

0.2

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10

27
Figure Five, Three Independent Variables

1.2

0.8
p(Type II Error, .05 Level)

0.6

0.4

0.2

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

28
Figure Six, Four Independent Variables

1.2

1
P(Type II Error, .05 Level)

0.8

0.6

0.4

0.2

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

29
Figure Seven, Five Independent Variables

1.2

0.8
P(Type II Error, .05 Level)

0.6

0.4

0.2

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

30
Figure Eight, Bootstrap Estimates-Five Independent Variables

0.9

0.8

0.7
P(Type II Error, .05 Level)

0.6

0.5

0.4

0.3

0.2

0.1

0
50 40 30 20 10
N

31
Figure Nine, One to Five Independent Variables (and Bootstrap)

1.2

0.8
P(Type II Error, .05 Level)

One IV
Two IV
Three IV
0.6
Four IV
Five IV
Boot 5 IV

0.4

0.2

0
200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10
N

32

You might also like