This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

### Publishers

Scribd Selects Books

Hand-picked favorites from

our editors

our editors

Scribd Selects Audiobooks

Hand-picked favorites from

our editors

our editors

Scribd Selects Comics

Hand-picked favorites from

our editors

our editors

Scribd Selects Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

P. 1

Finite sample criteria for autoregressive model order selection|Views: 4|Likes: 0

Published by RhysU

This document details autoregressive model selection criteria following Broersen. Emphasis is placed on converting the formulas into forms efficiently computable when evaluating a single model. When evaluating a hierarchy of models, computing using intermediate results may be more efficient.

This document details autoregressive model selection criteria following Broersen. Emphasis is placed on converting the formulas into forms efficiently computable when evaluating a single model. When evaluating a hierarchy of models, computing using intermediate results may be more efficient.

See more

See less

https://www.scribd.com/doc/171085567/Finite-sample-criteria-for-autoregressive-model-order-selection

09/21/2015

text

original

**This document details autoregressive model selection criteria following Broersen.
**

1

Emphasis

is placed on converting the formulas into forms eﬃciently computable when evaluating a

single model. When evaluating a hierarchy of models, computing using intermediate results

may be more eﬃcient.

Setting

An AR(K) process and its AR(p) model are given by

x

n

+ a

1

x

n−1

+· · · + a

K

x

n−K

=

n

x

n

+ ˆ a

1

x

n−1

+· · · + ˆ a

p

x

n−p

= ˆ

n

in which

n

∼ N (0, σ

2

) and ˆ

n

∼ N (0, ˆ σ

2

**). Model selection criteria for evaluating which of
**

several candidates most parsimoniously ﬁts an AR(K) process generally have the form

criterion(v

method

, N, p, α) = ln residual(p, v

method

) + overﬁt(criterion, v

method

, N, p, α) . (2)

Among all candidates and using a given criterion, the “best” model minimizes the criterion.

Here, N represents the number of samples used to estimate model parameters, p denotes

the order of the estimated model, v

method

= v

method

(N, i) is the method-speciﬁc estimation

variance for model order i, and α is an optional factor with a criterion-dependent meaning.

When estimating ˆ a

1

, . . . , ˆ a

p

given sample data x

n

, the residual variance is

residual(v

method

, p) = residual(p) = ˆ σ

2

.

Therefore the left term in (2) penalizes misﬁtting the data independently of the estimation

method used. One may therefore distinguish among criterion using only the overﬁtting

penalty term, namely overﬁt(criterion, v

method

, N, p, α).

In Broersen’s work, the penalty term depends upon the model estimation method used

through the estimation variance v:

v

Yule–Walker

(N, i) =

N −i

N (N + 2)

i = 0

v

Burg

(N, i) =

1

N + 1 −i

i = 0

v

LSFB

(N, i) =

1

N + 1.5 −1.5i

i = 0

v

LSF

(N, i) =

1

N + 2 −2i

i = 0

1

Broersen, P. M. T. “Finite sample criteria for autoregressive order selection.” IEEE Transactions on

Signal Processing 48 (December 2000): 3550-3558. http://dx.doi.org/10.1109/78.887047.

1

Here “LSFB” and “LSF” are shorthand for least squares estimation minimizing both the

forward and backward prediction or only the forward prediction, respectively. The estimation

variance for i = 0 depends only on whether or not the sample mean has been subtracted:

v(N, 0) =

1

N

sample mean subtracted

v(N, 0) = 0 sample mean retained

Inﬁnite sample overﬁt penalty terms

The method-independent generalized information criterion (GIC) has overﬁtting penalty

overﬁt(GIC, N, p, α) = α

p

N

independent of v

model

. The Akaike information criterion (AIC) has

overﬁt(AIC, N, p) = overﬁt(GIC, N, p, 2) (5)

while the consistent criterion BIC and minimally consistent criterion (MCC) have

overﬁt(BIC, N, p) = overﬁt(GIC, N, p, ln N) (6)

overﬁt(MCC, N, p) = overﬁt(GIC, N, p, 2 ln ln N) . (7)

Additionally, Broersen uses α = 3 with GIC referring to the result as GIC(p,3). The

asymptotically-corrected Akaike information criterion (AIC

C

) of Hurvich and Tsai

2

is

overﬁt(AIC

C

, N, p) =

2p

N −p −1

.

Finite sample overﬁt penalty terms

Finite information criterion

3

The ﬁnite information criterion (FIC) is an extension of GIC meant to account for ﬁnite

sample size and the estimation method employed. The FIC overﬁt penalty term is

overﬁt(FIC, v

method

, N, p, α) = α

p

i=0

v

method

(N, i)

= α

_

v(N, 0) +

p

i=1

v

method

(N, i)

_

2

Hurvich, Cliﬀord M. and Chih-Ling Tsai. ”Regression and time series model selection in small samples.”

Biometrika 76 (June 1989): 297-307. http://dx.doi.org/10.1093/biomet/76.2.297

3

FIC is mistakenly called the “ﬁnite sample information criterion” on page 3551 of Broersen 2000 but

referred to correctly as the “ﬁnite information criterion” on page 187 of Broersen’s 2006 book.

2

where v(N, 0) is evaluated using (4) and v

method

(N, i) from (3). The factor α may be chosen

as in (5), (6), or (7). Again, Broersen uses α = 3 calling the result FIC(p,3).

By direct computation one ﬁnds the following:

overﬁt(FIC, v

Yule–Walker

, N, p, α) = α

_

v(N, 0) −

p (1 −2N + p)

2N (N + 2)

_

overﬁt(FIC, v

Burg

, N, p, α) = α(v(N, 0) −ψ(N + 1) + ψ(N + 1 −p))

overﬁt(FIC, v

LSFB

, N, p, α) = α

_

v(N, 0) −

2

3

_

ψ

_

3 + 2N

3

_

−ψ

_

3 + 2N

3

−p

___

overﬁt(FIC, v

LSF

, N, p, α) = α

_

v(N, 0) −

1

2

_

ψ

_

2 + N

2

_

−ψ

_

2 + N

2

−p

___

The simpliﬁcations underneath the Burg, LSFB, and LSF results use that

p

i=1

1

N + a −ai

=

p−1

i=0

1

N −ai

=

1

a

p−1

i=0

1

N

a

−i

=

1

a

_

ψ

_

N

a

+ 1

_

−ψ

_

N

a

−p + 1

__

holds ∀a ∈ R because the digamma function ψ telescopes according to

ψ(x + 1) =

1

x

+ ψ(x) =⇒ ψ(x + k) −ψ(x) =

k−1

i=0

1

x + i

.

For strictly positive abscissae, ψ may be numerically evaluated following Bernardo.

4

Finite sample information criterion

The ﬁnite sample information criterion (FSIC) is a ﬁnite sample approximation to the

Kullback–Leibler discrepancy

5

. FSIC has the overﬁt penalty term

overﬁt(FSIC, v

method

, N, p) =

p

i=0

1 + v

method

(N, i)

1 −v

method

(N, i)

−1

=

1 + v(N, 0)

1 −v(N, 0)

·

p

i=1

1 + v

method

(N, i)

1 −v

method

(N, i)

−1. (9)

4

Bernardo, J. M. “Algorithm AS 103: Psi (digamma) function.” Journal of the Royal Statistical Society.

Series C (Applied Statistics) 25 (1976). http://www.jstor.org/stable/2347257

5

Presumably FSIC could be related, through the Kullback symmetric divergence, to the KICc and AKICc

criteria proposed by Seghouane, A. K. and M. Bekara. “A Small Sample Model Selection Criterion Based on

Kullback’s Symmetric Divergence.” IEEE Transactions on Signal Processing 52 (December 2004): 3314-3323.

http://dx.doi.org/10.1109/TSP.2004.837416.

3

The product in the context of the Yule–Walker estimation may be reexpressed as

p

i=1

1 + v

Yule–Walker

(N, i)

1 −v

Yule–Walker

(N, i)

=

p

i=1

N

2

+ 3N −i

N

2

+ N + i

= (−1)

p

(1 −3n −n

2

)

p

(1 + n −n

2

)

p

=

(n

2

+ 3n −p)

p

(1 + n −n

2

)

p

(10)

where the “rising factorial” is denoted by the Pochhammer symbol

(x)

k

=

Γ(x + k)

Γ(x)

.

When x is a negative integer and Γ is therefore undeﬁned, the limiting value of the ratio is

implied. The product in the context of the Burg, LSFB, or LSF estimation methods becomes

p

i=1

1 + v

Burg|LSFB|LSF

(N, i)

1 −v

Burg|LSFB|LSF

(N, i)

=

p

i=1

N + a (1 −i) + 1

N + a (1 −i) −1

=

_

−

1+N

a

_

p

_

1−N

a

_

p

(11)

where a ∈ R is a placeholder for a method-speciﬁc constant. Routines for computing the

Pochhammer symbol may be found in, for example, SLATEC

6

or the GNU Scientiﬁc Li-

brary

7

. In particular, both suggested sources handle negative integer input correctly.

By direct substitution of (10) or (11) into (9) one obtains:

overﬁt(FSIC, v

Yule–Walker

, N, p) =

1 + v(N, 0)

1 −v(N, 0)

·

(n

2

+ 3n −p)

p

(1 + n −n

2

)

p

−1

overﬁt(FSIC, v

Burg

, N, p) =

1 + v(N, 0)

1 −v(N, 0)

·

(−1 −N)

p

(1 −N)

p

−1

overﬁt(FSIC, v

LSFB

, N, p) =

1 + v(N, 0)

1 −v(N, 0)

·

_

−2−2N

3

_

p

_

2−2N

3

_

p

−1

overﬁt(FSIC, v

LSF

, N, p) =

1 + v(N, 0)

1 −v(N, 0)

·

_

−1−N

2

_

p

_

1−N

2

_

p

−1

Combined information criterion

The combined information criterion (CIC) takes the behavior of FIC(p,3) at low orders and

FSIC at high orders. For any estimation method CIC has the overﬁt penalty term

overﬁt(CIC, v

method

, N, p) = max

_

overﬁt(FSIC, v

method

, N, p) ,

overﬁt(FIC, v

method

, N, p, 3)

_

.

6

Vandevender, W. H. and K. H. Haskell. “The SLATEC mathematical subroutine library.” ACM

SIGNUM Newsletter 17 (September 1982): 16-21. http://dx.doi.org/10.1145/1057594.1057595

7

M. Galassi et al, GNU Scientiﬁc Library Reference Manual (3rd Ed.), ISBN 0954612078. http://www.

gnu.org/software/gsl/

4

Characteristics of Spatiotemporally Homogenized Boundary Layers at Atmospheric Reentry-like Conditions

A TRANSIENT MANUFACTURED SOLUTION FOR THE NONDIMENSIONAL,
COMPRESSIBLE NAVIER–STOKES EQUATIONS WITH A POWER LAW VISCOSITY

A TRANSIENT MANUFACTURED SOLUTION FOR THE COMPRESSIBLE NAVIER–STOKES EQUATIONS WITH A POWER LAW VISCOSITY

Proposal Abstract - Reducing Turbulence- and Transition-Driven Uncertainty in AerothermalHeating Predictions for Blunt-Bodied Reentry Vehicles

Turbulence statistics with quantified uncertainty in cold-wall supersonic channel flow

WENO-based first and second centered derivatives

Behavior of the bump function's derivatives

Nondimensionalization choices for the forced, compressible Navier-Stokes equations

Computation of the L^2 norm in a mixed Fourier/B-spline collocation discretization

Finding Galerkin L^2-based Operators for B-spline discretizations

Hyperbolic tangent stretching functions per M. Vinokur JCP 1983

A WENO-Based Code for Investigating RANS Model Closures for Multicomponent Hydrodynamic Instabilities

Comparison of nondimensionalization choices for the compressible Navier--Stokes equations

Explicit Total Variation Diminishing Runge Kutta Diffusive Stability

Towards a WENO-based code for investigating RANS model closures for hydrodynamic instabilities

Trac Wiki Formatting Cheat Sheet Raw

Centered Finite Differences

Burgers Analytic

Suzerain Poster

The skew-adjoint (skew-symmetric) form of the convective derivative

PSU MNE Spring 2009 Newsletter

Compressible Navier-Stokes formulation for a perfect gas

Probability Theory I

A verification case for B-Spline implementations, including derivatives

- Read and print without ads
- Download to keep your version
- Edit, email or read offline

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

CANCEL

OK

You've been reading!

NO, THANKS

OK

scribd

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->