# Inference for Point Pattern Spatial Statistics

N. Bert Loosmore

nhl@u.washington.edu

QERM 550 University of Washington May 11 & 13, 2005

Inference for Point Pattern Spatial Statistics – p.1/49

Outline

Use of Point Pattern Statistics in Ecology

Inference for Point Pattern Spatial Statistics – p.2/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope

Inference for Point Pattern Spatial Statistics – p.2/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Inference for Point Pattern Spatial Statistics – p.2/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues

Inference for Point Pattern Spatial Statistics – p.2/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question

Inference for Point Pattern Spatial Statistics – p.2/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.2/49

**Point Pattern Statistics in Ecology
**

Spatial processes

200

Ecological processes

Northing(m) Northing

0

50

100

150

0

50

100 Easting Easting(m)

150

200

Inference for Point Pattern Spatial Statistics – p.3/49

**Point Pattern Statistics in Ecology
**

Spatial processes

200

Ecological processes

Northing(m) Northing

100

150

What pattern for the green points?

0 0

50

50

100 Easting Easting(m)

150

200

Inference for Point Pattern Spatial Statistics – p.3/49

**Point Pattern Statistics in Ecology
**

Spatial processes

200

Ecological processes

Northing(m) Northing

100

150

What pattern for the red points?

0 0

50

50

100 Easting Easting(m)

150

200

Inference for Point Pattern Spatial Statistics – p.3/49

**Point Pattern Statistics in Ecology
**

Spatial processes

200

Ecological processes

Northing(m) Northing

100

150

Do we see (or expect) stationarity?

0 0

50

50

100 Easting Easting(m)

150

200

Inference for Point Pattern Spatial Statistics – p.3/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation,

rMatClust() with 105 points, radius = 0.1

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR,

CSR pattern with 100 points

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, rSSI() with 100 points, radius = 0.05 inhibition

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, inhibition Analyze distances between events:

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, inhibition Analyze distances between events: G (nearest neighbor),

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, inhibition Analyze distances between events: G (nearest neighbor), F(grid to nearest point),

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, inhibition Analyze distances between events: G (nearest neighbor), F(grid to nearest point), K/L (all neighbors)

Inference for Point Pattern Spatial Statistics – p.4/49

**Point Pattern Spatial Stats: How?
**

Evaluate observed pattern against ideas of aggregation, CSR, inhibition Analyze distances between events: G (nearest neighbor), F(grid to nearest point), K/L (all neighbors) Typically perform analysis using ‘Simulation Envelope’

Inference for Point Pattern Spatial Statistics – p.4/49

**Deﬁnition of the G and F Statistics
**

G statistic uses the nearest neighbor distances ( ) for each of sample points as:

¢ ¤ ¥£ ¦ § © ¦ ¨ ¡ § ¡

F statistic uses the distances ( ) from each of sample points (typically located on a grid) to their nearest event as:

¡ £ ¦ § © ¦ ¡ ¨ §

**Under CSR, both the G and F statistic is approximated as
**

¦ ! " © # § $

¢

Inference for Point Pattern Spatial Statistics – p.5/49

**Deﬁnition of the K and L Statistics
**

K statistic uses the distances between all neighbors (

¦ § ©

£

) as:

¦

¢

¨

#

¢

¤ ¥¡

**Under CSR, K statistic can be approximated by
**

¦

¢

¤

¡§

¡¦

#

§

**L statistic used to set mean variance as:
**

¦ §

¨

!

©

**and (supposedly) stabilize
**

! #

#

¨

¢

#

$

¡

¨

#

§

Inference for Point Pattern Spatial Statistics – p.6/49

¡

**Building the Simulation Envelope
**

A CSR pattern with

1.0

© ¨

¡ ¢ ¤ ¥£ ¦

G(t) 0.0 0.00 0.2 0.4

0.6

0.8

0.05

0.10 distance Distance

0.15

0.20

Inference for Point Pattern Spatial Statistics – p.7/49

**Building the Simulation Envelope
**

99 CSR patterns with

1.0

© ¨

¡ ¢ ¤ ¥£ ¦

G(t) 0.0 0.00 0.2 0.4

0.6

0.8

0.05

0.10 distance Distance

0.15

0.20

Inference for Point Pattern Spatial Statistics – p.7/49

**Using the Simulation Envelope
**

¦

**Plot after subtracting
**

0.3

¤ #

0.2

§

rSSI(r=0.03, n=100)

¦ hat G−bar G

¡ ¤ ¥£ ¦

¡ ¤ ¥£

¢

−0.3 0.00

−0.2

−0.1

0.0

¡

0.1

0.05

0.10 Distance Distance

0.15

0.20

Inference for Point Pattern Spatial Statistics – p.8/49

Perceived

Level Performance

©

£ ¡ ¢ ¤

¨

Using all results from 19 simulations yields

¥

£

©

, or

¡ ¢

¦§

Throwing out upper and lower 2 simulations at each distance ( ) from 99 simulations also yields

¡ ¢ ¦ §

¨

¨

¨

¤

Inference for Point Pattern Spatial Statistics – p.9/49

¨

¨

**Kenkel (1988) Methods
**

Evaluated spatial locations of all live trees, all (live + standing dead) trees in a jack pine Pinus Bansiana forest.

Inference for Point Pattern Spatial Statistics – p.10/49

**Kenkel (1988) Methods
**

Evaluated spatial locations of all live trees, all (live + standing dead) trees in a jack pine Pinus Bansiana forest. Map of live + standing dead represents distribution following early sapling mortality, but prior to the onset of density-depending mortality.

Inference for Point Pattern Spatial Statistics – p.10/49

**Kenkel (1988) Methods
**

Evaluated spatial locations of all live trees, all (live + standing dead) trees in a jack pine Pinus Bansiana forest. Map of live + standing dead represents distribution following early sapling mortality, but prior to the onset of density-depending mortality. Methods: Used MC techniques for the G and L statistics to evaluate observed results against of i) random locations (CSR) and ii) random mortality.

¡ ¢

Inference for Point Pattern Spatial Statistics – p.10/49

**Kenkel (1988) Conclusions
**

G: live + dead shows no departure from randomness whereas live trees only shows signiﬁcant regularity

Inference for Point Pattern Spatial Statistics – p.11/49

**Kenkel (1988) Conclusions
**

G: live + dead shows no departure from randomness whereas live trees only shows signiﬁcant regularity L: live + dead shows no departure from CSR at small scales, live trees show regularity at smaller scales

Inference for Point Pattern Spatial Statistics – p.11/49

**Kenkel (1988) Conclusions
**

G: live + dead shows no departure from randomness whereas live trees only shows signiﬁcant regularity L: live + dead shows no departure from CSR at small scales, live trees show regularity at smaller scales But is this interpretation correct?

Inference for Point Pattern Spatial Statistics – p.11/49

**Examples in Ecological Research
**

Author (Year) Batista and Maguire (1998) Dolezal et al. (2004) Freeman and Ford (2002) Grassi et al. (2004) Hirayama and Sakimoto (2003) Martens et al. (1997) Moeur (1997) Parish et al. (1999) SalvadorVan Eysenrode et al. (2000) Srutek et al. (2002) Tirado and Pugnaire (2003) G, K L K 1000 99 1000 95% 95% 99% y y n Statistics Used G, K K G, K K K L G, K G, K Patterns in Sim Env (s) 19 99 99 99 19,99 99 200 19 95% 95% 99% 95% 95%, 99% 95% 90% 95% “CI” (%) Marginal Results (y/n) n y n n n n n n

Inference for Point Pattern Spatial Statistics – p.12/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.13/49

Sim Env

Level Performance

Simulation study with independent ‘trials’ of a CSR pattern against a CSR envelope.

Designate ‘failure’ if pattern exceeds envelope at any distance. (Type I error)

Expected type I error rate 0.05 ...

Inference for Point Pattern Spatial Statistics – p.14/49

Sim Env

Level Performance

Simulation study with independent ‘trials’ of a CSR pattern against a CSR envelope.

Designate ‘failure’ if pattern exceeds envelope at any distance. (Type I error)

Expected type I error rate 0.05 ... ... actual type I error rate 0.5-0.7

Inference for Point Pattern Spatial Statistics – p.14/49

**Monte Carlo Simulation Theory
**

For a univariate continuous distribution,

£ §

©

©¦

¡ ¢

©

¤ ¥

¦

¡

¨

¨

¥

¨

¨¦

¦

¦

¥

¨

©

Inference for Point Pattern Spatial Statistics – p.15/49

§

**Monte Carlo Simulation Theory
**

For a univariate continuous distribution,

£ §

©

©¦

¡ ¢

©

¤ ¥

¦

¡

¨

¨

¥

¨

¨¦

¦

¦

But does the simulation envelope comprise a univariate distribution?

¥

¨

©

Inference for Point Pattern Spatial Statistics – p.15/49

§

**How the Envelope is Really Made
**

Simulation envelope built from 100 patterns:

0.3 0.2

55 patterns comprising the simulation envelope

^ ¦ G−G

¡ ¢ ¤ ¥£ ¦

¡ ¤ ¥£

¡

¢

−0.3 0.00

−0.2

−0.1

0.0

0.1

0.05

0.10 distance Distance

0.15

0.20

0.25

Inference for Point Pattern Spatial Statistics – p.16/49

**Failure of the Simulation Envelope
**

Although built from patterns, complexity of both 1. G, F, and/or K statistics, and 2. spatial patterns yields a multivariate result. Since evaluation of the observed pattern occurs at many distances we are performing simultaneous inference and thus is increased. Further, if the simulation envelope is invalid, then how can we use it to determine scale?

¡ ¥

Inference for Point Pattern Spatial Statistics – p.17/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.18/49

**Proper Statistical Methods
**

From Diggle (1983, 2003), for a given : 1. At a single a priori distance - use upper and lower simulated values

¡

2. Across a range of distances - use Goodness of Fit test

Inference for Point Pattern Spatial Statistics – p.19/49

**The Goodness of Fit Test - 1
**

1. Represent the empirical results as:

¤ £

¦

§

**observed pattern, and for simulated patterns
**

©

¤

¡ ¤ £ ¦ # §

¦

#

¨

¨¦

¦

¦

¨

¥

Inference for Point Pattern Spatial Statistics – p.20/49

**The Goodness of Fit Test - 2
**

2. Calculate:

¡ ¤ £ ¦ # § $ ¦ §

£

§

¦ ¤

¡

#

¡

¨

for

©

©

¨

¨¦

¦

¦

Summary statistic indicative of the total deviation of the given pattern from the theoretical result

¨

¥

#

©

§

Inference for Point Pattern Spatial Statistics – p.21/49

**The Goodness of Fit Test - 2
**

2. Calculate:

¡ ¤ £ ¦ # §

£

¦

§

§

$

¦ ¡ ¤

¡

#

¡

¨

for

©

©

¨

¨¦

¦

¦

but use

¨

¥

#

©

¡ ¤ £

§

©

¤

¡

¡ ¤ £ ¦ # §

£

¦

§

¦

¡ ¤

#

¨

¦

¥

©

§

¡

to reduce bias

¤

¦

#

§

Inference for Point Pattern Spatial Statistics – p.21/49

¢

**The Goodness of Fit Test - 3
**

3. Reject (fail to) based on the rank of p-value, calculated as

¡

§¨ © ¦¥ ¤

¦

using the

¡ ¢ ¤£

¨

©

¦

¥

©

¨

¦

¦

¨

©

¡

§¨

for

. So, if

©

¥

(the largest), then

©

©

©

¥

¦¥

¨

¤

¨¦

¦

¦

.

¡ ¤£

¨

Now we have quantitative results to evaluate a pattern’s signiﬁcance based on an “exact” level test because of proper MC methods

¡

¦

©

¨

¦

¨

¡

§

¦

Inference for Point Pattern Spatial Statistics – p.22/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.23/49

**Unresolved Implementation Issues
**

What is the optimal method to calculate

¡ ¤ £ ¦ # §

?

£

¦

¡ ¤

§

§

# ¦ # §

£

$ #

¡

¡

¨

**How to: replace integration with summation incorporate edge correction methods
**

¦

choose limits

#

¡

simulate patterns from null process

¢

§

, distance list

¡

Inference for Point Pattern Spatial Statistics – p.24/49

**Replacing Integration with Summation
**

We can rewrite Eqn (1) as

¡ ¤ £ ¦ # §

£

¦ ¡ ¤

§

§

#

$ ¦ §

¡

¨

¡ ¤ £ ¦ # §

£

§

$

#

¡

¦ ¡ ¤

#

©

But how accurate is this approximation?

#

Inference for Point Pattern Spatial Statistics – p.25/49

¤

§

Edge Correction

Used to eliminate bias from edge interfering with detecting a point’s neighbor

**Reduced Sample edge correction approach: Let be the distance for point to the closest boundary
**

¡

**Remove point from calculation at distance where
**

©

#

¡

£

©

¦

¡

Other approaches (toroidal, isotropic, etc.)

¡

£

Inference for Point Pattern Spatial Statistics – p.26/49

#

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

Recommended default for dependent!

£

#

¨

¦

¤¨

Inference for Point Pattern Spatial Statistics – p.27/49

¤

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

**Recommended default for dependent!
**

¤ ¥£ ¦ # § ¦ £

£

#

¨

,

§

are discrete, change where

¢

#

¦

¤¨

Inference for Point Pattern Spatial Statistics – p.27/49

¤

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

**Recommended default for dependent!
**

¤ ¥£ ¦ # § ¦ £

¢

£

#

¨

**, are discrete, change where new neighbor detected, or
**

#

§

¦

¤¨

Inference for Point Pattern Spatial Statistics – p.27/49

¤

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

**Recommended default for dependent!
**

¤ ¥£ ¦ # § ¦ £

¢

£

#

¨

**, are discrete, change where new neighbor detected, or point removed from sample
**

#

§

¦

¤¨

Inference for Point Pattern Spatial Statistics – p.27/49

¤

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

**Recommended default for dependent!
**

¤ ¥£ ¦ # § ¦ £

¢

£

#

¨

**, are discrete, change where new neighbor detected, or point removed from sample
**

#

Use empirical distance list for exact results from a single pattern

§

¦

¤¨

Inference for Point Pattern Spatial Statistics – p.27/49

¤

Choice of Limits (

), Distance List ( )

, but application

¡ ¢

¡¢

**Recommended default for dependent!
**

¤ ¥£ ¦ # § ¦ £

¢

£

#

¨

**, are discrete, change where new neighbor detected, or point removed from sample
**

#

**Use empirical distance list for exact results from a single pattern
**

§

¦

¤¨

¦

Because of calculation, especially , for exact solution, need to use complete empirical distance list (i.e. from all patterns) for evaluation of each pattern

¡ ¤ ¡ #

§

Inference for Point Pattern Spatial Statistics – p.27/49

¤

**Resolution of Simulated Patterns
**

Complexity? - Number of distances grows with ,

¥

Inference for Point Pattern Spatial Statistics – p.28/49

**Resolution of Simulated Patterns
**

Complexity? - Number of distances grows with ,

**Resolution (i.e. vs ) of simulated patterns should be equivalent to that of observed pattern
**

©

©

©

¨

©

¢

¦

¢

¨

¦

Inference for Point Pattern Spatial Statistics – p.28/49

¥

**Resolution of Simulated Patterns
**

Complexity? - Number of distances grows with ,

**Resolution (i.e. vs ) of simulated patterns should be equivalent to that of observed pattern
**

©

©

©

¨

©

¢

¦

¢

¨

Limiting resolution helps constrain complexity

¦

Inference for Point Pattern Spatial Statistics – p.28/49

¥

**Resolution of Simulated Patterns
**

Complexity? - Number of distances grows with ,

**Resolution (i.e. vs ) of simulated patterns should be equivalent to that of observed pattern
**

©

©

©

¨

©

¢

¦

¢

¨

Limiting resolution helps constrain complexity is highly accurate for ecological data (Freeman and Ford, 2002)

¡©

©

©

¨

¢

¦

¨

¦

Inference for Point Pattern Spatial Statistics – p.28/49

¥

**Resolution of Simulated Patterns
**

Complexity? - Number of distances grows with ,

**Resolution (i.e. vs ) of simulated patterns should be equivalent to that of observed pattern
**

©

©

©

¨

©

¢

¦

¢

¨

Limiting resolution helps constrain complexity is highly accurate for ecological data (Freeman and Ford, 2002)

¡©

©

©

¨

¢

**Combining resolution and default 25,000 distances in , regardless of , provides an exact solution
**

¦

¨

¦

**leads to at most or test statistic, and
**

¢

#

#

£

¥

¡

Inference for Point Pattern Spatial Statistics – p.28/49

¥

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.29/49

Parameterization - 1

“How to run any given test based on the ecological research question” Number of simulations ( )

¥

Inference for Point Pattern Spatial Statistics – p.30/49

Parameterization - 1

“How to run any given test based on the ecological research question” Number of simulations ( ) Choice of , including choice of

¡

¥

#

¡

¢

Inference for Point Pattern Spatial Statistics – p.30/49

versus

¡ £ ¤£

¢ £¡

¤

Uncertainly in realized p-value ( MC simulations

) results from the use of

**Ramiﬁcations of ? Affects precision of
**

¥

¢ £¡

through

actual simulated patterns against which observed pattern tested, and number of those patterns

**Note about exact level performance (across many tests vs. variation of p-value for single test)
**

¡

¡ £ ¤£

Inference for Point Pattern Spatial Statistics – p.31/49

Distribution of

Let and the test is then:

¤

for

. The p-value for

¡

©

¨

¨

¦

¡

¡

¨

¥

¨¦

¦

¦

¦

¨

¨

¦

¨

©

¡

¥

¡

¡

§

©

¢ £¡

Inference for Point Pattern Spatial Statistics – p.32/49

Distribution of

Let and the test is then:

¤

for

. The p-value for

¡

©

¨

¨

¦

¡

¡

¨

¥

¨¦

¦

¦

¦

¨

¨

¦

¨

©

¡

**The expected value of P is:
**

¦

¨

¥

¡

¡

§

©

¦

¨

©

¡

¦

§

¦

¥

©

¥

©

©

§

©

¡

¡

¨

¥

¦

§

¢ £¡

¡

**Assuming Y comes from each of the
**

¦ §

¨

¡

, then

¥ ¥

¡

¡

¡

¦

¨

£

¦

¢¡

¢

¤

©

©

¡

¢

§

¦

Inference for Point Pattern Spatial Statistics – p.32/49

§

. So,

Variance of P ( )

Looking at the variance of

$

¡ £ ¤£

we have

¦

¨

¡

¡

¦ ¥ ¦¤ ¢

¨

¡

©

£

¡

¥

¡

¡

©

¦

¥ ¦¤

¨

¨

¡

¦

¡

¡

§

¥

$

©

¦

§

¡

¦

©

¥

©

¨

¦

¥

$

©

©

¥

§

§

£

¢

§

Inference for Point Pattern Spatial Statistics – p.33/49

Variance of P ( )

Looking at the variance of

$

¡ £ ¤£

we have

¦

¨

¡

¡

¦ ¥ ¦¤ ¢

¨

¡

©

£

¡

¥

¡

¡

©

¦

¥ ¦¤

¨

¨

¡

¦

¡

¡

§

¥

$

©

¦

§

¡

¦

©

¥

©

¨

¦

¥

$

©

©

¥

§

§

£

¢

§

Hence we can model the theoretical distribution of from a binomial(p,s) distribution.

¡ £ ¤£

as

Inference for Point Pattern Spatial Statistics – p.33/49

Managing Uncertainty in

Rem that binomial quickly converges to Normal

Inference for Point Pattern Spatial Statistics – p.34/49

¢ ¡

Managing Uncertainty in

Create 95% CI on (true p-value) near

Rem that binomial quickly converges to Normal as

¡ ¤£

¨

¦

¡ £ ¤£

©

¡¢

£

¦

£

¢

$

Inference for Point Pattern Spatial Statistics – p.34/49

¨

¢ ¡

Managing Uncertainty in

Create 95% CI on (true p-value) near

Rem that binomial quickly converges to Normal as

¡ ¤£

¨

¦

¡ £ ¤£

©

¡¢

£

¦

95% of CI created this way should contain the true , and so set decision rule: e.g. reject if value of CI contains or fully below 0.05

¡ ¤£

£

¢

$

¨

¢ ¡

Inference for Point Pattern Spatial Statistics – p.34/49

¡

Managing Uncertainty in

Create 95% CI on (true p-value) near

Rem that binomial quickly converges to Normal as

¡ ¤£

¨

¦

¡ £ ¤£

©

¡¢

£

¦

95% of CI created this way should contain the true , and so set decision rule: e.g. reject if value of CI contains or fully below 0.05 Choose acceptable range of uncertainty for

¡ ¤£

£

¢

$

¨

¢ ¡

.

Inference for Point Pattern Spatial Statistics – p.34/49

¡ ¤£

¡

Managing Uncertainty in

Create 95% CI on (true p-value) near

Rem that binomial quickly converges to Normal as

¡ ¤£

¨

¦

¡ £ ¤£

©

¡¢

£

¦

95% of CI created this way should contain the true , and so set decision rule: e.g. reject if value of CI contains or fully below 0.05 Choose acceptable range of uncertainty for is ok, use example if

£

£

¢

$

¨

¢ ¡

¡ ¤£

. For

¦ §

¢

§

$

¡ ¤£

¨

¦

¡

¡ ¤£

¡

¦

Inference for Point Pattern Spatial Statistics – p.34/49

£

¢

©

¡

Managing Uncertainty in

Create 95% CI on (true p-value) near

Rem that binomial quickly converges to Normal as

¡ ¤£

¨

¦

¡ £ ¤£

©

¡¢

£

¦

95% of CI created this way should contain the true , and so set decision rule: e.g. reject if value of CI contains or fully below 0.05 Choose acceptable range of uncertainty for is ok, use example if

£

£

¢

$

¨

¢ ¡

¡ ¤£

. For

¦ §

¢

§

$

¡ ¤£

¨

¦

¡

¡ ¤£

¡

¦

Use relationship between

$

**and to ﬁnd value of
**

¥

£

¢

©

¥

£

¢

Inference for Point Pattern Spatial Statistics – p.34/49

¡

as a function of

σp

£

0.01

0.02

0.03

0.04

¡

¢

0.05

0.06

0.07

0

¡

500

1000 Number of Simulations (s)

1500

2000

# of Simulations

Inference for Point Pattern Spatial Statistics – p.35/49

Choice of

Use all available ecological knowledge for a more informative test

Inference for Point Pattern Spatial Statistics – p.36/49

Choice of

Use all available ecological knowledge for a more informative test Null point process just needs to be able to be simulated, many models available (e.g. spatstat) or write your own!

Inference for Point Pattern Spatial Statistics – p.36/49

Choice of

Use all available ecological knowledge for a more informative test Null point process just needs to be able to be simulated, many models available (e.g. spatstat) or write your own! At the very least, choose simple inhibition model based on physical separation

Inference for Point Pattern Spatial Statistics – p.36/49

Choice of

Use all available ecological knowledge for a more informative test Null point process just needs to be able to be simulated, many models available (e.g. spatstat) or write your own! At the very least, choose simple inhibition model based on physical separation EDA vs. conﬁrmatory analysis, results in iterative nature of research, with (hopefully) tests on independent data sets

Inference for Point Pattern Spatial Statistics – p.36/49

Choice of

Use all available ecological knowledge for a more informative test Null point process just needs to be able to be simulated, many models available (e.g. spatstat) or write your own! At the very least, choose simple inhibition model based on physical separation EDA vs. conﬁrmatory analysis, results in iterative nature of research, with (hopefully) tests on independent data sets Use the model to determine information on scale!

Inference for Point Pattern Spatial Statistics – p.36/49

**Example of model ﬁtting
**

Attempt to ﬁt a clustered model, representing establishment processes to the lower SW quadrant of in height. the WRCCRF data, for all trees

¢

Inference for Point Pattern Spatial Statistics – p.37/49

**Example of model ﬁtting
**

Attempt to ﬁt a clustered model, representing establishment processes to the lower SW quadrant of in height. the WRCCRF data, for all trees

¢

Used Poisson Clustered model, with represents the number of parents and represents the expected number of children per parent, and where clustering of ‘children’ around each parent are described as

©

¡

$

$

¦

¦

¢

§

§

¨

£

¨

! ¤

¢

$

Inference for Point Pattern Spatial Statistics – p.37/49

¢

¤

$

**Example of model ﬁtting
**

Attempt to ﬁt a clustered model, representing establishment processes to the lower SW quadrant of in height. the WRCCRF data, for all trees

¢

Used Poisson Clustered model, with represents the number of parents and represents the expected number of children per parent, and where clustering of ‘children’ around each parent are described as

©

¡

$

$

¦

¦

¢

§

§

¨

£

¨

! ¤

How to choose values for

¢

$

and ? (

¢

¢

¤

$

)

¡

¡

¨

¨

Inference for Point Pattern Spatial Statistics – p.37/49

**Example of model ﬁtting
**

Attempt to ﬁt a clustered model, representing establishment processes to the lower SW quadrant of in height. the WRCCRF data, for all trees

¢

Used Poisson Clustered model, with represents the number of parents and represents the expected number of children per parent, and where clustering of ‘children’ around each parent are described as

©

¡

$

$

¦

¦

¢

§

§

¨

£

¨

! ¤

How to choose values for

¢

$

and ? (

¢

¢

¤

$

)

¡

¡

Note that my null ‘model’ here describes not only the process, but also the parameter values.

¨

¨

Inference for Point Pattern Spatial Statistics – p.37/49

**Example of model ﬁtting - 2
**

This is Exploratory Data Analysis!

Inference for Point Pattern Spatial Statistics – p.38/49

**Example of model ﬁtting - 2
**

This is Exploratory Data Analysis! If we knew the theoretical value of G, K for this model, use Diggle’s ‘Least Squares Estimation’ method

Inference for Point Pattern Spatial Statistics – p.38/49

**Example of model ﬁtting - 2
**

This is Exploratory Data Analysis! If we knew the theoretical value of G, K for this model, use Diggle’s ‘Least Squares Estimation’ method Otherwise, use GoF test to estimate parameter space

Inference for Point Pattern Spatial Statistics – p.38/49

**Example of model ﬁtting - 2
**

This is Exploratory Data Analysis! If we knew the theoretical value of G, K for this model, use Diggle’s ‘Least Squares Estimation’ method Otherwise, use GoF test to estimate parameter space Find for different combinations of model where

£ £

and ‘accept’

©

¦

¦

¢

¨

¢

Inference for Point Pattern Spatial Statistics – p.38/49

**Example of model ﬁtting - 2
**

This is Exploratory Data Analysis! If we knew the theoretical value of G, K for this model, use Diggle’s ‘Least Squares Estimation’ method Otherwise, use GoF test to estimate parameter space

a) G statistic

0.4 0.4

b) K statistic

0.3

0.2

0.1

0

20

40

ρ

60

80

100

0.1 0

0.2

σ

σ

0.3

20

40

ρ

60

80

Inference for Point Pattern Spatial Statistics – p.38/49

**Example of model ﬁtting - 3
**

Inference? For the observed data, if this model ﬁts, then larger suggests lower (i.e. few parents) and so more children/parent.

¢

Inference for Point Pattern Spatial Statistics – p.39/49

**Example of model ﬁtting - 3
**

Inference? For the observed data, if this model ﬁts, then larger suggests lower (i.e. few parents) and so more children/parent. Conversely a smaller clustering radius requires higher and so fewer children per parent.

¢

Inference for Point Pattern Spatial Statistics – p.39/49

**Example of model ﬁtting - 3
**

Inference? For the observed data, if this model ﬁts, then larger suggests lower (i.e. few parents) and so more children/parent. Conversely a smaller clustering radius requires higher and so fewer children per parent. Is this model a good ﬁt? What might the physiological and/or ecological implications be?

¢

Inference for Point Pattern Spatial Statistics – p.39/49

**Example of model ﬁtting - 3
**

Inference? For the observed data, if this model ﬁts, then larger suggests lower (i.e. few parents) and so more children/parent. Conversely a smaller clustering radius requires higher and so fewer children per parent. Is this model a good ﬁt? What might the physiological and/or ecological implications be? gives us hints about scale.

¢ ¢

Inference for Point Pattern Spatial Statistics – p.39/49

, Variance stabilization

¡¢

should be chosen before the test, and based on research question. (i.e. what is the interaction distance of interest?)

¡ ¢

#

£

Inference for Point Pattern Spatial Statistics – p.40/49

, Variance stabilization

¡¢

should be chosen before the test, and based on research question. (i.e. what is the interaction distance of interest?)

¡ ¢

#

¡

£ ¤¢

¥

¡

£ ¢

¤ ¥£

¦

¢

¤ ¥£

¦

¥

−0.10 0.00 −0.05

K(t)

0.00

0.05

£

0.05

0.10

0.15

0.20

Distance

distance

**Variance stabilization - to make variance independent of .
**

#

Inference for Point Pattern Spatial Statistics – p.40/49

Outline

Use of Point Pattern Statistics in Ecology The Failure of the Simulation Envelope Diggle’s (1983, 2003) ‘Goodness of Fit’ Test Unresolved Implementation Issues Parameterization Based on the Ecological Research Question Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.41/49

**Type I Error Rate ( ) - 1
**

Simulation study of Type I error rate performance

**Evaluated different intensities (
**

¨

**levels, for different point pattern )
**

¤

©

¨

¨

¡

Results within LRT boundaries

¨

Inference for Point Pattern Spatial Statistics – p.42/49

**Type I Error Rate ( ) - 2
**

Simulations of 1000 independent trials using

a) Type I error rates for G

0.15 0.15

©

¡

¡

¥

b) Type I error rates for K

0.10

^ α

0.05

^ α

0.05

0.00

0

50

100

150

200

250

0.00

0.10

0

50

100

150

200

¨

250

λ # points ( )

λ # points ( )

Inference for Point Pattern Spatial Statistics – p.43/49

¡

**Type II Error Rate (1-Power)
**

Type II error rate is the prob of accepting is really true.

given that

Inference for Point Pattern Spatial Statistics – p.44/49

¡

**Type II Error Rate (1-Power)
**

Type II error rate is the prob of accepting is really true.

given that

Requires deﬁnition of

.

Inference for Point Pattern Spatial Statistics – p.44/49

¡

**Type II Error Rate (1-Power)
**

Type II error rate is the prob of accepting is really true.

given that

Requires deﬁnition of

.

¡

Power will be a function of ‘how far’ is from . (‘Easy’ to think of this distance when using Normal distribution, but more difﬁcult to conceptualize here.)

Inference for Point Pattern Spatial Statistics – p.44/49

¡

**Type II Error Rate (1-Power)
**

Type II error rate is the prob of accepting is really true.

given that

Requires deﬁnition of

.

¡

Power will be a function of ‘how far’ is from . (‘Easy’ to think of this distance when using Normal distribution, but more difﬁcult to conceptualize here.) Often overlooked for spatial point process analysis, but can be simulated.

Inference for Point Pattern Spatial Statistics – p.44/49

¡

**Analysis of Type II Error Rate
**

Analysis of power against of CSR for WRCCRF example for different parameterizations of .

Type II error rate tells us the ability to distinguish the pattern from CSR. As increases, larger clusters are more like CSR.

a)ρ=20

1.0 1.0

¢

b)ρ=40

0.8

0.6

Power

0.4

Power

0.2

0.0

0.05

0.15

0.25

0.35

0.0

0.2

0.4

0.6

0.8

0.05

0.15

0.25

0.35

σ

σ

Inference for Point Pattern Spatial Statistics – p.45/49

¡

**Power of the G Statistic
**

‘Large’ deviation at small distances may be swamped out

0.3

^ ¦ G−G

¡ ¢ ¤ ¥£ ¦

¡ ¤ ¥£

¢

−0.2

−0.1

0.0

¡

0.1

0.2

rSSI(r=0.02) rSSI(r=0.03)

−0.3 0.00

0.05

0.10 distance Distance

0.15

0.20

Inference for Point Pattern Spatial Statistics – p.46/49

**Parameters that may improve Power
**

Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):

¡ ¢£ ¤

¡ ¤ £ ¦ # ¦§

¦

§

£

¦§

¦

§

$ ¡ ¤ # #

¡

¨

¤

¡

¥

#

§

Inference for Point Pattern Spatial Statistics – p.47/49

**Parameters that may improve Power
**

Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):

¡ ¢£ ¤

¡ ¤ £ ¦ # ¦§

¦

§

£

¦§

¦

§

$ ¡ ¤ # #

¡

¨

¦

§

,

as parameters to improve Power against certain

¤

¡

¥

#

§

¥

#

Inference for Point Pattern Spatial Statistics – p.47/49

**Parameters that may improve Power
**

Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):

¡ ¢£ ¤

£ ¡ ¤¥£ ¦ # ¦§

¦

§

¦§

¦

§

$ ¡ ¤ # #

¡

¨

¦

**Use of not well explored, but could be used to emphasize certain distances.
**

¥

# ¦

§

¤

¡

For my calculations,

§

¥

#

¥

#

¨

©

§

Inference for Point Pattern Spatial Statistics – p.47/49

**Parameters that may improve Power
**

Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):

¡ ¢£

¤

¡ ¢ £ ¦ # §

§ ¦

¦

§

£

§

$ ¡ ¢ # #

¡

¨

¦

For

¢

use

#

§

, for L statistic.

¨

**use for power against clustered patterns (Diggle, 2003)
**

¦

¨

other?

¦

¤¨

¨

¤

¡

¥

#

§

Inference for Point Pattern Spatial Statistics – p.47/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals.

Inference for Point Pattern Spatial Statistics – p.48/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals. For more precise, reliable results, implement Diggle’s goodness of ﬁt test

Inference for Point Pattern Spatial Statistics – p.48/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals. For more precise, reliable results, implement Diggle’s goodness of ﬁt test Previous marginal results should be re-examined

Inference for Point Pattern Spatial Statistics – p.48/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals. For more precise, reliable results, implement Diggle’s goodness of ﬁt test Previous marginal results should be re-examined Choice of , based on research question and previous knowledge

¡

#

¡ ¢

Inference for Point Pattern Spatial Statistics – p.48/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals. For more precise, reliable results, implement Diggle’s goodness of ﬁt test Previous marginal results should be re-examined Choice of , based on research question and previous knowledge

¡

#

¡ ¢

Evaluate the Power of your test

Inference for Point Pattern Spatial Statistics – p.48/49

Conclusions

Simulation envelope does not result in expected Type I error rates. Limits are not conﬁdence intervals. For more precise, reliable results, implement Diggle’s goodness of ﬁt test Previous marginal results should be re-examined Choice of , based on research question and previous knowledge

¡

#

¡ ¢

Evaluate the Power of your test

R software availability:

http://students.washington.edu/nhl/masters.html

Inference for Point Pattern Spatial Statistics – p.48/49

R software resources

CRAN (Comprehensive R Archive Network) site

http://cran.r-project.org/

**A. Baddeley’s spatstat package
**

http://www.maths.uwa.edu.au/ adrian/spatstat.html

**P. Diggle’s splancs package
**

http://www.maths.lancs.ac.uk/ rowlings/Splancs/

**UW R and S-plus user support group
**

http://mailman1.u.washington.edu/mailman/listinfo/s plus

Inference for Point Pattern Spatial Statistics – p.49/49