New Image Processing Software For Analyzing Object Size-Frequency Distributions, Geometry, Orientation, and Spatial Distribution

1New image processing software for analyzing object size-frequency
2distributions, geometry, orientation, and spatial distribution*
3Ciarán Beggan1 and Christopher W. Hamilton2
41 Grant Institute, School of GeoSciences, University of Edinburgh, UK
52 Hawai‘i Institute of Geophysics and Planetology, University of Hawai‘i, USA
7Corresponding author: Ciarán Beggan, E-mail: ciaran.beggan@ed.ac.uk;
8Phone: +44 131 650 5916; Address: School of GeoSciences, Grant
9Institute, West Mains Road, Edinburgh, EH9 3JW, UK
10
11*Code available from server at http://www.geoanalysis.org
12
13Abstract
14Geological Image Analysis Software (GIAS) combines basic tools for calculating
15object area, abundance, radius, perimeter, eccentricity, orientation, and centroid
16location, with the first automated method for characterizing the aerial distribution of
17objects using sample-size-dependent nearest neighbor (NN) statistics. The NN
18analyses include tests for: (1) Poisson, (2) Normalized Poisson, (3) Scavenged k = 1,
19and (4) Scavenged k = 2 NN distributions. GIAS is implemented in MATLAB with a
20Graphical User Interface (GUI) that is available as pre-parsed pseudocode for use
21with MATLAB, or as a stand-alone application that runs on Windows and Unix
22systems. GIAS can process raster data (e.g., satellite imagery, photomicrographs, etc.)
23and tables of object coordinates to characterize the size, geometry, orientation, and
24spatial organization of a wide range of geological features. This information expedites
25quantitative measurements of 2D object properties, provides criteria for validating the
1 1
26use of stereology to transform 2D object sections into 3D models, and establishes a
27standardized NN methodology that can be used to compare the results of different
28geospatial studies and identify objects using non-morphological parameters.
29
30Key Words: Image analysis, software, size-frequency, spatial distribution, nearest
31neighbor, vesicles, volcanology
32
331. Introduction
34Geological image analysis extracts information from representations of natural objects
35that may either be captured by an imaging system (e.g., photomicrographs, aerial
36photographs, digital satellite imagery) or schematically rendered into visual form
37(e.g., geological maps). In addition to examining the properties of individual objects,
38spatial analysis may be used to quantify object distributions and investigate their
39formation processes.
40 ImageJ (Rasbald, 2005) is a commonly used image processing application that
41was developed as an open source Java-based program by the National Institutes of
42Health (NIH). Custom plug-in modules enable ImageJ to solve numerous tasks,
43including the analysis of vesicle size-frequency distributions within geological thin-
44sections (e.g., Szramek et al, 2006, Polacci et al., 2007). However, ImageJ and similar
45programs for Windows (e.g., Scion Image and ImageTool) and Macintosh (e.g., NIH
46Image) are not specifically designed for geological applications nor do they provide
47analytical tools for investigating patterns of spatial organization.
48 To take greater advantage of the information contained within geological
49images, we have developed Geological Image Analysis Software (GIAS). This
50program combines (1) an image processing module for calculating and visualizing
2 2
51object areas, abundance, radii, perimeters, eccentricities, orientations, and centroid
52locations, and (2) a spatial distribution module that automates sample-size dependent
53nearest neighbor (NN) analyses. Although other programs can perform the basic
54functions in the “Image Analysis" module, GIAS is the first program to automate
55sample-size-dependent analyses of NN distributions.
56
572. Motivation
58Nearest neighbor (NN) analysis is well-suited for investigating patterns of spatial
59distribution within intrinsically two-dimensional (2D) datasets, such as orthorectified
60aerial photographs and satellite imagery. Applications of NN analyses to remote
61sensing imagery include the study of volcanic landforms (Bruno et al., 2004; 2006;
62Baloga et al., 2007; Bishop, 2008; Hamilton et al., 2009; Bleacher et al., 2009),
63sedimentary mud volcanoes (Burr et al., 2009), periglacial ice-cored mounds (Bruno
64et al., 2006), glaciofluvial features (Burr et al., 2009), dune fields (Wilkins and Ford,
652007), and impact craters (Bruno et al., 2006). Despite widespread utilization of NN
66analyses, its value as a remote sensing tool and effectiveness as basis for comparison
67between different datasets is limited by the lack of a standardized NN methodology—
68particularly in terms of defining feature field areas, thresholds of significance, and
69criteria for apply higher-order NN methods.
70 In addition to remote sensing applications, NN analyses can be used to study
71objects in photomicrographs such as crystals (Jerram et al., 1996; 2003; Jerram and
72Cheadle, 2000) and vesicles. In this study, we emphasize the application of NN
73analyses to vesicle distributions to demonstrate how GIAS can be used to validate (or
74refute) the assumption of randomness, which is a prerequisite for effectively applying
3 3
75stereological techniques to derive vesicle volumes from photomicrographs and for
76selecting appropriate statistical models to characterize those vesicle populations.
77 Vesicle textures preserve information about the pre-eruptive history of
78magmas and can be used to investigate the dynamics of explosive and effusive
79volcanic eruptions (e.g., Mangan et al., 1993; Cashman et al., 1994; Mangan and
80Cashman 1996; Cashman and Kauahikaua, 1997; Polacci and Papale 1997; Blower et
81al., 2001; Blower et al., 2003; Gaonac'h et al., 2005; Shin et al., 2005; Lautze and
82Houghton, 2005; Adams et al., 2006; Polacci et al., 2006; Sable et al., 2006; Gurioli
83et al., 2008). Quantitative vesicle analyses stem from Marsh (1988), who explored the
84physics of crystal nucleation and growth dynamics to derive an analytical formulation
85for crystal size distributions. This research was then applied to numerous bubble size-
86frequency distribution studies (e.g., Sarda and Graham, 1990; Cashman and Mangan,
871994; Cashman et al., 1994; Blower et al., 2003). Early studies of vesicle distributions
88(e.g., Cashman and Mangan, 1994) were limited by their inability to characterize the
89full range of vesicle sizes because their methodology could not resolve the smallest
90vesicles. Nested photomicrographs solve this problem because photomicrographs
91captured at multiple magnifications enable the reconstruction of total vesicle size-
92frequency distributions (e.g., Adams et al., 2006; Gurioli et al., 2008).
93 In general, vesicle studies are limited by two major factors: transformations of
942D cross-sections into representative vesicle volumes (Mangan et al., 1993; Sahagian
95and Proussevitch, 1998; Higgins, 2000; Jerram and Cheadle, 2000); and development
96of reliable statistical characterizations of vesicle populations in terms of distribution
97functions and their spatial characteristics (Morgan and Jerram, 2007; Proussevitch et
98al., 2007a). These difficulties have been partially addressed by improved statistical
99techniques for investigating vesicle populations; however, a single cross-section
4 4
100cannot be used to reconstruct a representative 3D vesicle distribution unless the
101objects viewed in a 2D section can be characterized using 2D reference textures with
102known 3D distributions (Jerram et al., 1996; 2003; Jerram and Cheadle 2000;
103Proussevitch et al., 2007a). Although synchrotron X-ray tomography is increasingly
104being used to directly generate 3D vesicle distributions (e.g., Gualda and Rivers,
1052006; Polacci et al., 2007; Proussevitch et al., 2007b), nested datasets containing
106multiple Scanning Electron Microscope (SEM) images remain the most common
107input for vesicle studies because of their superior spatial resolution relative to X-ray
108tomography. To facilitate the analysis of vesicles in SEM imagery, GIAS can be used
109to determine the geometric properties of vesicles within the plane of a given
110photomicrograph and establish if objects fulfil the criteria of spatial randomness,
111which is required for transforming 2D sections into 3D models using stereology.
112
1133. Nearest neighbor (NN) analysis
114Clark and Evans (1954) proposed a simple test for spatial randomness in which the
115actual mean NN distance (ra) in a population of known density is compared to the
116expected mean NN distance (re) within a randomly distributed population of
117equivalent density. Following Clark and Evans (1954), re and expected standard error
118(σe) of the Poisson distribution are:
1 0.26136
119 re  ; e  , [1; 2]
2 0 N 0
120where the input population density, ρ0, equals the number of objects (N) divided by
121the area (A) of the feature field (ρ0 = N / A). The following two test statistics (termed
122R and c) are used to determine if ra follows a Poisson random distribution:
ra ra  re
123 R ; c . [3; 4]
re e
5 5
124 If a test population exhibits a Poisson random distribution, R should ideally
125have a value of 1, while c should equal 0. If R is approximately equal to 1, then the
126test population may have a Poisson random distribution. If R > 1, then the test
127population exhibits greater than random NN spacing (i.e., tends towards a maximum
128packing arrangement), whereas if R < 1, then the NN distances in the test case are
129more closely spaced than expected within a random distribution and thus exhibit
130clustering relative to the Poisson model.
131 To identify statistically significant departures from randomness at the 0.95 and
1320.99 confidence levels, |c| must exceed the critical values of 1.96 and 2.58,
133respectively (Clark and Evans, 1954); however, these critical values implicitly assume
134large sample populations (N > 104). Jerram et al. (1996) and Baloga et al. (2007) note
135that finite-sampling effects introduce biases into the variation of NN statistics. These
136biases become significant for small populations (N < 300) and thereby necessitate the
137use of sample-size-dependent calculations of R and c thresholds.
138 The NN methodology of Clark and Evans (1954) assumes that objects can be
139approximated as points. However, for frameworks of solid objects, NN statistics
140exhibit scale-dependent variations in randomness due to the restrictions imposed by
141avoiding overlapping object placements. For 2D sections through 3D distributions of
142equal-sized solid spheres, Jerram et al. (2003) define a porosity-dependent threshold
143for R, which they term the random sphere distribution line (RSDL). With increasing N
144(i.e., decreasing porosity), the RSDL threshold increases from a value just above 1 to
145a maximum of 1.65 at 37% porosity, based on the experimental spherical packing
146model of Finney (1970). Jerram et al. (1996, 2003) also note that compaction,
147overgrowth, and sorting can introduce systematic variations in R relative to the RSDL
6 6
148and thus, when correctly identified, can provide information relating to the relative
149importance of these processes during rock formation.
150 In addition to sample-size-dependent tests for Poisson randomness, Baloga et
151al. (2007) proposed the following three NN models:
152
153 1. Normalized Poisson NN distributions account for data resolution-limits, which
154 require normalization of re, σe, skewness, and kurtosis values based on a user-
155 defined minimum distance threshold.
156 2. Scavenged Poisson NN distributions assume features consume (or “scavenge”)
157 resources in their surroundings, which affects the kth nearest neighbor by
158 increasing re relative to the standard Poisson NN model. The order of the
159 scavenging model is given by the Poisson index, k, which identifies the
160 number of NN objects participating in the scavenging process and, if k = 0, the
161 standard Poisson NN distribution is obtained.
162 3. Logistic NN distributions apply the classical probability distribution for self-
163 limiting population growth due to the use of available resources. Under this
164 probability assumption, NN distances become more uniform, which leads to
165 smaller skewness, but larger kurtosis compared to the Poisson NN model.
166
167 Sample-size-dependent R and c test statistics, in addition to the expected
168skewness versus kurtosis values, allow for robust statistical tests of spatial
169organization. However, prior to this study, these extensions to the classical NN
170method have not been automated and freely available for use.
171
1724. Method
7 7
1734.1. Input data and processing parameters
174GIAS is designed to process two input data types: images and tables of object
175coordinates. In the first instance, images may include rectangular raster files (e.g.,
176*.tiff, *.jpg, *.gif, etc.) encoded in 8-bit grey-scale formats. All inputs are assumed to
177have an orthogonal projection with equal x and y pixel dimensions. If the input image
178pixels are not square, their area can be uniquely specified. To appropriately scale the
179output results, the pixel size must be defined by the analyst for each input image.
180 Image segmentation is achieved using Digital Number (DN) thresholding,
181such that inputs are converted to a binary threshold logical matrix. GIAS assumes
182input grey-scale pixel values >250 are white; however, the upper and lower thresholds
183can be adjusted in the input options. Within input images, black pixels are assumed to
184be the objects of interest (e.g., vesicles) whereas white pixels are assumed to be part
185of the background (e.g., non-vesicles). An option is provided for image inversion.
186 Other than performing the binary conversion, GIAS does not modify input
187images. Consequently, pre-processing may be necessary to remove image artifacts and
188segment objects into feature classes. To facilitate image analysis and pre-processing
189of inputs to the NN module, GIAS supplies labeling information and metadata for
190each object in the image analysis output file; however, the level of image modification
191will depend upon the particular scientific investigation and the preferences of the
192analyst.
193 Inputs to the NN module can be passed from the image analysis module or
194loaded as a two column table (tab- or space- delimited) with x-coordinates in the first
195column and y-coordinates in the second column. GIAS automatically checks input
196files for duplicate entries and retains only unique coordinate pairs.
8 8
197 Slider bars control the detection thresholds for the minimum and maximum
198object areas, in addition to the distance threshold used within the Normalized Poisson
199tests. A check-box excludes objects that touch the periphery of the image from
200subsequent processing. This option is enabled by default because meaningful cross-
201sectional properties (e.g., area, geometry, and orientation) cannot be calculated for
202truncated objects. Calculations of object abundance (as a percentage of total image
203area) are based on total image properties and are unaffected by peripheral object
204exclusion. Once the input options are satisfactorily set, the analyst simply presses the
205“Process” button to calculate the output graphs and statistics.
206
2074.2. Output results
208GIAS outputs its results to on-screen graphs and text boxes, which are presented in
209two tabbed panels. Statistics relating to the object area, geometry, and orientation are
210shown in the first panel, labeled “Image Analysis” (Figure 1). The frequency axis can
211be plotted in linear or log units with on-screen options for defining custom
212dimensions. Object areas, radii, perimeters, and eccentricities are presented as
213histograms, whereas object orientations are displayed using a rose plot. Object radii
214are estimated by calculating the radii of equal area circles. In addition to the graphical
215output, there are text-box summaries of the total number of objects, their abundance,
216minimum, maximum, mean, standard deviation (at 1 σ), skewness, and kurtosis based
217on the object area calculations.
218 Results of the spatial analyses are summarized in a second panel, labeled
219“Nearest Neighbor Analysis” (Figure 2). These results include histograms of the
220measured NN distances, in addition to graphs of R, c, and skewness versus kurtosis
221(Baloga et al., 2007). The R and c values are presented with sample-size-dependent
9 9
222thresholds for rejecting the null hypothesis (i.e., a given model of re) at significance
223levels of 1 and 2 σ. By default, the program assumes a Poisson random model for the
224expected spatial distribution; however, the user may also view results based on
225Normalized Poisson, or Scavenged (k = 1 or 2) models. Similarly, the skewness versus
226kurtosis plots are provided with sample-size-dependent significance levels of 0.90,
2270.95, and 0.99.
228 Textbox outputs provide the model-dependent value of re and summarize ra
229statistics, including the total number of objects, number within the convex hull, and
230NN distance statistics (minimum, maximum, mean, and standard deviation at 1 and 2
231σ). The convex hull is a polygon defined by the outermost points in a feature field
232such that each interior angle of the polygon measurers less than 180º (Graham, 1972).
233The convex hull thus forms a boundary around a feature field with the minimum
234possible perimeter length (see Appendix 1 for examples).
235 The input parameters and computed statistics from both tabbed panels can be
236output in separate user-named text files. The data are saved in a tab-delimited format,
237allowing the files to be viewed using a text editor or imported into a spreadsheet.
238
2394.3. Implementation
240GIAS is written and implemented in MATLAB with a Graphical User Interface (GUI)
241that is designed for use within the geological community. The program is available as
242preparsed pseudocode for use with MATLAB (provided both “Image Processing” and
243“Statistics” toolboxes are installed) or as a standalone application that can run on
244Windows and Unix systems without requiring MATLAB or its auxiliary toolboxes.
245These software products may be downloaded from the internet for free via
246http://www.geoanalysis.org.
10 10
247 The regionprops function within the MATLAB Image Processing toolbox
248calculates the following key properties of each object: area, centroid position,
249perimeter, eccentricity, orientation, equal-area circle equivalent diameter and a list of
250pixel positions within each vesicle. The pixel list is used to determine which objects
251are in contact with the image edge and the equal area circle diameters are used to
252calculate object radii. The eccentricity and orientation are calculated by fitting an
253ellipse about the object. The angle between the major axis of the ellipse and the x-axis
254of the image defines the orientation.
255 By default, the NN analysis utilizes centroid positions calculated in the image
256analysis module, however, object locations may be uploaded as a table of x- and y-
257coordinates. Within the NN module, the MATLAB function convhull is used to
258identify the vertices of the convex hull for all of the input objects. The function
259convhull is also used to calculate the area of the convex hull (Ahull) and to separate
260interior objects (Ni) from the hull vertices (see Appendix 1 for examples).
261 NN distances are determined by calculating the Euclidean distance between
262each interior object (centroid) and all other objects within the dataset, including the
263vertices of the hull. The results for each interior point are sorted in ascending order to
264identify the minimum NN distance and calculate the actual mean NN distance (ra) by
265averaging the NN distances for all interior objects (Ni). The interior population density
266(ρi = Ni / Ahull), provides a statistical estimate of the true density within an infinite
267domain and thus substitutes for ρ0. We then calculate R as the ratio of ra-to-re.
268 Using sample-size-dependent values of R, c, and their respective thresholds of
269significance, we may define the following scenarios. If the values of R and c fall
270within ±2 σ of their expected values, then the results suggest that the model correctly
271describes the input distribution. If the value of c is outside the range of ±2 σ, then the
11 11
272inference is that the input distribution is inconsistent with the model, regardless of the
273value of R. If c is outside ±2 σ, and R is greater than the +2 σ threshold, then the input
274distribution exhibits a statistically significant repelling relative to the model.
275Conversely, if c is outside the ±2 σ range, and R less then than the -2 σ threshold, then
276the input distribution is inferred to be clustered relative to the model. Lastly, if R is
277outside ±2 σ, and c is within the ±2 σ range, then the test results are ambiguous,
278which generally indicates that there is insufficient data to perform the analysis and/or
279the standard error (σe) of the input distribution is large.
280 One of the novel contributions of our application is the introduction of
281automated calculations of sample-size-dependent biases in NN statistics. Baloga et al.
282(2007) described how finite sampling would affect several NN models; however, they
283only implemented a solution for the Poisson NN case. In our application, we have
284automated the Poisson NN procedure, extended it to include the three other models of
285Baloga et al. (2007), and derived the k = 2 case for the scavenged NN distribution
286(Appendix 2). Automation of the NN method within GIAS enables users to rapidly
287determine if their data fulfills the size-dependent criteria of statistical significance for
288the following five spatial distribution models: (1) Poisson, (2) Normalized Poisson,
289(3) Scavenged k = 1; and (4) Scavenged k = 2. Table 2 summarizes the properties of
290these models.
291 To obtain sample-size-dependent values of R, c, and their confidence limits,
292we generated simulations in MATLAB. Following the method of Baloga et al. (2007),
293confidence limits for the Poisson and the Scavenged k = 1 and 2 models were
294simulated using 1000 random draws from a modeled distribution. However, for the
295Normalized Poisson model, limits for the standard Poisson case are employed because
296simulating unique limits for each user-defined distance threshold is computationally
12 12
297intensive in real-time (e.g., requiring approximately four minutes to perform the
298simulations using a computer with a 2 GHz processor and 1 GB of RAM).
299 GIAS does not include the logistic density distribution described by Baloga et
300al. (2007) because its parameters (i.e., the logistic mean (m) and the standard
301deviation (b), see Table 2) are estimated using a best-fit to the input data. The logistic
302case therefore differs substantially from the other NN analyses because they involve a
303comparison between input spatial distributions and expected model distributions.
304Additionally, the fit of a logistic curve to an input distribution does not necessarily
305imply an underlying logistic process because other processes could generate
306distributions that better fits the data. Thus given the differences between the logistic
307density distribution described by Baloga et al. (2007) and the other Poisson NN
308models, we have omitted the logistic case from the geospatial analysis tools
309implemented within GIAS.
310 To correctly interpret the results of the Scavenging NN models, it is important
311to recognize that the formulation of Baloga et al. (2007) is a scale-free distribution.
312Modification to expected NN distances depend upon the number of objects
313participating in the scavenging process and not on the location of the objects.
314Consequently, the mean and standard deviation statistics are not directly related to the
315radius of the scavenged area. This is an unrealistic condition for resource-scavenging
316phenomena in nature, which have a fixed upper limit to r. Alternative modeling
317scenarios, with a user-defined scavenging radius, may be more appropriate for some
318applications—provided that a scale for the scavenging process can be determined.
319 Distinguishing between Poisson NN and Scavenged Poisson NN models
320requires statistical separation of their respective NN distributions. The minimum
321number of objects required will depend on ρ0 and k. In Appendix 3, we show that
13 13
322separation of the Poisson and k = 1 Scavenged Poisson models, at the 95% confidence
323level, requires a minimum 50 to 60 objects for ρ0 values of 0.5 to 0.0005 objects / unit
324area, whereas 100 to 150 objects are required to distinguish between k = 1 and 2
325models.
326 In addition to sample-size-dependent R and c statistics, we expand upon the
327methods of Baloga et al. (2007) by simulating the sample-size-dependent range of
328expected skewness and kurtosis values for each NN hypothesis. Calculated
329confidence intervals—at 90%, 95% and 99%—are plotted within GIAS to illustrate
330the expected ranges of skewness versus kurtosis for each model. Plots of skewness
331versus kurtosis are also useful for discriminating between populations with otherwise
332similar R and c values, such as ice-cored mounds (e.g., pingos) and volcanic rootless
333cones (Bruno et al., 2006).
334
3355. Results
336To demonstrate the utility of GIAS, we have applied the program to vesicle patterns
337within proximal magmatic tephra deposits from Fissure 3 of the 1783-1784 Laki cone
338row, Iceland. Tables 3 and 4 summarize the results of the “Image Analysis” and
339“Nearest Neighbor Analysis” modules, respectively. In Appendix 1, we include
340additional examples of geospatial distributions that are clustered and repelled relative
341to the Poisson NN model. These examples are based on the Hamilton et al. (2010a,
3422010b) and demonstrate how statistical NN analyses can be combined with field
343observations to obtain information about the effects of environmental resource
344limitation on the spatial distribution of volcanic landforms. Appendix 1 also explores
345how the geospatial analysis tools that are provided within GIAS can be applied to
346remote sensing data to quantitatively compare the spatial distribution of landforms in
14 14
347different planetary environments, supply non-morphological evidence to discriminate
348between feature identities, and examine self-organization processes within geological
349systems.
350
3515.1. Vesicle analysis
352Vesicle characteristics can provide information about the pre-eruptive history of
353magmas, volatile exsolution dynamics, and conduit ascent processes—all of which are
354important factors in controlling the intensity of explosive volcanic eruptions.
355Quantification of these parameters relies largely upon the use of vesicle size, number,
356and volume distributions to establish links between natural volcanic samples and
357experimental data (Adams et al., 2006). Nested SEM images at several magnifications
358are typically used in combination with stereology (e.g., Sahagian and Proussevitch,
3591998) to transform 2D vesicle properties into 3D distributions. Although acquisition
360and analysis of a complete nested SEM dataset is outside the scope of this study, we
361present an example of how GIAS can be used to analyze vesicle properties within a
362single image. For this example, we have chosen a magmatic tephra sample (LAK-t-
36323c, Emma Passmore, unpublished data) from 1783-1784 Laki eruption in Iceland.
364The resolution of the image is 3.81 μm / pixel and we have followed the method of
365Gurioli et al. (2008) by using a minimum object area threshold of 20 pixels (76.2
366μm2) to avoid complications associated with poorly resolved objects near the limit of
367resolution within our data.
368 The output of the "Image Analysis" module (Table 3 and Figure 1) reveals that
369the sample has a vesicularity of 65.48% and—excluding objects in contact with the
370periphery of the image—there are 537 objects larger than the detection threshold of
37176.2 μm2. Assuming the properties of an equal area circle, the vesicle diameters range
15 15
372from 19.21 μm to 2.85 × 103 μm, with a mean of 3.01 × 102 μm ±6.89 × 102 (at 1 σ).
373The ratio of equal area circle circumference to object perimeter is 0.77 ±1.53 (at 1 σ);
374however; for the 21 vesicles larger than 3.10 × 105 μm2 the ratio decreases to 0.55
375±0.21, which indicates that the largest vesicles strongly deviate from a circular
376geometry (see the graph of “Objects Perimeters”). The vesicles have a mean
377eccentricity of 0.71 ± 0.18 and a mean orientation trending 7.27° to 187.27° ±43.91°
378(n.b., 90° points towards the top of the image; see “Object Eccentricity” and “Object
379Orientations” in Figure 1).
380 The NN module results (Table 4 and Figure 2) show that NN distances
381between vesicle centroids range from 28.05 to 1.53 ×103 μm, with a mean (ra) of 1.75
382× 102 ±1.15 ×102 (see “Nearest Neighbor Frequency Distribution” in Figure 2). The
383sample has R and c values within the 2 σ thresholds of significance (Figure 2) and
384thus we conclude that the vesicle centroids exhibit a Poisson (random) distribution.
385We have also analyzed the sample using a Normalized Poisson threshold of 5 pixels
386(19.24 μm), which corresponds to the diameter of a circle that is equal in area to our
387object area threshold of 20 pixels. Given this threshold, the sample exhibits a repelled
388distribution with R and c values greater than their respective upper 2 σ limits. The
389Scavenged k = 1 and 2 cases overestimate re relative to ra, and thus result in clustered
390results relative to both Scavenged models.
391 The GIAS outputs suggest that vesicles within this sample exhibit an overall
392Poisson distribution despite the presence of a vesicle fabric trending 7.27° to 187.27°
393±43.91°. The random distribution of vesicles supports the usage of a stereological
394transformation, however, deformation and coalescence have caused vesicles >3.10 ×
395105 μm2 to strongly deviate from a circular geometry. Consequently, if the vesicles
396represented within this thin-section were transformed into a 3D distribution using
16 16
397stereology, it would be necessary to acknowledge propagating errors associated with
398the non-circular geometry of the largest vesicles in the sample. Additionally, it is
399important to recognize the resolution limitations associated with analysis because we
400have applied a minimum object area threshold of 20 pixels, which excludes all
401vesicles <76.2 μm2 (i.e., 630 of 1167 objects). A detection threshold of 20 pixels is
402appropriate for studies of vesicle size-frequency distributions because an
403misclassification of 1 pixel results in an error of 5% of the estimate of total object area
404(Lucia Gurioli, personal communication, 2009). Nevertheless, this threshold appears
405insufficient for filtering artifacts associated with the geometric properties of the
406smallest vesicles in the input image. For instance, eccentricity values calculated for
407vesicles within the Laki magmatic tephra sample appear to be very high (0.71 ± 0.18),
408but this is a resolution-induced artifact resulting from pixel-scale asymmetries in a
409large population of small vesicles. Resolution limitations also affect the Normalized
410Poisson NN distribution because the large number of objects below the detection
411thresholds leads to underestimates of ρ0 and re, which in turn results in a repelled
412distribution relative to the model. In instances where the majority of objects are below
413the detection limit of analysis, it is therefore imperative to use nested image datasets,
414and/or high magnification mosaics to resolve the properties of the smallest objects in
415a given distribution.
416
4176. Discussion
418For randomly distributed spherical objects, stereological techniques may be used to
419convert cross-sectional areas into volumes (Sahagian and Proussevitch, 1998).
420Higgins (2000) and Morgan and Jerram (2007) have also applied 2D to 3D
421transformations for crystal shapes such as cubes and prisms based upon a large
17 17
422number of observed cross-sectional areas. Nevertheless, for irregular object
423geometries, transformations do not exist for generating an accurate 3D model from a
424single 2D image (Mock and Jerram, 2005; Sable et al., 2006). For vesicle studies,
425direct measurements of 3D vesicle distributions using X-ray microtomography can be
426used to circumvent the problems associated with stereological transformations (e.g.,
427Gualda and Rivers, 2006; Polacci et al., 2007; Proussevitch et al., 2007b), but the
428resolution of synchrotron-based X-ray tomography is insufficient to resolve micron-
429sized vesicle walls, which can result in erroneously large estimates of vesicle
430permeability (Gurioli et al., 2008). Tomographic resolution limitations similarly
431prohibit the detection of sub-micron vesicles, which decreases estimates of vesicle
432number densities relative to investigations of higher resolution SEM images (Gurioli
433et al., 2008). Consequently, even given the availability of 3D imaging techniques,
434thin-section analysis continues to supply unique information about vesiculation
435processes and, therefore, 2D image processing should continue to be explored.
436 Similarly, NN analysis of a single 2D image cannot resolve the spatial
437organization of the centroids in 3D, unless additional reference information is
438provided (e.g., Jerram et al., 1996; 2003; Jerram and Cheadle, 2000). Instead, 2D
439methods characterize the aerial distribution of objects within the plane of the image.
440When applying NN analyses to vesicle distributions in thin-sections, it is important to
441remember that object centroids are unlikely to coincide with the center of a vesicle in
4423D, especially when vesicles are irregularly shaped. To obtain a more meaningful
443statistical description of vesicle organization it may be necessary to analyze several
444photomicrographs—preferably three mutually perpendicular thin-sections that sample
445the principal component directions of the vesicle fabric. Vertical offsets in the position
446of objects in orthometric aerial photographs and satellite imagery may similarly
18 18
447introduce discrepancies between projected object separations and true NN distances,
448but the vertical offsets are generally small relative to the separation of objects in the
449plane of the image and hence the differences can be ignored.
450
4517. Conclusions
452We have developed Geological Image Analysis Software (GIAS) for calculating the
453geometric properties of objects and quantifying their spatial distribution. The program
454improves the efficiency and reproducibility of geological object characterizations by
455combining image processing tools for calculating object areas, abundance, radii,
456perimeters, eccentricity, orientation, and centroid location with sample-size-dependent
457NN tests for (1) Poisson; (2) Normalized Poisson; (3) Scavenged, k = 1; (4) and
458Scavenged k = 2 distributions. These geospatial analyses have not been implemented
459in other software packages and, consequently, GIAS represents a novel approach to
460characterizing vesicle distributions in thin-sections and examining the spatial
461organization of other geological features in datasets ranging from photomicrographs
462(e.g., Section 5) to remote sensing imagery of planetary surfaces (e.g., Appendix 1
463and Hamilton et al., 2010b).
464 Future work will focus on developing the following aspects of GIAS: (1)
465image segmentation using Kriging analysis of DN histograms (e.g., Oh and Lindquist,
4661999); (2) automated object recognition (e.g., Proussevitch and Sahagian, 2001;
467Lindquist et al., 1996; Shin et al, 2005); (3) geometric cluster analysis (e.g., Still et
468al., 2004); (4) statistical analyses to address truncated and multimodal populations
469(e.g., Proussevitch et al., 2007a); (5) processing nested image datasets (e.g., Gurioli et
470al., 2008); (6) incorporating stereological techniques for converting 2D data into 3D
471distributions and then using the 3D models to calculate vesicle number density (e.g.,
19 19
472Sahagian and Proussevitch, 1998; Proussevitch et al., 2007a); (7) developing NN
473analyses for 3D datasets (Jerram and Cheadle, 2000; Mock and Jerram, 2005); and (8)
474treatment of solid object interactions on NN statistics (e.g., Jerram et al.; 2003).
475
476Acknowledgements
477We thank Alexander Proussevitch, Steve Baloga, Sarah Fagents, Bruce Houghton,
478Lucia Gurioli, and Mark Naylor for insightful discussions relating to vesicle
479distributions and NN simulations; Emma Passmore for kindly providing the
480geological thin-sections examined within this study; and Dougal Jerram and
481Guilherme Gualda for their thorough reviews, which have greatly improved this
482manuscript. CDB wishes to acknowledge the assistance of Sue Rigby in providing
483funding for software purchases. CDB was funded under NERC studentship award
484NER/S/J/2005/13496. CWH is funded by the Icelandic Centre for Research
485(RANNÍS) and the NASA Mars Data Analysis Program (MDAP) grant
486NNG05GQ39G. SOEST publication number: 7808. HIGP publication number 1802.
487
488
489References
490Adams, N.K., Houghton, B.F., Fagents, S.A., Hildreth, W., 2006. The transition from
491explosive to effusive eruptive regime: The example of the 1912 Novarupta eruption,
492Alaska. Geological Society of America Bulletin 118, 620–634.
493doi:10.1130/B25768.1.
494
20 20
495Baloga, S.M., Glaze, L.S., Bruno, B.C., 2007. Nearest-neighbor analysis of small
496features on Mars: Applications to tumuli and rootless cones. Journal of Geophysical
497Research 112, E03002, 1-17. doi:10.1029/2005JE002652.
498
499Bishop, M.A., 2008. Higher-order neighbor analysis of the Tartarus Colles cone
500groups, Mars: The application of geographical indices to the understanding of cone
501pattern evolution. Icarus 197, 73–83.
502
503Bleacher, J.E., Glaze, L.S., Greeley, R., Hauber E., Baloga, S.M., Sakimoto, S.E.H.,
504Williams, D.A., Glotch, T.D., 2009. Spatial and alignment analyses for a field of small
505volcanic vents south of Pavonis Mons and implications for the Tharsis province,
506Mars. Journal of Volcanology and Geothermal Research 185, 96-102.
507doi:10.1016/j.jvolgeores.2009.04.008.
508
509Blower, J.D., Keating, J.P., Mader, H.M., Phillips, J.C., 2001. Inferring
510volcanic degassing processes from vesicle size distributions.
511Geophysical Research Letters 28, 347–350.
512
513Blower, J.D., Keating, J.P., Mader, H.M., Phillips, J.C., 2003. The
514evolution of bubble size distributions in volcanic eruptions. Journal of
515Volcanology and Geothermal Research 120, 1–23.
516
517Bruno B.C., Fagents S.A., Thordarson T., Baloga S.M., Pilger E., 2004.
518Clustering within rootless cone groups on Iceland and Mars: Effect of
21 21
519nonrandom processes. Journal of Geophysical Research 109,
520E07009, 1-11. doi:10.1029/2004JE002273.
521
522Bruno, B.C., Fagents, S.A., Hamilton, C.W., Burr, D.M., Baloga, S.M.,
5232006. Identification of volcanic rootless cones, ice mounds, and
524impact craters on Earth and Mars: Using spatial distribution as a
525remote sensing tool. Journal of Geophysical Research 111, E06017,
5261-16. doi:10.1029/2005JE002510.
527
528Burr, D.M., Bruno, B.C., Lanagan, P.D., Glaze, L.S., Jaeger, W.L., Soare, R.J., Wan
529Bun Tseung, J.M., Skinner, J.A., Baloga, S.M., 2009. Mesoscale raised-rim
530depressions (MRRDs) on Earth: A review of the characteristics, processes and spatial
531distributions of analogs for Mars. Planetary and Space Science 57, 579-596.
532
533Cashman, K.V., Mangan, M.T., 1994. Physical aspects of magmatic degassing: II.
534Constraints on vesiculation processes from textural studies of eruption products. In:
535Carroll, M.R. and Holloway, J.R. (Eds.), Volatiles in Magmas. Reviews in
536Mineralogy, Mineralogical Society of America, pp. 447–478.
537
538Cashman, K.V., Kauahikaua, J.P., 1997. Reevaluation of vesicle
539distributions in basaltic lava flows. Geology 25, 419–422.
540
541Clark, P.J., Evans, F.C., 1954. Distance to nearest neighbor as a measure of spatial
542relationships in populations. Ecology 35, 445–453.
543
22 22
544Finney J., 1970. The geometry of random packing. Proceedings of the Royal Society
545of London, Series A 318, 479-493.
546
547Gaonac'h, H., Lovejoy, S., Schertzer, D., 2005. Scaling vesicle
548distributions and volcanic eruptions. Bulletin of Volcanology 67, 350–
549357.
550
551Gualda, G.A.R., Rivers, M., 2006. Quantitative 3D petrography using X-ray
552tomography: Application to Bishop Tuff pumice clasts. Journal of Volcanology and
553Geothermal Research 154, 48–62.
554Gurioli, L., Harris, A.J.L., Houghton, B.F., Polacci, M., Ripepe, M., 2008. Textural
555and geophysical characterization of explosive basaltic activity at Villarrica volcano,
556Journal of Geophysical Research 113, B08206, 116. doi:10.1029/2007JB005328.
557
558Hamilton, C.W., Thordarson, T., Fagents, S.A., 2010a, Explosive lava-water
559interactions I: architecture and emplacement chronology of volcanic rootless cone
560groups in the 1783-1784 Laki lava flow, Iceland. Bulletin of Volcanology, (in press)
561
562Hamilton, C.W., Fagents, S.A., Thordarson, T., 2010b, Explosive lavawater
563interactions II: selforganization processes among volcanic rootless eruption sites in
564the 17831784 Laki lava flow, Iceland. Bulletin of Volcanology, (in press).
565
23 23
566Higgins, M.D., 2000. Measurement of crystal size distributions. American
567Mineralogist 85, 1105-1116.
568
569Jerram, D.A., Cheadle, M.J., Hunter, R.H., Elliot, M.T., 1996. The spatial distribution
570of crystals in rocks. Contributions to Mineralogy and Petrology 125(1), 60-74.
571
572Jerram, D. A., Cheadle, M. J., 2000. On the cluster analysis of rocks. American
573Mineralogist 84, 47-67.
574
575Jerram, D.A., Cheadle, M.J., Philpotts, A.R., 2003. Quantifying the building blocks of
576igneous rocks: are clustered crystal frameworks the foundation? Journal of Petrology
57744, 2033-2051. doi:10.1093/petrology/egg069.
578
579Lautze, N.C., Houghton, B.F., 2005. Physical mingling of magma and complex
580eruption dynamics in the shallow conduit at Stromboli volcano, Italy. Geology 33
581425–428.
582
583Lindquist, W.B., Lee, S-M., Coker, D.A., Jones, K.W., Spanne, P., 1996. Medial axis
584analysis of void structure in three-dimensional tomographic images of porous media.
585Journal of Geophysical Research 101, 8297–8310.
586
587Mangan, M.T., Cashman, K.V., 1996. The structure of basaltic scoria and reticulite
588and inferences for vesiculation, foam formation, and fragmentation in lava fountains.
589Journal of Volcanology and Geothermal Research 73, 1–18.
590
24 24
591Mangan, M.T., Cashman, K.V., Newman S., 1993. Vesiculation of basaltic magma
592during eruption, Geology 21, 157–160.
593
594Mangan, M.T., Mastin, M., Sisson, T., 2004. Gas evolution in eruptive conduits:
595combining insights from high temperature and pressure decompression experiments
596with steady-state flow modelling. Journal of Volcanology and Geothermal Research
597129, 23–36.
598
599Marsh, B.D., 1988. Crystal size distributions (CSD) in rocks and the
600kinetics and dynamics of crystallization: 1. Theory. Contributions to
601Mineralogy and Petrology 99, 277–291.
602
603Mock, A., Jerram, DA., 2005. Crystal size distributions (CSD) in three dimensions:
604Insights from the 3D reconstruction of a highly porphyritic rhyolite. Journal of
605Petrology 46, 1525-1541.
606
607Morgan, D., Jerram, D.A., 2006. On estimating crystal shape for crystal size
608distribution analysis. Journal of Volcanology and Geothermal Research 154, 1-7.
609
610Oh W., Lindquist W., 1999. Image thresholding by indicator Kriging. IEEE
611Transactions on Pattern Analysis and Machine Intelligence 21, 590–602.
612
613Polacci, M., Baker, D.R., Bai, L., Mancini L., 2007. Large vesicles record pathways
614of degassing at basaltic volcanoes. Bulletin of Volcanology 70, 1023-1029. doi
61510.1007/s00445-007-0184-8.
25 25
616
617Polacci, M., Baker, D.R., Mancini, L., Tromba, G., Zanini, F., 2006.
618Three-dimensional investigation of volcanic textures by X-ray
619microtomography and implications for conduit processes.
620Geophysical Research Letters 33, L13312, 1-5.
621doi:10.1029/2006GL026241.
622
623Polacci, M., Papale, P., 1997. The evolution of lava flows from
624ephemeral vents at Mount Etna: insights from vesicle distribution
625and morphological studies. Journal of Volcanology and Geothermal
626Research 76, 1–17.
627
628Proussevitch, A., Sahagian, D., Tsentalovich, E.P., 2007a. Statistical analysis of
629bubble and crystal size distributions: Formulations and procedures. Journal of
630Volcanology and Geothermal Research 164, 95-111.
631
632Proussevitch, A., Sahagian, D., Carlson, W., 2007b. Statistical analysis of bubble and
633crystal size distributions: Application to Colorado Plateau Basalts. Journal of
634Volcanology and Geothermal Research 164, 112-126.
635
636Proussevitch, A., Sahagian, D., 2001. Recognition and separation of discrete objects
637within complex 3D voxelized structures. Computers & Geosciences 27, 441–454.
638
639Rasband, W.S., 2005. ImageJ, U. S. National Institutes of Health, Bethesda,
640Maryland, USA, http://rsb.info.nih.gov/ij/ [accessed November 28, 2009].
26 26
641
642Sahagian, D., Proussevitch, A., 1998. 3D particle size distributions from 2D
643observations: stereology for natural applications. Journal of Volcanology and
644Geothermal Research 84, 173–196.
645
646Sable, J., Houghton, B.F., Del Carlo, P., Coltelli, M., 2006. Changing conditions of
647magma ascent and fragmentation during the Etna 122 BC basaltic Plinian eruption:
648evidence from clast microtextures. Journal of Volcanology and Geothermal Research
649158, 333–354.
650
651Sarda, P., Graham, D., 1990. Mid-ocean ridge popping rocks:
652implication for degassing at ridge crests. Earth and Planetary
653Science Letters 97, 268–289.
655Shin, H., Lindquist, W.B., Sahagian, D.L., Song, S-R., 2005. Analysis of the vesicular
656structure of basalts. Computers & Geosciences 31, 473–487.
657
658Stephenson, G., 1992. Mathematical Methods for Science Students, 2nd edn.,
659Longman Scientific & Technical, London, 528pp.
660
661Still, S., Bialek, W., Bottou, L., 2004. Geometric clustering using the Information
662Bottleneck method. In: Thrun, S., Saul, L., and Schölkopf, B. (Eds.), Advances in
663Neural Information Processing Systems, 16, MIT Press, Cambridge, MA, pp. 1165-
6641172.
665
27 27
666Szramek, L., Gardner, J.E., Larsen, J., 2006. Degassing and microlite crystallization
667of basaltic andesite magma erupting at Arenal Volcano, Costa Rica. Journal of
668Volcanology and Geothermal Research 157, 182–201.
669
670Wilkins, D.E., Ford, R.L., 2007. Nearest neighbor methods applied to dune field
671organization: The Coral Pink Sand Dunes, Kane County, Utah, USA. Geomorphology
67283, 48-57.
28 28
673Appendix 1: Additional nearest neighbor analysis examples
674Explosive interactions between lava and groundwater can generate volcanic rootless
675cones (VRCs). VRCs are well suited to nearest neighbor (NN) analyses because they
676occur in groups, which consist of a large number of landforms that share a common
677formation mechanism. Geological Image Analysis Software (GIAS) includes NN
678analysis tools that were used by Hamilton et al. (2010b) to investigate self-
679organization processes within VRC groups and quantitatively compare the spatial
680distribution of VRCs in the Laki lava flow in Iceland to analogous landforms in the
681Tartarus Colles Region of Mars.
682 Figure 1 presents examples of rootless eruption sites within the Hnúta and
683Hrossatungur groups of the Laki lava flow. In the field, Differential Global
684Positioning System (DGPS) measurements were used to delimit rootless crater floors
685and their centroids were assumed to represent the surface projection of underlying
686rootless eruption sites (Hamilton et al., 2010a, 2010b). Hamilton et al. (2010a) used
687tephrochronology and stratigraphic relationships to divide the complete study area
688(Figure 1A) into chronologically and spatially distinct groups, domains, and
689subdomains. Figure 1B provides an example of a VRC subdomain (Hnúta Subdomain
6901.1). Figures 1C and D show the convex hull boundaries of the complete study area
691and Subdomain 1.1 of the Hnúta Group, respectively. Figure 1 also shows the
692locations of the convex hull vertices (Nv) and interior rootless eruption sites (Ni).
693 Identification of internal boundaries within the Hnúta and Hrossatungur
694groups allowed Hamilton et al. (2010b) to examine the effects of scale and resource
695availability on the spatial distribution explosive lava-water interactions. Relative to
696the Poisson NN model, rootless eruption sites in both the complete study area (Ni =
6972193) and Hnúta Subdomain 1.1 (Ni = 98) have |c| and |R| values that exceed their
29 29
698respective sample-size-dependent limits of confidence at 2 σ. Consequently, neither of
699these distributions are constant with a Poisson random model. Within the complete
700study area, R is 0.72, which is lower than the 2 σ confidence limit of R (0.98) and
701implies that the distribution is clustered with respect to the Poisson NN model. In
702Hnúta Subdomain 1.1, R is 1.23, which is above the upper 2 σ confidence limit of
7031.16 and implies that the distribution is repelled relative to a Poisson NN model.
704Hamilton et al. (2010b) interpret these patterns of spatial distribution within the
705context of resource abundance and depletion though competitive utilization. On
706regional scales, VRCs cluster in locations that contain sufficient lava and groundwater
707to initiate rootless eruptions, but on local scales rootless eruption sites tend to self-
708organize into distributions that maximize the utilization of limited groundwater
709resources. Hamilton et al. (2010b) interpret topography to be the primary influence on
710regional clustering because it would control initial groundwater abundance and
711regions of lava inundation. In contrast, they attribute local repelling to the non-
712initiation, or early cessation, of rootless eruptions at unfavorable localities due to
713groundwater depletion in a competitive system that is analogous to an aquifer
714perforated by a network of extraction wells.
715 Baloga et al. (2007) showed that rootless eruption sites in the Tartarus Colles
716Region of Mars have R = 1.22 and c = 6.14. These NN statistics are similar to those
717observed within Hnúta Subdomain 1.1 (R = 1.23 and c = 4.45), which suggests that a
718similar formation process in both environments. Thus the statistical NN analysis tools
719provided within GIAS can be combined with field observations and remote sensing to
720quantify patterns of spatial distribution and infer underlying mechanisms of formation
721within a diverse range of geological systems.
722
30 30
723Appendix 1 Figure 1:
724A and B depict rootless eruption sites (red circles) superimposed on a 0.5 m/pixel
725color aerial photograph showing 1783-1784 Laki eruption lava flows in Iceland. A:
726Full extent of Hnúta and Hrossatungur groups and an inset (white rectangle) denoting
727Hnúta Subdomain 1.1 (shown in B). C and D correspond to identical regions shown in
728A and B, respectively, and provide examples of convex hull boundaries, rootless
729eruption sites that form convex hull vertices (Nv), and interior points (Ni).
730
731Appendix 1 Figure 1
31 31
732Appendix 2: Derivation for the Scavenged Nearest Neighbor (k = 2) case
733The probability of finding exactly k points within a circle of radius r on a plane can be
734found using the Poisson distribution:
( 0 r 2 ) k 0 r 2
735 P(r , k )  e , [0]
k!
736where ρ0 is the spatial density—defined by the number of objects (N) divided by the
737area of the feature field (A). If the process of creating a point on the plane consumes
738(or “scavenges”), resources that would have formed the kth nearest neighbor, then the
739process can be modelled as (Baloga et al., 2007):
740 P( r ; k scavenges)  1  P (0)  P (1)   P (k ) .
741 [0]
742For k = 2, the density distribution of the function can be written as:

2
743 p(r )  ( 0 ) 3 r 5 e 0 r . [0]
744For convenience, we write    0 , leading to:

2
745 p ( r )   3 r 5 e  r . [0]
746To find the mean of this distribution, we multiply by r, integrate from 0 to ∞, and
747normalise the resulting equation, to obtain first moment of the distribution r :
748

 3  r  r 5  e r dr
2
749 r  0
 . [0]
r
r 2
5
e dr
0
750
751 The integrals can be solved using integration by parts and two mathematical
752identities for the odd and even terms of r (e.g., Stevenson, 1992). For the even terms,
753the following identity can be used,
754
32 32
755

1.3.5 (2n  1) 
r
2
756 2n
 e r dr  , [0]
2 n 1  n  2
1
757while odd integrals of the type,


r
2
758
2 n 1
 e r dr [0]
0
759can be solved using the substitution x2 = u and solving using integration by parts,
760e.g.,
 
1
 r  e dr   ue
2
3 r u
761 2
du . [0]
0 0
762This leads to the following solution for the mean distribution of the nearest neighbour
763distances for k = 2:
1 3  5  
7
2    15    0  15
4 3 3
764 2 . [0]
r 
1 3 7 7 1 16 0
 16 0 2 2
765The second moment r2 can be calculated in a similar manner:
r
2
2
 r 2  r 3  e r dr
3
766 r2  0
 . [0]

 0
 r  e dr
2
5 r
767 The standard deviation is calculated using the identity:  :

2
 r2  r
2
3 15 0.27572
768    2  . [0]
 0 16  0 0
769Thus, the expected Standard Error (σe) is:
0.27572
770 e  , [0]
N  0
33 33
771where N is the number of points in the area of interest. The third and fourth moments
772can be derived as above, leading to:
773

r
2
3
 r 5  e r dr
105
774 r3  0

 3 ; [0]
32 0 2
r
r 2
5
e dr
0

r
2
4
 r 5  e r dr
12
775 r4  0
 . [14]

( 0 ) 2
 r  e dr
2
5 r
776Using the moment formulae for skewness and kurtosis:

3 2 4
777 skew  r 3 3 r r2 2 r ; kurt  r 4 4 r r3 6 r r2 3 r ,
778 [15]
779 we obtain:
0.006663 0.0174838
780 skew  3 ; kurt  . [16; 17]
 4 0
2
 0
3 2
34 34
781Appendix 3: Separation of population means
782To examine the approximate number of samples required to find a statistically
783significant separation of two Gaussian distributions, where the expected means and
784standard deviations are known, a commonly-used test is to examine if zero lies within
785the following confidence interval:
786
     
787 P s 0  s1  1.96 0  1   0  1  s 0  s 1  1.96 0  1   0.95 , [1]
 N N N N 
788where s0 and s1 are the sample means, μ0 and μ1 are the expected means and σ0 and σ1
789are the expected standard deviations. N can be varied to estimate the number of
790samples required to separate the distributions. In particular, if zero is within the
791interval, then there is no discernible difference between the means of the distributions
792at the 95% confidence level.
793 This method is applied to simulate two different means based upon the
794expected values of the Poisson and the Scavenged NN when k = 1 and k = 2. To
795simplify the problem, an approximation to a normal distribution is assumed.
796 Simulation from two normal distributions with increasing numbers of N was
797coded in MATLAB. We found the number of samples required to statistically
798differentiate between the distributions by varying ρ0.
799 For PNN and Scavenged NN, k = 1:
800
ρ0 Minimum N required to separate
(objects / unit area) Poisson NN and Scavenged NN (k = 1)
0.5 50
0.05 60
0.005 60
0.0005 60
801
802For Scavenged NN, k = 1 and Scavenged NN, k = 2:
35 35
ρ0 Minimum N required to separate
(objects / unit area) Poisson NN and Scavenged NN (k = 2)
0.5 100
0.05 120
0.005 130
0.0005 150
803
804 Given that these simulations assume Normal distributions, it would be wise to
805regard them as indicative of the minimum number of data points required to separate
806the competing models. A much larger number of samples should be used separate the
807Scavenged NN, k = 1 and k = 2 models because their expected means are relatively
808close together.
36 36
809Electronic Supplements
810(1) MATLAB preparsed pseudo code and standalone MATLAB executable file,
811available to the public at http://www.geoanalysis.org.
812(2) GIAS help file
813
814Figures and Tables
815Tables
816Table 1: Notation.
817
818Table 2: Nearest neighbor model properties (after Baloga et al., 2007).
819
820Table 3: “Image Analysis” module: summary of results.
821
822Table 4: “Nearest Neighbor Analysis” module: summary of results.
823
824Figures
825Figure 1: Image analysis program in “Image Analysis” mode, with maximum object
826areas truncated to ~104 pixels for visualization purposes which excludes very large
827(and least circular) vesicles from displayed plots and histograms. Note that object
828thresholding does not affect results presented in Tables 3 and 4.
829
830Figure 2: Image analysis program in “Nearest Neighbor Analysis” mode with outputs
831displaying results for a Poisson NN test.
37 37
832Table 1.
Variable Definition
A Area of a feature field
Ahull Area of the convex hull
C Statistic for comparing observed (actual) to expected mean NN distances
k Poisson index
N Number of features within a sample population
Ni Number of features within the convex hull
Nv Number of features forming the vertices of the convex hull
NN Abbreviation for nearest neighbor
P Probability
R Statistic for comparing actual (ra) to expected (re) mean NN distances
R Radial distance
ro Threshold distance used to calculate Normalized Poisson NN distributions
ra Mean NN distance observed in the data
re Mean NN distance calculated from a given NN model (e.g., Poisson)
P Probability density
ρ0 Population (spatial) density of the input features (ρ0 = N / A)
ρi Population (spatial) density of the input features (ρi = Ni / Ahull)
Σ Standard deviation
σe Standard error
833
38 38
834Table 2.
835
Statistic Poisson Scavenge k = 1 Scavenge k = 2 Normalized NN Logistic
Probability 2 0 re  0r
2 2
2(0 ) 2 r 3e  0r 2(0 ) 3 r 5 e  0r
2 2 2
e( r  m) b
Density (P)
20 re   0 ( r0 r )
b[1  e  ( r  m ) b ]2
Mean (re) 1 3 15 2 0  m
 r e 0 dr
2
2  r
2 0 4 0 16  0 2
e  0 r0 r0
Mode (rmode) 1 3 5 1 m
2 0 2 0 2 0 2 0
Standard Error 0.26136 0.272249 0.27572 r02  r 2  1  0 N b
(σe) N 0 N 0 N 0 3
Skewness (  3) 0.008187 0.0066638 1  2  3
r (r0  3r )  r 

 2r 2 
0
3  0
3 3 3    20 
4 3 0 2  30 2  30 2
Kurtosis 2  3 2 16 0.016807 0.0174838 1  2 2 2  3  6

 r  r  4 r r  6 r 2
   3r 4

 4 ( 0 ) 2  4 0 2
 4 0 2
 4  
0 0 0
0   0  2  5
39 39
836Table 3.
Image Analysis Magmatic vesicles

(base unit: μm)
Resolution (pixel size) 3.81 μm
Minimum object area 20 pixels = 76.2 μm2
Total number of objects 537
(excluding boundary objects)
Abundance (% of total) 65.48
Area (min. to max.) 2.90 ×10 to 6.40 ×106
2
Mean area (±1 σ) 7.11 ×104 (±3.73 ×105)

Eccentricity (min. to max.) 0.00 to 0.97
Mean eccentricity (±1 σ) 0.71 (±0.18)
Perimeter (min. to max.) 56.50 to 3.55 ×104
Mean perimeter (±1 σ) 8.64 ×102 (±2.29 ×103)
Orientation (min. to max.) -89.32 to 90.00
Mean orientation (±1 σ) 7.27 (±43.91)
40 40
837Table 4.
Nearest Neighbor (NN) Magmatic vesicles

Analysis (base unit: μm)
Input distribution properties
Resolution (pixel size) 3.81 μm
Minimum object area 20 pixels = 76.2 μm2
Objects inside convex hull (Ni) 520
Interior object density (ρi) 7.79 ×10-6
Range of NN distances 28.05 to 1.53 ×103
Mean NN distance (ra; ±1 σ) 1.75 ×102 (±1.15 ×102)
Skewness 2.48
Kurtosis 16.13
Poisson model
Expected mean NN distance (re) 1.79 ×102
R with thresholds at 2 σ 0.98 (0.97 and 1.07)
c with thresholds at 2 σ -0.91 (-1.33 and 2.83)
Conclusions (c value) Model supported (<2 σ)
Normalized Poisson model

Distance threshold 5 pixels = 19.05 μm
c with thresholds at 2 σ 3.51 (-1.34 and 2.88)
Conclusions (c value) Model rejected (>2 σ)
Repelled vs. Normalized
Scavenged (k = 1) model
Clustered vs. k = 1
Scavenged (k = 2) model
Clustered vs. k = 2
838
41 41
839Figure 1
840
42 42
841Figure 2
842
43 43

New Image Processing Software For Analyzing Object Size-Frequency Distributions, Geometry, Orientation, and Spatial Distribution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

New Image Processing Software For Analyzing Object Size-Frequency Distributions, Geometry, Orientation, and Spatial Distribution

Uploaded by

Copyright:

Available Formats

1New image processing software for analyzing object size-frequency

2distributions, geometry, orientation, and spatial distribution*

3Ciarán Beggan1 and Christopher W. Hamilton2

41 Grant Institute, School of GeoSciences, University of Edinburgh, UK

52 Hawai‘i Institute of Geophysics and Planetology, University of Hawai‘i, USA

7Corresponding author: Ciarán Beggan, E-mail: ciaran.beggan@ed.ac.uk;

8Phone: +44 131 650 5916; Address: School of GeoSciences, Grant

9Institute, West Mains Road, Edinburgh, EH9 3JW, UK

11*Code available from server at http://www.geoanalysis.org

15object area, abundance, radius, perimeter, eccentricity, orientation, and centroid

17objects using sample-size-dependent nearest neighbor (NN) statistics. The NN

19and (4) Scavenged k = 2 NN distributions. GIAS is implemented in MATLAB with a

21with MATLAB, or as a stand-alone application that runs on Windows and Unix

24spatial organization of a wide range of geological features. This information expedites

25quantitative measurements of 2D object properties, provides criteria for validating the

27standardized NN methodology that can be used to compare the results of different

28geospatial studies and identify objects using non-morphological parameters.

30Key Words: Image analysis, software, size-frequency, spatial distribution, nearest

31neighbor, vesicles, volcanology

34Geological image analysis extracts information from representations of natural objects

35that may either be captured by an imaging system (e.g., photomicrographs, aerial

36photographs, digital satellite imagery) or schematically rendered into visual form

37(e.g., geological maps). In addition to examining the properties of individual objects,

40 ImageJ (Rasbald, 2005) is a commonly used image processing application that

41was developed as an open source Java-based program by the National Institutes of

43including the analysis of vesicle size-frequency distributions within geological thin-

47analytical tools for investigating patterns of spatial organization.

48 To take greater advantage of the information contained within geological

49images, we have developed Geological Image Analysis Software (GIAS). This

55sample-size-dependent analyses of NN distributions.

58Nearest neighbor (NN) analysis is well-suited for investigating patterns of spatial

59distribution within intrinsically two-dimensional (2D) datasets, such as orthorectified

60aerial photographs and satellite imagery. Applications of NN analyses to remote

67between different datasets is limited by the lack of a standardized NN methodology—

68particularly in terms of defining feature field areas, thresholds of significance, and

69criteria for apply higher-order NN methods.

70 In addition to remote sensing applications, NN analyses can be used to study

72Cheadle, 2000) and vesicles. In this study, we emphasize the application of NN

74refute) the assumption of randomness, which is a prerequisite for effectively applying

76selecting appropriate statistical models to characterize those vesicle populations.

77 Vesicle textures preserve information about the pre-eruptive history of

84physics of crystal nucleation and growth dynamics to derive an analytical formulation

90vesicles. Nested photomicrographs solve this problem because photomicrographs

91captured at multiple magnifications enable the reconstruction of total vesicle size-

92frequency distributions (e.g., Adams et al., 2006; Gurioli et al., 2008).

93 In general, vesicle studies are limited by two major factors: transformations of

96of reliable statistical characterizations of vesicle populations in terms of distribution

99techniques for investigating vesicle populations; however, a single cross-section

101objects viewed in a 2D section can be characterized using 2D reference textures with

103Proussevitch et al., 2007a). Although synchrotron X-ray tomography is increasingly

110photomicrograph and establish if objects fulfil the criteria of spatial randomness,

111which is required for transforming 2D sections into 3D models using stereology.

1133. Nearest neighbor (NN) analysis

115actual mean NN distance (ra) in a population of known density is compared to the

116expected mean NN distance (re) within a randomly distributed population of

118(σe) of the Poisson distribution are:

122R and c) are used to determine if ra follows a Poisson random distribution:

125have a value of 1, while c should equal 0. If R is approximately equal to 1, then the

130clustering relative to the Poisson model.

137use of sample-size-dependent calculations of R and c thresholds.

139approximated as points. However, for frameworks of solid objects, NN statistics

140exhibit scale-dependent variations in randomness due to the restrictions imposed by

141avoiding overlapping object placements. For 2D sections through 3D distributions of