Professional Documents
Culture Documents
Abstract
Univariate diagrams such as binned frequency histograms and probability density distributions are often used for the
initial assessment and communication of geochronological data. Both diagram types are estimates of the sample
distribution and both have inherent limitations that are not widely appreciated. Binned frequency histograms are
effective at conveying frequency information, but analytical error is discarded and appearance is vulnerable to bias due
to arbitrary decisions about bin width. A method for assessing the efficiency of bin widths is presented. Probability
density distributions use a variable Gaussian kernel method that accounts for analytical error of individual datum.
While providing standardization of the display, these diagrams are limited by the lack of visually accessible frequency
information. An approach combining elements of both histograms and probability density distributions is proposed.
All methods are applied in an Excel workbook and the procedures for using this are explained.
r 2003 Elsevier Ltd. All rights reserved.
0098-3004/$ - see front matter r 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cageo.2003.09.006
ARTICLE IN PRESS
22 K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31
effect, although the mean of the age estimate distribu- bin width of 20 Myr is needed for 50% efficiency and
tion is 2015 Ma and that value lies within the bin limits over 75 Myr for 90% efficiency. Table 1 lists further bin
of 2000 and 2020 Ma, there is a 70.98% probability that widths required for 50% and 90% efficiency levels
the ‘‘true’’ value of the age lies outside the bin limits. In depending upon mean error.
comparison, 99.38% of the age estimate defined by the
second analysis (201572 Ma) lies within the bin limits,
so the bin may be considered representative. However, if 3.3. Bin size limitation
the bin width was the same, but the limits were, for
example, 1996 and 2016 Ma, then even the relatively The illustration of a histogram bin’s efficiency at
more precise second analysis would have 30.85% of its capturing age estimates also highlights the second
age estimate outside the bin limits. limitation of histogram use, i.e. the size and location
The ‘‘efficiency’’ of how well each age estimate is of the bins themselves. A histograms appearance, and
captured by a particular bin can be derived by thus its potential interpretation, is a balance between too
calculating the proportion of the Gaussian distribution much detail with narrow bin widths (undersmoothing)
within the bounds of the bin: and too little detail with wide bin widths (oversmooth-
Z zU ing). In a variety of published cases (Table 2) the choice
1 i 2=2
of bin width ranges from 5 to 100 Myr. Using SHRIMP
Ei ¼ pffiffiffiffiffiffi expx dx; ð2Þ
2p zLi derived data, Morton et al. (1996, p. 917) defined bin
where the bounds of bin j about xi are given by
ðx0 þ jhÞ xi ðx0 þ ðj þ 1ÞhÞ xi
zLi ¼ ; zU
i ¼ ; ð3Þ Table 1
ei ei Bin widths (in Myr) required to reach 50% and 90% efficiency
where ei is the standard deviation of each Gaussian in a set of age data with various mean errors based on empirical
distribution. analysis described in text
The mean of the individual efficiency values is defined Mean error (Myr) Bin width required for
as a proxy for the efficiency of a particular bin width at
representing the age data. Fig. 2 illustrates the relation- >50% Efficiency >90% Efficiency
ship between histogram efficiency and bin width for a 1 2 10
variety of age data, both randomly generated with given 2 5 20
mean errors and real data from the Slave Province 5 10 40
(Sircombe et al., 2001). For data with B1 Myr errors 10 15 100
(more typical of thermal ionization mass spectrometry) a 15 20 120
bin width of 10 Myr is sufficient to capture 90% of the 20 30 160
age estimates. For the Dwyer Lake and George Lake 30 45 220
samples, representing typical SHRIMP zircon data, a 50 75 360
Fig. 2. Relationship between bin width and bin width efficiency (mean proportion of age estimates within same bin as age) for a variety
of real and simulated age data with mean errors ranging from 1 to 30 Myr.
ARTICLE IN PRESS
24 K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31
Table 2
Examples of bin width and other details of binned frequency histogram displays from a variety of references
Bin width (Myr) Range (Myr) Bin width as % range Mean error (Myr) Efficiency (E) Reference
a
5 70 7.14 2 70% Davis et al. (1994), Fig. 8B
20 3400 0.588 5a 80% Gehrels and Dickinson (1995), Fig. 6
25 4000 0.625 20b 60% Morton et al. (1996), Fig. 3
33.33 1900 1.75 50c 30% Scott and Gauthier (1996), Fig. 4
100 2800 3.57 3a 98% Roback and Walker (1995), Fig. 8d
Mean error and efficiency (E; explained in text) calculated for a subset of presented data and intended for only for broad indicative
purposes.
a
Thermal ionization mass-spectrometer analyses.
b
SHRIMP analyses.
c
Laser-ablation microprobe inductively coupled mass spectrometer analyses.
d
Roback and Walker (1995) histograms compiled from data in Ross et al. (1991, 1992).
Fig. 4. Example of individual and accumulated age estimates. (a) Individual age estimate with a small error, (b) age estimate with a
large error, (c) accumulated density distribution with six contributing age estimates in two modes. Although it is area beneath the curve
that is important (see text) probability distribution can be read as follows: probability of an age of 255 Ma within distribution is
0.015% or 1.5%.
the Dwyer Lake sample data discussed here, Eqs. (4) and probability density distribution (PDD) diagram (e.g.
(8) yields optimal bin widths of 136 and 210 Myr. In Fig. 3; Dodson et al., 1988).2 Technically, the age PDD
comparison, another example set of detrital zircon age is another estimate of the ‘‘true’’ sample distribution.
data (George Lake metagreywacke) yields optimal bin The PDD produces a density estimate of the sample
widths of 36 and 22 Myr respectively. This illustrates distribution using a Gaussian kernel (Silverman, 1986)
that such optimal binning methods do not necessarily that varies with each individual age estimate. The shape
produce a standard bin width for easy visual comparison of these age estimates, and thus the kernel, will vary
between different sets of data. Critically, because these from a narrow, tall distribution if the error is small
calculations assume an underlying Gaussian distribution (Fig. 4a), to a wide shallow distribution if the error is
in the sample data, they may not be applicable to sets of large (Fig. 4b). These individual distributions are
data that have non-Gaussian distributions (Scott, 1979, summed together to form the age PDD function, f ðtÞ;
1992). Detrital zircon analyses are typically widely for the sample being examined (e.g. Fig. 4c) using the
dispersed and the optimal bin width calculations may following formula:
only be relevant for zircon analyses producing single age
modes. X
N
1 2 2
f ðtÞ ¼ pffiffiffiffiffiffi expðtxi Þ =2ei ; ð9Þ
Finally, the selection of an origin in a binned i¼1 ei 2p
histogram display can also significantly alter the
appearance and potential interpretation of the data where xi is the ith age measurement and ei is the ith
(Simonoff and Udina, 1997). This effect should be analytical error, t is the age and N is the sample size. In
minimal for age data because, for the general aesthetic practice, the distribution function can be approximated
sense of ‘rounded’ limits proposed by Doane (1985), the by assessing the value of f ðtÞ in fixed increments
origin should either be 0 Ma or an integer multiple of the (typically 1 Myr) across a range that encompasses the
bin width. For example, with a bin width of 25 Myr and required data. Because geochronological results are
minimum value at 2037 Ma, the histogram origin would typically reported to a round Ma value, standard
not be 2036 Ma, rather it would be 2000 or 2025 Ma. increments of 1 Myr are recommended as a suitable
Any selection of a less orthodox origin would require estimation for the distribution function. In some cases,
detailed justification in terms of potential instability in for instance younger ages, the size of these increments
the appearance of the histogram (Simonoff and Udina, may be reduced further to ensure a smooth appearance.
1997). For the distribution to be a true probability distribution
2
The original probability plots applied to detrital zircon age
data were presented in Dodson et al. (1988) and were the
4. Probability density distributions
product of a technique and program (‘‘Nouveau Stats’’)
developed by Dr. P. Zeitler then at the Research School of
4.1. Mathematical definition Earth Sciences of the Australian National University (I.S.
Williams, written comment). The application of the method to
A graphical approach that attempts to address the 40
Ar–39Ar data had been demonstrated earlier by Jessberger
limitations of binned frequency histograms is the et al. (1980).
ARTICLE IN PRESS
26 K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31
it must be scaled so that the cumulative total sums to proportion information. The height of the curve in a
one, i.e. the distribution function Eq. (9) should be PDD diagram is both a function of quantity and
divided by N: This approach also ensures that the precision rather than simply quantity, i.e. a precise
diagram is standardized for comparative purposes. It is analysis will have a tall peak (e.g. Fig. 4a) that may
recommended that the probability scale on the y-axis is compare, height-wise, with a peak of accumulated less
retained to allow meaningful comparison between sets of precise analyses (e.g. Fig. 4c). Area is not an easily
data. The number of individual analyses contributing to recognized attribute of a diagram and thus frequency
the distribution should also be clearly indicated on the information may be lost to the observer. For instance in
diagram. Fig. 4c, the distribution contains six individual analyses,
three in each mode, but the left-hand peak is higher
4.2. PDD application suggesting that it is the dominant mode.
(a)
(b)
Fig. 5. Illustration of relative heterogeneity values. (a) Strongly unimodal sample, (b) polymodal sample.
5. Combined display
Calculations and chart production are a combination Age data is entered in the Data Entry worksheet as
of automatic and macro-based procedures. User input age, error and concordance (Fig. 7). A spot/analysis
cells are indicated by a white background and a yellow identification is optional. Error must be 1 s.e. The
background indicates unalterable cells where the appli- sample name can be added, and will be as the basis for
cation has made calculations. Newly generated charts chart titles and output filenames. Data is filtered at two
will replace previous charts on the same worksheet, and, stages. Firstly, the data will be filtered by the con-
after being produced, charts can be altered, if required, cordance value according to the value given in the Filter
ARTICLE IN PRESS
28 K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31
Fig. 7. Screen shot of Data Entry worksheet illustrating areas for entering age data and errors along with automatic calculations.
cell. For instance, entering 90 will exclude all data with range displayed on the chart), end (Ma, end of the range
concordance below 90 or above 110 from further displayed on the chart) and binwidth (Myr). The Create
processing. The data is also filtered via the fifth data Histogram button is linked to a macro that uses the user-
column headed Use? Any character entered in this input values and the concordant age estimates in the
column will ensure the age data is used in further Data Entry worksheet to produce a binned frequency
calculations—assuming it passes the concordance filter. histogram chart.
This provides a means for the user to concentrate The validity of the user input values are checked prior
processing on data of interest without deleting to calculation. If any age estimates are either below the
information. origin or above the end of the user-specified range these
This worksheet automatically calculates parameters will be flagged immediately below the input cells. It is at
such as mean age and mean error, along with optimal the user’s discretion whether this indicates that the
bin width as calculated by the Doane (1985) and Scott specified range is not adequate for the data being
(1979) methods (Eqs. (8) and (4) respectively). The charted. The user-input values will also be overwritten
worksheet also contains a button Create Report linked to the Histogram Efficiency worksheet and an efficiency
to a macro that will create a copy of the workbook value calculated.
without the attached objects, formulae and macros. This
can be used when the user is satisfied with the results of
the analysis and wishes to capture a final version. 6.4. Probability Density Dist.
Fig. 8. Screen shot of Regular Histogram worksheet illustrating user-input cells for origin, end and bin-width, Create Histogram
button and automatic calculation of histogram efficiency.
Fig. 9. Screen shot of Probability Density Dist. worksheet illustrating user inputs origin, end and increment. Create PDD and Create
DXF buttons also shown.
flagged immediately below the user input cells. It is 6.5. Combined display
at the user’s discretion whether this indicates that
the specified range is not adequate for the data The Combined Display worksheet produces a display
being charted. The macro also calculates the local combining two PDDs of all data and concordance-
maxima of the distribution following the approach of filtered data along with a histogram of concordance-
Scott (1992) and outputs these results beside the filtered data (Fig. 10). The user inputs origin (Ma), end
distribution results. (Ma), increment (Myr), bin width (Myr) and level of
The worksheet also has a Create DXF button linked concordance filtering (%). The increment is typically in
to a macro that produces a rudimentary DXF format file 1 Myr steps, but this can be altered if required. Data is
of the PDD along with simple lines representing the x- copied from the Data Entry worksheet, but the user does
and y-axis based on the scale of the chart. This DXF file have the option of altering the Use? Column as required.
can be imported into a variety of graphing/drawing The Create combined diagram button is linked to a
packages. Because the PDD is represented as a single macro that performs the calculations based on user
and continuous curve, it can help avoid some of the input and concordant age estimate data in the Data
problems associated with directly cutting and pasting Entry worksheet. The left y-axis of the combined chart
the Excel chart. records the probability value of the PDDs and the right
ARTICLE IN PRESS
30 K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31
Fig. 10. Screen shot of Combined Display worksheet illustrating user inputs: origin, end, increment and bin width. Create combined
diagram and Create DXF buttons also shown.
y-axis records the frequency value of the histogram. The Dodson, M.H., Compston, W., Williams, I.S., Wilson, J.F.,
title of the chart also records the concordant number of 1988. A search for ancient detrital zircons in Zimbabwean
data in the range against the total number of data sediments. Journal of the Geological Society, London 145,
(‘‘n ¼ ’’) and the concordance filter level. The macro 977–983.
also flags any data beyond the user-specified range. Fergusson, C.L., Carr, P.F., Fanning, C.M., Green, T.J., 2001.
Proterozoic-Cambrian detrital zircon and monazite ages
The worksheet also links to another macro via the
from the Anakie Inlier, central Queensland: Grenville and
Create DXF button to produce a rudimentary DXF Pacific-Gondwana signatures. Australian Journal of Earth
format file for importing into graphics applications. Sciences 48, 857–866.
Gehrels, G.E., Dickinson, W.R., 1995. Detrital zircon prove-
nance of Cambrian to Triassic miogeoclinal and eugeoclinal
strata in Nevada. American Journal of Science 295, 18–48.
Acknowledgements
Harley, S.L., Black, L.P., 1997. A revised Archaean chronology
for the Napier Complex, Enderby Land, from SHRIMP
Portions of this work were developed while supported ion-microprobe studies. Antarctic Science 9, 74–91.
by a NSERC Canadian Laboratories Visiting Fellow- Jessberger, E.K., Dominik, B., Staudacher, T., Herzog, G.F.,
ship at the Geological Survey of Canada, Ottawa (1998– 1980. 40Ar–39Ar Ages of Allende. Icarus 42, 380–405.
2000). Recent work and updates have been supported by Morton, A.C., Claou!e-Long, J.C., Berge, C., 1996. SHRIMP
a University of Western Australia Postdoctoral Fellow- constraints on sediment provenance and transport history in
ship (2001–2002) and Australian Research Council the Mesozoic Statfjord Formation, North Sea. Journal of
Discovery Grant DP0208797. P. Cawood and two the Geological Society, London 153, 915–929.
anonymous reviewers provided helpful comments. Nutman, A.P., 2001. On the scarcity of >3900 Ma detrital
TSRC Contribution no. 256. zircons in X3500 Ma metasediments. Precambrian Geology
105, 93–114.
Pell, S.D., Williams, I.S., Chivas, A.R., 1997. The use of
protolith zircon-age fingerprints in determining the proto-
References source areas for some Australian dunes sands. Sedimentary
Geology 109, 233–260.
Davis, D.W., Hirdes, W., Schaltegger, U., Nunoo, E.A., 1994. Pelto, C.R., 1954. Mapping of multicomponent systems.
U–Pb age constraints on deposition and provenance of Journal of Geology 62, 501–511.
Birimian and gold-bearing Tarkwaian sediments in Ghana. Rainbird, R.H., McNicoll, V.J., Th!eriault, R.J., Heaman,
West Africa; Precambrian Research 67, 89–107. L.M., Abbott, J.G., Long, D.G.F., Thorkelson, D.J.,
DeGraaff-Surpless, K., Graham, S.A., Wooden, J.L., McWil- 1997. Pan-continental river system draining Grenville
liams, M.O., 2002. Detrital zircon provenance analysis of Orogen recorded by U-Pb and Sm–Nd geochronology of
the Great Valley Group, California: evolution of an arc– Neoproterozoic quartzarenites and mudrocks, North-
forearc system. Geological Society of America Bulletin 114, western Canada. The Journal of Geology 105, 1–17.
1564–1580. Roback, R.C., Walker, N.W., 1995. Provenance, detrital zircon
Doane, D.P., 1985. Aesthetic frequency classifications. The U–Pb geochronology, and tectonic significance of Permian
American Statistician 30, 181–183. to Lower Triassic sandstone in southeastern Quesnellia,
ARTICLE IN PRESS
K.N. Sircombe / Computers & Geosciences 30 (2004) 21–31 31
British Columbia and Washington. Geological Society of Simonoff, J.S., Udina, F., 1997. Measuring the stability of
America Bulletin 107, 665–675. histogram appearance when the anchor position is changed.
Ross, G.M., Parrish, R.R., Dud!as, F.O.,. 1991. Provenance of Computational Statistics and Data Analysis 23, 335–353.
the Bonner Formation (Belt Supergroup), Montana: in- Sircombe, K.N., 1999. Tracing provenance through the isotope
sights from U-Pb and Sm-Nd analyses of detrital minerals. ages of littoral and sedimentary detrital zircon, eastern
Geology 19, 340–343. Australia. Sedimentary Geology 124, 47–67.
Ross, G.M., Parrish, R.R., Winston, D., 1992. Provenance and Sircombe, K.N., 2000. The usefulness and limitations of binned
U–Pb geochronology of the Mesoproterozoic Belt Super- frequency histograms and probability density distributions
group (northwestern United States): implications for age of for displaying absolute age data. Radiogenic age and
deposition and pre-panthalassa plate reconstructions. Earth isotopic studies, Report 13, Geological Survey of Canada,
and Planetary Science Letters 113, 57–76. Current Research 2000-F2, 11pp.
Sambridge, M.S., Compston, W., 1994. Mixture modelling of Sircombe, K.N., Bleeker, W., Stern, R.A., 2001. Detrital zircon
multi-component data sets with application to ion-probe geochronology and grain-size analysis of a B2800 Ma
zircon ages. Earth and Planetary Science Letters 128, Mesoarchean proto-cratonic cover succession, Slave Pro-
373–390. vince, Canada. Earth and Planetary Science Letters 189,
Scott, D.W., 1979. On optimal and data-based histograms. 207–220.
Biometrika 66, 605–610. Sircombe, K.N., Stern, R.A., 2002. An investigation of artificial
Scott, D.W., 1992. Multivariate Density Estimation: Theory, biasing in detrital zircon U-Pb geochronology due to
Practice and Visualization. Wiley, New York, 376pp. magnetic separation in sample preparation. Geochemica et
Scott, D.J., Gauthier, G., 1996. Comparison of TIMS (U–Pb) Cosmoschimica Acta 66, 2379–2397.
and laser ablation microprobe ICP-MS (Pb) techniques for Smosna, R., Bruner, K.R., Burns, A., 1999. Numerical analysis
age determination of detrital zircons from Paleoproterozoic of sandstone composition, provenance, and paleogeogra-
metasedimentary rocks from northeastern Laurentia, phy. Journal of Sedimentary Research 69, 1063–1070.
Canada, with tectonic implications. Chemical Geology Sturges, H.A., 1926. The choice of a class interval. Journal of
131, 127–142. the American Statistical Association 21, 65–66.
Shannon, C.E., Weaver, W., 1949. The Mathematical Theory of Wand, M.P., 1996. Data-based choice of histogram bin width.
Communication, University of Illinois Press, Illinois, 117pp. University of New South Wales, Australian Graduate
Silverman, B.W., 1986. Density Estimation for Statistics and School of Management Working Paper Series 95-011,
Data Analysis. Chapman and Hall, London, 175pp. 14pp.