You are on page 1of 8

Minerals Engineering 34 (2012) 70–77

Contents lists available at SciVerse ScienceDirect

Minerals Engineering
journal homepage: www.elsevier.com/locate/mineng

Statistical methods to compare batch flotation grade-recovery curves


and rate constants
T.J. Napier-Munn ⇑
Julius Kruttschnitt Mineral Research Centre, The University of Queensland, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Grade-recovery curves obtained from kinetic batch flotation testing are, like any other measurement,
Received 5 February 2012 subject to experimental error. This leads to uncertainty in the true position of each cumulative grade-
Accepted 30 March 2012 recovery point, the curve itself, and the kinetics. This uncertainty is rarely if ever taken into account when
Available online 27 May 2012
interpreting such curves, in particular when comparing curves obtained under different conditions. This
paper proposes a methodology to deal with this problem.
Keywords: The standard formula is used to establish true confidence intervals for the grade and recovery at each
Froth flotation
replicated timed concentrate point, and the 2-sample t-test is used to compare these point values
Modelling
Statistics
between tests conducted under different conditions. The properties of the grade-recovery curves can
be compared by fitting an appropriate model to the two data sets and using a bootstrap to create distri-
butions of differences between the model parameters and the model predictions of recovery at any cho-
sen concentrate grade, reflecting the uncertainty in the original data. It is then easy to construct
hypothesis tests on the parameter differences and on the mean difference at the chosen grade(s) between
the two curves. The same approach can be used to construct confidence intervals on the fitted curves and
to test differences in estimated flotation rates. An extra sum of squares test can be used to compare the
fitted grade-recovery curves as a whole. Details of the methods are presented, suitable for spreadsheets.
These methods are relatively easy to apply but require that all batch flotation tests be replicated. The
alternative (single tests under each condition) ignores the existence of experimental error and renders
the data susceptible to subjective and perhaps erroneous interpretation. Using these methods it is not
unusual to find that grade-recovery curves thought to represent truly different flotation performance
are not in fact statistically different, especially at the longer flotation times (high recoveries).
Ó 2012 Elsevier Ltd. All rights reserved.

1. Introduction and problem statement will need to be changed. In testing new reagents or grinding media
it is necessary to compare the flotation response of the new condi-
Batch flotation testing is widely used to assess the flotation re- tion with the standard or current alternative to determine whether
sponse of ores under particular conditions. The test usually com- there has been an improvement. Mineral engineers seek to move
prises some standard procedure for grinding the ore and floating the whole curve to a higher grade-recovery position through mod-
it in a batch cell. Sequential timed concentrates are taken to allow ifications to conditions. The ultimate theoretical curve is that pre-
a cumulative concentrate grade-recovery curve to be constructed. scribed by the liberation characteristics of the ore (Nice and Brown,
This curve contains much useful information including the trade- 1995).
off between grade and recovery and the kinetic characteristics im- Fig. 1 shows two grade-recovery curves in which a copper ore
plied in the shape and location of the curve. Batch flotation testing was floated with a standard collector (A) and then with an alterna-
is described by Runge (2010). tive collector (B), to determine if collector B could give an im-
Grade-recovery curves are used for many purposes. For exam- proved performance. Four sequential timed concentrates were
ple, in future ores testing on a mine site it is common practice to taken in each case. Test B appears to show faster flotation initially,
compare the flotation response of future ores to the currently trea- but the curves approach each other as the flotation time increases
ted material to determine whether the grind or flotation conditions (from right to left), with final recoveries which are similar but with
Test B giving a higher concentrate grade.
It is almost inevitable in such testing that grade-recovery curves
⇑ Address: Julius Kruttschnitt Mineral Research Centre, University Mine, Isles will be compared in some way. Comparisons are usually made by
Road, Indooroopilly, Queensland 4068, Australia. Tel.: +61 7 3365 5888; fax: +61 7
eye with no reference to the inevitable experimental uncertainties
3365 5999.
E-mail address: t.napier-munn@uq.edu.au
inherent in the construction of the curve. Curves are implicitly

0892-6875/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.mineng.2012.03.036
T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77 71

95 grade-recovery points. Fig. 1 shows the curves connecting the


Cum. Cu Recovery (%)

90 mean grade-recovery points for each of the four timed concen-


trates. Fig. 2 shows all the replicated data.
85
Because the tests were replicated, confidence intervals can be
80
calculated separately for each timed mean grade and recovery.
75 The formula for the confidence interval, CI, is the usual one:
Test A
70
Test B ta s
65 CI ¼  pffiffiffi ð1Þ
n
60
10 12 14 16 18 20 22 24 where ta is the t-value for a 2-sided confidence level of 100(1a)%
Cum. Concentrate Grade (% Cu) with n1 degrees of freedom, s the standard deviation of the sam-
ple, and n is the number of replicates in the sample.
Fig. 1. Batch flotation grade-recovery curves for a copper ore with two different The t-value can be obtained from a table of the t-distribution or
collectors.
from the Excel function =TINV(a, n1). n = 3 and t = 2.92 for 90%
deemed to be truly different, and elaborate explanations are then confidence (t = 4.30 for 95% confidence).
concocted for the observed difference, especially where the nature Eq. (1) should be applied separately to the cumulative recovery
of the difference coincides with the natural prejudices of the exper- and grade at each flotation time. Table 1 shows the means, stan-
imenter. In this case it seems that Test B demonstrates a superior dard deviations and 90% confidence intervals for the recovery
performance, with higher recoveries for a given concentrate grade and enrichment ratio (ER) for the first and last concentrates for
over most of the range, though the curves appear to converge at Tests A and B. ER is the ratio of concentrate grade to head feed
the final concentrate where recovery is around 88%. grade and is used in preference to concentrate grade to reduce
In fact, as will be shown, many such comparisons are invalid be- the effect of variations in feed grade. Fig. 3 shows the two curves
cause there is no statistically significant difference between the with the mean grade-recovery points and the 90% confidence inter-
curves, over all or some of the range. This is due to the inherent vals in both recovery and grade expressed as error bars in the y-
uncertainties in the position of the curve due to the malign though direction and x-direction respectively.
inevitable influence of experimental error. This error arises from It is clear that there is considerable uncertainty at each point, in
many sources, including sampling (Gy’s (1979) guidelines should both ER and recovery. This is a product of the relatively high stan-
be followed), feed preparation, the flotation process itself (espe- dard deviations, and the low value of n. The standard deviations for
cially if manual scraping of the concentrate froth is used), assaying, both recovery (in the range 1–2%) and ER (in the range 1–3) are
and metallurgical balancing including the method of calculating however normal for base metal batch flotation, so confidence inter-
recovery (for example whether or not the measured head grade vals such as these are not unusual. Smaller confidence intervals can
is included in the balance). Experimental error can be minimised be achieved, as Eq. (1) demonstrates, by reducing experimental er-
by good experimental practice, but it can never be eliminated ror(s) and/or increasing the number of replicates (n). The number
and is often larger than the experimenter expects. of replicates required to achieve a desired bound on an experimen-
Two questions then arise: tal mean with a given confidence can be calculated from Eq. (2).
z r2
a
 How can we report the uncertainty in the grade-recovery n¼ ð2Þ
B
curve?
 How can we compare two such curves to determine if they are where n is the number of replicates required (sample size), za the
truly different? normal ordinate for 2-sided confidence of 100(1a)% (from tables
or from the Excel function =NORMSINV[2(1a)]), r the expected
This paper suggests that point-by-point uncertainty is best ex- standard deviation (assumed stable and ‘known’ from extensive
pressed by the standard formula for a confidence interval. Grade- prior testing), and B is the 2-sided bound desired on the mean.
recovery points at the same flotation time can be compared statis- Thus if we wished to determine the mean recovery at a partic-
tically using the t-test. The significance of the difference in recov- ular time to ±2% with 90% confidence and the standard deviation of
ery at any given concentrate grade between two tests conducted recovery under these conditions was known to be 2%, then three
under difference conditions can be determined by constructing a replicates would be sufficient (za = 1.64 for 90% confidence). How-
hypothesis test around the values predicted by a model fitted to ever if we wished to determine mean recovery to ±1% then 11 rep-
the data many times using bootstrapping. A similar approach can licates would be required. Fig. 4 shows the size of sample required
be used to compare model parameters including rate constants, as a function of the ratio B/r. The size of sample increases rapidly
and to construct confidence intervals around the full curves.
These methods require that batch flotation testing be replicated
in order to measure experimental error. Some enlightened organi- 95
Cum. Cu Recovery (%)

sations already do this routinely, though these are the exceptions 90


and one might also argue that the best use is not yet being made 85
of the extra information in the replicates. Most however do not. 80
This paper presents methods to allow statistically valid deci-
75
sions to be made when comparing data from different batch flota- Test A
tion tests. 70
Test B
65

2. Confidence intervals and t-tests for grade-recovery points 60


10 12 14 16 18 20 22 24
Cum. Concentrate Grade (% Cu)
The tests shown in Fig. 1 were in fact conducted in triplicate,
that is, each collector was tested three separate times with the Fig. 2. Batch flotation grade-recovery points for a copper ore with two different
same ore under identical conditions, generating three sets of collectors – triply replicated.
72 T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77

Table 1
Statistics for mean recovery and ER for concentrates 1 and 4, Tests A and B.

Statistic Concentrate 1 Concentrate 4


ER A ER B Rec A % Rec B % ER A ER B Rec A % Rec B %
Mean 26.63 27.12 68.69 75.00 15.49 17.77 88.27 87.44
Std. dev. 1.30 1.21 1.05 1.37 2.24 1.07 2.14 1.20
90% CI 2.18 2.03 1.78 2.32 3.78 1.80 3.60 2.03

95 Table 2
t-tests for differences in ER and recovery between A and B for cons. 1 and 4.
Cum. Cu Recovery (%)

90
Value Concentrate 1 Concentrate 4
85
Test B ER Rec. % ER Rec. %
80
Difference (B–A) 0.49 6.31 2.29 -0.83
75 Test A t-Value 0.48 6.31 1.59 0.58
1-sided P(t) 0.328 0.002 0.093 0.295
70
Confidence level (%) 67.2 99.8 90.7 70.5
65
60 B is significantly higher than that with collector A (about 6%). At
10 15 20 25 30
the final concentrate, there is some evidence for a higher ER with
Cum. Enrichment Ratio collector B but there is no difference in recovery. Similar conclu-
sions hold for the intermediate concentrates (not shown), with
Fig. 3. Grade-recovery curves, including 90% confidence intervals on the mean
enrichment ratio and recoveries for each flotation time.
no difference in recovery but P-values of 0.063 and 0.075 for the
difference in ER for concentrates 2 and 3 respectively. These con-
clusions are not unexpected in the light of the confidence intervals
100
shown in Fig. 3. They suggest a real difference in flotation rate early
in the process with the B curve displaced somewhat to higher
recoveries at the longer times. This difference disappears at longer
Sample Size

times.
95% confidence
10 The comparison of grade-recovery points at a particular flota-
tion time is valid only for that time. Its conclusion is independent
of all other parts of the curve. A problem arises in considering par-
90% confidence ticular points in the context of other points on the curve. To what
do we attribute differences between points at a particular time? It
1
0.0 0.5 1.0 1.5 2.0
may be that any significant difference truly reflects an improve-
ment in flotation conditions with the new collector. However the
+- Bound/Std.dev. difference may be due to other factors, and the two points, though
Fig. 4. Sample size required to achieve a given confidence in the mean.
apparently different, may in fact be on the same grade-recovery
curve. This can only be tested by considering the curves as a whole.
Equally we may want to test the differences in rate constants
as the desired bound decreases and the standard deviation
which requires a rate model to be fitted to the kinetic data.
increases.
The question now arises as to whether the mean ER and the
mean recovery at a given flotation time are the same for the two
collectors. This question can be asked and answered for each flota- 3. Comparing the full grade-recovery curve
tion time separately, because the replicated cumulative grade and
recovery estimates at each time can be considered as an indepen- 3.1. The general approach
dent sample of the population of cumulative grades and recoveries
at that time. The question cannot be asked simultaneously for all A rigorous comparison of two full grade-recovery curves is a dif-
the flotation times (ie for the whole curve) because the error model ferent matter to comparing the grade or recovery at the same
for the cumulative curve is not known. This problem is discussed timed points. There are various ways to do this but some are com-
further below. plicated by the fact that the curves are cumulated, meaning that
Some indication can be obtained by inspecting the confidence the results for each time are not independent and the error model
intervals at the points to be compared (e.g. Fig. 3). This is important is unclear. In addition, the approach will depend on what the user
anyway to remind us of the experimental uncertainties when wishes to achieve with the comparison. As examples we shall con-
interpreting the data. However the rigorous method of comparing sider four useful possibilities:
mean values under these circumstances is the 2-sample t-test,
which is available in Excel either as a tool (Tools > Data Analy-  Comparing the parameters of the first order rate models for the
sis > t-test) or as a function (=TTEST). Table 2 shows the results timed data.
of t-tests of the difference in mean recovery and ER between Tests  Comparing the parameters of models fitted to the grade-
A and B for the first and last concentrates. A 1-sided test is appro- recovery data.
priate because we are testing the alternative hypothesis that B > A,  Comparing predicted recovery at selected values of ER (or
i.e. we are expecting an improvement in flotation with the new col- concentrate grade). This is equivalent to generalising the
lector. The null hypothesis is that B = A, i.e. there is no difference. comparisons we made at the experimental flotation times, for
The P-values and corresponding confidence levels suggest that which we used the t-test.
at Con.1 there is no difference in ER but the recovery with collector  Comparison of the two fitted grade-recovery curves as a whole.
T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77 73

We first fit an appropriate model to the data by the usual Table 3 shows the estimated parameters and statistics for the fit
method of least squares, in this case using Excel’s Solver routine. of Eq. (3) to the two data sets. The standard errors of the parame-
We then generate a large number of parameter estimates by ters and the fit, and the coefficient of multiple determination (R2),
bootstrapping, that is by fitting the model many times (say 1000 were estimated using the Excel macro SolvStat (Billo, 2001).
times) to the original model predictions perturbed by random The difference in the values of Rm is small but the k-values differ
noise reflecting the inherent uncertainty in these predictions. by about 0.5 min1. The question is: are these differences statisti-
Because the tests were replicated, this uncertainty will reflect the cally significant in the context of the uncertainties in their
real experimental error in the data. Press et al (1989) discuss the estimation?
method and its mathematical justification. Table 4 shows an Excel worksheet set up to solve the problem,
We will then have 1000 values of the quantities we are inter- using the data of Test A.
ested in, such as rate constant or the recovery at ER = 17, for both Column A contains the triply replicated concentrate times, and
test conditions, A and B, whose variability reflects the uncertainty Column B the corresponding experimental recoveries. Column C
in the original fit of the model. We hope that the model form is contains the recoveries predicted by the original fit of the rate
appropriate because then the uncertainty in the fit will reflect only model with the parameters shown in Table 3. Column D generates
the experimental error in the original data rather than any short- the normally distributed random numbers with zero mean and
coming in the model. We can then make our deductions simply standard deviation = 2.004 (the standard error of the original
by inspecting the distribution of these quantities of interest, in par- model fit – see Table 3) and adds them to Column C. The first
ticular the distribution of the differences in key values between row of this column thus contains the function =C2 + NORMINV
Tests A and B. (RANDOMNV(),0,$H$6) where H6 contains the standard error of
This strategy requires calling Solver 1000 times. Each time, nor- the original fit. Cells H1 and H2 contain the parameters estimated
mally distributed random numbers with mean zero and standard by the current run of Solver, and Column E contains the recoveries
deviation equal to the standard error of the original model fit are predicted by those parameters. The sum of squares to be mini-
added to the original model predictions. These form the new ‘data’ mised by Solver is that between Columns D and E and is calculated
for Solver to use to fit the model. To perform this task we use the in cell H4 from the Excel function =SUMXMY2(col.D,col.E).
Excel add-in MCSimSolver (Barreto and Howland, 2005) which has MCSimSolver can be set up to run any number of simula-
three important features for this application: tions (1000 is the default), and to record any cell selected in
the worksheet. In this case cells H1 and H2 are selected
1. It can repeatedly call Solver to perform the same fit, a defined because we wish to interrogate the distribution of the two
number of times (e.g. 1000). model parameters. The output is 1000 values of Rmax and k
2. It can store the resulting 1000 values of quantities that we (and any other quantity that the worksheet may compute and
choose to define, such as the model parameters and predictions. that we may select).
3. It includes a non-volatile random number generator called The question we want to answer concerns the differences in
RANDOMNV() which is needed because once the set of random model parameters between Tests A and B. We therefore construct
numbers is generated for a given fit we need to leave Solver to 1000 differences between the values of the two parameters in
iterate to a solution; if we used a standard volatile Excel func- Tests A and B and inspect the properties of these differences, as
tion such as RAND() then the values of recovery would change shown in Table 5.
for each iteration and Solver would fail. The upper and lower 95% confidence limits are calculated sim-
ply as the 0.025 and 0.975 percentiles of the 1000 values of the dif-
3.2. Comparing first order rate parameters ferences, using Excel’s =PERCENTILE function. The z hypothesis test
is conducted by calculating z = mean/std.dev. and then calculating
It is often helpful to compare directly the kinetic parameters of the 1-sided P-value from the function =1-NORMSDIST(z); the 2-
the two separations. The four times at which concentrates were sided value is twice this.
taken in Tests A and B were 1, 3, 6 and 11 min. We will fit the As we are searching for an improvement in the performance of
simple first order rate equation to these data, separately for the Test B over that of Test A it is a 1-sided test. Clearly there is no sig-
two tests (a more complex model incorporating fast and slow nificant difference in the values of Rmax: P = 0.26 and the confi-
components could also be used): dence interval includes zero. However there is a highly
significant difference in the values of the rate constant, k.
Rt ¼ Rm ð1  ekt Þ ð3Þ P = 1.5  105 (>99.99% confidence that the difference is not zero).
where Rt = recovery at time t, Rm is the maximum recovery at infi- The magnitude of the difference is 0.47 min1 and the 95% confi-
nite time, and k is the first order rate constant (min1). Eq. (3) is dence interval of the difference is 0.26–0.69 min1.
non-linear in the parameters. The parameters can be estimated by We can obtain a visual impression of these hypothesis tests by
using Excel’s Solver to minimise the sums of the squares, SS, of inspecting the distribution of bootstrapped parameter differences
the differences between the observed recovery values and those as histograms, shown in Fig. 5.
predicted by the fitted model: 26% of the distribution of Rm differences lie above zero and 74%
below. We would normally require 5% or less to be above (or
X
n
SS ¼ ^ i Þ2
ðRi  R ð4Þ below) zero before we would accept the alternative hypothesis
i¼1

^ i is the recovery pre-


where Ri is the observed recovery for ERi, R Table 3
Fitted parameters and statistics for first order rate model.
dicted by the model for ERi, and n is the number of observations.
The standard error of the fit is given by: Quantity Test A Test B
sffiffiffiffiffiffiffiffiffiffiffiffi
Rm k Rm k
SS
SE ¼ ð5Þ Parameter 86.80 1.556 86.27 2.029
np
Parameter std. error 0.696 0.0715 0.500 0.0847
Std. error of fit 2.004 1.466
where p is the number of parameters (in this case n = 12 and p = 2).
R2 0.948 0.930
The Solver fit is made subject to the constraint that Rm 6 100%.
74 T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77

Table 4
Excel worksheet for bootstrapped fit of rate model (Eq. (3)).

Table 5 where R is the cumulative recovery (%), ER the cumulative enrich-


Properties of rate model parameter differences between Tests A and B (B–A). ment ratio, and R, a and b are parameters of the function which
Quantity Difference in Rm (%) Difference in k (min1) must be estimated from data.
Mean 0.550 0.471
R can be thought of as the maximum possible recovery at
Std. dev. 0.856 0.113 ER = 1, i.e. with no concentration. The parameters a and b describe
95% lower conf. limit 2.179 0.257 the shape of the curve. The function behaves as one would expect a
95% upper conf. limit 1.114 0.694 grade-recovery relationship to behave, extrapolating in a plausible
z-test 0.643 4.182
way. Eq. (6) is non-linear in the parameters but can be fitted using
1-sided P(z) 0.260 1.45E-05
2-sided P(z) 0.520 2.89E-05 Excel’s Solver to estimate R, a and b against the usual least squares
criterion (Eq. (4)). Eq. (6) fitted the present data very well.
Bruey (2010) has suggested an alternative parameterisation of
Eq. (6), as follows:
that there is a real difference between the tests, so we must con-
   
clude that there is no difference. The situation for k is entirely dif- ER Rmax  50
ferent. None of the 1000 random values of differences lie below R ¼ Rmax  expð10  cÞ sinh a sinh ð7Þ
ER50 expð10  cÞ
zero. We can thus be extremely confident that there is a real dif-
ference in the rate constants for the two tests, that for B being where c and ER50 are parameters to be estimated from data and
higher than that for A. Rmax is the maximum possible recovery (equivalent to R in the Vera
model and Rm in the rate curve). ER50 is the value of cumulative ER
3.3. Comparing the grade-recovery curves for which the cumulative recovery is 50%, and c describes the shape
of the curve; c = 0 gives a straight line, and increasing positive val-
We can apply exactly the same tools to comparing the grade- ues give increasing curvature.
recovery curves as we did to comparing the rate curves in Sec- This formulation has two advantages. The parameters c and
tion 3.2. We first need a model to fit to the data. An appropriate ER50 have physical meanings which are more easily understood,
choice is the hyperbolic sine function proposed by Vera et al and tests with a range of data have shown that the parameters
(2000) which has been shown to fit a wide variety of grade-recov- are better determined in non-linear regression than those of the
ery curves from many mineral commodities including base metals, Vera model. In particular the standard error of parameter b in
precious metals, industrial minerals and coal. The Vera model is: the Vera model is usually of the same order as b itself, suggesting
that it is very poorly estimated. This is not true of any of the Bruey
R ¼ R  a sinh½bðER  1Þ ð6Þ function parameters. Accordingly we will use the Bruey formula-

120 140

100 P = 0.260 120


P = 0.000
100
Frequency
Frequency

80
80
60
60
40
40
20 20

0 0
-3.2 -2.4 -1.6 -0.8 0.0 0.8 1.6 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Difference in Rm (B-A) Difference in k (B-A)

Fig. 5. Histograms of differences in parameters Rm and k between Tests A and B.


T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77 75

Table 6 for the hypothesis test of 0.077. This reaches significance at 90%
Fitted Bruey model parameters for Tests A and B. but not 95% confidence, an equivocal result. There is no such equiv-
Parameters Test A Test B ocation in the case of the recoveries at ER = 25. The mean differ-
Value Std. error Value Std. error ence is 6.3% and P = 2  107. We are very confident that this
difference is real, with confidence limits of 3.8–8.6%. Test B has a
Rmax 97.28 5.41 91.81 3.56
c 8.60 1.21 10.62 1.86
significantly higher recovery at ER 25 than Test A.
ER50 32.11 1.38 33.55 1.74 Fig. 7 shows the 95% confidence limits of the fitted grade-
R2 0.951 0.931 recovery curves. They are non-symmetric as we would expect, with
Std. error 2.037 1.528 the limits widening at each end of the curves where there are no
data. They also overlap strongly at each end which is consistent
with the non-significance in the differences in the Rmax and ER50
95 parameters as both these values are well outside the range of the
Cum. Cu Recovery (%)

90 data (at ER = 1 and ER  32–33). However there is no overlap in


the confidence limits in the approximate ER range 20–29, which
85
again is consistent with the finding that the predicted recoveries
80 at ER = 25 are truly different, Test B having a higher recovery than
75 Test A at this grade.
Test A We have learned a lot about the contrasts in these two curves
70
Test B and their predictions. However we have not yet unequivocally an-
65 swered the global question: are the curves different? One way to
60 do this is by applying the ‘extra sums of squares’ principle. This
10 15 20 25 30 compares the total sum of squares for the separate fits of the two
Cum. Enrichment Ratio data sets with that obtained by fitting the same model to the com-
bined data set (a so-called global fit). The null hypothesis is that
Fig. 6. Batch flotation grade-recovery points with fitted Bruey model curves.
there is no significant reduction in the sum of squares using two fits
compared to the global fit. The alternative hypothesis is that there is
tion, though the choice of formulation is unlikely to materially a significant reduction in sum of squares, and that the data are bet-
change the conclusions of this work; both fit the data well and pro- ter represented by separate fits, which implies that the fitted curves
duce almost identical predicted curves. are indeed different. The hypothesis is evaluated with an F-test. The
Table 6 shows the estimated parameters and statistics for the fit calculations are quite simple and are shown in Table 8.
of the Bruey model to the two grade-recovery data sets, and Fig. 6 The global fit has only 3 parameters whereas the fit to the two
shows the experimental data and the fitted curves. separate data sets has 6, 3 for each model. The degrees of freedom
The question ‘are the curves different?’ has several possible are the number of data sets minus the number of parameters. The
interpretations. Two important ones are: ‘are the parameters dif- sums of squares are calculated in the course of using Solver, and
ferent?’ (as in the case of the rate curve) and ‘are the predicted are as defined in Eq. (4). In the 2-fit case they are the sum of the
recoveries at chosen values of ER different?’. This last question values for the two fits. F is defined as:
generalises the comparison of recoveries at the specific cumulative
ðSS1  SS2 Þ=ðDF1  DF2 Þ
concentrates that we dealt with in Section 2 using the t-test. F¼ ð8Þ
ðSS2 =DF2 Þ
Both these questions can be solved using the methods of Sec-
tion 3.2. It is simply a question of selecting the calculated quanti- where the subscripts 1 and 2 define the global fit and 2-fit models
ties we wish MCSimSolver to record as it fits Eq. (7) 1000 times. respectively. The significance of F is determined from the Excel
The standard errors which drive the random number generator function =FDIST(F, 3, 18) where 3 and 18 are the respective degrees
are now those of the original fit of Eq. (7) to the two data sets. of freedom. The resulting P-value of 0.00013 allows us to reject the
We will select for inspection the three Bruey model parameters, null hypothesis with great confidence (99.99%) and accept that the
plus the predicted recovery at two ER values at each end of the two curves are indeed different.
curves: ER = 17 (at a longer flotation time) and 25 (at a shorter flo-
tation time). As before we calculate the 1000 differences of these
values between Tests A and B, and conduct a z-test to assess the 4. Some comments on the comparison of curves
statistical significance of the differences.
We will also construct 95% confidence intervals for the fitted Let us first summarise what we have learned about these partic-
curves. This requires a column of ER values covering the range of ular data in Sections 3.2 and 3.3. Recall that we are looking for dif-
interest in short intervals and a column of corresponding predicted ferences in flotation behaviour attributed to the change in flotation
recoveries from the current fit of the model. By selecting the col- collector from Test A to Test B:
umn of recoveries prior to invoking MCSimSolver we will obtain
1000 estimates of recovery at each selected ER, from which we  There is a statistically significant difference in the Tests A and B
can extract the 95% confidence limits simply by computing the rate constants. The magnitude of the difference is 0.47 min1
2.5% and 97.5% percentiles. These can then be plotted as a smooth with a 95% confidence interval of 0.26–0.69 min1, Test B hav-
line on the grade-recovery graph. ing a higher k than Test A.
Table 7 shows the statistics for the differences in the model  The values of Rm in the rate equation are not significantly
parameters and the recoveries at ER = 17 and 25. different.
The comparisons are instructive. None of the three parameters  The three Bruey parameters, Rmax, c and ER50 in the grade-
are statistically different between the two tests (1-sided P > 0.05, recovery curve model, show no significant difference between
and the confidence intervals include zero). This suggests that the Tests A and B. The lack of difference between Rmax and ER50
fitted model cannot distinguish between them. The confidence can probably be attributed to the fact that these parameters
intervals interestingly are not symmetrical. The mean difference define points on the curve which are far removed from the data
in recovery between Tests A and B at ER = 17 is 1.6% with a P-value and where the fitted curves therefore have large uncertainties.
76 T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77

Table 7
Statistics for differences in Bruey model parameters between Tests A and B.

Values Rmax c ER50 Rec. at ER = 17 Rec. at ER = 25


Mean difference (B–A) 3.727 1.771 1.908 1.57 6.34
Standard deviation 4.981 2.061 2.104 1.10 1.25
95% lower confidence limit 11.686 1.959 1.833 0.607 3.844
95% upper confidence limit 7.106 5.850 6.423 3.588 8.606
z-test 0.748 0.859 0.907 1.428 5.07887
1-sided P 0.227 0.195 0.182 0.077 1.9E-07
2-sided P 0.454 0.390 0.365 0.153 3.8E-07

95 of recovery. In this case the model should be fitted in that form,


with ER (as y) being a function of recovery. Simply doing some
Cum. Cu Recovery (%)

90
algebra on the fitted form of Eq. (7) to get ER on the left hand side
85 will lead to an incorrect solution.
80 Likewise there is no absolute need to use enrichment ratio
75
rather than concentrate grade. It has been found however that ER
Model fits to experimental data
is often more meaningful because it is normalised to feed grade
70 which may vary even in closely controlled laboratory experiments.
95% CLs Test A
65 95% CLs Test B In plant data ER is often necessary to reproduce the conventional
60
shape of a grade-recovery curve which is not possible using con-
10 15 20 25 30 centrate grade alone because of uncontrolled variations in feed
Cum. Enrichment Ratio grade.
The example given in this paper uses an unweighted least
Fig. 7. 95% confidence limits for model fits for Tests A and B grade-recovery curves. squares criterion for fitting models. It is also possible to apply
weighted least squares, using for example the variances of the ac-
tual grade (ER) and recovery measurements at each time as the in-
 The difference in predicted recovery at an ER of 17 is 1.6%. The verse weights for each time, if these are known.
significance of this difference is equivocal, reaching significance The proposed bootstrap solution can be used to compare any
at 90% confidence but not 95%. replicated data sets for which appropriate models are available,
 The difference in predicted recovery at an ER of 25 is 6.3%, including other forms of flotation performance curves from batch
which is highly significant. tests, pilot tests or production data, separability curves in general,
 A better fit is obtained by fitting the model separately to the and the comparison of the linear recovery-feed grade relationships
two data sets rather than globally to the combined data set. This in plant trials discussed by Napier-Munn (1998). The advantages of
implies that the two grade-recovery curves are really different the present method in comparing grade-recovery curves are:
overall.
1. It makes no assumptions regarding the linearity of the data.
This particular data set illustrates well the need for careful 2. It is distribution-free, that is, it makes no assumptions as to the
interpretation of the question ‘are they different?’ It depends on distribution of the original data. It only assumes that the ran-
what we mean by ‘different’ and the answers must be interpreted dom errors are normally distributed.
accordingly. For some data sets all the answers are clearly ‘no’ and 3. It does not require that the two fitted grade-recovery models
for others ‘yes’. In this case, which is not at all unusual, it depends give parallel curves. When they are not parallel then several
on how the question is formulated. This does not detract from the comparisons may be necessary at different values of ER.
utility of the process. It simply reminds us, if we needed reminding,
that these curves contain a lot of information, and are subject to The bootstrap comparison method does depend on the assump-
the usual experimental uncertainties. Rate constants tell us differ- tion that the model forms chosen to represent the data, whether ki-
ent things from grade-recovery relationships, and the latter may netic or grade-recovery, are correct in the sense that any lack of fit is
themselves differ over the range of the data. Asking whether the attributable only to experimental error and not functional form. The
parameters are different is a different question from asking if the use of 1000 repetitions is a compromise between the desired preci-
curves as a whole are different. A systematic assessment of the dif- sion of estimate (10,000 might be a better number) and the time re-
ferences between two batch flotation data sets, using the tools sug- quired for the bootstrap. The Bruey 3-parameter model took about
gested in this paper, can give rigorous conclusions of value to the 3 min to generate 1000 Solver fits on an average laptop PC.
experimenter.
The present grade-recovery example has recovery plotted on 5. Conclusions and Recommendations
the y-axis and enrichment ratio on the x-axis. There is no reason
why they should not be plotted in reverse, and this is necessary Grade-recovery curves, like any other empirical measurement,
if differences in grade (or ER) are to be tested at particular values are subject to experimental error. In order to ensure that informed

Table 8
Statistics for comparing the grade-recovery curves using extra sums of squares.

Model No. data sets No. params. DF SS Mean square F P(F)


1: Global fit 24 3 21 178.373 – – –
2: Two fits 24 6 18 58.365 3.243 – –
Difference (1–2) – – 3 120.008 40.003 12.337 0.00013
T.J. Napier-Munn / Minerals Engineering 34 (2012) 70–77 77

conclusions are drawn from inspection of these curves, this error and mass pulls can propagate into the whole curve, rendering
should be embraced and managed. Standard statistical procedures interpretation difficult. Similarly there is a lack of important infor-
such as confidence intervals and hypothesis tests can be used to mation about water recovery and entrainment. Other ways of for-
calculate the uncertainty in each mean grade-recovery point and mulating grade-recovery information could be considered, and this
to compare such points. A new bootstrap method has been intro- is likely to be a fruitful area of future research (Neethling and
duced to compare the kinetic curves in terms of rate parameters, Cilliers, 2008). The modelling of the data in terms of the mass
and the full grade-recovery curves in terms of model parameters proportions of slow and fast floating species is an established
and recoveries at any selected values of enrichment ratio. This in- alternative which may carry more useful information (Morrison
volves repeatedly fitting a model to each data set perturbed by ran- and Alexander, 1998).
dom errors with standard deviation equal to the original model fit
standard error, to generate a distribution of model parameters. Acknowledgements
These are then used to generate many estimates of the quantities
of interest (parameters and predictions) and thus many estimates The author’s interest in this problem arose out of stimulating
of the difference in those quantities between two data sets. These discussions with Dr. Neville Plint of Anglo Platinum, Deryck de
differences have statistical properties which correctly reflect the Vaux of Anglo Research, Prof. Dee Bradshaw of the JKMRC, Dr. Chris
uncertainties in the original data. The distribution of differences Greet of Magotteaux Australia and Dr. Frank Bruey of Cytec. He
is used to perform non-parametric hypothesis tests of the signifi- thanks Dr. Bruey for permission to quote his re-parameterisation
cance of the observed mean difference and to calculate confidence of Vera’s function. He also acknowledges helpful discussions with
intervals on the differences. Confidence intervals on the fitted Prof. Bill Whiten of the JKMRC. Dr. Rob Morrison of the JKMRC
curves can be computed in a similar way. These procedures are rel- kindly read an early version of paper and made useful suggestions.
atively easy to implement in a spreadsheet. Some of the ideas in the paper were first presented at the SAIMM
No-one wants to be told to do three or four (or more) tests Minerals Processing 09 conference in Cape Town in August 2009.
where historically one would have been regarded as the norm.
However replicate testing is essential if these methods are to be References
utilised. The alternative is to ignore the inevitable presence of
experimental error and perhaps draw misleading or erroneous Barreto, H., Howland, F.M., 2005. Introductory Econometrics: Using Monte Carlo
Simulation with Microsoft Excel. Camb. Uni. Press, p. 798.
conclusions from the data. It is therefore recommended that all Billo, E.J., 2001. Excel for Chemists: A Comprehensive Guide. John Wiley and Sons,
batch flotation tests be replicated and that confidence intervals Ch.12 – Non-Linear Regression using the Solver.
be reported such as those in Figs. 3 and 7, however awkward the Bruey, F., 2010. Private Communication (Cytec Inc.).
Gy, P.M., 1979. Sampling of Particulate Materials: Theory and Practice. Elsevier.
consequences. The number of replications required to estimate Morrison, R.D., Alexander, D.J., 1998. Rapid estimation of floatability components in
mean point grades and recoveries to a given uncertainty can be cal- industrial flotation plants. Miner. Eng. 11 (2), 133–143.
culated (Eq. (2) and Fig. 4). It is further recommended that the Napier-Munn, T.J., 1998. Analysing plant trials by comparing recovery-grade
regression lines. Miner. Eng. 11 (10), 949–958.
methods described in this paper be used wherever comparisons Neethling, S.J., Cilliers, J.J., 2008. Predicting and correcting grade-recovery curves:
are to be made between batch flotation results collected under dif- theoretical aspects. Int. J. Miner. Proc. 89, 17–22.
ferent conditions, and that the confidence intervals on the model Nice, R.W., Brown, P.J., 1995. The design of a base metals separation process. In:
Proc. XIX Int. Miner. Proc. Cong., San Francisco, (SME), 137–143.
parameters and curves also be routinely reported.
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T., 1989. Numerical
The fact that the two grade-recovery curves shown in Figs. 6 Recipes. Cambridge Uni. Press, p. 702.
and 7 are not statistically different at the longer flotation times is Runge, K., 2010. Laboratory flotation testing – an essential tool for ore
characterisation. In: Greet, C.J. (Ed.), Ch.9 in Flotation Plant Optimisation,
not an unusual result in such cases, not least because at higher
AusIMM.
recoveries there is less room for a superior performance to express Vera, M.A., Franzidis, J.P., Manlapig, E.V., 2000. The locus of flotation performance.
itself. This kind of analysis highlights potential deficiencies in the In: Proc. XXI Int. Miner. Proc. Cong. Rome, 23–28 July. Elsevier Science, pp. 74–
way kinetic batch flotation data are presented and interpreted as 82.

classical grade-recovery curves. Small errors in flotation start times

You might also like