PROCEEDINGS OF AUSTRALIAN SOCIETY OF SUGAR CANE TECHNOLOGISTS 1993
FIELD TECHNIQUES TO QUANTIFY THE YIELD-
DETERMINING PROCESSES IN SUGARCANE.
Il. SAMPLING STRATEGY ANALYSIS
By
M.R. THOMAS*, R.C. MUCHOW*, A.W. WOOD**,
M.F. SPILLMAN***, M.J. ROBERTSON***
*CSIRO, St. Lucia; **CSR, Ingham; ***CSIRO, Townsville
Introduction
Sampling sugarcane crops for yield estimation provides many challenges due
to the crop size, the tendency of large crops to lodge and stand variability. In another
paper, Muchow ef al. (1993) outlined a procedure for sampling sugarcane crops
throughout growth to develop quantitative relationships for canopy development,
fresh weight accumulation, biomass. accumulation, nutrient accumulation and
sucrose accumulation. The challenge remains to develop a sampling strategy that
balances manpower requirements and analytical cost with data precision, to ob-
tain data on field-grown sugar cane. This paper examines the consequences of dif-
ferent field sampling and subsampling strategies on data precision.
In their pioneering work, Hogarth and Skinner (1967) developed a sampling
method for final yield based on counting the number of stalks in each plot and
the weight of a subsample of stalks. Their technique provides an efficient and
accurate method of estimating final yield, but variation in components other than
final yield have not been examined. It was difficult to obtain accurate stalk counts
for young crops, where the number of stalks per metre of row is much greater
than at harvest, and also to obtain stalk counts for older lodged crops. This paper
also examines sources of variation in each of the yield components with field
sampling.
In developing a sampling strategy for conducting growth analysis studies in
sugarcane, the conflicting demands of manpower requirements and analytical costs
with data precision must be balanced. The size of sample cut in the field is an
important manpower consideration, while the size of the sample for component
dry matter content, nutrient and sucrose concentration is both a manpower and
analytical services cost. Therefore, an objective of this paper is to assess the tradeoff
between increasing the size of field quadrat that is cut and between increasing the
number of whole stalks subsampled for partitioning into yield components. Another
objective is to examine how the optimal sampling strategy varies with different
stages of crop: growth.
The intention is to design a sampling strategy which enables precise and accurate
estimation of a number of quantities: Biomass per unit area (for each plant com-
ponent), sucrose accumulation per unit area and leaf area index. Nutrient accumula-
tion will not be discussed in this paper, because of space limitations; however similar
sampling considerations apply.
Statistical methods
Details of the experimental procedures involved, and calculations of biomass
per unit area, sucrose accumulation and leaf area index are given in another paper
KEYWORDS: Sugarcane Yield, Sampling Strategy, Biomass, Sucrose, Leaf Area Index
3441993 PROCEEDINGS OF AUSTRALIAN SOCIETY OF SUGAR CANE TECHNOLOGISTS
(Muchow ef al., 1993). The sampling strategy is defined in terms of the number
of three metre by 1 row quadrats to be cut and weighed, the number of stalks
to be partitioned into components and weighed by component, and the volume
of each component to be dried (specified by number of 850 ml aluminium foil
trays dried). These numbers are reférred to as the design parameters, and are
represented by the symbols m,p and t. For ease of interpretation, the number of
three metre quadrats required is expressed in terms of the number of linear metres
to be sampled, but it should be borne in mind that this is strictly defined only
for lengths made up of random three metre quadrats.
This restriction to random sampling of three metre quadrats should be
remembered throughout the following discussion. The precision of systematic
sampling plans (where the lengths of cane cut are adjacent, rather than randomly
placed) will also be affected by the degree of spatial correlation over the length
of row sampled. Positive spatial correlation will tend to decrease the precision of
estimates, and negative spatial correlation will tend to increase the precision. Thomas
and Muchow (unpublished data) have observed small positive spatial correlations
in stalk number, measured for one metre quadrats over 60 m of row. This sug-
gests that the application of the results for random samples of three metre quadrats
may provide a reasonable approximation to the precision of systematic samples,
made up of adjacent lengths of row.
The first stage of designing the sampling strategy is to identify the sources
of variation affecting each variable, and to estimate the appropriate covariance
components (co-variance matrices of random deviations associated with each source
of variation). The second stage is to define the effects of the design parameters
on each source of variation, and to obtain an expression for the coefficient of
variation (CV) of the final estimate, as a function of the design parameters. This
is broadly similar to the approach adopted by Hogarth and Skinner (1967). The
focus on the CV rather than the variance in order to make more meaningful com-
parisons across sample times. For most of the quantities of interest, the mean and
variance change markedly with time, but the CV is more stable. At any one sample
time, the strategy which minimises CV also minimises variance.
Estimates of the variance components (Snedecor and Cochran, 1989) were
obtained by the method of moments (Mood and Graybill, 1963), applied to the
relevant sums of squares and products matrices. These matrices were obtained by
multivariate analysis of variance. The analysis of variance was performed with
quadrats and subsamples treated as random effects, and variety treated as a fixed
effect. All statistical calculations were performed using the SAS system. Covariance
components are not presented here, for the sake of brevity. Full details are pro-
vided by Thomas ef al. (1993).
In this paper, expressions are used for the CV of a product of two correlated
random variables. Using the delta method (Kendall and Stuart, 1958), the first
order approximation is obtained:
CV(Z) = VCV(X? + CV(Y)? + 2ep-CV(X)-CKY) Equation 1
Where X and Y are random variables, Z is the random variable which is given
by the product of X and Y, CV() is the coefficient of variation, and g is the cor-
relation of X and Y. This approximation is accurate for large sample sizes (when
the CVs of the variables is small). This expression also approximates the CV of
a ratio of two random variables. When the random variables X and Y are
uncorrelated, the CV of their product is obtained as the square root of the sum
of squares of their CVs. So, whenever the CVs of two random variables differ
markedly in size, the CV of the product is nearly equal to the larger of the two
CVs of the original variables. For example, if X has a CV of 20 and Y has a CV
of 2, then the CV of the product (assuming independence) is 20.1. Positive
345PROCEEDINGS OF AUSTRALIAN SOCIETY OF SUGAR CANE TECHNOLOGISTS 1993
correlations inflate the CV of the product, and negative correlations decrease the
CV of the product.
Further details of the statistical methodology are described by Thomas ef al.
(1993). Analyses were performed for each of the samplings described in the previous
paper (Muchow et a/., 1993). Details are presented for the July sampling, and com-
parisons are made with the February sampling.
Sampling strategy for biomass
The objective is to design a sampling strategy which enables precise and accurate
estimation of the biomass per unit area in each component of the plant: millable
stalks, green leaves and cabbage. For each component, the biomass per unit area
is estimated by the product of total fresh weight per unit area, proportion of total
fresh weight in each component, and dry matter content of the component. Results
are presented only for biomass in stalks, although similar analyses have been per-
formed for each plant component (Thomas ef a/., 1993). This analysis investigates
the effect of design parameters on sampling for total fresh weight, stalk propor-
tion fresh weight, and stalk dry matter content. The effect is then considered of
the design parameters on the estimation precision for the product of these three
quantities: the stalk biomass per unit area.
Total fresh weight is modelled per three metre by one row quadrat with only
one source of variation: operating at the level of the quadrat. The variance of fresh
weight estimate is therefore inversely proportional to the number of three metre
by one row quadrats sampled. Figure 1a shows the coefficient of variation of the
fresh weight estimator, as a function of the linear metres cut in the field, for the
July sampling described in the previous paper (Muchow et al., 1993). Both February
and July samplings show a large coefficient of variation — of the order of 20%
for a three metre quadrat. Nine linear metres (three quadrats) would produce a
coefficient of variation of 13% at the February sampling and 10% at the July
sampling.
The proportion of fresh weight in each plant component has two sources of
variation: between quadrat variation, and between individual stalks within quadrats.
For a sample of any given size, the effect of the between quadrat variation is inversely
proportional to the number of quadrats sampled, and the effect of the between
stalks variation is inversely proportional to the number of stalks subsampled. Figure
1b shows the predicted CV of estimated proportion of fresh weight in the stalks
as a function of the number of quadrats (expressed as linear metres of cane) and
of the number of stalks subsampled, for the July sampling. This was generated
using the relationship:
ea aye
m Pp
Where: 03 is the variance of the estimator Equation 2
o7 is the between quadrats variance
a
Note that g here represents the ¢o/al number of stalks subsampled, not the number
of stalks subsampled per three metre quadrat. Similarly, the legend to Figure 1b
refers to the.total number of stalks subsampled. From Figure 1b, it is clear that
the between stalk variation in the subsample is the more important determinant
of sampling precision for proportion of fresh weight in each component. Increas-
ing the number of stalks subsampled to 30 produces a larger reduction in CV than
does increasing the number of quadrats sampled. From a comparison of Figuress
346