Professional Documents
Culture Documents
Original papers
a r t i c l e i n f o a b s t r a c t
Article history: Sugarcane is an important crop for tropical and sub-tropical countries. Like other crops, sugarcane agri-
Received 26 October 2015 cultural research and practice is becoming increasingly data intensive, with several modeling frameworks
Received in revised form 23 August 2016 developed to simulate biophysical processes in farming systems, all dependent on databases for accurate
Accepted 4 October 2016
predictions of crop production. We developed a computational environment to support experiments in
sugarcane agriculture and this article describes data acquisition, formatting, storage, and analysis. The
potential to support creation of new agricultural knowledge is demonstrated through joint analysis of
Keywords:
three experiments in sugarcane precision agriculture. Analysis of these case studies emphasizes spatial
Precision agriculture
Sugarcane
and temporal variations in soil attributes, sugarcane quality, and sugarcane yield. The developed compu-
Database tational framework will aid data-driven advances in sugarcane agricultural research.
Workflow Ó 2016 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.compag.2016.10.002
0168-1699/Ó 2016 Elsevier B.V. All rights reserved.
14 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19
workflows. WED-flow addresses the integration of three main flow The third experiment was conducted in a 10-ha commercial
paradigms: workflow, event-flow and data-flow. These three flow sugarcane field (Fig. 2c) that belongs to GRUPO USJ, a sugar and
paradigms combine the concepts of workflow composition, trans- ethanol plant located in Araras, São Paulo State in the southeast
actions (i.e., activities), events, and data states. At an abstract level, region of Brazil (22° 230 3800 S, 47° 180 0400 W). Sugarcane variety
the WED-flow is modeled as a SAGA (Garcia-Molina and Salem, SP80-3280 was planted in 2007 and was mechanically green har-
1987), composed from SAGA steps, each of which is enclosed in a vested along the cropping seasons. Liming and fertilization were
transaction. The main advantage of WED-flow approach is the inte- performed according to usual recommendations for sugarcane crop
gration of three main flow paradigms supporting modeling of anal- planting (Raij et al., 1997) at a fixed rate. For the subsequent ratoon
ysis workflows. More concretely, in this paper, we assume that: the crops (2009–2011), no fertilizers were applied. A total of 117 sam-
flow of work is a set of analysis steps; the flow of analysis event is a ple points were established on a 30 m regular grid to sample the
set of constraints (that once satisfied triggers the following analy- sugarcane quality parameters and physical and chemical soil attri-
sis step); and flow of data is a set of data states generated by each butes. Additional 14 refinement points were also taken. Plant sam-
analysis step. An analytical module was created within BDAgro by pling occurred just before the respective harvests at the peak of the
constructing a set of tables containing the data states resulting sugarcane maturity. The three sites (Fig. 2a–c) had been under con-
from analysis workflows. The statistical computing integrated to tinuous sugarcane cultivation for more than 30 years.
the database is performed with the R programming language. Annually the areas were harvested using sugarcane harvesters
equipped with auto-guidance system with a RTK signal (Trimble,
Navigation Ltd, Sunnyvale, CA, USA) and a yield monitor (Simpro-
3. Case studies
cana, Enalta, São Carlos, Brazil). Annually, after harvest, soil sam-
ples were taken again at same grid to diagnosis some deficiency
Data presented in this paper has been collected from 2007 to
and recommend fertilizer application using VRT when applicable
2014 in different sugarcane fields. Sugarcane is a semi-perennial
(field 1 and 2). At each field, the experiment has been conducted
crop, which typically grows in cycles of four to six years, being har-
for the whole sugarcane cycle, i.e. from planting, following through
vested and fertilized annually.
the consecutive ratoons, during 4 cycles.
Fig. 2. Photographs from land areas of experiments 1–3 (A-C). Grid points (black dots), grid refinement points (red crosses, A and C), and the track from the sugarcane
harvester (yellow lines in the rectangular area zoomed at the inset of (A) are shown). (For interpretation of the references to colour in this figure legend, the reader is referred
to the web version of this article.)
16 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19
collected from two soil layers: 0–0.20 and 0.20–0.40 m deep in removed by a filtering step. Any entry deviating from the mean
experiments 1 and 2; 0–0.20 and 0.20–0.50 m deep in experiment by more than three standard deviations (for a given attribute)
3. Sugarcane quality parameters (Fiber, Brix, and Pol) were mea- was treated as an outlier.
sured from sugarcane samples collected immediately before har- Following linearization and filtering, the next steps aim at rep-
vest. Sugarcane yield was measured by an on-the-go yield resenting all measurements of one experiment as a matrix of
monitor that determines sugarcane yield during the harvesting entries U = {uij}, where i is the index for grid points and j for mea-
operation (Magalhães and Cerri, 2007). In the database, yield is a sured attributes. Sampling at grid points straightforwardly gener-
type of static data associated with a characterization event. Each ates data in this matrix format. Nevertheless, in case of proximal
event of yield characterization is simultaneous to the intervention measurements, such as those performed in grid refinement points
event of harvest (Section 2.2). In this article, we analyze sugarcane (Fig. 2), uij is calculated as the average of the proximal points. On
yield, whereas we omit data from other on-the-go sensors (of the other hand, each attribute from on-the-go sensor is a function
apparent electrical conductivity, crop vegetation indexes). uj(x,y), where (x,y) are the spatial coordinates of the sensor track on
the field. The value of uij is estimated by employing linear regres-
sion to fit a plane to uj(x,y) points within a circle of 50 m diameter
3.3. Analysis workflow centered at the grid point i.
With attributes estimated at grid points, Moran0 s I spatial auto-
We developed a data analysis workflow aiming at providing correlation, Ij, is calculated by considering the connection matrix M
synthetic views of (i) spatiotemporal variability in each measured whose terms equal one for neighbor grid points, and zero other-
attribute and (ii) correlations among soil attributes that are pre- wise. The matrix L is obtained by normalizing M to have each
sumably associated with soil pH. The model of the analysis work- row summing to unit. Then, Ij = zjT L zj, where zj is the vector of
flow is sketched in Fig. 3, representing flow of analysis the mean-centered normalized attribute estimated at grid points,
(represented by boxes), flow of events (represented by arrows), j)/rj (Cliff and Ord, 1973).
zij = (uij u
and flow of data states (represented by databases). Note that Measured attributes vary widely concerning their range of val-
results of analysis (Fig. 3) are correlations, spatial autocorrelations, ues as well as their data acquisition technologies. Therefore, preci-
and outputs from principal component analysis (PCA). In essence, sion in attribute measurement also vary widely. Typical attribute
all these results are based on correlations, which would be detri- precisions were estimated as ‘‘noise” levels sj. For attributes deter-
mentally affected by nonlinearities and outliers. mined by sampling at grid points, sj was estimated as the standard
In order to avoid such issues, linearization and filtering steps deviation of measurements in proximal points. For attributes
are applied beforehand in the workflow. Linearization takes the obtained from on-the-go sensors, sj was estimated from the stan-
logarithm of concentrations of soil components (OM, macronutri- dard error of the linear regression coefficient that estimates uij.
ents and micronutrients), keeping other attributes unchanged. Principal Component Analysis (PCA) was employed to reduce
The logarithm scale reduced the positive skewness from concen- the dimensionality of the attribute space. Prior to PCA, missing
tration distributions and is additionally justifiable due to ubiquity data in matrix {uij} was imputed by the Expectation-
of linear relations with log (concentration) found in physical- Maximization (EM) algorithm associated with a multivariate nor-
chemistry (Atkins and Paula, 2010). Outliers in data sets are mal model (Johnson and Wichern, 2007), stopping EM iterations
when imputed uij change way less than sj. This imputation
approach preserves the data covariance structure, thus being well
suited as data preparation method for PCA. PCA was performed
from correlation matrixes, which means that each attribute con-
tributes with normalized, unit variance.
Fig. 4. Inter-year correlation against Moran0 s I spatial autocorrelation for attributes of soil chemistry (black; OM, P, K, B, Mg, S, Ca, Mn, Fe, Cu, and Zn), sugarcane quality (blue;
Fbr, Brix, Pol), and sugarcane yield (red; Yi). Results from experiments 1–3 are shown in plots A-C. Year of measurement is represented by the inclination of the labels, as
indicated in the legends. Distinct soil layers are represented by Bold (0–0.20 m) and Italic (either 0.20–0.40 m or 0.20–0.50 m) labels. Diagonal dashed lines (y = x) and
horizontal dotted line (zero inter-year correlation) are shown to guide the eyes. (For interpretation of the references to colour in this figure legend, the reader is referred to the
web version of this article.)
4.2. Spatiotemporal variability in soil attributes appreciable correlations and spatial autocorrelations are only occa-
sionally observed in one (experiment 3) out of three experiments
Statistical noise tends to reduce the magnitude of all types of suggests that significant spatiotemporal variability in sugarcane
correlations, including spatial autocorrelation and correlation quality may often remain undetected.
between sequential measurements of a given attribute. Fig. 4 plots
inter-year correlation against spatial autocorrelation Ij. The inter-
year correlation is an index for the temporal stability of the spatial
4.4. Spatiotemporal variability in sugarcane yield
variability of a given attribute. For attributes of soil chemistry
(black labels), a clear trend is observed in experiment 1 (Fig. 4A),
Yield is certainly a major aim in crop management. In other
with attributes clustered near the diagonal dashed line (y = x).
crops, yield spatial variations are commonly found to be rather
Indeed, inter-year correlation is positively correlated (r = 0.73)
stable across several harvests (Godwin et al., 2003). In sugarcane,
with spatial autocorrelation. This trend is consistent with inter-
however, we have been observing a lack of temporal stability in
year correlation and spatial autocorrelation being both substan-
yield spatial variability. In experiment 1, yield in 2014 has a posi-
tially reduced by statistical noise. The correlation between inter-
tive (0.2) correlation with yield in 2013. This level of correlation
year correlation and spatial autocorrelation is weaker but still pos-
locates yield close to the diagonal trend observed in Fig. 4A, which
itive in experiments 2 and 3 (r = 0.54, r = 0.32, respectively). This
would be consistent with temporal stability reduced mainly by sta-
result indicates that statistical noise is also a major factor reducing
tistical noise. On the other hand, yield in 2013 has a near zero cor-
inter-year correlation and spatial autocorrelations in experiments
relation with yield in 2012 (Fig. 4A). This yield data is quite below
2 and 3. In these two experiments, most attributes of soil chemical
the diagonal trend of Fig. 4A and the lack of inter-year correlation
attributes are above the diagonal dashed line (Fig. 4B and C), i.e., for
cannot be attributed to statistical noise. Even more extreme case is
most attributes inter-year correlation is greater than spatial auto-
observed in experiment 2, where correlation between 2012 and
correlation. This observation might result from inherently lower
2011 yields is negative (0.6).
spatial autocorrelations in experiments 2 and 3, due to the greater
A more detailed investigation indicated two major causes for
spacing between grid points in experiment 2, and perhaps due to
the zero or negative inter-year correlations in yield. One cause is
the more homogeneous, flat terrain of experiment 3.
occasional sensor failure. Readings from yield monitor are very
noisy (Fig. 5a), even after removal of outliers (Fig. 5b). The example
4.3. Spatiotemporal variability in sugarcane quality of Fig. 5b has mean yield of 93 Mg ha1 and standard deviation of
47 Mg ha1. Typical precision (sj) in yield is reduced to about 1–
In experiments 1 and 2, attributes of sugarcane quality have 4 Mg ha1 due to multiple readings used to estimate uij by regres-
spatial autocorrelation and inter-year correlation of approximately sion (Fig. 3). However, with such noisy signal, some sensor failures
zero (blue labels in Fig. 4A and B, respectively). This observation remain unfiltered, detrimentally affecting the precision of yield
indicates that sugarcane quality is consistent, at least approxi- estimates and reducing associated correlations. The second cause
mately, with random spatiotemporal variations. Experiment 3 of low inter-year correlations originates from real changes in yield,
shows a different behavior. In year 2011, Brix and Pol (quality attri- especially due to damages inflicted to the sugarcane crop. Harvest
butes associated with sugar content) have spatial autocorrelation operation may damage the roots of the sugarcane plants, occasion-
Ij 0.3 and correlation with previous year of 0.2 (Fig. 4C). These ally pulling roots out of the soil. The result of such damage is a
are significant correlations, demonstrating that within-field spatial localized yield reduction in the next harvests, thus reducing
variability and temporal stability of Brix and Pol are possible issues inter-year correlation. Such damages to roots may become visible
to be managed by PA techniques. Correlations of even higher mag- as gaps in the sugarcane field. Indeed, field observations by agrono-
nitudes are observed in sugarcane fiber contents (‘Fbr’ labels in mists have been suggesting that crop damaging during harvest is
Fig. 4C), reinforcing that spatiotemporal management of sugarcane the major cause for declining yield across the multiyear crop cycles
quality is possible, at least in principle. Nevertheless, the fact that (Zamykal and Everingham, 2009).
18 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19
Fig. 5. Reads from yield monitor before (a) and (b) after the filter step of the analysis workflow. Data from 2013 harvest of experiment 1.
Fig. 6. Loadings plot from principal component analysis of soil attributes presumably associated with soil pH (OM, Ca, Mg, and pH). Results from experiments 1–3 are shown
in plots A-C. Year of measurement is represented by the inclination of the labels, as indicated in the legends. Distinct soil layers are represented by Bold (0–0.20 m) and Italic
(either 0.20–0.40 m or 0.20–0.50 m) labels. Dotted lines are a pair of orthogonal axes rotated to have one axis aligned with the clusters of pH loadings (A and C).
4.5. PCA of attributes associated with soil pH clustering of loadings (Fig. 6B), which might be due to poorer
statistics arising from fewer grid points (Fig. 2).
The numerous soil and plant attributes exist in an abstract Experiments 1 and 3 also share the relative directions of loading
space of high dimensionality where visualization of the informa- clusters. The centroids of the pH clusters form an angle slightly
tion is impossible. Hence, reduction of dimensionality is often greater than 90° with the centroids of OM clusters
desirable and PCA is a statistical technique to do it. Fig. 6 presents (Fig. 6A and C). This relative direction indicates a tendency to have
the PC1& PC2 loadings from PCA applied to attributes judged to be slightly negative correlations between pH and OM, consistent with
potentially associated with soil pH. The attributes are pH itself, the decreasing pH due to higher levels of decomposing OM. Further-
main elements of lime (Mg and Ca, applied to soils to increase their more, the loading clusters of Ca and Mg are observed in-between
pH), and organic matter (OM, whose decomposition is thought to those of pH and OM (Fig. 6A and C). Proximity with loadings of
decrease soil pH). PCA was applied separately for each experiment. pH is consistent with Ca and Mg causing increases in soil pH. On
Analyzed attributes span sequential years as well as the two soil the other hand, the proximity between loadings of OM and of lime
layers that were characterized. Our interest in soil pH is due to elements (Mg and Ca) is consistent with OM acting as storage med-
the possibility of pH control using lime broadcasted on the soil sur- ium for Mg and Ca available in soils. These relative directions in
face with variable rate technology, which could reduce costs by loadings plots demonstrate the possibility of finding common
applying lime prescribed according to site-specific demands. behavior across multiple sugarcane agricultural experiments.
As a first observation, experiments 1 and 3 show loadings that
are clustered (Fig. 6A and C). In each experiment, there is one clus- 5. Conclusions
ter for pH, one for lime elements (Mg and Ca), and one for OM. Such
clustering indicates high correlation between soil top and bottom This paper described a computational environment to support
layers. The clustering also indicates that spatial variations of pH, research in sugarcane agriculture. Data acquisition, formatting,
Ca, Mg, and OM are mostly preserved along the years. This is and verification steps are performed prior to data insertion into a
instructive and perhaps surprising, considering that experiments dedicated database named BDAgro. Data analysis is integrated to
1 and 3 employed distinct strategies to control soil pH (Section 3.1). the database by recording data states generated through data anal-
Loadings of OM are comparatively more spread than loadings of ysis workflows. Such workflows can be tailored for the specific
pH, Ca, and Mg (Fig. 6A and C). This observation is consistent with aims of each study.
noisier OM measurements as well as with significant dynamics in The computational environment was employed to jointly ana-
spatial variability of soil OM. Experiment 2 does not present such lyze three experiments in sugarcane precision agriculture (PA).
C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19 19
These experiments comprised characterization of soil attributes, Approach. In: Meersman, R., Dillon, T., Herrero, P. (Eds.), On the Move to
Meaningful Internet Systems: Otm 2010, PT I, Hersonissos, Greece, pp. 150–167.
sugarcane quality, and sugarcane yield. Our analysis showed low
Garcia-Molina, H., Salem, K., 1987. Sagas. In: ACM SIGMOD International Conference
spatial autocorrelations and low inter-year correlations for most on Management of Data. San Francisco, pp. 249–259.
of the sugarcane quality and yield attributes. This finding revealed Gebbers, R., Adamchuk, V.I., 2010. Precision agriculture and food security. Science
important limitations in measurements of sugarcane quality, while 327, 828–831. http://dx.doi.org/10.1126/science.1183899.
Godwin, R.J., Wood, G.A.A., Taylor, J.C., Knight, S.M., Welsh, J.P., 2003. Precision
yield seems to be primarily impaired by crop damages inflicted by farming of cereal crops : a review of a six year experiment to develop
the mechanical harvest processes, such as ratoon damage and soil management guidelines. Biosyst. Eng. 84, 375–391. http://dx.doi.org/10.1016/
compaction. Furthermore, our multivariate analysis of features S1537-5110(03)00031-X.
IRENA, 2015. Renewable energy capacity statistics 2015. Irena 44. http://dx.doi.org/
associated with soil pH revealed common behavior in two experi- 10.1016/j.renene.2014.09.059.
ments, evidencing the possibility of common underlying principles Johnson, R.A., Wichern, D.W., 2007. Statistical Analysis. Pearson - Prentice Hall,
to be identified across multiple sugarcane agricultural environ- Upper Saddle River.
Magalhães, P.S.G., Cerri, D.G.P., 2007. Yield monitoring of sugar cane. Biosyst. Eng.
ments. Overall, these case studies demonstrate the usefulness of 96, 1–6. http://dx.doi.org/10.1016/j.biosystemseng.2006.10.002.
the computational environment for supporting research in sugar- Marin, F.R., Jones, J.W., Royce, F., Suguitani, C., Donzeli, J.L., Filho, W.J.P., Nassif, D.S.
cane agriculture, which will positively impact the sustainability P., 2011. Parameterization and evaluation of predictions of DSSAT/CANEGRO for
Brazilian sugarcane. Agron. J. 103, 304. http://dx.doi.org/10.2134/
and competitiveness of the sugar-ethanol industry. agronj2010.0302.
Mulla, D.J., 2013. Twenty five years of remote sensing in precision agriculture: key
Acknowledgements advances and remaining knowledge gaps. Biosyst. Eng. 114, 358–371. http://dx.
doi.org/10.1016/j.biosystemseng.2012.08.009.
Pontes, A.O., Ling, L.Y., Sanches, G.M., Magalhães, P.S.G., Ferreira, J.E., Driemeier, C.E.,
Project funded by CNPq and Fundação de Amparo à Pesquisa do 2014. BDAgro – CTBE Database of Agricultural Experiments, Technical
Estado de São Paulo (FAPESP grants 2011/01817-9 and Memorandum, MeT 11. Campinas. available at <https://forms.cnpem.
2015/01587-0). br/formularios/prodbiblio/DB/2338/MeT%20112014.pdf>.
Portz, G., Molin, J.P., Jasper, J., 2011. Active crop sensor to detect variability of
nitrogen supply and biomass on sugarcane fields. Precis. Agric. 13, 33–44.
References http://dx.doi.org/10.1007/s11119-011-9243-4.
van Raij, B., Cantarella, H., Quaggio, J.A., Furlani, A.M.C., 1997. Recomendações de
Anselmi, A.A., Bredemeier, C., Federizzi, L.C., Molin, J.P., 2014. Factors related to adubação e calagem para o Estado de São Paulo. Instituto Agronômico,
adoption of precision agriculture technologies in southern Brazil. In: ISPA (Ed.), Fundação IAC, Campinas. 285p.
Proc. of the 12th International Conference on Precision Agriculture. ISPA, Rodrigues Jr., F.A., Magalhães, P.S.G., Franco, H.C.J., 2012. Soil attributes and leaf
Sacramento, California, p. 11. nitrogen estimating sugar cane quality parameters: brix, pol and fibre. Precis.
Atkins, P., Paula, J. de, 2010. Physical Chemistry. Oxford University Press, Oxford. Agric. 14, 270–289. http://dx.doi.org/10.1007/s11119-012-9294-1.
Avanzi, J.C., Borghi, E., Bortolon, L., Bortolon, E.S.O., Luchiari Jr., A., Inamasu, R.Y., Silva, C.B., de Moraes, M.A.F.D., Molin, J.P., 2011. Adoption and use of precision
Bernardi, A.C.C., 2014. Adoption Level of Precision Agriculture for Brazilian agriculture technologies in the sugarcane industry of São Paulo state, Brazil.
Farmers - 2011/2012 crop year. In: ISPA (Ed.), Proc. of the 12th International Precis. Agric. 12, 67–81. http://dx.doi.org/10.1007/s11119-009-9155-8.
Conference on Precision Agriculture. ISPA, Sacramento, California, pp. 1–2. Song, X., Wang, J., Huang, W., Liu, L., Yan, G., Pu, R., 2009. The delineation of
Bramley, R., Trengove, S.A.M., 2013. Precision agriculture in Australia: present agricultural management zones with high resolution remotely sensed data.
status and recent development. Eng. Agric. 33, 575–588. Precis. Agric. 10, 471–487. http://dx.doi.org/10.1007/s11119-009-9108-2.
Bramley, R.G.V., 2009. Lessons from nearly 20 years of Precision Agriculture Srinivasan, A., 2006. Precision Agriculture: An overview. Handbook of Precision
research, development, and adoption as a guide to its appropriate application. Agriculture, pp. 3–18.
Crop Pasture Sci. 60, 197–217. http://dx.doi.org/10.1071/CP08304. Tagarakis, A., Liakos, V., Fountas, S., Koundouras, S., Gemtos, T.A., 2012.
Chen, N., Zhang, X., Wang, C., 2015. Integrated open geospatial web service enabled Management zones delineation using fuzzy clustering techniques in
cyber-physical information infrastructure for precision agriculture monitoring. grapevines. Precis. Agric. 14, 18–39. http://dx.doi.org/10.1007/s11119-012-
Comput. Electron. Agric. 111, 78–91. 9275-4.
Cliff, A.D., Ord, J.K., 1973. Spatial Autocorrelation. Pion, London. Webster, R., Oliver, M.A., 2007. Geostatistics for Environmental Sciences. John Wiley
CONAB, Ministério da Agricultura, Abastecimento, P., 2015. SAFRA 2015/16 - & Sonns, Ltd. http://dx.doi.org/10.1002/9780470517277.
Terceiro levantamento. Whelan, B.M., McBratney, A.B., 2000. The ‘‘null hypothesis” of precision agriculture
Elmasri, R., Navathe, S.B., 2010. Fundamental of Database Systems. Addison- management. Precis. Agric. 2, 265–279.
Wesley. Zamykal, D., Everingham, Y.L., 2009. Sugarcane and Precision Agriculture:
FAO, Food and Agricultere Organization of the United Nations, 2015. Statistics Quantifying Variability Is Only Half the Story – A Review. In: Lichtfouse, E.
Division – FAOSTAT) [www Document]. URL http://faostat3.fao.org (accessed (Ed.), Climate Change, Intercropping, Pest Control and Beneficial
10.1.15). Microorganisms. Springer, Netherlands, Dordrecht, pp. 189–218. http://dx.doi.
Ferreira, J., Takai, O., Malkoviski, S., Pu, C., 2010. Reducing Exception Handling org/10.1007/978-90-481-2716-0.
Complexity in Business Process Modeling and Implementation: The WED-Flow