biosystems engineering 106 (2010) 97–102

Available at

journal homepage:

Research Paper

Identification of the cleaning process on combine harvesters, Part II: A fuzzy model for prediction of the sieve losses
Geert Craessaerts a,*, Wouter Saeys a, Bart Missotten b, Josse De Baerdemaeker a
a b

Department of Biosystems, Katholieke Universiteit Leuven, Kasteelpark Arenberg 30, B-3001 Leuven, Belgium CNH (Case New Holland) Belgium N.V., Leon Claeysstraat 3A, B-8210 Zedelgem, Belgium

article info
The driving tasks of combine harvester operators can be described as being very Article history: Received 26 November 2007 Received in revised form 20 November 2009 Accepted 23 November 2009 Published online 18 April 2010 exhausting because of the high environmental temperatures, long working days and time constraints operators have to deal with. In order to lighten the job, automation is being introduced. In this study, the focus lies on the automation of the cleaning unit. A nonlinear prediction model for the sieve losses has been established by means of fuzzy modelling techniques. Excessive sieve losses can be predicted by making use of a differential pressure measurement under the upper sieve section and the derived sieve load signal. The latter signal is a measure for the loadings by grain and chaff on the upper sieve and can thus be linked with the aeration level of the upper sieve section. Validation results revealed that the best position for mounting a pressure sensor is at the rear section of the upper sieve. Grain loss kernels will be blown out by the fan when the loadings on the upper sieve are very low, while a high loading on the upper sieve will result in sieve overload and thus very high sieve losses. ª 2009 IAgrE. Published by Elsevier Ltd. All rights reserved.



Developments of new rotational threshing and separation systems in modern combine harvesters have resulted in higher crop throughputs. As a result, the performance of the grain cleaning system has to be improved because of the larger amount of material other than grain (MOG) in the grain– chaff mixture which is deposited on the cleaning system by the rotary threshing unit. Overloaded sieves should be avoided in order to keep the sieve losses within acceptable limits. From an economic point of view, grain losses at harvesting time result in a direct loss of income for the farmer. Acceptable limits for the losses of small grains such as wheat, barley and oats are often placed at 3% of total yield, being the harvest yield plus the losses, but this can vary between farms. It is usually very difficult to reduce total losses

below 1–2%, so the operator must make a trade-off based on the value of the crop, the cost of combine harvesting and the time available for harvesting, which is influenced by climate conditions. Some harvest losses are unavoidable because the grain has to be properly cleaned within the available time. Therefore, an acceptable compromise has to be sought between the material throughput (feedrate), the level of sieve losses, and the proportion of impurities in the grain bin (Srivastava et al., 1993). Another challenge the operator of a combine harvester faces is the intra- and inter-field variability. Although it has been known for a long time that yield and crop properties (like moisture, grain/straw ratio) can vary within and between fields due to variation in soils, years and soil/year interactions (Mercer and Hall, 1911), a field is usually as assumed to be a homogeneous unit and uniformly cultivated (Joernsgaard

* Corresponding author. E-mail address: (G. Craessaerts). 1537-5110/$ – see front matter ª 2009 IAgrE. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.biosystemseng.2009.11.009


biosystems engineering 106 (2010) 97–102

Fig. 1 – Configuration of a cleaning section of a combine harvester and the four pressure sensors on the cleaning section with their corresponding names.

and Halmoe, 2003). Due to a lack of time, the combine harvester operator usually estimates the most appropriate settings only once, when entering the field, and then applies these settings throughout the field, thereby neglecting the differences in temporal and site specific conditions. Consequently, manual control usually fails to fully exploit the capabilities of the combine harvester. In this study, the focus lies on the automation of the cleaning shoe. In Craessaerts et al. (2008) a prediction model for the MOG content in the grain bin was established by means of fuzzy modelling techniques. This led to a physical insight into the main causes of low/high impurity levels in the grain bin. However, as mentioned earlier, the performance of the cleaning shoe is determined by the combination of the proportion of MOG in the grain bin and the sieve loss level. In Craessaerts et al. (2007), a test machine was equipped with extra sensors in order to extract valuable information concerning the cleaning section operation. A non-linear genetic polynomial regression technique was proposed to rank the candidate input variables as possible regressor variables for the prediction of the sieve losses. It was found that sieve losses are affected in a non-linear way by differences in the pressure profile of the cleaning section and the upper sieve opening. Here the selected regressor variables will be used to establish a non-linear sieve loss prediction model by means of fuzzy modelling techniques in order to gain insight in the main causes of high and low sieve loss levels.


Materials and methods

2.1. Estimation of the sieve load by a pressure measurement
Different differential pressure sensors were placed under the upper sieve section in order to measure the loadings on the upper sieve and thus establish the aeration level of the upper sieve section. Their configuration is shown in Fig. 1. A drawback of these standard pressure signals is that they are influenced by the settings of the cleaning section such as fan

speed, lower and upper sieve openings. In order to make these pressure signals dependent solely on the amount of crop material on the sieve section and independent of the cleaning section settings, a correction algorithm was developed by Craessaerts et al. (2008). From this the corrected pressure signals are correlated with the aeration level of the cleaning shoe and the cleaning shoe efficiency predicted. Different steps were executed to determine the corrected pressure signals. Firstly, the signals of the pressure sensors were acquired for zero biomass load and for different combinations of the cleaning section settings; fan speed, and lower and upper sieve opening. This resulted in a calibration curve for the relation between the upper sieve opening and the pressure sensor signals for each combination of lower sieve opening and fan speed. An example of such a calibration curve for the front right section of the upper sieve, a fan speed of 800 rpm and a lower sieve opening of 8 mm is shown in Fig. 2. It can be seen that reducing the upper sieve opening resulted in an increase of the measured total pressure, originating from an increase in the static pressure. When the cleaning section of the combine harvester is loaded with grain, chaff and straw, this can be interpreted as reducing the size of the holes on the upper sieve section. An increase in the amount of biomass on the upper sieve section can thus be simulated by decreasing the upper sieve opening. Since the upper sieve openings can only be set between 0 and 20 mm, an extrapolation of the linear part of the measured calibration curve in the negative direction was needed to cover pressure ranges above the pressure obtained with fully closed sieve louvres. Secondly, when the combine harvester is loaded with grain, chaff and straw, the raw pressure signal was measured and the corresponding ‘virtual upper sieve opening’ (i.e. an equivalent sieve opening) determined by making use of the calibration curve for the applied fan speed and lower sieve opening. This curve was taken from the database of calibration curves resulting from the above mentioned experiments. The virtual upper sieve opening was the calculated upper sieve opening that corresponded to the measured pressure signal and it could therefore be negative. The virtual upper sieve opening was then subtracted from the real upper sieve

biosystems engineering 106 (2010) 97–102


Table 1 – Overview of the different machine settings during wheat harvest (2005–2006) and tests on the stationary test rig (2007). Machine setting
Lower sieve opening Upper sieve opening Fan speed Capacity

5 mm 11 mm 450 rpm 104 kg hÀ1

11 mm 19 mm 900 rpm 4 Â 104 kg hÀ1

2 mm 2 mm 150 rpm 104 kg hÀ1

discussion of the fuzzy model identification technique can be found in Craessaerts et al. (2008). Fig. 2 – Estimation of the virtual upper sieve opening and sieve load by means of a pressure measurement. The calibration curve shows the relationship between the upper sieve opening and the pressure measured by the sensor which is placed at the front right section of the upper sieve, when the fan speed was set at 800 rpm and the lower sieve opening was set at 8 mm. The solid line indicates the points which are typically measurable on the machine by changing the upper sieve opening between 0 mm and maximum opening. The dotted line in the left half of the graph indicates the extrapolation of the linear part of the solid line calibration curve.


Experimental set-up

opening to obtain a difference L, which can be used as a measure of the biomass load on the upper sieve section. For a given raw pressure signal Praw, the sieve load L will be smaller if the upper sieve opening is smaller and vice versa.


Non-linear fuzzy identification

Earlier research (Berner and Grobler, 1986; Craessaerts et al., 2007) pointed out that the quality of the cleaning process is affected non-linearly by the settings of the cleaning unit. A common drawback of most standard modelling and control techniques is that they cannot make effective use of extra information, such as the knowledge of experienced engineers and operators, which is often imprecise and qualitative in its nature. Fuzzy control methodology serves as the ideal framework for incorporating human and deductive knowledge. It combines the advantages of the white-box and blackbox approaches, so that the known parts of the system are modelled using physical knowledge, and the unknown or less certain parts are approximated in a black-box manner, using process data. This is an important advantage when considering automation of agricultural machinery since much process knowledge is gained by from experience. Therefore, non-linear Takagi–Sugeno fuzzy modelling techniques were used to model the relationship between the potential regressor variables and the output of the process, the total amount of sieve losses. This technique decomposes a nonlinear relationship between the regressor and output variables into different local linear relations which can be easily interpreted (Babuska, 1998). Standard linear or fuzzy control design tools can be applied in a subsequent phase. A more in-depth

The tests performed in this study are similar to the tests performed in Craessaerts et al. (2008). The behaviour of the cleaning section was identified during the harvest seasons of 2005 and 2006 on different wheat fields in Belgium. The fields were harvested with a New Holland CX combine harvester. Each test field was divided into different strips of approximately 200–300 m length and every strip was harvested with a specific combination of machine settings (lower sieve opening, upper sieve opening and fan speed). Machine speed was kept constant at 4 km hÀ1, but mainly due to variations in crop density, small variations in incoming biomass may have occurred. The database from 2005 and 2006 was extended with extra measurements, carried out on a Cse New Holland stationary test rig of at the beginning of 2007. Performing tests on a stationary test rig has the advantage that the amount of incoming biomass can be kept constant during a test run whereas small changes in crop density can occur during field tests. During stationary tests, the amount of incoming biomass was varied between 1 and 4 Â 104 kg hÀ1 grain. An overview of the range and intervals for the different machine settings is given in Table 1. In total 55 stationary regimes were monitored during the different test periods. Based on the results of the input selection study (Craessaerts et al., 2007), and physical insights into the cleaning process, a selection was made of the most important sensors in order to predict the sieve loss level. The performance of the cleaning system was monitored by means of three commercial sieve loss sensors, placed in cascade one after each other. It was expected that this configuration could monitor not only the traditional sieve overloads, which are measured by a standard commercial sieve loss sensor, but also the grain kernels that are blown out by the fan. Their placement at the end of the sieve section is shown in Fig. 3. A refined set of sensors, listed in Table 2, was monitored for each of the possible machine setting combinations. The sensor data were monitored and logged by a National Instruments PXI real-time hardware system with LabviewÒ1 software. Data collection was intitated approximately 30 s after the biological material started entering the machine to make sure that the combine harvesting process was at equilibrium. At the pre-processing stage, an average of each signal was taken for each stationary run, because the corresponding signals remained more or less constant during these runs.

Labview, National Instruments, Austin, Texas, USA.


biosystems engineering 106 (2010) 97–102

Table 2 – Overview of the measurement signals that are related to the cleaning process on a conventional combine harvester and their symbolic notation. Signal (unit)
Lower sieve opening (mm) Upper sieve opening (mm) Fan speed (rpm) Pressure sieve rear right (V) / load rear right (load RR) Pressure sieve front right (V) / load front right (load FR) Pressure sieve rear left (V) / load rear left (load RL) Pressure sieve front left (V) / load front left (load FL) Sieve losses

X1 X2 X3 X4 X5 X6 X7 Yout

Fig. 3 – Configuration of the cascade sieve loss sensor.

encompasses the system response when the amounts of chaff and grain at the rear section of the upper sieve are high. Each cluster corresponds to a local linear regime of the form: Sieve Loss ¼ ða à Load RLÞ þ b (1)

The total data set, which consists of 55 stationary regimes, was at random split into a training set (2/3) and validation set (1/3). Different fuzzy model structures were evaluated. The parameters of the non-linear fuzzy models were estimated by making use of the training data set and least-squares techniques. The evaluation of the different fuzzy model structures was based on the prediction performance on a validation set. This whole procedure of parameter estimation, based on a training set, and validation on a separate set was repeated forty times in order to reduce the impact of the random influence of choosing a training and validation set.


Results and discussion

The average validation results (R2 ) of the different non-linear val model structures are listed in Table 3. The best results (R2 ¼ 0.72–0.85) were obtained with a non-linear model which val consisted of only one input variable (load rear left). For the determination of the optimal number of clusters or local linear submodels the use of cluster validity measures was proposed by Babuska (1998). These validity measures will, however, only work well when the data are randomly sampled and equally divided in the continuous data space. Since different discrete fan speed levels have been applied in this study, it is not advisable to select the number of clusters based on these validity measures. Therefore, the optimal number of clusters was chosen based on the validation performance of the different models. Best results (R2 ¼ 0.85) were obtained for an input val data space consisting of one input variable (load rear left) subdivided into three local linear submodels. The partitioning results of the 1-D-antecedent space by the fuzzy Gustafson–Kessel algorithm, as described in Hoppner ¨ (1999), into three different clusters and their corresponding membership functions are shown in Fig. 4. A physical interpretation can be given to the obtained partition. The first cluster characterises the system response when the loadings on the upper sieve by chaff and grain are low. The second cluster corresponds to a process regime when loadings on the upper sieve are medium. The third and last cluster

The values for the consequent parameters of the resulting local linear models (a,b) were estimated by standard leastsquares techniques and are listed in Table 4. The prediction of sieve losses by the global sieve loss model, which is a linear combination of the local linear models [Eq. (1)], is shown in Fig. 5. The validation results for this ‘best’ model are shown in Fig. 6, where the predicted sieve losses are plotted against the real sieve loss level for different repetitions of training and validation. Although only validation results for wheat harvest are shown in this study, it should be mentioned that the general model structure of Eq. (1) is applicable to other crops such as barley, corn, oats, and rapeseed. The optimum number of clusters and corresponding model parameters might, however, slightly differ between the different harvesting conditions because of the differences in process gain when harvesting different crops. The main advantage of making use of fuzzy modelling techniques is the physical interpretability of the resulting local linear models. This will allow the data-based fuzzy model derived in this study to be extended using heuristic rules from experienced operators and engineers. Interpretation of the consequent parameters reveals that the three local regimes have different characteristics, which can be interpreted as follows: - Work regime 1 corresponds to the system response when the load of grain and chaff on the upper sieve is too low because of an excessively high fan speed for a given biomass throughput. As a result, the aerodynamic forces on the grain kernels are too high and they are blown out. The resulting sieve losses are high. An increase in the load on the upper sieve will result in lower sieve losses. - Work regime 2 corresponds to the normal and/or optimal machine response. Grain losses are low because of the optimum combination of settings for the cleaning unit. - Work regime 3 corresponds to the system response when the loadings on the upper sieve become too high, mainly due to a low fan speed for a given biomass throughput which results in a badly aerated material layer and thus

biosystems engineering 106 (2010) 97–102


Table 3 – Evaluation of the different model structures on a validation set. R2 validation 2
Model input variables X4 X5 X6 X7 X3, X4 X3, X5 X3, X6 X3, X7 X2, X4 X2, X5 X2, X6 X2, X7 X1, X4 X1, X5 X1, X6 X1, X7 X1, X3, X4 X1, X3, X5 X1, X3, X6 X1, X3, X7 X2, X3, X4 X2, X3, X5 0.74 0.76 0.78 0.4 0.41 0.5 0.48 0.29 0.42 0.49 0.75 0.26 0.54 0.51 0.55 0.52 0.44 0.55 0.51 0.46 0.54 0.5

Number of clusters 3
0.76 0.8 0.85 0.5 0.65 0.82 0.82 0.64 0.62 0.64 0.79 0.42 0.54 0.6 0.59 0.44 0.45 0.62 0.5 0.46 0.57 0.74

0.73 0.77 0.79 0.41 0.72 0.78 0.73 0.58 0.62 0.64 0.71 0.4 0.47 0.45 0.5 0.49 0.35 0.57 0.54 0.39 0.5 0.66

0.68 0.76 0.78 0.46 0.65 0.64 0.67 0.55 0.52 0.58 0.66 0.29 0.38 0.34 0.45 0.31 0.3 0.29 0.37 0.25 0.4 0.48

0.69 0.75 0.79 0.4 0.6 0.5 0.65 0.44 0.4 0.37 0.57 0.27 0.26 0.29 0.44 0.29 0.12 0.3 0.28 0.2 0.34 0.32

0.64 0.68 0.78 0.33 0.46 0.5 0.55 0.35 0.34 0.36 0.49 0.21 0.18 0.19 0.28 0.14 0.04 0.19 0.13 0.09 0.17 0.21

0.62 0.63 0.72 0.38 0.39 0.31 0.3 0.17 0.28 0.23 0.38 0.12 0.13 0.18 0.12 0.1 0.01 0.01 0.02 0.02 0.04 0.07

high sieve losses. High biomass loadings can also occur if there is too narrow a sieve opening for a given biomass throughput. This work regime is characterised by a high static process gain such that a small increase in sieve load leads to a high increase in sieve losses. It should be noted that excessive amounts of sieve losses can be predicted by making use of a differential pressure measurement under the upper sieve section and the derived sieve load signal. Validation results revealed that the best position for placement of a pressure sensor was at the rear section of the upper sieve. These findings can be linked to the observation that sieve overloading can be predicted based on the throughput

distribution across the sieve length (L) measured using several grain separation sensors, as was reported by Somes (1982). Once this distribution is measured, an estimate of the grain loss can be found by integrating the tail of the distribution.
þN Z

Loss ¼

Pðx; tÞdt



A high separation rate at the rear section of the upper sieve means that a large fraction of grain kernels is still present at the rear section of the upper sieve and thus will result in a high sieve load. A drawback of measuring the separation rate by grain impact sensors is that the transport trajectory of the grain kernels is interrupted and thus accumulation of chaff, grain and short straw material can occur at these positions. This is not the case with sensitive differential pressure measurement. Another advantage of pressure measurement is that it allows the distinction to be made between a working regime where the kernels are blown out and the normal and/or optimal working regime. Considering these results and the results reported by Craessaerts et al. (2008), it can be concluded that this sieve load signal calculated from pressure measurements provides valuable information for automatic control of the cleaning unit.

Table 4 – Estimated values for the consequent parameters a, b of the three local linear submodels. a Fig. 4 – Partitioning of the 1-D-antecedent input space into three clusters by the fuzzy Gustafson–Kessel clustering algorithm; the numbers correspond to the three work regimes that have been identified.
Cluster 1 Cluster 2 Cluster 3 À4.4 0.05 15.4

78.1 19.1 À481.8


biosystems engineering 106 (2010) 97–102

Fig. 5 – Response of the best fuzzy model to predict the total amount of sieve losses. The best model consists of one input variable, load rear left on the upper sieve, and is subdivided into three different local linear regimes. The fuzzy model is shown by the solid (red) line and the data points by the (blue) dots. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

predicted based on a differential pressure measurement under the upper sieve section and the sieve load signal derived from this. A non-linear fuzzy model was built which relates the sieve load signal to the sieve losses. Best validation results (R2 ¼ 0.85) were obtained for a non-linear model consisting of val three local linear regimes. Two suboptimum working regimes can be distinguished from the optimal and normal working regime by monitoring the sieve load signal. Grain loss kernels will be blown out by the fan when the loadings on the upper sieve are too low, whilst too high a loading on the upper sieve will result in sieve overload and thus very high sieve loss levels. In a following paper, the development of a cleaning automation system using the knowledge from the data-based fuzzy models, described in this paper and in Craessaerts et al. (2008), in combination with the knowledge of experienced operators will be described.

The authors gratefully acknowledge the I.W.T.-Flanders (Institute for the Promotion of Innovation through Science and Technology, I.W.T.-Vlaanderen) for the financial support through project ‘Automatische regeling van het reinigingsproces in maaidorsers’ project number 020750 and CNH Belgium for its cooperation. Wouter Saeys is funded as a Postdoctoral Fellow of the Research Foundation – Flanders (FWO). Bart Missotten is with CNH Belgium.


Fig. 6 – Comparison of the measured sieve loss level versus the predicted sieve loss level with a validation set based on the best model structure using 1 input variable (load rear left) and 3 clusters. The validation results of different repetitions (40) of a random selection of training and validation sets are shown.


Conclusions and future research

The relationship between the settings of the cleaning unit and the sieve loss has been studied in detail. Sieve losses are an important quality parameter for the cleaning unit besides MOG content. It was found that the sieve losses can be

Babuska R (1998). Fuzzy Modeling for Control. Kluwer Academic Publishers, Boston. Berner D; Grobler W H (1986). Gesteuerte adaptive regelung einer mahdrescherreinigungsanlage. Landtechnik, 36. ¨ Craessaerts G; Saeys W; Missotten B; De Baerdemaeker J (2007). A genetic input selection methodology for identification of the cleaning process on a combine harvester, Part I: selection of relevant input variables for identification of the sieve losses. Biosystems Engineering, 98, 166–175. Craessaerts G; Saeys W; Missotten B; De Baerdemaeker J (2008). Identification of the cleaning process on combine harvesters Part I: A fuzzy model for prediction of the material other than grain content (MOG) in the grain bin. Biosystems Engineering, 101(1), 42–49. doi:10.1016/j.biosystemseng.2008.05.016. ¨ Hoppner F; Klawonn F; Kruse R; Runkler T (1999). Fuzzy cluster analysis. Wiley Chichester. Joernsgaard B; Halmoe S (2003). Intra-field yield variation over crops and years. European Journal of Agronomy, 19, 23–33. Mercer W B; Hall A D (1911). The experimental error in field trials. Journal of Agricultural Science, 4, 107–132. Somes (1982). Absolute grain loss monitor. United States Patent, Patent number 4,360,998. Srivastava A K; Goering C E; Rohrbach R P; Buckmaster D R (1993). Engineering Principles of Agricultural Machines. ASABE, Michigan.