You are on page 1of 10

Smart Agricultural Technology 3 (2023) 100071

Contents lists available at ScienceDirect

Smart Agricultural Technology


journal homepage: www.elsevier.com/locate/atech

Acoustic sensors for detecting cow behaviour


P.R. Shorten
AgResearch Limited, Ruakura Research Centre, Private Bag 3123, Hamilton, New Zealand

a r t i c l e i n f o a b s t r a c t

Keywords: Acoustic technologies provide a non-invasive method to generate information about the health, welfare, and
Cow environmental impact of livestock. This study demonstrated that rear leg attached acoustic sensors can be used
Acoustic to differentiate between seven different acoustic classes, with six based on cow behaviours (Grazing, Breathing,
Machine learning
Walking, Lying Down, Dung, Vocalization, Other) that were obtained from more than 150 cows under grazing
Neural network
conditions. The overall accuracy of the ensemble classification model was 96.2% based on a total of 700 acoustic
Behaviour
Livestock recordings. The performance of the models for the duration, frequency (number of events per 10 seconds), or
period (average time between events) of the six animal behaviours had an average coefficient of determination
of R2 = 0.93. The model for respiration rate (Breathing class) while sleeping (R2 = 0.99) provides an alterna-
tive to more invasive differential pressure and thermistor-based methods. The model performance for bite rate
(R2 = 0.91) was consistent with results obtained previously with collar and forehead attached microphones. There
was ten-fold variation in the duration of dung events and the model for the duration of dung events (R2 = 0.96)
allows for estimation of the total amount of dung deposited per day. The acoustic technology also provides
(R2 = 0.91) an alternative to accelerometer-based methods for stepping frequency. Lying down events were char-
acterised by scratching sounds generated by the microphone rubbing against the pasture that provided good
prediction of duration of the time to lie down (R2 = 0.86). Models for vocalization duration (R2 = 0.92) and clas-
sification of the vocalization class (sensitivity 0.99; precision 0.95) demonstrate the feasibility of acoustic-based
determination of vocalization traits, which provide information on the welfare and state of the animal.

1. Introduction Acoustic-based prediction of animal traits requires 1) proximal


recordings of the trait from cow-attached (which allows continuous cow
Acoustic technologies provide a non-invasive method to generate monitoring) or milking shed located (which requires a single sensor per
information about the health, welfare, and environmental impact of herd) sensors; 2) acoustic sensor data of sufficient quality; 3) expert
livestock [11, 18], which can be used for farm management, regula- labelling of the animal trait (which may need to be cross-referenced
tory purposes, and evidence to consumers on the livestock-based prod- with other observations such as video recordings); 4) development of
ucts they consume. Acoustic sensors have been developed to predict calibration algorithms to predict the traits; and 5) model validation in
cow rumination time [16, 24], oestrus [4, 25], intake [3, 7–9], uri- independent studies. A multitude of algorithms have been utilized for
nation time and duration [29, 30], proximity to their calf [20], emo- acoustic prediction of animal traits, including machine learning, neural
tional state [10], coughing [34], identity [10], social isolation (Green networks and deep learning algorithms that interrogate the spectral and
et al., 2018), hunger [15] and detection of lameness [35, 36]. Acous- time features of the acoustic signals [21, 22], although the performance
tic information has also been used to determine cow feed anticipation of the algorithm cannot exceed the performance of the data labelling
[14], respiratory diseases [27], rate of respiration [5, 6], and painful by the expert assessor(s). However, there is no universal best algorithm
husbandry procedures [32]. Acoustic methods have also been utilized for acoustic-based prediction of animal traits, although the development
for gait analysis [13] and identification of individual steps in humans of long short-term memory (LSTM) networks and convolutional neural
[28] and cows [35]. Bioacoustics also provides the opportunity to mea- networks (CNN) [21, 22] may provide for improved classification and
sure traits such as cow lying behaviour [26] and dung production [17], prediction of animal behaviour.
which are normally measured with inertial measurement units (IMU) The primary hypothesis of this study was that algorithms can dif-
and labour-intensive metabolism stalls respectively. Acoustic methods ferentiate between seven different acoustic classes (Grazing, Breathing,
therefore have the potential to be deployed on-farm to measure a wide Walking, Lying Down, Dung, Vocalization, Other) obtained from cow
range of animal behaviours. rear leg attached acoustic sensors that measure on-farm cow behaviour.

E-mail address: paul.shorten@agresearch.co.nz

https://doi.org/10.1016/j.atech.2022.100071
Received 2 March 2022; Received in revised form 11 May 2022; Accepted 16 May 2022
2772-3755/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

were selected to obtain a diverse dataset and included a multitude of


Nomenclature farm environmental sounds including birds (magpies, plovers), insects
(cicadas), wind, rain, thunder, road vehicles (cars, trucks), farm machin-
BiLSTM Bidirectional long short-term memory network ery (tractors, motorbikes), airplanes, gates, scratching and other animal
CNN Convolutional Neural Network and environmental sounds. Background events did not contain acoustic
F acoustic frequency (Hz) recordings of the six primary animal behaviours in this study. Urina-
F1 The F1 statistic tion events were excluded from this study as they have previously been
FN False negative more extensively studied in a 100-fold larger model calibration dataset
FP False positive (6122 urination events and 11413 background events) and validated
IMU Inertial measurement unit in multiple independent datasets (12541 urination events and 14484
LSTM long short-term memory networks background events) with a model precision of 0.98 (the small number
ML machine learning of false positive on-paddock urination events tended to be classified as
MV Majority vote dung events, lying down, or vehicle noise) in validation studies [31].
N Nitrogen
RMSE Root mean squared error 2.3. Acoustic labelling
R2 Coefficient of determination
TN True negative Each 10 second acoustic event was classified as one of seven classes
TP True positive (Dung, Vocalization, Lying Down, Walking, Grazing, Breathing and
Other) based on auditory assessment of the acoustic signal by a single ex-
pert. The acoustic samples for the six animal behaviour traits were also
The secondary hypothesis was that algorithms can predict either the du-
scored for either the duration (Vocalization, Dung, Lying Down), fre-
ration (Vocalization, Dung, Lying Down), frequency (Grazing, Walking),
quency (Grazing, Walking), or period (Breathing) of the trait. Duration
or period (Breathing) of the animal behaviour trait.
traits were scored based on labelling the time of the first and last occur-
rence of the behaviour in the 10 s interval (e.g., some cows generated a
2. Methods sequence of separate vocalizations in the 10 s interval and for simplicity
the duration was based on the start of the first vocalization and the end
2.1. Study site and animals of the last vocalization). The time of each step was identified for the
Walking trait, and the time of each bite was identified for the Grazing
An existing acoustic dataset obtained from rear leg attached record- trait (the duration of the acoustic signal associated with each step/bite
ings from more than 150 dairy cows at multiple locations and times was short relative to the time between steps/bites). Times less than 0.5 s
of year was utilized in this study [29, 30]. The OlympusTM WS-853 from the time of a step were denoted as stepping regions and times less
digital voice recorder (Tokyo, Japan) was used for acoustic recording. than 0.1 s from the time of a bite were denoted as biting regions. The
The sensor was attached to a lightweight board and wired for operation start and end time of each breath was labelled for the breathing trait
with two AAA batteries and two AA batteries. The sensor was placed in to determine breathing regions. Each breath was also identified as an
a custom-made housing with dimensions 111.5 mm × 55 mm × 18 mm exhalation or an inhalation. No time labels occurred in the first or last
that had a VELCROTM strap to wrap around the leg and back onto itself. 0.1 s of the 10 s interval.
The sensor was placed on the rear leg and positioned near the hock as
illustrated in Shorten & Welten [29, 30]. Sensors were placed on cows
2.4. Augmented dataset
for 2 -5 days at the AgResearch Ruakura Research farm and the AgRe-
search Tokanui Research farm located in the Waikato, New Zealand as
An augmented dataset of 35700 signals was generated from the 700
detailed in Shorten & Welten [29, 30]. Dairy cows were managed ac-
labelled acoustic signals. For each acoustic signal, a further 50 signals
cording to usual New Zealand farm management procedures. The cows
were generated by signal stretch, pitch, volume, noise, and time-shift
had free access to drinking water and grazed perennial ryegrass/white
operations (10 signals per operation). Signals were stretched by up to
clover pasture. All animals were observed for abnormal health and be-
25%, pitch was modified by up to 25%, volume was modified by up to
haviour during the studies. All animal manipulations were approved by
50%, absolute and relative noise levels were added to signals with am-
the AgResearch Ruakura Animal Ethics Committee.
plitude of 5% and 10% respectively, and time-shifts of up to 0.1 seconds
were used.
2.2. Acoustic dataset

A total of 700 representative acoustic signals of 10 second dura- 2.5. Spectral analysis
tion were obtained for each of seven classes (Dung, Vocalization, Ly-
ing Down, Walking, Grazing, Breathing and Other), with 100 signals All signals were acquired at a sampling rate of 44,100 Hz. A power
per class. All walking events had at least 2 steps in the 10 second in- spectrogram representation of each 10 s acoustic signal was generated
terval. Dung events were obtained from a variety of seasons, which im- [29, 30]. A total of 256 discrete sampling points were used to calcu-
pacts dung moisture content and consequently the type of dung sound late the discrete Fourier transform, 100 samples were used per overlap
generated on the soil surface. A variety of cow vocalizations (Vocal- between adjoining segments, and a Kaiser window length of 512 with
ization) were acquired from a range of cows. Lying down events were a shape factor of 20 was utilized. Each power spectrum (p) was then
characterised by scratching sounds generated by the microphone rub- converted to dB (10 × log10 (p)). Spectrograms of size 128 × 1070 were
bing against the pasture. Video recordings of animals lying down were generated (128 frequencies at a time-step of 0.0093 s) for each 10 s
utilized in combination with the acoustic recordings to confirm that ly- acoustic signal.
ing down events (Lying Down) were correctly classified by the acous-
tic assessor from the acoustic recordings [29]. Grazing events were ob- 2.6. Classification models
tained over a range of bite frequencies (6 - 22 bites per 10 s interval).
Breathing events were largely obtained over the period when cows were Five individual Convolution Neural Net (CNN) model architectures
sleeping/resting from midnight to 5 a.m., when the acoustic microphone were constructed (Models 1-5), which were then used to create an en-
was more proximal to the head of the cow. Background (Other) events semble classifier (AV5). The input layer for each model was a 128 × 1070

2
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Fig. 1. Representative spectrograms for the seven different acoustic classes (A) Grazing, B) Breathing, C) Walking, D) Lying Down, E) Dung, F) Vocalization, and G)
Other).

spectrogram matrix. The five model architectures are listed in the Sup- The BiLSTM was first used to determine the Walking, Breathing and
plementary Material. The dataset was split into calibration (54% of Grazing regions (times close to a step, bite, or breath). For the Walk-
data), validation (13% of data), and test (33% of data) sets. The cali- ing trait, the biLSTM model was followed by a machine learning (ML)
bration dataset was used for model training and the validation dataset model to predict the number of steps in the 10 s interval, which was then
was used to decide when to terminate the training process. CNN training subsequently rounded to the nearest natural number. The ML model was
was conducted utilizing the stochastic gradient descent method with a based on an ensemble model of Gaussian Process Regression (GPR), Sup-
mini-batch size of 128 signals, initial learning rate of 0.00003, momen- port Vector Machine Regression (SVMR), Partial Least Squares Regres-
tum 0.95, a validation frequency of once per epoch, and a validation sion (PLSR), and Random Forest (RF) models [29, 30]. For the Breathing
patience of 4 epochs was used to terminate model training to avoid over- trait the BiLSTM model was followed by erosion and dilation operations
fitting. (each with a neighbourhood of 0.28 s), labelling of the connected com-
ponents, and calculation of the average time between the start of each
predicted breath. For the Grazing trait the BiLSTM model was followed
2.7. Regression models by erosion and dilation operations (each with a neighbourhood of 0.09
s), labelling of the connected components, and calculation of the number
The CNN architecture of Model 4 (with the removal of the softmax of steps.
layer and the replacement of the classification layer by a regression
layer) was used for the duration of Dung, Vocalization, Lying Down 2.8. Analysis
traits. Training was conducted as per the classification models.
A bidirectional sequence-to-sequence long short-term memory net- The accuracy of classification was calculated for each classification
work (BiLSTM) was used for the Walking, Breathing and Grazing traits. model for the calibration (54% of data), validation (13% of data), and
The BiLSTM architecture was: a sequence input layer (of dimension 128 test (33% of data) augmented datasets. Each dataset contained an equal
with z-score normalization); BiLSTM layer (BiLSTM with 200 hidden number of samples from each of the seven behaviour classes. Confusion
units); dropout layer (20% dropout); BiLSTM layer (BiLSTM with 100 matrix representations of the comparison between observed and AV5
hidden units); dropout layer (20% dropout); BiLSTM layer (BiLSTM with model predicted classification of acoustic events were calculated for the
50 hidden units); dropout layer (20% dropout); fully connected layer (2 augmented dataset split into calibration, validation, and test datasets.
classes); softmax layer; classification layer. The number of correctly classified observations were calculated for each

3
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Fig. 1. Continued

predicted class as percentages of the number of observations in the pre- by somewhat periodic short duration low intensity bites, whereby the
dicted class. The number of correctly classified observations were also pitch increases over the duration of the bite. Breathing events were char-
calculated for each true class as percentages of the number of observa- acterised by periodic long duration low intensity breaths, whereby the
tions of the true class. duration of inhalation and exhalation were potentially different. Walk-
Linear regression models were fit to the relationships between the ing events were characterised by somewhat periodic very high intensity
model predicted and measured duration, frequency (number of events steps with very low frequency. Lying down events were characterised by
per 10 seconds), or period (average time between events) of the six be- somewhat random scratching sounds generated by the microphone rub-
haviours. Performance statistics of the models (R2 and RMSE) were cal- bing against grass. Dung events were characterized by somewhat regular
culated for the duration, frequency, or period of the six behaviours for pulses in sound generated by the dung flow hitting the ground, although
the initial dataset and the augmented dataset (split into calibration, val- the time between pulses in sound was highly variable depending on the
idation, and test datasets). The slope, intercept, and bias for the linear moisture content of the dung. Vocalization events were characterized
relationships between the predicted and measured duration, frequency, by a sustained interval of resonant sound of a fixed pitch over a short
or period of the six behaviours was calculated for the test dataset. All timescale (with multiple overtones), whereby the pitch and volume of
calculations were conducted in MATLAB (The MathWorks). the signal could readily change over the interval. Other events included
a multitude of farm environmental sounds including birds, insects, wind,
3. Results and discussion rain, thunder, road vehicles, farm machinery, gates, animal scratching
and other animal and environmental sounds.
3.1. Acoustic data
3.2. Classification models
A total of 700 acoustic samples of 10 second duration were labelled
in the original dataset (100 samples for each of the 7 classes), and 35700 Models were trained using the calibration augmented dataset, the
acoustic samples of 10 second duration were available in the augmented validation augmented dataset was used to determine when to cease
dataset (an extra 50 samples per original sample). Representative spec- training, and the test augmented dataset was used to evaluate the model
trograms for the seven different acoustic classes (Grazing, Breathing, performance in an independent dataset. The overall accuracy of the five
Walking, Lying Down, Dung, Vocalization, Other) are depicted in Fig. 1. individual classification models and the ensemble (AV5) model for the
The different acoustic classes can be readily differentiated by the charac- calibration, validation and test augmented datasets are listed in Table 1.
teristic features of the spectrograms. Grazing events were characterised Model performance was very similar between the five individual mod-

4
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Table 1 tion, and test datasets (a total of 100 samples per behaviour). The coeffi-
Comparison of the accuracy of classification of acoustic events for the cient of determination ranged from R2 = 0.86 (Lying Down) to R2 = 0.99
six classification models for the augmented dataset (35700 samples (Breathing), with an average of R2 = 0.93 for the six different traits in
(5100 samples per behaviour class) split into calibration (19278 sam-
the test dataset (Table 4). The slope, intercept, and bias for the linear
ples), validation (4641 samples) and test (11781 samples) datasets).
relationships between the predicted and measured duration, frequency,
Models 1-5 were used to construct the ensemble classifier (AV5).
or period of the six behaviours are listed in Table 5 for the test dataset
Model Calibration Accuracy Validation Accuracy Test Accuracy (33 samples). The slope and intercept were close to unity and zero re-
1 98.7% 90.0% 94.6% spectively for the Walking, Breathing and Grazing traits. However, there
2 97.4% 89.1% 93.0% was a slight underprediction bias for the Dung, Vocalization and Lying
3 95.3% 91.3% 94.1% down traits, which all had slopes of 1.15 - 1.3 that were significantly
4 100% 94.1% 95.4%
greater than unity, but an intercept not significantly different to zero.
5 96.9% 92.8% 93.7%
AV5 99.9% 92.6% 96.2%
The performance of the model for respiration rate while sleeping in
the test dataset (R2 = 0.99) is consistent with that obtained by manual-
based labelling of halter audio recordings of respiration in heifers
(R2 = 0.92) grazing pasture at 0800 h, 1000 h, 1200 h and 1500 h [5].
els, with accuracies of 93% - 95% in the test augmented dataset. The The acoustic method provides an alternative to 1) differential pressure-
overall accuracy of the AV5 ensemble classification model was 99.9%, based methods (R2 = 0.85 for dozing; R2 = 0.96 for lying; and R2 = 0.99
92.6%, and 96.2% for the calibration, validation, and test augmented for standing) that require a silicon tube to be inserted 10 cm into the
datasets, respectively (Table 1). The performance of the AV5 ensemble nostril of the cow [33] and thermistor-based sensors (R2 = 0.95) that
model was better than each individual classification model in the test require a thermistor to be near the nostril [19]. The acoustic-based av-
augmented dataset as expected. erage respiration rate was 21 full breaths (inhalation and exhalation)
Confusion matrix representations of the comparison between ob- per minute in this study, which is consistent with previous studfies for
served and AV5 predicted classification of acoustic events for the aug- cows that were not under heat stress [19, 33].
mented dataset (35700 samples) split into calibration (19278 samples), The performance of the model for bite rate in the test dataset
validation (4641 samples) and test (11781 samples) datasets are shown (R2 = 0.91) is consistent with that obtained by Delagarde et al., [7] who
in Table 2. Classification errors were typically less than 5% although used a collar attached microphone (R2 = 0.95). Forehead attached mi-
around 11% of Lying Down events were falsely predicted to be Dung crophones have also been used to predict bite rate and dry matter in-
events in the test augmented dataset. Similarly, falsely predicted Dung take (R2 = 0.88) over a range of forage types [8, 9]. Acoustic sensors
events were typically Lying Down events. Falsely predicted Other events therefore have the potential to be combined with pasture nitrogen (N)
were typically Vocalization events and falsely predicted Grazing events measurements and animal metabolic models to estimate the N content
were typically Walking events. This highlights that prediction errors of individual urination events [30] relative to the total N intake of in-
were relatively low in the augmented test dataset, some classes were dividual cows. Acoustic sensors also have the potential to identify the
more difficult to differentiate (e.g., Dung and Lying Down classes), and feed composition of the diet consumed by individual animals based on
the model classification errors tended to be grouped in more closely forage-dependent bite sounds [8], although further trials are required.
related classes. The good classification performance for the Other cat- The duration of dung events provides information about the quantity
egory, which includes a diverse range of other farm environmental of dung deposited during each event. The frequency (number per day)
sounds, can be interpreted in terms of the classification performance of and duration of dung events could also potentially provide information
the other categories (Grazing, Breathing, Walking, Lying Down, Dung, about the total amount of dung deposited per day and the distribution
Vocalization) and the process of elimination. in the size of dung events. The ten-fold variation in the duration of dung
The classification accuracy for the 7 classes in the test dataset events in this study is consistent with the 10-fold variability in dung pat
(96.2%) is consistent with the classification accuracy of 91% using a area [12]. The size of the dung event is also a factor in the temperature,
hidden Markov model for 210 acoustic vocalization signals (cow vo- rate of ammonia emission [23], and rate of decomposition of dung [1].
calization; calf vocalization; coughing; oestrus; milking delayed; noisy Dung moisture content is also an important factor that impacts on the
inhaling; hunger) collected in an independent test dataset [15]. rate of decomposition of dung [37].
The acoustic method provides an alternative to pedometer and
3.3. Regression models accelerometer-based methods for calculation of stepping frequency, dis-
tance travelled and detection of lameness. Acoustic sensors can also
The acoustic samples for the six animal behaviour traits were scored potentially provide information about surface walking type (pasture,
for either the duration (Vocalization, Dung, Lying Down), frequency laneway, and concrete), which could be combined with travelling dis-
(Grazing, Walking), or period (Breathing) of the trait. There was ap- tance, farm maps, environmental sounds (such as road noise) and
proximately five-fold variation in each trait (Fig. 2). Models for each changes in pasture cover from satellite images to provide approximate
trait were trained using the calibration augmented dataset, the valida- information on the location of animals on the farm without the use of
tion augmented dataset was used to determine when to cease training, GPS technologies. Stepping events (or the lack of events) are also as-
and the test augmented dataset was used to evaluate the model perfor- sociated with other cow behaviours such as urination, lying down and
mance in an independent dataset. oestrus.
The performance of the models for the duration, frequency, or period Cows that graze pasture typically spend 10 – 12 hours per day lying
of the six behaviours are depicted in Fig. 2 for the calibration, validation, down [26] and lying behaviour is normally measured with inertial mea-
and test augmented datasets (a total of 5100 samples per behaviour). surement units (IMU). Lying down events were characterised by some-
The coefficient of determination ranged from R2 = 0.81 (Lying Down) what random scratching sounds generated by the microphone rubbing
to R2 = 0.98 (Breathing), with an average of R2 = 0.91 for the six differ- against the pasture and acoustic recordings provided good prediction
ent traits in the augmented test dataset (Table 3). The model residuals of duration of the time to lie down (R2 = 0.86). This information com-
(predicted – measured) were approximately normally distributed and bined with times not walking could provide predictions of time spent
exhibited low heteroscedasticity (the error variance was relatively con- lying. Lying time is also reduced for wet and muddy surfaces [26] and
stant over the range of predicted values). there is potential to use the acoustic technology to determine the lying
The performance of the models for the duration, frequency, or period surface type (pasture, concrete, wood chip) and moisture content (wet,
of the six behaviours are depicted in Fig. 3 for the calibration, valida- dry).

5
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Table 2
Confusion matrix representations of the comparison between observed and AV5 predicted classification of acoustic events for the augmented
dataset (35700 samples (5100 per behaviour class) split into A) calibration (19278 samples), B) validation (4641 samples) and C) test (11781
samples) datasets).

Confusion Matrix for Calibration Data

TrueBreathing 14.30% 100.00%


ClassDung 14.30% 0.00% 100.00% 0.00%
Grazing 14.30% 100.00%
Lying Down 0.00% I 14.2% I 0.10% 99.40% 0.60%
Other I 14.3% I 100.00%
Vocalization 0.00% 14.30% 99.90% 0.10%
Walking 14.30% 100.00%
100.00% 100.00% 99.90% 100.00% 99.90% 99.40% 100.00%
0.10% 0.10% 0.60%
Breathing Dung Grazing Lying Down Other Predicted Vocalization Class Walking
Confusion Matrix for Validation Data
TrueBreathing 14.20% 0.00% 0.00% 99.60% 0.40%
ClassDung 14.30% 100.00%
Grazing 1.00% 13.20% 0.10% 92.50% 7.50%
Lying Down 2.70% 11.50% 0.10% 80.20% 19.80%
Other 0.10% 10.90% 3.30% 76.40% 23.60%
Vocalization 0.00% 0.00% 14.20% 99.60% 0.40%
Walking 14.30% 100.00%
93.40% 84.00% 99.80% 99.10% 99.80% 80.50% 99.60%
6.60% 16.00% 0.20% 0.90% 0.20% 19.50% 0.40%
Breathing Dung Grazing Lying Down Other Predicted Vocalization Class Walking
Confusion Matrix for Test Data
TrueBreathing 14.20% 0.10% 99.20% 0.80%
ClassDung 13.80% 0.50% 96.70% 3.30%
Grazing 13.60% 0.70% 95.40% 4.60%
Lying Down 1.60% 12.60% 0.10% 88.20% 11.80%
Other 0.10% 0.00% 13.50% 0.60% 94.50% 5.50%
Vocalization 0.00% 0.00% 0.00% 14.20% 99.60% 0.40%
Walking 14.30% 100.00%
100.00% 88.80% 99.80% 95.50% 99.90% 95.10% 95.60%
11.20% 0.20% 4.50% 0.10% 4.90% 4.40%
Breathing Dung Grazing Lying Down Other Predicted Vocalization Class Walking

Table 3
Performance statistics of the models for the duration, frequency, or period of the six behaviours for the cal-
ibration, validation, and test augmented datasets (5100 samples per behaviour split into calibration (2754
samples), validation (663 samples) and test (1683 samples) datasets).1

(N=5100) Calibration R2 Calibration RMSE Validation R2 Validation RMSE Test R2 Test RMSE

Walking 0.998 0.07 0.94 0.36 0.92 0.42


Lying Down 0.94 0.48 0.67 0.89 0.86 0.69
Breathing 0.986 0.06 0.99 0.03 0.98 0.06
Grazing 0.884 0.90 0.92 1.08 0.81 1.3
Vocalization 0.98 0.34 0.35 0.60 0.92 0.58
Dung 0.99 0.18 0.96 0.40 0.96 0.54
1
RMSE units denote the duration (s) of the behaviour (Dung, Vocalization, Lying Down), the frequency
(number per 10 s) of the behaviour (Walking, Grazing), or the average time (s) between the behaviour (Breath-
ing).

Table 4
Performance statistics of the models for the duration, frequency, or period of the six behaviours for the cali-
bration, validation, and test datasets (100 samples per behaviour split into calibration (54 samples), validation
(13 samples) and test (33 samples) datasets).1

(N=100) Calibration R2 Calibration RMSE Validation R2 Validation RMSE Test R2 Test RMSE

Walking 1.0 0.0 0.96 0.29 0.92 0.42


Lying Down 0.94 0.48 0.67 0.89 0.86 0.69
Breathing 0.994 0.04 0.99 0.01 0.99 0.03
Grazing 0.985 0.30 0.94 0.30 0.91 0.90
Vocalization 0.98 0.34 0.35 0.60 0.92 0.58
Dung 0.99 0.18 0.96 0.40 0.96 0.54
1
RMSE units denote the duration (s) of the behaviour (Dung, Vocalization, Lying Down), the frequency
(number per 10 s) of the behaviour (Walking, Grazing), or the average time (s) between the behaviour (Breath-
ing).

6
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Fig. 2. Performance of the models for the duration (seconds), frequency (number of events per 10 seconds), or period (average time between events) of the six
behaviours (A) Dung, B) Vocalization, C) Lying Down, D) Walking, E) Breathing, and F) Grazing) for the calibration, validation, and test datasets (5100 samples per
behaviour split into calibration (2754 samples; blue square symbol), validation (663 samples; red plus symbol) and test (1683 samples; black cross symbol) datasets).
A small amount of noise (with a standard deviation of 0.1 bites per 10 s) was added to predictions for the Grazing trait to improve the visualisation of the distribution
in the model predictions. Lines denote the regression.

7
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Fig. 3. Performance of the models for the duration, frequency (number of events per 10 seconds), or period of the six behaviours (A) Dung, B) Vocalization, C) Lying
Down, D) Walking, E) Breathing, and F) Grazing) for the calibration, validation, and test datasets (100 samples per behaviour split into calibration (54 samples; blue
square symbol), validation (13 samples; red plus symbol) and test (33 samples; black cross symbol) datasets). A small amount of noise (with a standard deviation of
0.1 bites per 10 s) was added to predictions for the Grazing trait to improve the visualisation of the distribution in model predictions.

8
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

Table 5 References
Slope, intercept, and bias for the linear relationships be-
tween the predicted and measured duration, frequency, [1] S. Aarons, C. O’Connor, H. Hosseini, C Gourley, Dung pads increase pasture pro-
or period of the six behaviours for the test dataset (33 duction, soil nutrients and microbial biomass carbon in grazed dairy systems, Nutr.
samples).1 Cycling Agroecosyst. 84 (2009) 81–92.
[2] E.F. Briefer, Vocal expression of emotions in mammals: mechanisms of production
(N=100) Test Slope Test Intercept Test Bias and evidence, J. Zool. 288 (2012) 1–20.
[3] J.O. Chelotti, S.R. Vanrell, D.H. Milone, S.A. Utsumi, J.R. Galli, H.L. Rufiner,
Walking 1.01 ± 0.05 0.12 ± 0.24 -0.12 L.L. Giovanini, A real-time algorithm for acoustic monitoring of ingestive behavior
Lying Down 1.20 ± 0.09 0.68 ± 0.32 -1.39 of grazing cattle, Comput. Electron. Agric. 127 (2016) 64–75.
Breathing 1.01 ± 0.01 -0.03 ± 0.02 0.01 [4] Y. Chung, J. Lee, S. Oh, D. Park, H.H. Chang, S. Kim, Automatic detection of
Grazing 1.03 ± 0.06 -0.46 ± 0.74 0.12 cow’s Oestrus in audio surveillance system, Asian-Austr. J. Animal Sci. 26 (2013)
Vocalization 1.30 ± 0.07 0.11 ± 0.16 -0.64 1030–1037.
Dung 1.15 ± 0.04 -0.47 ± 0.23 -0.27 [5] G.A. de Carvalho, A.K.D. Salman, P.G. da Cruz, E.C. de Souza, F.R.F. da Silva,
E. Schmitt, Technical note: an acoustic method for assessing the respiration rate
1
Units for the intercept and bias (predicted – mea- of free-grazing dairy cattle, Livestock Sci. 241 (2020) 104270.
[6] M.P. de la Torre, E.F. Briefer, T. Reader, A.G. McElligott, Acoustic analysis of cattle
sured) are the duration (s) of the behaviour (Dung, Vo-
(Bos taurus) mother–offspring contact calls from a source–filter theory perspective,
calization, Lying Down), the frequency (number per 10 Appl. Animal Behav. Sci. 163 (2015) 58–68.
s) of the behaviour (Walking, Grazing), or the average [7] R. Delagarde, J.-P. Caudal, J.-L. Peyraud, Development of an automatic bitemeter
time (s) between the behaviour (Breathing). for grazing cattle, Annales de Zootechnie 48 (1999) 329–339.
[8] J.R. Galli, C.A. Cangiano, M.A. Pece, M.J. Larripa, D.H. Milone, S.A. Utsumi,
E.A. Laca, Monitoring and assessment of ingestive chewing sounds for prediction
of herbage intake rate in grazing cattle, Animal 12 (2018) 973–982.
The vocalization of cows provides information on the welfare and [9] J.R. Galli, C.A. Cangiano, M.W. Demment, E.A. Laca, Acoustic monitoring of chewing
state of the animal ([14, 15]; Green et al., 2018), although the vocal- and intake of fresh and dry forages in steers, Anim. Feed Sci. Technol. 128 (2006)
14–30.
ization patterns are complex and are influenced by animal, herd, age, [10] A. Green, C. Clark, L. Favaro, S. Lomax, D. Reby, Vocal individuality of Hol-
and other factors. Some information is encoded in the duration (s), fre- stein-Friesian cattle is maintained across putatively positive and negative farming
quency (Hz) and rate (number per unit time) of vocalization [2]. Models contexts, Nature Sci. Reports 9 (2019) 18468.
[11] A. Green, I.N. Johnston, C. Clark, Invited review: the evolution of cattle bioacoustics
for vocalization duration (R2 = 0.92), and classification of the vocaliza-
and application for advanced dairy systems, Animal 12 (2020) 1250–1259.
tion class (sensitivity of 0.99 and precision of 0.95 in the test dataset) [12] M. Hirata, S. Ogura, M. Furuse, Fine-scale spatial distribution of herbage mass,
demonstrate the feasibility of acoustic-based determination of vocaliza- herbage consumption and fecal deposition by cattle in a pasture under intensive
tion traits. Further studies are required to investigate the role of animal rotational grazing, Ecol. Res. 26 (2011) 289–299.
[13] J. Huang, F. Di Troia, M. Stamp, Acoustic gait analysis using support vector ma-
environment on animal vocalization, which could contribute to on-farm chines, in: ICISSP 2018 - Proceedings of the 4th International Conference on Infor-
based assessment of animal health and welfare. mation Systems Security and Privacy, 2018, pp. 545–552.
[14] Y. Ikeda, Y. Ishii, Recognition of two psychological conditions of a single cow by her
voice, Comput. Electron. Agric. 62 (2008) 67–72.
[15] G. Jahns, Call recognition to identify cow conditions—A call-recogniser translating
4. Conclusions
calls to text, Comput. Electron. Agric. 62 (2008) 54–58.
[16] L. Kovács, F.L. Kézér, F. Ruff, O. Szenci, Rumination time and reticuloruminal tem-
This study demonstrated that rear leg attached acoustic sensors can perature as possible predictors of dystocia in dairy cows, J. Dairy Sci. 100 (2017)
1568–1579.
be used to differentiate between seven different acoustic classes (Graz-
[17] S.F. Ledgard, B. Welten, K. Betteridge, Salt as a mitigation option for decreas-
ing, Breathing, Walking, Lying Down, Dung, Vocalization, Other) ob- ing nitrogen leaching losses from grazed pastures, J. Sci. Food. Agric. 95 (2015)
tained from cows under grazing conditions. The overall accuracy of 3033–3040.
the ensemble classification model was 96.2% for the test augmented [18] M.P. Mcloughlin, R. Stewart, A.G. McElligott, Automated bioacoustics: methods in
ecology and conservation and their potential for animal welfare monitoring, J. R.
dataset. The performance of the models for the duration, frequency, or Soc. Interface. 16 (2019) 20190225.
period of the six animal behaviours (Grazing, Breathing, Walking, Ly- [19] H.F.M. Milan, A.S.C. Maia, K.G. Gebremedhin, Technical note: device for measuring
ing Down, Dung, Vocalization) had coefficients of determination that respiration rate of cattle under field conditions, J. Anim. Sci. 94 (2016) 5434–5438.
[20] M. Padilla de la Torre, E.F. Briefer, T. Reader, A.G. McElligott, Acoustic analysis of
were R2 = 0.91 (bite rate), R2 = 0.99 (Breathing), R2 = 0.91 (stepping cattle (Bos taurus) mother–offspring contact calls from a source–filter theory per-
frequency), R2 = 0.86 (duration of the time to lie down), R2 = 0.96 (du- spective, Appl. Animal Behav. Sci. 163 (2015) 58–68.
ration of dung events), and R2 = 0.92 (vocalization duration), with an [21] D. Palaz, M. Magimai-Doss, R. Collobert, End-to-end acoustic modeling using con-
volutional neural networks for HMM-based automatic speech recognition, Speech
average of R2 = 0.93 for the six different behaviour traits in the test Commun. 108 (2019) 15–32.
dataset. Further studies are required to refine these acoustic models and [22] H. Purwins, B. Li, T. Virtanen, J. Schlüter, S. Chang, T. Sainath, Deep learn-
investigate the role of animal environment on animal behaviour, which ing for audio signal processing, J. Selected Topics Signal Process. 13 (2019)
206–219.
could contribute to on-farm based non-invasive assessment of animal
[23] M.R. Redding, R. Lewis, P.R. Shorten, Simultaneous measurements of ammonia
health and welfare. volatilisation and deposition at a beef feedlot, Animal Prod. Sci. 59 (2019) 160–168.
[24] S. Reith, S. Hoy, Relationship between daily rumination time and estrus of dairy
cows, J. Dairy Sci. 95 (2013) 6416–6420.
Funding [25] V. Röttgen, P.C. Schön, F. Becker, A. Tuchscherer, C. Wrenzycki, S. Düpjan, B. Puppe,
Automatic recording of individual oestrus vocalisation in group-housed dairy cattle:
development of a cattle call monitor, Animal 14 (2020) 198–205.
This work was funded by AgResearch, New Zealand. [26] K.E.V.M.Cave Schütz, N.R. Cox, F.J. Huddart, C.B. Tucker, Effects of 3 surface
types on dairy cattle behavior, preference, and hygiene, J. Dairy Sci. 102 (2019)
1530–1541.
Declaration of Competing Interests [27] P.R. Scott, Clinical presentation, auscultation recordings, ultrasonographic findings
and treatment response of 12 adult cattle with chronic suppurative pneumonia: case
study, Ir. Vet. J. 66 (2013) 43.
The authors declare that they have no known competing financial [28] B. She, Framework of footstep detection in in-door environment, in: Proceedings of
interests or personal relationships that could have appeared to influence International Congress on Acoustics, Kyoto, Japan, 2004, pp. 715–718. pages.
[29] P.R. Shorten, B.G. Welten, Assessment of a non-invasive acoustic sensor for detecting
the work reported in this paper. cattle urination events, Biosyst. Eng. 207 (2021) 177–187.
[30] P.R. Shorten, B.G. Welten, An acoustic sensor technology to detect urine excretion,
Biosyst. Eng. 214 (2022) 90–106.
Supplementary materials [31] P.R. Shorten, B.G. Welten, Acoustic sensor determination of repeatable cow urina-
tions traits in winter and spring, Comput. Electron. Agric. 196 (2022) 106846.
[32] G. Stilwell, M.S. Lima, D.M. Broom, Comparing plasma cortisol and behaviour of
Supplementary material associated with this article can be found, in calves dehorned with caustic paste after non-steroidal-antiinflammatory analgesia,
the online version, at doi:10.1016/j.atech.2022.100071. Livestock Sci. 119 (2008) 63–69.

9
P.R. Shorten Smart Agricultural Technology 3 (2023) 100071

[33] S. Strutzke, D. Fiske, G. Hoffmann, C. Ammon, W. Heuwiser, T. Amon, Technical [35] N. Volkmann, B. Kulig, N. Kemper, Using the footfall sound of dairy cows for detect-
note: Development of a noninvasive respiration rate sensor for cattle, J. Dairy Sci. ing claw lesions, Animals 9 (2019) 78.
102 (2019) 690–695. [36] N. Volkmann, B. Kulig, S. Hoppe, J. Stracke, O. Hensel, N. Kemper, On-farm detec-
[34] J. Vandermeulen, C. Bahr, D. Johnston, B. Earley, E. Tullo, I. Fontana, M. Guarino, tion of claw lesions in dairy cows based on acoustic analyses and machine learning,
V. Exadaktylos, D. Berckmans, Early recognition of bovine respiratory disease in J. Dairy Sci. 104 (2021) 5921–5931.
calves using automated continuous monitoring of cough sounds, Comput. Electron. [37] S. Yoshitake, H. Soutome, H. Koizumi, Deposition and decomposition of cattle dung
Agric. 129 (2016) 15–26. and its impact on soil properties and plant growth in a cool-temperate pasture, Ecol.
Res. 29 (2014) 673–684.

10

You might also like