You are on page 1of 6

The Veterinary Journal 196 (2013) 247–252

Contents lists available at SciVerse ScienceDirect

The Veterinary Journal


journal homepage: www.elsevier.com/locate/tvjl

Evaluation of a novel feather scoring system for monitoring feather damaging


behaviour in parrots
Yvonne R.A. van Zeeland ⇑, Madeleine J. Bergers, Lisette van der Valk, Nico J. Schoemaker,
Johannes T. Lumeij
Division of Zoological Medicine, Department of Clinical Sciences of Companion Animals, Faculty of Veterinary Medicine, Utrecht University, Yalelaan 108, 3584 CM Utrecht,
The Netherlands

a r t i c l e i n f o a b s t r a c t

Article history: Feather damaging behaviour is common in captive psittacine birds and there is a need for reliable meth-
Accepted 28 August 2012 ods to evaluate the efficacy of therapeutic and preventive interventions. This study compared the inter-
and intra-observer reliabilities of a novel feather scoring system with an existing system to assess the
plumage of grey parrots (Psittacus erithacus). Regions of the body were photographed separately at
Keywords: 1 week intervals and shown at random to 35 examiners (avian veterinarians and veterinary students),
Psittacine who used the two scoring systems to assess plumage. Since the quality of the photographs was insuffi-
Plumage
cient to allow accurate assessment of the individual flight and tail feathers, the novel scoring system was
Feather picking
Feather plucking
only evaluated for its reliability regarding covert and down feathers. Inter- and intra-observer reliabilities
Scoring system were determined using the intra-class correlation coefficient. Bland–Altman analysis was performed to
determine absolute reliabilities for both systems.
Correlation coefficients were 0.90 and 0.95 for intra-observer reliability and 0.83 and 0.89 for inter-
observer reliability for the existing and novel feather scoring systems, respectively. When using the novel
system, a change in plumage condition of P10% was needed to ensure that the change reflected a real
difference in 95% of cases, while a change of P15% was needed for the existing system. Since it may take
from 4 weeks (covert or down feathers) to over 1 year (flight or tail feathers) for feathers to regrow, suf-
ficient time should be allowed to elapse between two scoring sessions to reliably evaluate the efficacy of
preventive or therapeutic interventions for feather damaging behaviour.
Ó 2012 Elsevier Ltd. All rights reserved.

Introduction and medical causes (van Zeeland et al., 2009). Prior to addressing
behaviour and environment, ‘underlying’ medical problems should
Feather damaging behaviour (also referred to as feather picking be ruled out (van Zeeland et al., 2009). Management of (non-med-
or plucking) is one of the most common behavioural problems in ical) feather damaging behaviour is challenging. Many treatment
captive parrots, with an estimated prevalence of 10% (Grindlinger modalities have been proposed, including behavioural modifica-
and Ramsay, 1991; Gaskins and Bergman, 2011). Feather damaging tion (Davis, 1995), environmental enrichment with toys and forag-
behaviour includes plucking, chewing, fraying and/or biting, ing activities (Meehan et al., 2003; Lumeij and Hommers, 2008),
resulting in loss of or damage to the feathers. Contour and down pharmacological intervention (Ramsay and Grindlinger, 1994;
feathers (which are often plucked or pulled) are usually the pri- Seibert et al., 2004; Seibert, 2007) and local application of foul
mary targets, but tail and flight feathers (which are often chewed) tasting substances and/or collars (Rosskopf and Woerpel, 1996;
may also be affected (Nett and Tully, 2003). Feather damage usu- Kaleta, 2003).
ally occurs in the readily accessible regions of the neck, chest, flank, It is necessary to assess changes in feather damaging behaviour
inner thigh and wing web (Harrison, 1986; Van Hoek and King, over time to evaluate the efficacy of preventive or therapeutic
1997; Nett and Tully, 2003), while the feathers on the parrot’s head interventions. Ramsay and Grindlinger (1994) quantified the time
remain intact (Harrison, 1986). spent on feather damaging by direct behavioural observation, but
A variety of causes of feather damaging behaviour have been this method does not appear to be reliable, since it is difficult to
suggested, including socio-environmental, neurobiological, genetic distinguish between preening and feather damaging behaviour
(Van Hoek and King, 1997). Furthermore, bouts of feather damag-
ing behaviour may not be observed clinically, since they often take
⇑ Corresponding author. Tel.: +31 30 2534542. place at night (Meehan et al., 2003). Feather scoring systems,
E-mail address: Y.R.A.vanZeeland@uu.nl (Y.R.A. van Zeeland). which measure feather damaging behaviour indirectly by assessing

1090-0233/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.tvjl.2012.08.020
248 Y.R.A. van Zeeland et al. / The Veterinary Journal 196 (2013) 247–252

Table 1
Ten-point feather scoring assessment (Meehan et al., 2003).

Score Description
(a) Scoring system used for chest/flank, back and legs
0 All or most feathers removed, down removed and skin exposed, evidence of skin or tissue injury
0.25 All or most feathers removed, down removed and skin exposed, no evidence of skin or tissue injury
0.5 All or most feathers removed, some down removed, patches of skin exposed
0.75 All or most feathers removed, down exposed and intact or feathers removed from more than half of the area, some down removed, patches of skin exposed
1.0 Feathers removed from less than half of the area, some down removed and skin exposed
1.25 Feathers removed from more than half of the area, down exposed and intact
1.5 Feathers removed from less than half of the area, down exposed and intact
1.75 Feathers intact with fraying or breakage
2.0 Feathers intact with little or no fraying or breakage
(b) Scoring system used for wings
0 All or most primary feathers, secondary feathers and coverts removed, down removed, skin exposed, evidence of skin or tissue injury
0.5 All or most primary feathers, secondary feathers and coverts removed, down removed, skin exposed, no evidence of injury
1.0 More than half of coverts removed, down exposed and intact or more than half of primary and secondary feathers removed, down exposed and intact
1.5 Fewer than half of coverts removed, down exposed and intact or fewer than half of primary and secondary feathers removed, down exposed and intact or
primary and secondary feathers intact with significant breakage and fraying
2.0 Feathers intact with little or no fraying or breakage
(c) Scoring system used for tail
0 All or most tail feathers removed or broken
1 Some tail feathers removed or broken or significant fraying of tail feathers
2 Feathers intact with little or no fraying or breakage

Total score (0–10) = (chest/flank) + (back) + (legs) + (wings) + (tail).

plumage condition, may be used as an alternative to direct obser- The novel feather scoring system consists of two parts: (1) Part A assesses the
covert and down feathers, and (2) Part B assesses the flight feathers; for both parts,
vations. Similar systems have been used in feather pecking laying
a maximum score of 100 can be obtained (Table 2).
hens (Bilcik and Keeling, 1999; Kjaer et al., 2011).
For any test that involves scoring by humans, there is a possibil-
Birds for assessment
ity that bias or error related to human judgement is introduced.
This error may be present within a single person performing Twenty-five privately-owned grey parrots (Psittacus erithacus erithacus) with
multiple ratings of a subject and/or between people scoring the various degrees of feather damaging behaviour, ranging from almost perfectly
same subject (Bruton et al., 2000). Prior to using a scoring method, feathered to almost completely bald, were selected at random from the patient pop-
ulation of the Utrecht University Veterinary Teaching Hospital. All birds were pre-
it should be evaluated for inter- and intra-observer reliabilities
sented to the clinic for consultation regarding feather damaging behaviour, which
(Bland and Altman, 1986). included a behavioural assessment and a routine physical examination to assess
Meehan et al. (2003) developed a 10-point feather scoring the bird’s plumage condition and to exclude underlying medical problems.
system for evaluation of feather damaging behaviour in parrots (Ta- During the physical examination, standardised photographs were taken of each
ble 1); using the Pearson correlation coefficient, the inter-observer body part to be assessed: (1) chest and neck (front); (2) back; (3) ventral surfaces of
the left and right wings in fully extended position; (4) dorsal surfaces of the left and
reliability for two observers was >0.76. However, this coefficient right wings in fully extended position; (5) front view of the left and right legs in ex-
only considers relative position (i.e. linear correlation between rat- tended position, and (6) tail with feathers spread.
ings) rather than agreement in specific segments or absolute agree- To prevent bias from recognition, pictures were trimmed to show only the body
ment of ratings (Keating and Matyas, 1998). Several drawbacks part of interest (without background detail) and subsequently mixed and ordered at
random (using a random sequence generator programme1) to create a set of pic-
have also been encountered when using the scoring system of Mee-
tures representing a complete parrot.
han et al. (2003): (1) the use of descriptive terminology increases
the degree of subjective interpretation by the observer; (2) the lim-
Study design
ited, incomplete, range of descriptions may result in choosing a
suboptimal alternative; (3) the commonly involved ventral surfaces At least 7 days prior to the start of the study, the examiners were asked to as-
of the wings are not assessed; (4) the relative size of each body part sess photographs of three parrots (not included in the trial) to familiarise them-
is omitted, and (5) no distinction is made between the types of selves with the scoring procedure. For the assessment of intra-observer
reliability, nine students and nine veterinarians were randomly selected from the
damage to the feathers, although a different time frame is needed
groups using a random sequence generator programme1. These examiners were
for replacement of different feather groups. In the present study, asked to score the plumage condition of 10 parrots twice using both scoring systems.
a novel feather scoring system was developed and evaluated by Photographs of the various body parts were distributed in batches of five parrots at a
comparison with the scoring system of Meehan et al. (2003). time. A minimum of 7 days elapsed between the first and second scoring dates to en-
sure that enough time had elapsed to minimise the influence of memory (Streiner and
Norman, 2003). Of the observers participating in the intra-observer trial, three veter-
Materials and methods
inarians and four students scored only 5/10 parrots twice.
To determine inter-observer reliability, the other 17 examiners used both scor-
Observers
ing systems for a single assessment of the plumage condition of 15 parrots. Photo-
graphs of the parrots were distributed in batches of five parrots at a time, with an
A total of 35 observers with a veterinary background were recruited from Eur-
interval of 7 days between the first and second scoring dates.
ope and North America. The observers were either avian veterinarians examining at
Scoring sessions were offered digitally to the examiners in an on-line survey
least one psittacine bird per week for a period >1 year (experienced in evaluating
(Lime-survey, Carsten Schmitz2). Depending on the body part that was involved,
plumage; n = 16) or fourth-year veterinary students (inexperienced in evaluating
1–4 multiple choice questions were used to assess the plumage condition with both
plumage; n = 19). None of the observers had prior experience in using either feather
scoring systems, resulting in a total of 20 questions for each set of photographs (see
scoring system.
Appendix A: Supplementary material 1). Based on the answers of the examiners, the
corresponding scores were calculated for each body part and added up to a total score
Feather scoring systems for each individual parrot.

The feather scoring system of Meehan et al. (2003) involves a 10-point scoring
1
system for assessment of the feather condition of five separate body areas, i.e. See: http://www.random.org.
2
chest/flank, back, legs, tail and wings (Table 1). See: http://docs.limesurvey.org/.
Y.R.A. van Zeeland et al. / The Veterinary Journal 196 (2013) 247–252 249

Table 2
Novel feather scoring system.
(1) Score determination table for coverts and down feathers; used for chest/neck/flank, back, legs, dorsal and ventral surface of the wings.

Coverts Down feathers


No down removed <50% of down removed >50% of down removed All down removed
All coverts intact 100 85 70 60
Fraying or breakage of feathers 95 80 65 55
<25% of coverts removed 90 75 60 50
25–50% of coverts removed 80 65 50 40
50–75% of coverts removed 70 55 40 30
75–90% of coverts removed 60 45 30 20
>90% of coverts removed 50 35 20 10
The percentage of damage to the covert and down feathers is assessed for each body part separately.
Deduct 10 points from the score if skin damage is present.
Total body plumage score (0–100) = 0.25  chest/flank + 0.17  back + 0.10  legs + 0.28  dorsal wings + 0.20  (ventral wings).
To determine the total body plumage score, the scores for each body part are corrected for their relative body surface percentage, similar to scoring systems used in
human burn victims (Lund and Browder, 1944). These percentages (expressed as percentage of the total body surface area excluding the surface area of the head and
unfeathered parts of the legs) were determined in six grey parrots according to Mitchell (1929). Mean (± standard deviation) values for the various body parts were
25 ± 1.2% (chest/neck/flank), 17 ± 1.5% (back), 10 ± 1.2% (legs), 28 ± 2.2% (dorsal surface of the wings, up to the level of the tertiary feathers) and 20 ± 1.9% (ventral
surface of the wings, up to the level of the tertiary feathers).
(2) Score determination for flight feathers; used for tail feathers and primary and secondary feathers (wing feathers).

Score Description
0 Flight feather with signs of fraying and/or breakage over >50% of the original length
1 Flight feather with signs of fraying and/or breakage over <50% of the original length
2 Flight feather with little or no damage present
Damage to individual flight feathers is assessed.
Total flight feather score (0–100) = (primary + secondary feathers left wing) + (primary + secondary feathers right wing) + (tail feathers).
The maximum score is dependent on total number of flight feathers of the bird. In general, each wing has 10 primary feathers and 10 secondary feathers (remiges),
whereas the tail has 10–12 flight feathers (rectrices). As each individual flight feather is awarded a score from 0–2, the score will range from 0–40 for each wing and
from 0–20 (or 0–24 in the case of 12 tail feathers) for the tail, respectively.

No questions were included on the status of the flight and tail feathers (needed reliability (>0.90). The novel scoring system had significantly higher
to determine the score for part B of the novel scoring system), since it was impos-
reliability, especially when comparing the consistency in scoring for
sible to spread the wings and tail sufficiently to allow accurate assessment of the
individual flight and tail feathers from the photographs.
the group of veterinarians who were experienced in examining
avian plumage (P < 0.01; Table 3). Both systems had overall good in-
Statistical analysis ter-observer reliability (Meehan’s system 0.83; novel scoring sys-
tem 0.89) and there were no significant differences between the
Statistical analyses were performed using SPSS version 16.0. The relative inter- two scoring systems (Table 4). No significant differences in intra-
and intra-observer reliabilities of each scoring system were assessed using the in-
or inter-observer reliabilities were found between the two observer
tra-class correlation coefficient (ICC). A two-way random effects model for single
measurements (ICC[2,1]) was used to calculate the ICC and corresponding 95% con- groups for either scoring system (Tables 3 and 4).
fidence intervals (CIs) for the total plumage scores (Fleiss and Cohen, 1973; Shrout
and Fleiss, 1979). For ICC, values range from 0 to 1, with a value of ‘0’ representing
no agreement and a value of ‘1’ representing perfect agreement. In general, values Absolute reliability, measurement error and minimum change
>0.75 and >0.90 indicate good and excellent agreement, respectively (Fleiss, 1981;
Portney and Watkins, 2000). The SEM for individual observers ranged from 0.19 to 0.89
Bland–Altman analysis was performed to determine absolute reliability (Bland (mean 0.53 ± 0.19) on a scale of 0–10 for Meehan’s scoring system
and Altman, 1996; Weir, 2005). The inter-observer standard error of measurement
(SEM) was calculated using the scores of all examiners simultaneously. The intra-
and from 1.5 to 8.4 (mean 3.6 ± 1.7) on a scale of 0–100 for the no-
observer SEM was determined as the mean (± standard deviation) of the individual vel scoring system (Table 3). The minimum real difference (i.e. the
SEMs (which were calculated using the repeated scores of 10 parrots for each exam- minimum difference between two measurements that reflects a
iner separately). Subsequently, the SEM was used to determine the minimum real real difference at a confidence level of 95%) was 1.5 ± 0.5 points
p
difference, with a reliability of 95% (SEM  1.96  2), for qualitative interpretation
(14.7 ± 5.3%) on the 10-point scale for the scoring system of
of the results (Weir, 2005). Any change in a parrot’s plumage score above or below
the previous plumage score that was greater than this minimum difference was Meehan. For the novel scoring system, the absolute intra-observer
considered to reflect a real difference in 95% of the cases. reliability was significantly higher (9.9 ± 4.7 points on a scale of
An independent samples t test was conducted to compare the relative inter- and 0–100; P < 0.01; Table 3; Fig. 1).
intra-observer reliabilities obtained with each scoring system. To compare absolute The SEM for inter-observer reliability was 0.72 for the 10-point
inter- and intra-observer reliabilities (SEM, minimum difference) an F test was per-
formed. Normality of distribution for each parameter was evaluated using the Shap-
scale of Meehan’s scoring system and 5.8 for the 100-point scale of
iro–Wilk test. Standard deviations, necessary for the t test, were calculated the novel scoring system. There was no significant difference
according to the method described by Kenney and Keeping (1962). P values <0.05 between the absolute inter-observer reliability of the novel scoring
were considered to be statistically significant. For ‘post hoc’ comparisons of the system (13.5%) and Meehan’s scoring system (17%; Table 4).
scoring methods within the two subgroups, a Bonferroni correction was performed,
leading to an adjusted significance level of 0.025.
Discussion
Results
The inter- and intra-observer reliabilities of both scoring
Relative reliability and inter- and intra-observer reliabilities systems in our study were >0.75 and >0.90, respectively, which is
considered to reflect good and excellent agreement (Fleiss, 1981;
ICC[2,1] was 0.90 for Meehan’s scoring system and 0.95 for the Portney and Watkins, 2000). Similar to other studies (Leggin
novel scoring system. These values indicate excellent intra-observer et al., 1996; Sokoloff and Blumberg, 2002), intra-observer reliabil-
250 Y.R.A. van Zeeland et al. / The Veterinary Journal 196 (2013) 247–252

Table 3
Relative and absolute intra-observer reliabilities (mean ± standard deviation) for the total feather score using Meehan’s feather scoring system and the novel feather scoring
system.

Observer Feather scoring system of Meehan et al. (2003) Novel feather scoring system
ICC[2, 1] (95% CI) SEM (0–10) Minimum difference (%) ICC[2, 1] (95% CI) SEM (0–100) Minimum difference (%)
Inexperienced observers (n = 9) 0.92 (0.88–0.95) 0.51 ± 0.24 14.0 ± 6.7 0.94 (0.90–0.98) 4.0 ± 2.1 11.0 ± 5.8
a a a
Experienced observers (n = 9) 0.89 (0.84–0.93) 0.56 ± 0.16 15.5 ± 4.3 0.96 (0.94–0.98) 3.2 ± 1.2 8.9 ± 3.3
a a a
All observers (n = 18) 0.90 (0.87–0.93) 0.53 ± 0.19 14.7 ± 5.3 0.95 (0.93–0.97) 3.6 ± 1.7 9.9 ± 4.7

ICC[2,1], Intra-class correlation coefficient; SEM, Standard error of measurement; CI, Confidence interval.
a
Significant difference (P <0.01) from the feather scoring system of Meehan et al. (2003).

Table 4
Relative and absolute inter-observer reliabilities for the total feather score using Meehan’s scoring system (Meehan et al., 2003) and the novel feather scoring system.

Observers Feather scoring system of Meehan et al. (2003) Novel feather scoring system
ICC[2, 1] (95% CI) SEM (0–10) Minimum difference (%) ICC[2, 1] (95% CI) SEM (0–100) Minimum difference (%)
Inexperienced observers (n = 10) 0.82 (0.67–0.93) 0.70 16 0.88 (0.77–0.95) 5.86 13.6
Experienced observers (n = 7) 0.82 (0.65–0.93) 0.76 18 0.91 (0.81–0.96) 5.71 13.3
All observers (n = 17) 0.83 (0.70–0.93) 0.72 17 0.89 (0.80–0.95) 5.82 13.5

ICC[2,1], Intra-class correlation coefficient; SEM, Standard error of measurement; CI, Confidence interval.

Fig. 1. Intra-observer data for (A) Meehan’s feather scoring system (Meehan et al., 2003) and (B) the novel scoring system. The first and second score for each individual
parrot (n = 10) by the different observers (n = 18) are plotted against each other to depict intra-observer reliability for experienced observers (dots) and inexperienced
observers (asterisks). The graph furthermore depicts the ideal correlation between the two measurements (x = y; continuous line) and the minimum difference considered to
be real in 95% of cases (interrupted lines).

ities were higher than inter-observer reliabilities, indicating that the coefficient of variation (Portney and Watkins, 1993) or limits
the highest level of reliability can be found when consecutive scor- of reliability (Bland and Altman, 1986), is therefore recommended.
ings of a bird are performed by the same examiner. The signifi- In our study, a Bland–Altman analysis was performed to deter-
cantly higher intra-observer reliability obtained with the novel mine the SEM and the minimum real difference in 95% of cases. The
scoring system may be linked to the use of objectively definable smallest minimum real differences (i.e. 9.9 ± 4.7 points or 10 ± 4.7%
criteria to quantify the amount of feather damage compared to on a scale of 0–100) were achieved on consecutive scoring by the
the more subjective terminology used in Meehan’s scoring system. same observer using the novel scoring system. On the basis of
However, a high ICC does not always imply an acceptable reli- our experience, this minimum difference of 10% is considered to
ability (Costa-Santos et al., 2011). When large ranges in scores be clinically important, rendering the scoring system suitable for
are present, a high ICC may be obtained, while actual scores display use in practice. Using Meehan’s scoring system, a significantly
large within-subject differences. In addition, factors such as the de- greater difference between scores (at least 15 ± 5.3% instead of
sign of the study, variance within the sampled population and the 10 ± 4.7%) is necessary to ascertain that this difference is real in
number of observers and/or subjects may influence the height of 95% of cases. Therefore, the novel scoring system may be able to
the ICC, limiting the ability to extrapolate the findings more widely detect real changes at an earlier stage than Meehan’s system. The
(Müller and Büttner, 1994; Walter et al., 1998; Saito et al., 2006). mean values obtained for each system may be used as a guide
The complementary use of an absolute measure of reliability, e.g. for interpretation of scores in clinical practice. However, since
Y.R.A. van Zeeland et al. / The Veterinary Journal 196 (2013) 247–252 251

the minimum real difference varies greatly between observers, it is between total feather scores within and between observers for
advisable that the minimum real difference is assessed for each both scoring systems. The relative and absolute intra-observer reli-
individual observer. abilities were highest for the novel system, demonstrating its value
The minimum real differences for both scoring systems indicate for monitoring feather damaging behaviour over time. Since both
that slight changes in feather damaging behaviour (<10% for the systems had an overall good consistency, the preference for a cer-
novel scoring system and <15% for Meehan’s scoring system) can- tain scoring system will be based on additional factors, such as the
not be detected reliably. Since it may take from 4 weeks (covert or clinical setting in which a parrot’s plumage condition is assessed.
down feathers) (Wang, 1943) to >1 year (flight or tail feathers) Since intra-observer reliability was higher than inter-observer reli-
(Juniper and Parr, 1998) for feathers to regrow, sufficient time ability for both scoring systems, it is recommended that individual
should elapse between two scoring sessions to evaluate the birds should be assessed by the same observer.
efficacy of preventive or therapeutic interventions for feather
damaging behaviour. Further studies are needed to determine
Conflict of interest statement
whether smaller differences can be detected by direct evaluation
of the parrot’s plumage condition instead of using photographs.
None of the authors has any financial or personal relationships
In addition to its higher relative and absolute intra-observer reli-
that could inappropriately influence or bias the content of the
abilities, the novel scoring system may also result in a more accu-
paper.
rate representation of the birds plumage condition due to (1) the
assessment of feather damage in percentages; (2) the incorporation
of relative body surface area of the affected body parts (a method- Acknowledgements
ology that is routinely used in human burn victims); and (3) the use
of a separate system to address damage to the flight and tail feath- The authors would like to thank all the participants for their
ers, which usually involves a different type of damage (chewing or time and contribution to the study, J. Fama and the veterinary tech-
fraying) and a different time frame (until the next moult) to restore nicians of the Division of Zoological Medicine of Utrecht University
damage (Nett and Tully, 2003). Thus, when high accuracy and/or for assistance with taking the photographs of the parrots and Dr
high consistency of the observations are needed, the novel scoring J.C.M. Vernooij, Dr G.P.A. Bergers and Mrs M. de Kam for help with
system may be preferable to the system of Meehan. statistical analysis. This study was made possible with the help of a
However, the choice for a specific scoring system may also de- grant received from the Dutch Ministry of Economic affairs, Agri-
pend on other factors, such as simplicity of the system or the time culture and Innovation (Grant No. 1400001780).
needed to complete the scoring, which are advantages of Meehan’s
scoring system (although scoring time may be decreased consider- Appendix A. Supplementary material
ably for the novel scoring system when using an automated pro-
gramme to calculate the scores; see Appendix A: Supplementary Supplementary data associated with this article can be found, in
file 2). The choice may also depend on the conditions under which the online version, at http://dx.doi.org/10.1016/j.tvjl.2012.08.020.
the scoring can be performed or the purpose for which the scoring
is used. The scoring system of Meehan is ideal for evaluation of the References
plumage condition of a bird from a distance (e.g. in the wild or in
an aviary), which hinders the accurate estimation of percentage of Bilcik, B., Keeling, I.J., 1999. Changes in feather condition in relation to feather
feathers damaged, as well as assessment of the ventral surface of pecking and aggressive behaviour in laying hens. British Poultry Science 40,
444–451.
the wings. Conversely, the novel scoring system is designed to en- Bland, J.M., Altman, D.G., 1986. Statistical methods for assessing agreement
able the feather condition to be evaluated either by using photo- between two methods of clinical measurement. Lancet 327, 307–310.
graphs or by direct visualisation while the bird is manually Bland, J.M., Altman, D.G., 1996. Measurement error. British Medical Journal 313,
744.
restrained during clinical examination. Bruton, A., Conway, J.H., Holgate, S.T., 2000. Reliability: What is it, and how is it
Due to the inability to accurately assess the individual flight measured? Physiotherapy 86, 94–99.
feathers from the photographs, evaluation of damage to the pri- Costa-Santos, C., Bernardes, J., Ayres de Campos, D., Costa, A., Costa, C., 2011. The
limits of agreement and the intraclass correlation coefficient may be
mary, secondary and tail feathers with the novel scoring system
inconsistent in the interpretation of agreement. Journal of Clinical
was excluded in this study, thereby omitting the reliability of this Epidemiology 64, 264–269.
part of the scoring system. Direct visualisation of a manually re- Davis, C., 1995. Behavior modification counselling – An alliance between
strained bird would be necessary to allow proper evaluation of veterinarian and behaviour consultant. Seminars in Avian and Exotic Pet
Medicine 4, 39–42.
the individual flight and tail feathers, but this was not feasible in Fleiss, J.L., Cohen, J., 1973. The equivalence of weighted kappa and the intraclass
the present study due to the use of assessors at different institutes. correlation coefficient as measures of reliability. Educational and Psychological
Although the novel scoring system has not been fully evaluated, Measurement 33, 613–619.
Fleiss, J.L., 1981. Statistical Methods for Rates and Proportions, Second Ed. John
the assessment of flight and tail feathers is relatively straightfor- Wiley, New York, USA, 352 pp.
ward (i.e. no damage, <50% of the feather damaged or P50% of Gaskins, L.A., Bergman, L., 2011. Surveys of avian practitioners and pet owners
the feather damaged). Therefore no marked influence on the out- regarding common behavior problems in psittacine birds. Journal of Avian
Medicine and Surgery 25, 111–118.
come of the present study would be expected by inclusion of these Grindlinger, H.M., Ramsay, E., 1991. Compulsive feather picking in birds. Archives of
feathers in the analysis. Furthermore, the tail and flight feathers do General Psychiatry 48, 857.
not need to be routinely assessed when evaluating feather damag- Harrison, G.J., 1986. Disorders of the integument. In: Ritchie, B.W., Harrison, G.J.,
Harrison, L.R. (Eds.), Avian Medicine and Surgery: Principles and Application.
ing behaviour, since the damage inflicted to these feathers (chew- W.B. Saunders, Philadelphia, Pennsylvania, USA, pp. 509–524.
ing, fraying) does not resolve until the next moult and therefore is Juniper, T., Parr, M., 1998. Guide to Parrots of the World. Pica Press, Robertsbridge,
not useful for evaluating short-term changes in behaviour. East Sussex, UK, 584 pp.
Kaleta, E.F., 2003. Verhalten und verhaltensstorungen. In: Kaleta, E.F., Krautwald-
Junghanns, M.E. (Eds.), Kompendium der Ziervogelkrankheiten. Schlutersche,
Hannover, Germany, pp. 41–45.
Conclusions Keating, J., Matyas, T., 1998. Unreliable inferences from reliable measurements.
Australian Journal of Physiotherapy 44, 5–10.
Kenney, J.F., Keeping, E.S., 1962. Chapter 6.5: The standard deviation; Chapter 6.6:
In a comparison of a novel feather scoring system with the Calculation of the standard deviation. In: Kenny, J.F. (Ed.), Mathematics of
existing (Meehan) system, there was good to excellent agreement Statistics, Pt. 1, Third Ed. Van Nostrand, Princeton, New Jersey, USA, pp. 77–80.
252 Y.R.A. van Zeeland et al. / The Veterinary Journal 196 (2013) 247–252

Kjaer, J.B., Glawatz, H., Scholz, B., Rettenbacher, S., Tauson, R., 2011. Reducing stress Saito, Y., Sozu, T., Hamada, C., Yoshimura, I., 2006. Effective number of subjects and
during welfare inspection: Validation of a non-intrusive version of the LayWel number of raters for inter-rater reliability studies. Statistics in Medicine 25,
plumage scoring system for laying hens. British Poultry Science 52, 149–154. 1547–1560.
Leggin, B.G., Neumann, R.M., Iannotti, J.P., Williams, G.R., Thompson, E.C., 1996. Seibert, L.M., 2007. Pharmacotherapy for behavioural disorders in pet birds. Journal
Intrarater and interrater reliability of three isometric dynamometers in of Exotic Pet Medicine 16, 30–37.
assessing shoulder strength. Journal of Shoulder and Elbow Surgery 5, 18–24. Seibert, L.M., Crowell-Davis, S.L., Wilson, G.H., Ritchie, B.W., 2004. Placebo-
Lumeij, J.T., Hommers, C.J., 2008. Foraging ‘enrichment’ as treatment for controlled clomipramine trial for the treatment of feather-picking disorder in
pterotillomania. Applied Animal Behaviour Science 111, 85–94. cockatoos. Journal of the American Animal Hospital Association 40, 261–269.
Lund, C.C., Browder, N.C., 1944. The estimation of areas of burns. Surgery, Shrout, P.E., Fleiss, J.L., 1979. Intraclass correlations: Uses in assessing rater
Gynecology and Obstetrics 79, 352–358. reliability. Psychological Bulletin 86, 420–428.
Meehan, C.L., Millam, J.R., Mench, J.A., 2003. Foraging opportunity and increased Sokoloff, G., Blumberg, M.S., 2002. Contributions of endothermy to huddling
physical complexity both prevent and reduce psychogenic feather picking by behavior in infant Norway rats (Rattus norvegicus) and Syrian gold hamsters
young Amazon parrots. Applied Animal Behaviour Science 80, 71–85. (Mesocricetus auratus). Journal of Comparative Psychology 116, 240–246.
Mitchell, H.H., 1929. The surface area of single comb white leghorn chickens. Streiner, D.L., Norman, G.R., 2003. Health Measurement Scales: A Practical Guide to
Journal of Nutrition 5, 443–449. their Development and Use. Oxford University Press, New York, USA, 296 pp.
Müller, R., Büttner, P., 1994. A critical discussion of intraclass correlation Van Hoek, C.S., King, C.E., 1997. Causation and influence of environmental
coefficients. Statistics in Medicine 13, 2465–2476. enrichment on feather picking of the crimson-bellied conure (Pyrrhura perlata
Nett, C.S., Tully, T.N., 2003. Anatomy, clinical presentation and diagnostic approach perlata). Zoo Biology 16, 161–172.
to feather-picking pet birds. Compendium on Continuing Education for the Van Zeeland, Y.R.A., Spruijt, B.M., Rodenburg, T.B., Riedstra, B., van Hierden, Y.M.,
Practicing Veterinarian 25, 206–219. Buitenhuis, B., Korte, S.M., Lumeij, J.T., 2009. Feather damaging behaviour in
Portney, L.G., Watkins, M.P., 1993. Foundations of Clinical Research: Applications to parrots: A review with consideration of comparative aspects. Applied Animal
Practice. Appleton and Lange, Norwalk, Connecticut, USA, 722 pp. Behaviour Science 121, 75–95.
Portney, L.G., Watkins, M.P., 2000. Foundations of Clinical Research: Applications to Walter, S.D., Eliasziw, M., Donner, A., 1998. Sample size and optimal designs for
Practice, Second Ed. Prentice Hall Health, Upper Saddle River, New Jersey, USA, reliability studies. Statistics in Medicine 17, 101–110.
742 pp. Wang, H., 1943. The morphogenetic functions of the epidermal and dermal
Ramsay, E.C., Grindlinger, H., 1994. Use of clomipramine in the treatment of components of the papilla in feather regeneration. Physiological Zoology 16,
obsessive behavior in psittacine birds. Journal of the Association of Avian 325–349.
Veterinarians 8, 9–15. Weir, J.P., 2005. Quantifying test-retest reliability using the intraclass correlation
Rosskopf, W.J., Woerpel, R.W., 1996. Feather picking and therapy of skin and feather coefficient and the SEM. Journal of Strength and Conditioning Research 19, 231–
disorders. In: Rosskopf, W.J., Woerpel, R.W. (Eds.), Diseases of Cage and Aviary 240.
birds, Third Ed. Williams and Wilkins, Baltimore, Maryland, USA, pp. 397–405.

You might also like