You are on page 1of 6

Automatic Detection of Pain Intensity

Zakia Hammal Jeffrey F. Cohn1,2


Carnegie Mellon University (1) Carnegie Mellon University,
Robotics Institute, (2) University of Pittsburgh
5000 Forbes Ave 210 S. Bouquet Street
Pittsburgh, PA 15213, USA Pittsburgh, PA 15213, USA
zhammal@andrew.cmu.edu jeffcohn@cs.cmu.edu

ABSTRACT 1. INTRODUCTION
Previous efforts suggest that occurrence of pain can be detected Pain assessment and management are important across a wide
from the face. Can intensity of pain be detected as well? The range of disorders and treatment interventions. The assessment of
Prkachin and Solomon Pain Intensity (PSPI) metric was used to pain is accomplished primarily through subjective reports of
classify four levels of pain intensity (none, trace, weak, and patients, caregivers, or medical staff. While convenient and
strong) in 25 participants with previous shoulder injury useful, subjective reports have several limitations. These include
(McMaster-UNBC Pain Archive). Participants were recorded inconsistent metrics, reactivity to suggestion, efforts at impression
while they completed a series of movements of their affected and management or deception, and differences among clinicians’ and
unaffected shoulders. From the video recordings, canonical sufferers’ conceptualizations of pain. Further, self-report cannot
normalized appearance of the face (CAPP) was extracted using be used with children or patients with certain neurological
active appearance modeling. To control for variation in face size, impairments, dementia, or those in transient states of
all CAPP were rescaled to 96x96 pixels. CAPP then was passed consciousness or requiring breathing assistance [1]. Biomedical
through a set of Log-Normal filters consisting of 7 frequencies research has found that pain can be detected reliably from facial
and 15 orientations to extract 9216 features. To detect pain level, expression [26], [6]. Recent efforts in affective computing suggest
4 support vector machines (SVMs) were separately trained for the that automatic detection of pain from facial expression is a
automatic measurement of pain intensity on a frame-by-frame feasible goal. Several groups have automatically distinguished
level using both 5-folds cross-validation and leave-one-subject- pain from absence of pain [14], [1], [16], [10]. For clinical or
out cross-validation. F1 for each level of pain intensity ranged experimental utility, pain intensity need be measured as well.
from 91% to 96% and from 40% to 67% for 5-folds and leave-
one-subject-out cross-validation, respectively. Intra-class Automatic measurement of pain from the face is challenging for at
correlation, which assesses the consistency of continuous pain least two reasons. One is the lack of training and testing data of
intensity between manual and automatic PSPI was 0.85 and 0.55 spontaneous, un-posed and unscripted, behavioral observations in
for 5-folds and leave-one-subject-out cross-validation, individuals that have clinically relevant pain. The other is the
respectively, which suggests moderate to high consistency. These difficulty of face and facial features analysis and segmentation in
findings show that pain intensity can be reliably measured from real world settings, such as medical clinics. The recent
facial expression in participants with orthopedic injury. distribution of the UNBC-McMaster Pain Archive addresses the
need for well-annotated facial expression recordings during acute
Categories and Subject Descriptors pain induction in a clinical setting. Using the UNBC or other data
J [Computer Applications]: J.3 [Life and Medical Sciences]; sources, several approaches have been proposed to detect
I.5.4 [Pattern Recognition Applications]; H.1.2 [User/Machine occurrence of pain: [23], [14], [1], [16], [10]. Notably, [11]
systems]: [Human information processing, Human Factors]. proposed the more demanding task of detecting ordinal pain
intensity.
General Terms In the current contribution, we extend the state of the art in pain
Algorithms, Measurement, Performance, Design, recognition by automatically detecting four levels of pain intensity
Experimentation, Human Factors. consistent with the Prkachin and Solomon Pain Intensity (PSPI)
metric in participants with orthopedic injuries. The PSPI [27] is a
Keywords well-validated approach to manual measurement of pain that is
Facial Expressions, Pain, Intensity, Active Appearance Models based on the Facial Action Coding System (FACS) [9]. Pain is
(AAMs), Log-Normal filters, Support Vector Machines (SVMs). measured on an ordinal scale.
Previous work has found that active appearance models (AAMs)
are a powerful means of analyzing spontaneous pain expression.
[1][16]. These approaches use gray-scale features, however, that
Permission to make digital or hard copies of all or part of this work for
may be less robust to the head pose variation that is common in
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that pain. [17]. [3] found that biologically inspired features (e.g.,
copies bear this notice and the full citation on the first page. To copy Gabor magnitudes) are more robust to the registration error
otherwise, or republish, to post on servers or to redistribute to lists, introduced by head rotation. [14] used biologically inspired
requires prior specific permission and/or a fee. features to discriminate real from feigned pain.
ICMI ’12, October 22–26, 2012, Santa Monica, California, USA.
Copyright 2012 ACM 978-1-4503-1467-1/12/10...$15.00..

47
Inn the current paaper we couple AAMs with bio ologically inspireed 3. DA
ATABASE
ffeatures to assesss pain intensity
y in participantss with significaant The evvaluation of the pproposed modell for the recogniition of pain
oorthopedic injury
y. Using video from
fr the recentlyy released UNBC C- expresssion intensities iis made on UNB BC-McMaster Shhoulder Pain
MMcMaster Should Pain Archive [18], AAMs aree used to track an nd Expres sion Archive [[18]. The databbase is compossed of 129
register rigid and
d non-rigid face motion [16] in each video fram me. particippants (63 maless and 66 females) who self-iddentified as
BBased on this innformation, the canonical appeaarance of the faace having problems with pain shoulder. Participants weere recorded
(CAPP) is extraccted for each fraame, rescaled to 96x96 pixels, an nd during a series of movvements to test thheir affected andd unaffected
ppassed through a set of Log-Norm mal filters of 7 frequencies
f and 15 shouldeer during activve and passive conditions. Inn the active
oorientations [12]. The extracted spatial face reprresentation is theen conditioon, participants initiated shouldder rotation on thheir own. In
aaligned as a vector of 9216 feattures used by fo our SVMs traineed the passsive condition, a physiotherappist was responssible for the
sseparately for th
he automatic meeasurement of fo our levels of paain movem ment. Sony digiital cameras reecorded particippants’ facial
inntensity. expresssions. In the aactive conditionn, camera orienntation was
TThe paper is org ganized as folloows: Section 2 describes
d the paain initiallyy frontal; in thee passive condition camera orieentation was
inntensity metric developed
d by Prk
kachin and Solomon [27]; Sectio on about 770° from frontall. In both condiitions, moderatee changes in
3 describes thee UNBC-McMaaster Shoulder Pain Expressio on pose wwere common as participants conntorted in pain. V Videos were
AArchive [16] ussed in the curren nt paper; Sectioon 4 describes thhe captureed at a resolutionn of 320x240 pixels, out of whhich the face
ddifferent steps off the proposed automatic
a model for pain intensiity area sp anned an averagge of approximattely 140x200 pixxels [18].
mmeasurement; an nd Section 5 pressents the obtaineed performances.. Three F FACS-certified ccoders coded faccial action units on a frame-
by-framme basis. Eachh AU was codded on a 6-levvel intensity
22. PRKACH
HIN and SOLOM MON PAIN
N dimenssion (from 0 = abbsent, 1 = trace to 5 = maximum m) by one of
IINTENSITY
Y METRIC
C (PSPI) three coders, each oof whom hadd previously demonstrated
MMany efforts haave been madee in human beh havior studies to proficieency by passingg the FACS finaal test [27]. A ffourth coder
iddentify reliable and valid facial indicators of paain (e.g., [26], [6
6], who haad also demonstrated proficienncy on the final test then
[20], [27]. In these studies pain expression
e is wid
dely characterizeed revieweed all coding. F Finally, to assess inter-observerr agreement,
bby the activationn of a small set of facial musclees and coded by ya 1738 fframes selectedd from one afffected-side triaal and one
sset of correspondding actions uniits (AUs): brow lowering (AU 4), 4 unaffeccted-side trial oof 20 participannts were random mly sampled
oorbital tightening
g (AU 6 and AU U 7), levator labiii raise (AU 9 an nd and inndependently cooded. Inter-codder percent aggreement as
AAU 10) and ey ye closure (AU U 43) (see Figu ure. 1). With th he calculaated by the Ekm man-Friesen forrmula [8] was 995%, which
eexception of AU U 43, which is binary, each off these actions is comparres favorably w with other researrch in the FAC CS literature
mmeasured on a siix-point ordinal scale (0 = absen nt, 5 = maximum m) [27]. FFor a detailed ddescription of thhe database, pleease refer to
uusing FACS. In I a recent stu udy Prkachin an nd Solomon [2 27] [27] annd [18]. Pain inteensity was finallyy annotated fram
me-by-frame
cconfirmed that pain
p informationn is effectively contained
c in theese using tthe PSPI metricc described in ssection 2 using equation 1.
AAUs and defined d pain intensity as
a the sum of theeir intensities. Th he Figure 2 shows exam mples from the U UNBC-McMastter Shoulder
PPrkachin and Solomon
S FACS pain intensity y (PSPI) scale is Pain Exxpression Archiive with the corrresponding PSP PI scores. In
ddefined as: additionn to the PSPI, pparticipants comppleted three selff-report pain
measurres after each test to rate the maximum pain they
Pain = AU4
4 + (AU6||AU7) + (AU9||AU10) + AU43 (1)) experieenced: the sensoory (SEN) and aaffective (AFF) verbal pain
descripptors and the Vissual Analog Scalles (VAS).
FFigure 1 shows an example off face in pain from the UNBC C-
MMcMaster Sho oulder Pain Expression
E Arcchive with th he The reecently releasedd version of thhe UNBC-McM Master Pain
ccorresponding AUs
A and their intensities. In th
his example, paain Archivee includes 200 video sequencees from 25 subbjects in the
inntensity using th
he PSPI metric iss computed as: active ccondition [18]. Of the 200 videeo sequences, 100 were for
the affe
fected shoulder, for which painn could be expeerienced. Of
P
Pain= Intensity (AAU4) + (Max Inntensity AU6 or AU7) + (Max these, w
we used the firsst 53 sequences. These consisteed of 16657
Inntensity AU9 orr AU10) + Intenssity (AU43) frames of FACS-codedd video. Using eequation 1 and tthe six-point
scale o f AU intensity, the PSPI could theoretically varry from 0 to
T ount the intensity of each AU,, the PSPI in th
Taking into acco his 16. Thee obtained obseerver scores rangged from 0 to 112. Because
eexample is equal to 12: some ggradations of ppain were sparssely representedd, the PSPI
scores were pooled to create a 4-point scale defined aas shown in
P
Pain = 4 + Max (3, 4) + Max (2
2, 3) +1 = 4 + 4 + 3 + 1=12. Table 11.

ble 1. Number of video framess for each level of pain.


Tab
Pain Severityy and PSPI Scorre
No pain Trace Weak Strong
0 1 2 ≥3
Numbeer of 12503 1263 1239 1652
frames

Figure 2 shows exampples of each of thhe four intensityy levels. The


followiing section deescribes the prroposed machinne learning
Figure 1. Example of painful face from the UNBC- methodd for the automattic measurementt of these four leevels.
McMaster Shoulder Pain Expression Arcchive with the
onding Action Units
correspo U and their intensities.
(i= intensity
y of each AU).

48
44. AUTOM
MATC MEAASUREMEN
NT OF PAIN
N verticess. These vertex llocations corresppond to a sourcee appearance
image, from which thee shape is alignedd. Since AAMs allow linear
EEXPRESSIO
ON INTENSY
shape vvariation, the shaape s can be exxpressed as a basse shape s0
44.1 Preliminary analysses plus a llinear combination of m shape vvectors si :
AAn anatomically y based measuree of pain intensiity, the PSPI, was w
uused to train automatic detectorss of pain intensiity. A preliminaary
qquestion was ho ow well the PSP PI correlated wiith the self-repo ort m
mmeasures of pain. To evaluate the correspondeence between th he s = s0 + ∑ pi si (2)
PPSPI and participants' self-reporrted pain intensiity, the self-repoort i =1
mmeasures were compared with each other and th hen with the PSP PI.
TThe three self-reeport measures were highly in nter-correlated (rr= where the coefficiennts p = ( p1 ,...., pm )T are the shape
00.94 to r = 0.96, all p < 0.001), which
w representss high consistenccy parameeters. These shappe parameters can typically be divided into
bbetween self-reports of pain inten nsity. To compaare the PSPI to th he rigid siimilarity parameeters ps and noon-rigid object deformation
sself-report meassures, we conssidered the PS SPI score as th he
mmaximum intensity over all images i for each sequence. Th he parameeters p0 , suuch that pT = [ psT ,. p0T ] . Similarity
PPearson’s correlations between the PSPI and self-reported paain
wwas r > 0.61 orr higher, all p < 0.001. The ob btained correlatioon parameeters are associaated with a geoometric similaritty transform
ssuggests moderaate to strong asssociation betweeen PSPI and sellf- (i.e. ttranslation, rottation and scale). The objject-specific
reported pain inteensity. parameeters, are the residual parametters representinng non-rigid
geomettric variations asssociated with thhe object shape (e.g., mouth
44.2 Overvieew openingg, eyes shutting,, etc.). Procrustees alignment [5] is employed
TTo automatically y measure the in ntensity of pain, active appearannce to estim
mate the base shaape s0 .
mmodels (AAMs) are first used to o extract the can
nonical appearan nce
oof the face (CAP PP) [19, 16]. The obtained CAP PP is then rescaleed For eacch participant, aapproximately 3%
% of frames weere manually
aand passed throu ugh a set of Log--Normal filters [12]. The extracteed labeledd in training the AAM. All frammes then were auutomatically
ffeatures are finallly given as inpu
uts to four separate support vecttor alignedd using a gradiennt descent AAMM fitting algorithm
m described
mmachines (SVMs) trained for th he automatic meeasurement of th he in [25].. Based on [19, 1 and 16] canonnical normalizedd appearance
ffour levels of paain intensity. A detailed descrip ption of the thrree CAPP (see Figure 3.bb) was derived ffrom shape and appearance
ssteps is given in the
t following su ubsections. parameeters of the AAM
Ms. Canonical nnormalized appearance a0
refers tto the situation w
where all the noon-rigid shape vvariation has
been nnormalized withh respect to thee base shape s0 . This is
accompplished by appplying a piece-w wise affine waarp on each
trianglee patch appearannce in the sourcee image so that itt aligns with
the bas e face shape s0 [16]. CAPP feaatures corresponnd to 87 × 93
synthessized grayscale face images. Figgure 3.b shows an example
of CAPPP [16].
In a reecent study of ppain analysis annd recognition uusing AAMs
Lucey and collaboratoors [16] found that canonical normalized
appearaance (CAPP) gives better performances for pain
recogniition compared to similarity shhape features. G Given these
results,, CAPP features are investigatedd in the current contribution
for paiin intensities m measurement. A Additionally, ennergy based
Fiigure 2. Examplles from the UNNBC-McMasterr Shoulder Pain
n represeentation (such aas Gabor amplituude) has been ffound to be
Expression Arcchive database with
w the corresponding pain highly discriminative ffor facial action rrecognition commpared to the
in
ntensity using th
he PSPI metric. grayscaale representatioon [7, 17]. Thhus, rather thann using the
grayscaale representatiion of the CA APP, the resuults of the
applicaation of a set off filters (using L
Log-Normal filteers), is used
44.3 Active Appearance
A e Model for the automatic recoggnition of 4 levells of pain intensiity.
AActive appearan nce models (AAAM) [4] have been
b successfullly
uused for face tracking and facial features extraction for f 4.4 L
Log-Normal Filters
sspontaneous facial expression an
nalysis and recog
gnition [19, 1, 16
6]. Pain faacial expressionn is characterizeed by the deformations of
AAAMs are defineed by a shape co omponent, s , and
a an appearance facial ffeatures (see secction 2). Figure 22a shows an exaample where
ccomponent, g , which jointly represent the shape
s and textu ure pain exxpression is chaaracterized by a deepening andd orientation
vvariability of thhe face [19]. The
T AAMs fit their shape an nd changee of the naso-labbial furrows as well as eyes cclosure. The
aappearance com mponents throuugh a gradientt-descent searcch, intensitty of pain can bbe characterizedd by the magnituude of such
aalthough other optimization
o meethods have beeen employed wiith deformmations (such as the strong or sooft appearance oof nasal root
ssimilar results [5]. wrinklees, the degree off eyes closure, ettc.). These deforrmations can
A 2D triangulateed mesh describes the shape s of an AAM. Th he be meaasured directly in the energy-bbased representaation by the
ccoordinates off the mesh vertices deffine the shap pe magnituude of energy aappearing in thee face after filterring process
comparred to the relaxaation state. For exxample, in Figurre 3 one can
s = [ x1 ,. y1 , x1 ,. y1 ,..., xn , yn ] , where n is the number of see thatt the appearancee of nasal root wwrinkles (Figure 33.b) leads to

49
hhigh-energy mag gnitude after fiiltering process (see the whiteest
aareas in Figuree 3.c). Thus, insteadi of usin ng the grayscaale bandwiidth (σ f = ln( f i +1 / fi ))) , σ θ , the orientation
representation off CAPP [16], we w investigated the results of an a bandwiidth and A , a normalization factor. The facttor 1/ f in
aadditional filterin
ng of the CAPP P. To do so, a set s of biologicallly
bbased Log-Norm mal filters [12, 24] is applied to the extracteed equatioon1 accounts forr the decrease oof energy in funnction of the
α
CCAPP. Log-Norm mal filters weree originally prop posed by [24] for f frequenncy, which in aaverage followss 1/ f poweer law. This
mmeasuring frequeencies and orien ntations in textuure images for 3D 3 factor eensures that the sampling of thee spectral inform
mation of the
sshape detection. Log-Normal filtersf are a go ood model of th he face takkes into accounnt the specific ddistribution of ennergy of the
ccomplex cells fo ound in the visuual cortex V1. Notably
N they shaare studiedd face at differeent scales. Thiss property ensurres that the
sspecific characteeristics such as: they are symm metrical on a lo og- facial information iss optimally sam mpled over thhe different
frequency scale; they are well deefined even at veery low frequenccy;
fr frequenncy bands [12].
aand they samplee completely thee Fourier space (interested read der
ccan find a more detailed descrip ption of the Log g-Normal filters in To com mpute the magnnitude of the reesponse of the L Log-Normal
[24], and their appplication for faccial feature extraactions in [12]). filters, the power specttrum of the norm malized CAPP iis multiplied
by a bbank of Log-Normal filters (15 orientations annd 7 central
frequenncies, see [24, 122] for details onn the filter bank ddesign). The
inversee Fourier transfoorm (IFFT) is thhen computed to recover the
spatial representation oof the filtered C CAPP. Figure 3.c shows an
examplle of the magnituude response aftter the Log-Norm mal filtering
in the ccase of strong paain (PSPI >3).

4.5 S
Support Vector Machine Based
Classsification
To exaamine the possibble description oof pain intensities based on
the Logg-Normal filter rresponses, suppoort vector machiines (SVMs)
[2] are used. Among other possible claassifiers, SVMs are selected
becausee they can cope with large repreesentation spacess, are simple
to trainn, and generalizze well. SVMs are well-suited to the high
dimenssional representtation of Log--Normal filter responses),
which depend only oon the number of training exaamples [22].
Howevver, the trainingg time complexity for an SVM M is O(m3),
where m is the numberr of training exam mples [2, 1]. A ccompromise
betweeen the best use oof the training ddata and the trainning time is
necessaary for an efficieent use of SVMM. Previous workk has shown
the cooncurrent validdity of SVMs for the recoognition of
spontanneous AU inteensity (e.g. [22], [21]). In the current
contribbution, an SVM is built for eachh pain intensity level. Each
Figure 3. (a
a) Input frame, (b) Canonical Normalized
N SVM iss trained on the corresponding iimages (see secttion 3) using
Appearrance CAPP, (c)) Log-Normal filtering.
fi a linearr basis function. Four linear baasis function SVMs are thus
employyed to recognizee separately the four levels of paain intensity
definedd in section 3. Eaach SVM basedd intensity detecttor is trained
CCompared with the t more comm monly used Gabo or filters, the Lo
og- using ppositive examplees, which consistt of the frames thhat the PSPI
NNormal filters area chosen becaause they may better
b sample thhe metric labeled as equaal to that particuular intensity. TThe negative
ppower spectrum and are easily tu uned and separaable in frequenciies examplles correspond tto all the other fframes that the P PSPI metric
aand orientations [12]. These attrributes make theem well suited for
f labeledd with another intensity (see Table 1). The SVMs are
ddetecting featurees (such as the naso-labial
n ows in the case of
furro trained on the (96x96= =9216) pixel respponses after the L Log-Normal
ppain) at differennt scales and orientations [12]. The Log-Norm mal filteringg process (see F
Figure 3.c). Thiss process allowss taking into
ffilters are defined
d as follow: accounnt magnitude eneergy response ass well as the disstribution of
this ennergy over thee face (i.e. thhe spatial possition). The
generallization of the proposed SVM M based modeels, for the
2 2 1 ⎛ 1 ⎛ ln( f / fi ) ⎞2 ⎞ ⎛ 1 ⎛θ −θj ⎞2 ⎞ recogniition of pain inttensity levels, too new data is eevaluated by
⎜ 2 ⎝ σ ⎠⎟ ⎟⎟ ⎜⎜ 2⎝⎜ σ ⎟⎠ ⎟⎟
Gi, j ( f ,θ) = Gi ( f ).Gj (θ) = A. .eexp⎜− ⎜ .exp −
f using thhe standard 5-foolds cross-validaation process annd the leave-
⎝ r ⎠ ⎝ θ ⎠ one-subbject out validattion process, in wwhich all imagees of the test
((3) particippant were excluuded from traininng. Section 5 ddescribes the
obtaineed performancess. In 5-folds crooss-validation, trrain and test
sets aree independent with respect to fraames (i.e. no framme included
W
Where Gi , j iss the transfer function
f of thee filter at the ith in trainning is used in testing). In leavve-one-out crosss-validation,
train annd test are indeppendent with resspect to particippants. Thus,
frequency and the jth orienttation,
fr Gi ( f ) and G j (θ ) , leave-oone-out cross-vallidation affords a more rigorous test.
represents the frrequency and th
he orientation co
omponents of th
he
fi
filter, respectivelly; fi is the ceentral frequency
y, θ j , the centrral
oorientation (θ j = (180 /15).( j − 1)) , σ f , the frequenccy

50
5. PERFORMANCES Mahoor [21], ICC is used to measure concurrent validity between
automatic and manual coding of intensity. The ICC values
5.1 Classification Results between the automated measurements and the manually labeled
In this section, the performance of the proposed SVM classifiers pain intensity levels using the PSPI were 0.85 and 0.55 for 5-folds
for the four intensity levels is evaluated. Recall, precision and F1 and leave-one-subject-out validation, respectively. The obtained
are used to quantify the performance of each classifier in ICC between ground truth PSPI and each of the estimated PSPIs
comparison with ground truth, the PSPI. The 5-folds cross- suggest moderate to high consistency between manual and
validation in which all video frames of the training set are automatic measurement of pain intensity.
removed from the testing set is used first for the evaluation. The
obtained performance of each SVM is reported in Table 2. The 6. CONCLUSION AND PERSPECTIVES
best results are obtained for no pain (PSPI=0) and strong pain We used a combination of AAM, Log-Normal filters, and SVMs
(PSPI>=3). These results may be explained by the strong to measure four levels of pain intensity in the McMaster-UNBC
difference between these two intensity levels relative to Pain Archive. Using both 5-folds cross-validation and leave-one-
intermediate ones (see Figure 2). The obtained performances are subject-out cross-validation, the results suggest that automatic
encouraging given the task difficulty. pain intensity measurement in a clinical setting is a feasible task.
Intra-class correlation results were within the acceptable range for
Table 2. Performances of Pain Intensity measurement behavioral measurement. Replication in shoulder-pain populations
compared to the PSPI manual detection (ground truth). and applications to other types of pain would be next steps for
CR = Classification Rate (%), PR = Precision (%), future research.
F1 = F1 measure (%). The current work opens several additional directions for future
5-folds cross Leave one subject investigations. One is to compare additional types of features
validation out validation (e.g., Gabor) and classifiers. Two is to evaluate whether pain
intensity might be detected better by first detecting AU intensity
Intensity CR PR F1 CR PR F1
and then calculating PSPI from the result. In our work, classifiers
0 97 95 96 61 65 57 were trained to directly detect PSPI scores without first detecting
individual AU intensities. Detection of AU intensity is in the
1 96 97 92 72 37 67 early stages of research [21]. To our knowledge, no one has yet
compared direct versus indirect measurement of the intensity of
2 96 97 91 79 35 40
pain or other constructs.
>=3 98 98 95 80 70 60 Three, following previous work, we measured pain at the frame-
by-frame level. However, pain expression is not static, but results
from the progressive deformation of facial features over time. A
Five-folds cross-validation is not participant-independent. Frames next investigation would be to include dynamics when measuring
from the same participant may appear in both the training and pain intensity.
testing sets. To control for this limitation, we next performed a And four, previous work in both pain and AU detection primarily
leave-one participant out cross-validation in which participants regards head pose variation as a source of registration error.
(i.e. all the corresponding images) of the training are removed However, head pose is itself a potentially informative signal. In
from the testing. This validation allows exploring how the particular, head pose changes may themselves be a good indicator
proposed method for pain intensity measurement generalizes to a of pain [18] and pain intensity. We are currently in the process of
new set of participants who were not part of the training set. The exploring the dynamic characteristics of head orientation such as
leave-one-subject-out validation, consists in building 25 (but not limited to) the speed, velocity, and acceleration of pain
classifiers for each one of the four levels of pain intensity and indicators. We believe explicit attention to dynamics is an exciting
iterating the process. In the leave one subject out validation, the direction for further research.
number of training frames from all the video sequences is
prohibitively large to train a SVM, as the training time complexity
for SVM is O(m3), where m is the number of training examples
7. REFERENCES
[1] Ashraf, A., B., Lucey, S., Cohn, J. F., Chen, T,., Prkachin,
[2, 1]. In order to make the learning process practical, while
K. M. and Solomon, P. E. (2009). The painful face: Pain
making the best use of the training data, each video sequence is
expression recognition using active appearance models,"
down-sampled by taking 1 of every frames [21]. The training is
Image and Vision Computing, vol. 27, pp. 1788–1796, 2009.
thus performed on only 15% of video images excluding one
participant. The SVM testing is made on the left- out participant. [2] Burges, Christopher J. C.; A Tutorial on Support Vector
Based only on 15% of the training data, the obtained F1 for each Machines for Pattern Recognition, Data Mining and
level of pain intensity (from 0 to 3) was 0.57, 0.67, 0.40, and 0.60, Knowledge Discovery 2:121–167, 1998
respectively. [3] Chew, S. W., Lucey, P., Lucey, S., Saragih, J., Cohn, J. F.,
and Sridharan, S. In the pursuit of effective affective
5.2 Intra Class Correlation Coefficient computing: The relationship between features and
The Previous results are for category-level agreement. Here we registration," IEEE Transactions on Systems, Man, and
compare consistency of the 4-level automatic measurement with Cybernetics - Part B, In press.
the 12-level PSPI. To do so, the reliability between the proposed
method and the PSPI is quantified by the intra-class correlation [4] Cootes, T., Cooper, D., Taylor, C., and Graham, G. 1995.
coefficient (ICC) [28]. Values range within a restricted interval [- Active shape models - their training and application.
1,1]. An ICC of 1 would indicate perfect consistency. The ICC is Computer Vision and Image Understanding, 61(1):38–59,
typically used to calculate the reliability of judgments. Following Jan. 1995.

51
[5] Cootes, T., Edwards, G. and Taylor, C. 2001. Active [17] Lucey, P,, Lucey, S., and Cohn, J. F. (2010). Registration
Appearance Models, IEEE Transactions on Pattern Analysis invariant representations for expression detection,"
and Machine Intelligence, vol. 23, no. 6, pp. 681–685, 2001. International Conference on Digital Image Computing:
[6] Craig, K. D., Prkachin, K. M., and Grunau, R. V. E. 2001. Techniques and Applications (DICTA), December 1-3, 2010
The facial expression of pain. In D. C. Turk and R. M., 2010.
editors, Handbook of pain assessment. Guilford, New York, [18] Lucey, P., Cohn, J. F., Prkachin, K. M., Solomon, P., &
2nd edition, 2001. Matthews, I. (2012). Painful data: The UNBC-McMaster
[7] Donato, G., Bartlett, M., Hager, J, Ekman, P., and Sejnowski, shoulder pain expression archive database," Image, Vision,
T. 1999. Classifying facial actions,” IEEE Transactions on and Computing Journal, vol. 30, pp. 197-205, 2012.
Pattern Analysis and Machine Intelligence, vol. 21, no. 10, [19] Lucey, S., Ashraf, A., and Cohn, J. (2007). Investigating
pp. 974–989, 1999. spontaneous facial action recognition through aam
[8] Ekman P, Friesen WV. (1978). Manual for the Facial Action representations of the face, in Face Recognition Book, K.
Coding System. Palo Alto, CA: Consulting Psychologists Kurihara, Ed. Pro Literatur Verlag, 2007.
Press; 1978. [20] Lints-Martindale, A.C., Hadjistavropoulos, T., Barber, B.,
[9] Ekman, P., Friesen, W. V. and Hager, J. C. 2002. Facial Gibso, S.J. 2007. A psychophysical investigation of the facial
action coding system: Research Nexus, Network Research action coding system as an index of pain variability among
Information, Salt Lake City, UT, 2002. older adults with and without Alzheimer's disease. Pain Nov-
Dec;8(8):678-89, 2007.
[10] Hammal, Z., and Kunz, M., (2012). Pain Monitoring: A
Dynamic and Context-sensitive System (2012). Pattern [21] Mahoor, M. H., Cadavid, S., Messinger, D. S. and Cohn, J. F.
Recognition. Volume 45, Issue 4, April 2012, Pages 1265- (2009). A Framework for Automated Measurement of the
1280. Intensity of Non-Posed Facial Action Units. 2nd IEEE
Workshop on CVPR for Human communicative Behavior
[11] Hammal, Z., Context based Recognition of Pain Expression analysis (CVPR4HB), Miami Beach, June 25, 2009.
Intensities. The 5th Workshop on Emotion in Human-
Computer Interaction - Real World Challenges -held at the [22] Marian Stewart Bartlett, M.S., Littlewort, G., Frank, M.,
23rd BCS HCI Group conference. Cambridge University, Lainscsek, C., Fasel, I., Movellan, J., (2006). Fully
Cambridge, UK, September 2009. (Fraunhofer VERLAG, Automatic Facial Action Recognition in Spontaneous
2010). Behavior. Proceedings of the 7th International Conference on
Automatic Face and Gesture Recognition (FGR’06)
[12] Hammal, Z., and Massot, C., (2011). Gabor-like Image
Filtering for Transient Feature Detection and Global Energy [23] Maruf Monwar, Md. and Siamak R. 2006. Pain Recognition
Estimation Applied to Multi-Expression Classification. Pages Using Artificial Neural Network. Proc. of IEEE International
135-153. In "Communications in Computer and Information Symposium on Signal Processing and Information
Science" (CCIS 229) (eds. P. Richard and J. Technology 2006.
Braz, Springer, CCIS 229, 2011). [24] Massot C., Herault J., 2008. Model of Frequency Analysis in
[13] Kunz, M., Scharmann, S., Hemmeter, U., Schepelmann, K., the Visual Cortex and the Shape from Texture Problem,
& Lautenbacher, S. 2007. The facial expression of pain in International Journal of Computer Vision, 76(2), 2008.
patients with dementia. Pain. December 15;133(1-3):221-8, [25] Matthews, I. and Baker, S. 2004. Active appearance models
(2007). revisited, International Journal of Computer Vision, vol. 60,
[14] Littlewort, G.C., Bartlett, M.S. & Kang, M.S. 2007. Faces of no. 2, pp. 135–164, 2004.
Pain: Automated Measurement of Spontaneous Facial [26] Prkachin, K., 1992. The consistency of facial expressions of
Expressions of Genuine and Posed Pain. Proc. ICMI, pain: a com- parison across modalities,” Pain, vol. 51, pp.
Nagoya, Aichi, Japan, November 12–15, 2007. 297–306, 1992.
[15] Lehr, V.T., Zeskind, P.S., Ofenstein, J.P., Cepeda, E., [27] Prkachin, K. M. & Solomon, P. E. 2008. The structure,
Warrier, I. Aranda, J.V. 2007. Neonatal facial coding system reliability and validity of pain expression: Evidence from
scores and spectral characteristics of infant crying during patients with shoulder pain. Pain, 139, 267-274, 2008.
newborn circumcision. Clin J Pain. Jun; 23(5):417-24, 2007. [28] Shrout, P., and J. Fleiss, J. (1979). Intraclass correlations:
[16] Lucey, P., Cohn, J. F., Howlett, J., Member, S. L., & uses in assessing rater reliability. Psychological Bulletin,
Sridharan, S. (2011). Recognizing emotion with head pose 86(2):420–428, 1979.
variation: Identifying pain segments in video. Systems, Man,
and Cybernetics – Part B, 41(3), 664-674.

52

You might also like