You are on page 1of 12

Physiology & Behavior 143 (2015) 15–26

Contents lists available at ScienceDirect

Physiology & Behavior


journal homepage: www.elsevier.com/locate/phb

The comparison question polygraph test: A contrast of methods


and scoring
Charles R. Honts ⁎, Racheal Reavy 1
Boise State University, Psychology Department, 1910 University Drive, MS-1715, Boise, ID 83725-1715, USA

H I G H L I G H T S

• We examined two variants of the Comparison Question Test for deception detection.
• The directed-lie had equivalent validity with the traditional probable-lie CQT.
• The directed-lie variant did not evoke more countermeasure attempts.
• Vasomotor responses were found to be highly useful in discriminating deception.
• The directed-lie is more standardized and may offer application advantages.

a r t i c l e i n f o a b s t r a c t

Article history: We conducted a mock crime experiment with 250 paid participants (126 females, Mdn age = 30 years) contrast-
Received 2 October 2014 ing the validity of the probable-lie and the directed-lie variants of the comparison question test (CQT) for the de-
Received in revised form 17 February 2015 tection of deception. Subjects were assigned at random to one of eight conditions in a Guilt (Guilty/
Accepted 18 February 2015
Innocent) × Test Type (Probable-Lie/Directed-Lie) × Stimulation (Between Repetition Stimulation/No Stimula-
Available online 20 February 2015
tion) factorial design. The data were scored by an experienced polygraph examiner who was unaware of subject
Keywords:
assignment to conditions and with a computer algorithm known as the Objective Scoring System Version 2
Deception detection (OSS2). There were substantial main effects of guilt in both the OSS2 computer scores F(1, 241) = 143.82,
Comparison question test p b .001, ηp2 = 0.371, and in the human scoring, F(1, 242) = 98.92, p b .001, ηp2 = .29. There were no differences
Lie detection between the test types in the number of spontaneous countermeasure attempts made against them. Although
under the controlled conditions of an experiment the probable-lie and the directed-lie variants of the CQT pro-
duced equivocal results in terms of detection accuracy, the directed-lie variant has much to recommend it as it
is inherently more standardized in its administration and construction.
© 2015 Elsevier Inc. All rights reserved.

1. Introduction Polygraph Examiners Association), South Africa (South African Profes-


sional Polygraph Association), the United States (American Association
Comparison question tests (CQTs) are by far the most commonly of Police Polygraphists; American Polygraph Association; National Poly-
used type of psychophysiological deception detection (PDD) test in graph Association), and the United Kingdom (British Polygraph Associ-
law enforcement, forensic practice, and national security screening set- ation). In 2009 the American Polygraph Association reported members
tings [44]. Such tests play an important role in the United States from 46 different countries [1]. It is believed that the CQT is the test of
Government's national security and law enforcement programs. choice in the vast majority of those countries [17]. There is a substantial
World-wide, the interest in and use of PDD are growing rapidly as evi- published literature on the criterion validity of the CQT that converges
denced by the growing number of professional polygraph associations on estimates of decision accuracy around 90% (see reviews by [17,35,
around the world with associations in Canada (Canadian Association 44]). However, some aspects of the CQT testing procedures are without
of Police Polygraphists), Central and South America (Latin American substantial empirical support.
Polygraph Association), European Union (British and European Poly-
graph Association; European Polygraph Association), Israel (Israel
1.1. A brief background

⁎ Corresponding author.
E-mail addresses: chonts@boisestate.edu (C.R. Honts), rreavy@gmail.com (R. Reavy).
The CQT contains two active stimuli in the form of test questions
1
Pennsylvania State University, 320C Biobehavioral Health Building, University Park, PA [43]. Relevant questions are direct accusatory questions that address
16802, USA. the matter under investigation, e.g. “Did you shoot John Doe?” It is

http://dx.doi.org/10.1016/j.physbeh.2015.02.028
0031-9384/© 2015 Elsevier Inc. All rights reserved.
16 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

expected that a person who is attempting deception about the topic of setting where the examiner's manipulations have set the comparison
the relevant question will provide a substantial physiological reaction questions up as important for the outcome of the test. However the inno-
during and after its presentation. Comparison questions are stimuli cent subjects are responding truthfully to the apparently more important
that are designed to divert attention from the truthfully answered, but relevant questions. None of this is to say that emotions cannot play a role
obviously important relevant questions. Through their design and crea- in, or even be sufficient for deception detection, but simply that emotional
tion it is assumed (or known) that all subjects' verbal responses to the responses are not necessary for the CQT to work.
comparison questions are deceptive. The most commonly given ratio- The idea that pure cognitive load tasks can produce reactions that are
nale of the CQT [43] is that the comparison questions will provide stim- indistinguishable from those produced by innocent subjects to compar-
uli that will evoke a physiological reaction (hereinafter the term ison questions was demonstrated in the mental countermeasure condi-
response, refers to a verbal response and reaction will refer to a physio- tion in Honts et al. [23]. Honts et al.'s mental countermeasure subjects
logical reaction) from persons who are not attempting deception to the were instructed to count backward by 7 from a random number larger
relevant questions. The ability of comparison questions to evoke reac- than 200 during each presentation of the comparison questions. Using
tions is created by the examiner's control of an environmental context this strategy to increase cognitive load, 55% of Honts et al.'s mental coun-
were the truthful person (with respect to the relevant questions) will termeasure subjects were able to produce sufficient reactions to compar-
be concerned that they may fail (or produce an inconclusive outcome) ison questions to avoid detection. The polygraph examiner was unable to
as a result of their reactions to the comparison questions. For those per- detect the use of the mental countermeasures either from observation of
sons attempting deception to the relevant questions, the relatively in- the subject during the examination or from the physiological data.
nocuous comparison questions are likely unimportant.
Although reactions to the CQT stimuli are often categorized as emo- 1.3. Probable-lie vs. directed-Lie
tional reactions, the data clearly show that such reactions occur in the
absence of contextual circumstances that would be needed create fear Currently there are two major categories of comparison questions in
or other strong emotions. For example, detection rates with guilty sub- use within the polygraph profession. The older and more common form
jects in laboratory studies, where only small amounts of money are mo- of the comparison question is known as the probable-lie [45]. Probable-
tivators, consistently produce high detection rates and low error rates lie comparison questions deal with either acts that are directly similar to
[44]. Honts et al. [20,21] provided a specific example of this in a labora- the issue of the investigation or with general issues about honesty and
tory CQT study. In that study the motivation to produce a truthful out- lying. They are more general in nature than the relevant questions, are
come was one US dollar. With that level of motivation 91% of the deliberately vague, and cover long periods of time in the life of the sub-
guilty subjects in the control condition (representing standard field ject. Virtually every subject has difficulty in unequivocally answering
practices) were correctly classified as deceptive, 9% were inconclusive them with a simple and truthful “No.” An example of a probable-lie
and there were no false negative errors. Innocent subjects under the question in an examination regarding a robbery is “Prior to 2013, did
$1 motivation in the standard field practices condition were correctly you ever take something that did not belong to you?” or “Prior to
classified 91% of the time, there were no inconclusive outcomes and 2013, did you ever lie to someone who trusted you?” Probable-lie com-
there were 9% false positive errors. It would be difficult to argue that, parison questions are reviewed with the subject after the relevant ques-
with only one US dollar at stake the subjects in the Honts et al., study tions are discussed and reviewed, and they are presented in a manner
were producing emotional reactions. designed to encourage the subject to answer them with a denial (for ad-
ditional details on probable-lie CQT presentation see [43]).
1.2. The role of cognitive load A newer form of the comparison question is known as the directed-
lie [22]. The directed-lie comparison was developed within the U.S. Gov-
Recent theoretical conceptions of the CQT use the notions of Vrij and ernment to improve the accuracy of certain national security screening
Gannis [54] about the underlying processes of deception in general and polygraph tests [4]. With directed-lie comparison questions the subject
have stressed cognitive load as a primary explanatory construct for the is instructed to answer certain questions with a lie. A typical directed-lie
production of reactions in the CQT [18]. Vrij and his colleagues question is “Prior to 2012, did you ever tell even one lie in your entire
(e.g., [53–55]) note that as compared to truth-tellers liars experience life?” All subjects are told that they must show appropriate reactions
more cognitive load than do truth-tellers. Liars expend more finite cog- when lying to the directed-lie questions, or the test will result in an in-
nitive resources than truth-tellers when being assessed for deception. conclusive outcome (for additional details on directed-lie CQT presenta-
Vrij and his colleagues attribute this increased demand to a number of tion see [43]).
factors. The basic process of formulating a plausible lie may be cogni- The rationale for using directed-lie comparison questions is similar to
tively difficult. Liars, assuming that their credibility is suspect, will mon- the rationale for probable-lie comparison questions. It is assumed that the
itor and attempt to control their appearance and hide their emotions so subject's concern will be focused on the questions that pose the greatest
that they appear truthful. To the extent that liars are actually experienc- risk of an undesirable test outcome. For guilty subjects, the focus will be
ing emotion associated with the lie cognitive load will increase. Liars are on the relevant questions that are answered deceptively. It is reasoned
also likely to monitor the interviewer's reactions more carefully in order that innocent subjects will focus on showing they are suitable subjects,
to assess their success in lying. Liars may focus on the task of acting and and on clearly demonstrating that their reactions when lying are different
role-play as truthful. Moreover, liars must suppress the truth while they from when they are truthful. This focus of concern is designed to enhance
are lying, since speaking the truth often happens automatically. Finally the cognitive load associated with the directed-lie questions for the truth-
as compared to telling the truth, producing a lie is more intentional ful subjects thus making them a stronger stimulus than the relevant ques-
and deliberate, and thus requires mental effort. tions. The directed-lie is easily standardized to a fixed script for
Honts [18] argued that that Vrij et al.'s theoretical conception provides presentation and a small set list of questions for use in application.
a good model for understanding the CQT. Deceptive subjects experience Compared to the probable-lie, there is relatively little published re-
high cognitive load while answering the relevant questions for all of the search on the directed-lie, but the research, from both laboratory and
reasons noted above. At the same time deceptive subjects are much less field studies is consistent in showing no significant differences in accu-
likely to have high cognitive load during the comparison questions be- racy from the probable-lie (see reviews by [2,43]). The American Poly-
cause, as compared to the relevant questions, the comparison questions graph Association [2] considers the form of the directed-lie test
are obviously less important from a guilty person's perspective. Innocent developed and validated at the University of Utah [22,26] to be a vali-
subjects are likely to experience higher cognitive load during the compar- dated technique (a study with two or more published empirical studies
ison questions because they are lying to the comparison questions in a with an average accuracy of decisions of 90% or more.)
C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26 17

Despite its wide application, the probable-lie version of the CQT has 1.5. Goals of the present study
several inherent problems [53], and some suggest that the directed-lie is
a remedy for most of these problems [4,22–24,26,43]. Probable-lie com- The literature contrasting the probable-lie and the directed-lie is
parison questions can be difficult to administer in field settings and re- equivocal. A large evaluative review of laboratory and field data by the
quire psychological sensitivity, sophistication, and skill on the part of American Polygraph Association [2] looked at variants of the CQT con-
the examiner to obtain an accurate outcome (see the review by [43]). ducted with probable-comparison questions and directed-lie compari-
Unfortunately, many polygraph examiners lack adequate training in son questions and found no reliable differences. However, even if the
psychological methods and do not understand the basic concepts and accuracy rates associated with the probable-lie and the directed-lie
requirements of using a standardized psychological test in a field set- are equivalent, the directed-lie comparison question offers substantial
ting. These problems are exacerbated when the examiner attempts to advantages in standardization, face validity to lay audiences, and de-
formulate individualized probable-lie comparison questions for each creased intrusiveness. Additional data addressing possible validity dif-
subject. Strong standardization of the wording and discussion of ferences between the directed-lie and the probable-lie approaches,
probable-lie comparison questions are thus difficult. Clearly, the validity and the impacts of between chart stimulation acquired from a well-
of a probable-lie comparison question test depends on how the subject conducted experiment, would provide a substantial increase in our
perceives and responds to the probable-lie questions when they are in- knowledge about the best practices to take in the field. Should the evi-
troduced and discussed during the pretest interview and those percep- dence support between chart stimulation and the directed-lie compari-
tions are at least somewhat dependent upon the skill of the examiner. son question, those techniques could be widely added to field practice
The difficulties with probable-lie comparison questions may be quickly and with minimal cost in retraining. Finally, it may be that the
compounded by problems related to the characteristics of examinees stimulation of questions between repetitions has differential effects on
[43]. Examinees can be anxious about the subject matter of the probable-lie and directed-lie comparison questions. Thus it makes the
probable-lie questions, making it difficult for the examiner to establish most sense to study these two variables in a factorial design where
effective comparison questions. Probable-lie comparison questions their possible interaction can be examined directly.
may be personally intrusive and offensive to some subjects. For others, This study will also provide data to address criticisms concerning the
the probable-lie questions may encompass prior criminal behavior of strength of directed-lie comparison questions. For some that concern fo-
a serious nature that poses problems for the subjects, some of whom cuses on the fact that because subjects are told to lie to the directed-lie
may refuse to answer the questions. If a person is administered more questions they may not take them seriously. As a result of that lack of se-
than one test, or tested on multiple occasions, it may become difficult riousness directed-lie questions will not function as well as probable-lie
to formulate new probable-lie questions that continue to be effective questions in attracting and holding the focus of the actually innocent
for the subject. Finally, it is often difficult to explain the function of [57]. Although Matte's speculations were sharply refuted by [58]. Such
probable-lie questions and their role in interpreting the outcome of the concerns may persist in the polygraph profession and additional data
test to those who use the results of polygraph tests (e.g., investigators, will help clarify this issue.
lawyers, judges, program managers, and jurors) and to laypersons. They One other issue that is sometimes raised about directed-lie compar-
often do not understand the rationale of the probable-lie and may inter- ison questions is that since their function in the CQT seems more obvi-
pret strong physiological reactions to probable-lie questions as an indica- ous than probable-lie comparisons, the use of directed-lie questions
tion that the subject is dishonest and guilty. Given its standardized format, might invite countermeasure attempts [27]. Although a substantial
the simplicity and fixed nature of its presentation, and its unambiguous body of research concerning naive spontaneous countermeasure has
procedures, the directed-lie appears to offer many advantages over the failed to find significant effects [18], it would be important to know if
probable-lie and may minimize the problems described above. the directed-lie did stimulate an increased frequency of spontaneous
countermeasure attempts.
1.4. The role of stimulation The present study represents a large N study contrasting the validity
of probable-lie and directed-lie CQT examinations. In designing the
Another current difference in field practices concerns the stimulation study we attempted to address the following experimental questions:
of comparison questions between charts. In a typical CQT, the question se-
ries is repeated between three and five times. One area of marked diver- 1. Are there significant differences in the scores, accuracy or physiological
gence in field practice concerns what is said to subjects between those reactions produced by probable-lie and directed-lie CQTs? Based upon
question repetitions. The U.S. Government's approach is to not discuss the existing research, we expect to find no significant differences in the
(stimulate) any of the test questions between question repetitions [9, total scores or accuracy of decisions that are produced by the two ap-
10]. The University of Utah approach is to discuss the critical questions be- proaches to comparison questions in the CQT. However, Kircher and
tween charts. In the Utah system, after each presentation of the question Raskin [30,31] note conflicting data regarding physiological differences
sequence, the examiner asks the subject if there were any problems with in the physiological component scores, most notably in respiration.
the test questions and discusses any concerns that the subject expresses. They discuss two laboratory studies Horowitz et al. [26] and Bell
The examiner then reviews the questions in order to ensure that the rel- et al. [6] (later partially published as Bell et al. [7]) that failed to find
evant questions are clear and straightforward and the comparison ques- that respiration component scores from directed-lie had significant
tions remain salient. If the subject makes an admission to a probable-lie correlations with the guilt, while respiration scores from probable-lie
question or provides additional information that changes the meaning tests did correlate significantly with the guilt criterion. Such differ-
of a relevant question, this is discussed and appropriate adjustments are ences in respiration scores were not observed in the field study by
made in the wording of the affected questions. Honts and Raskin [22]. However since our theoretical position is that
A meta-analysis by Honts [16] suggests that between chart stimula- both techniques work through the same underlying process, that is
tion offers a positive improvement in CQT accuracy. Moreover a recent cognitive load, our prediction was that there would be no significant
experiment by Offe and Offe [38] reported that between chart discus- differences in the physiological component data produced by the two
sion had a positive effect on accuracy, but only when there was minimal comparison question types. As in these previous studies all of our anal-
explanation of comparison questions in the pretest. When there was a yses are based upon difference measures between the relevant and
normal pretest discussion of the comparison questions between chart their respective comparison questions. We would also note that al-
stimulation produced non-significantly higher accuracy. However, the though we have chosen to describe our work within the cognitive
Offe and Offe study had relatively few subjects and thus had relatively load model for the CQT, other theoretical conceptions would, or
low statistical power to find small effects. could, make similar predictions (i.e. Relevant Issue Gravity [14],
18 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

Differential Salience and Psychological Set [49]). Differentiating be- that the person did not meet requirements for participation. There were
tween these various theoretical models was not the goal of this re- 250 individuals (50.4% female) who participated in the study and com-
search. pleted the protocol. Participants ranged in age from 18 to 65 years
2. Does between repetition stimulation of the test questions have a (Mdn = 30, M = 34.4, SD = 13.4).
significant impact on the data produced by probable-lie and/or
directed-lie CQTs? Between chart stimulation of the test questions 2.2. Examiners
was standardized at the University of Utah in an effort to keep the
important questions in focus for the test subjects. In our current An experienced polygraph examiner (32 years of field polygraph ex-
conceptual framework we expect between chart stimulation to in- perience at project onset) used reference materials provided by the De-
crease cognitive load on the more salient category of questions for partment of Defense Polygraph Institute to train three individuals, none
the test subjects. We thus predicted that between repetition stimula- of whom was a practicing polygraph examiner, to conduct polygraph
tion should increase discrimination between guilty and innocent par- examinations. The experienced examiner was originally trained to con-
ticipants in this study. duct test with probable-lie comparison questions, but began using
3. Are there differences between computer generated scores with the directed-lie comparison questions in 1984, either in combination with
OSS2 algorithm as compared to numerical scores from an experienced probable-lie comparisons or exclusively one or the other type of com-
human examiner? Recent research [44] shows the Objective Scoring parison questions. The experienced examiner was also experienced in
System Version 2 (OSS2; [34]) to be the most accurate of the currently running polygraph examinations both with and without between chart
available computer-based polygraph scoring algorithms. However, stimulation. Two of the examiners were undergraduate research assis-
Raskin and Kircher [44] did not include a comparison of the various al- tants, the third was the paid Research Assistant for the project and recent
gorithms with human numerical scoring. We obtained a human nu- college graduate with a B.A. in psychology. The latter individual had run
merical scoring of our data to test against the OSS2. We predicted polygraph examinations as part of a previous research project in our lab-
that OSS2 should outperform a highly experienced human evaluator oratory. The goal of the examiner training was to standardize and repli-
using numerical scoring. cate field procedures as closely as possible. The training of the two
4. Does the directed-lie CQT produce more spontaneous countermeasure undergraduates consisted of approximately 4 h of direct work with the in-
attempts than the probable-lie CQT? The suggestion that directed-lie strument, a subject, and the experienced examiner. The undergraduate
questions might invite more countermeasure attempts seemed plausi- examiners also reviewed video recordings of two polygraph examina-
ble, but it did not fit with our (first author) clinical impressions. We did tions: one conducted with the directed-lie and one with the probable-
not make a prediction about possible outcomes to this question. lie. The undergraduate examiners then ran two pilot participants, one
5. Does the addition of a measure of peripheral vasomotor reaction (PVR) directed-lie and one probable-lie, whose data was not included in the
increase the accuracy of polygraph scoring? The ability to measure PVR study. The videos and data from the pilot examinations were reviewed
has been available for use in field polygraph instruments since the by the experienced examiner and the undergraduate examiners were
1980s. However, this dependent measure is rarely used in field prac- provided with feedback. This was very similar to the training regimen
tice. Kircher and Raskin [29] reported that decreases in amplitude the Research Assistant received before running examination in the previ-
and increases in the duration of vasomotor reactions were significantly ous study where she conducted 60 examinations.
correlated with the guilt criterion but that the vasomotor reaction did The polygraph examiner and the assistants who greeted the partici-
not load into their discriminant analysis model. A vasomotor predictor pants were unaware, at all times, of the participants' assignment as
was not included in the commercially available Computerized Poly- guilty or innocent. The experienced polygraph examiner conducted 92
graph System algorithm [30] that is sold with Stoelting polygraph in- examinations in the project. The Research Assistant conducted 84
struments. Nevertheless, criteria for scoring vasomotor reaction were, examinations. The female undergraduate examiner conducted 39 ex-
and are, part of the University of Utah numerical scoring system [8]. aminations while the male undergraduate examiner conducted 35
Since the time of the Kircher and Raskin [29] analysis, our impression examinations.
is that the quality of commercially available vasomotor measurement
equipment has improved dramatically and that this improvement in 2.3. Apparatus
instrumentation quality may have increased the value of the vasomo-
tor component for detecting deception. We revisited this issue in these Physiological data were collected with CPS II field polygraph instru-
data without making a specific prediction about the outcome. ments running the CPS II software [51]. The following physiological reac-
tions were recorded: Thoracic and abdominal respiration were monitored
2. Materials and method with Pneumotrace strain gage sensors placed around the chest and abdo-
men; electrodermal activity was measured as skin conductance reaction
2.1. Participants from disposable Vermed GSR-13 electrodes placed on the palm in the
area of the thenar and hypothenar eminences; relative blood pressure
Participants were recruited via help-wanted ads placed in was monitored from a cuff placed on the participants' upper left arm
craigslist.com and a local alternative newspaper in a metropolitan area and inflated to between 50 and 60 mm Hg of pressure; and peripheral va-
with a population of about 400,000. The advertisement stipulated an somotor reaction was monitored with a photoelectric plethysmograph
hourly wage of $15 for approximately 2 1/2 h of participation in a poly- placed on the distal surface of the subject's right thumb. A Stoelting move-
graph research study. When they called for an appointment potential ment sensor was placed in the seat of the subject's chair. The movement
participants were screened for disqualifying factors. Individuals who sensor is marketed as a countermeasure and movement artifact detector
were currently pregnant, taking prescription medication for high and its use has become a standard in field examinations. Data from this
blood pressure, a heart condition, or to treat a psychological disorder, sensor were not used in this study.
had previously taken a polygraph examination, or were under the age The data were initially evaluated with a computer implementation
of 18 were deemed ineligible for participation in the study and were [30,32,51] of the OSS2 scoring system. The OSS2 scoring system was
not invited to participate. Those who met the selection criteria were originally developed as an objective way to generate numerical scores
randomly assigned to one of eight experimental conditions. A number based upon handmade measurements [33,34]. A guide for the calcula-
of participants began the protocol and were lost during the experiment tion of OSS scores was published by Dutton [11]. In the Computerized
for a variety of reasons unrelated to the experimental conditions, for ex- Polygraph System used in this study, the software made measurements
ample, failure to show up for the appointment or discovery after arrival of the three reaction features retained in the discriminant analysis
C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26 19

reported in Kircher and Raskin [29]. Those features are the same as com- This script/video described that their participation in the study might
puter measured features used in Krapohl and McManus [33] for their involve stealing some money and that they, regardless of their assigned
Training sample. Those criteria are maximum amplitude of the electro- condition, would be taking a polygraph examination during which they
dermal response, maximum amplitude of the rise in baseline of the rel- were to try to convince a polygraph examiner that they were truthful
ative blood pressure, and length of the respiration tracing for 10 s when they denied stealing the missing money. If they agreed to the de-
following question onset. scribed conditions of the study, participants signed an informed consent
A clear description of how those measurements are turned into form. After their consent was obtained, participants were instructed to
scores is provided by Raskin and Kircher [44]. Briefly, the objective fea- select an unmarked sealed envelope at random from a box of unmarked
ture measurements are converted to ratio values by dividing the feature envelopes. That envelope contained instructions for watching another
measurement from a relevant question by the measurement for the video that would describe their condition assignment and instructions
same feature from the comparison question that preceded the relevant for carrying out their task(s). Scheduling was controlled so that partici-
question in time. OSS scores are then assigned on a seven-point scale pants never had the opportunity to encounter any other participants.
based upon the cutoff values determined in Krapohl and McManus Some participants (innocent) were shown a video informing them
[33] from a large training set of confirm field cases from U.S. Govern- that they were assigned to the innocent condition and thus they were
ment practice. The smallest ratios are assigned +3 (CQ strongest) and not going to be stealing any money during the study. Those participants
the largest ratios −3 (RQ strongest). Electrodermal activity scores are were told that they would be paid a $20 bonus if they successfully con-
then doubled in an effort to mimic the discriminant weights from the vinced the polygraph examiner that they were innocent of stealing $20
Kircher and Raskin [29] analysis. from the Education Building. Innocent participants were instructed to
The data were subsequently scored with the Utah Scoring System leave the laboratory building and go to the Education Building, where
[8]. The Utah Scoring System is a semi-objective (judgments and mea- they were to deliver an envelope to the door of Dr. A's office and return
surements) scoring system that also uses a 7-position scale that ranges to the laboratory 20 min later to take a polygraph examination.
from −3 to +3. Zero scores indicate equivalent responses to relevant Other participants (Guilty) watched a video informing them that
and comparison questions. Negative scores indicate that the physiolog- they were assigned to the guilty condition and thus they were going
ical reaction to the relevant question was larger. Positive scores indicate to be stealing money during the study. Those participants were also in-
that the physiological reaction to the comparison question was larger. formed that if they were successful in passing the polygraph examina-
The magnitude of the score indexes the magnitude of the difference tion, by producing a truthful outcome concerning the theft of $20
between the physiological reactions to relevant and comparison ques- from the Education Building, they would be paid a $20 bonus. Guilty
tions. Within each component relevant question reactions are com- participants were instructed to leave the laboratory building and go to
pared to the reaction to the preceding comparison question. With the Education Building. They were asked to find Dr. A's office and steal
respiration the examiner visually looks for suppression of amplitude, an envelope addressed to Sam Stone that was taped to the door. They
apnea (cessation of breathing), increase in respiration baseline, and were then asked to open the envelope and hide its contents (a $20
slowing of respiration rate. With electrodermal activity measurements bill) on their person. They were asked to return to the laboratory
of the amplitude of the response and scores are assigned with a ratio 20 min later to take a polygraph examination.
rule. Ratios of 2 to 1 are assigned a score of 1, ratios of 3 to 1 are assigned Prior to each examination the examiner obtained a code from a ran-
a score of 2, and ratios of 4 to 1, or greater, are assigned a score of 3. Du- domized running order that indicated if this examination was to be a
ration and complexity of the electrodermal response are also examined directed-lie or probable-lie examination and if there was to be between
and at the discretion of the examiner can impact the assignment of repetition stimulation of the test questions. When participants returned
scores by no more than a reduction of the ratio requirement by .5. to the laboratory, an assistant introduced her or him to the polygraph
With relative blood pressure measurements are made of the increases examiner. The examiner reminded the participant that his or her poly-
in the baseline. A ratio of 1.5 to one is required for a score of 1. Duration graph examination would be videotaped and that the purpose of the
of the baseline increase is also considered by the evaluator and in con- examination was to identify the person who had stolen an envelope
junction with larger difference in baseline increase may be used to in- containing $20 from the door of Dr. A's office in the Education Building
crease the score to a 2 or 3. Scores for peripheral vasomotor activity earlier that day. Examination sessions began with the examiner
are based upon the examiner's judgment of the size of the respective collecting some general information from the participant concerning
vasoconstriction durations and magnitudes. Duration is given more things such as the participant's general health, how well they had
weight than magnitude. Full details of the Utah Scoring System are pro- slept the night before, whether he/she had ever taken a polygraph
vided by Bell et al. [8]. A contemporary review of the validity studies of exam. This was done using the built-in biographical form in the CPS II
the Utah Scoring System is provided by Raskin and Kircher [44]. software. Participants were then told that they were a suspect in the
theft of $20 from the Education Building and were asked if they had,
2.4. Design in fact, stolen the envelope containing the money. After participants de-
nied the accusation, the examiner asked them to explain where they
The study was a Guilt (Guilty, Innocent) × Test Type (Probable-Lie, had been and what they had been doing for approximately the last 2 h.
Directed-Lie) × Stimulation (Between Chart Stimulation, No Between Next, the examiner briefly discussed the nature of the autonomic
Chart Stimulation) between subjects factorial design. Participants nervous system. Participants were told that although individuals are
were randomly assigned to one of the eight design conditions with largely able to control their motor behavior, many functions of the
the constraint that each condition would be considered to be complete body, such as temperature regulation, heart rate, and breathing are
when 30 participants had been run in that condition. largely uncontrollable and vary automatically in response to physical
and psychological stressors, such as lying. Next, the function of each
2.5. Procedure sensor was described, and participants were told to expect that, due to
the pressure applied from the blood pressure cuff, they might experi-
The design was implemented using of a variation of the mock crime ence a tingling sensation in and/or some discoloration of the arm on
paradigm developed at the University of Utah [41]. All protocols and which the blood pressure cuff was placed. At this point, participants
participant recruitment procedures were reviewed and approved by were asked to sign another informed consent sheet giving permission
the University Institutional Review Board. Upon arriving at the laborato- for the polygraph examination to continue. Participants were told that
ry, participants were directed to a room where they watched a video in the polygraph was approximately 90% accurate but that inconclusive
private (the script of which was also presented in typewritten form). and incorrect outcomes sometimes do occur.
20 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

Participants were then told that a demonstration test was going to the second repetition was N2, SR, OI1, C2, R1, C3, R2, OI2, C1, R3, and
be conducted before the actual polygraph examination concerning the N1. The order of the questions for the third repetition was N1, SR, OI1,
theft. The demonstration test (also known as a stimulation or acquain- C3, R1, C1, R2, OI2, C2, R3, and N2. For half the participants the relevant
tance test) is a standard practice in most CQT examinations [44]. The and comparison questions were reviewed between question repetitions
use of a demonstration test has been shown to increase the accuracy and for half they were not.
of subsequent CQT examinations (see the review in [44]). In this study After the examination was completed, participants were asked to fill
the demonstration test was introduced as a way for the participant to out a debriefing questionnaire that asked them about their perceptions
get used to taking a test while attached to the instrument and to allow of the test and about their countermeasure use. After completing the
the examiner to adjust the instrument to the participant's unique phys- questionnaire participants were debriefed by the examiner. During the
iological baselines. Participants were told to pick a number between 2 debriefing they were told about the outcome of their examination
and 6 inclusive and inform the examiner of the number that was chosen. (i.e., whether their reactions were scored as truthful or deceptive as
It was explained that after the sensors were attached to the participant a based upon an initial computer analysis with OSS2 run by the examiner)
series of questions would be posed, beginning with “Concerning the and the various conditions that were being compared as part of the
number that you chose, was it the number 1?” and continuing through study. Participants were asked not to discuss their participation with
to number 7. Participants were instructed to answer “no” to each of the others who might be participants in the study. Finally, participants
seven questions, so that during the asking of the question regarding the were thanked and paid for their participation. The examiners in this
number that was selected (and hence their deception was known) their study were not given any feedback about their performance until the
physiological reactions to lying could be observed. end of the experiment. Full participation, from consent through
Participants were asked to wash their hands (so that the best possi- debriefing lasted approximately two and a half hours.
ble recordings from the sensors could be obtained). At this point, the At a later time, the resulting physiological data were edited to re-
sensors were attached, and the demonstration test was conducted. As move artifacts by an experienced polygraph examiner who was not in-
is standard practice in the field, all participants were told that they pro- formed about subject assignment to conditions. Following editing, the
duced high quality recordings and that they were responding in a man- data were reanalyzed with the Objective Scoring System Version 2
ner that was suitable for continuing the examination. (OSS2; [34]) software that is part of the CPSII software. OSS2 was re-
Next, each of the questions that would be asked during the polygraph cently reported to be the most accurate of the commercially available al-
examination concerning the theft of $20 was reviewed with the partici- gorithms for scoring the CQT [44]. The data were also numerically
pants according to the testing approach, probable-lie or directed-lie, to evaluated with the Utah Scoring System [8] by an experienced examiner
which the participant was assigned. The reader can reference an example (6 years) who was not otherwise involved in the study.
of how this the comparison question presentations were made from tran-
scripts supplied with this manuscript as Appendix 1 for a probable-lie 3. Results
presentation and as Appendix 2 for a directed-lie presentation. As the ex-
aminer read each question, the participant was instructed to answer with 3.1. Computer-generated OSS2 scores
a “yes” or “no” just as they would during the actual examination. All par-
ticipants were asked 3 relevant questions, 3 comparison questions, 2 neu- 3.1.1. Examiners
tral questions and 3 other questions (Table 1). Exactly the same questions To see if there was an effect of examiner experience on OSS2 scores
were used for both the probable-lie and the directed-lie examinations. an Examiner (4 Examiners) × Guilt (Innocent, Guilty) × Test Type
The only difference between the conditions was in the presentation of (Probable-Lie, Directed-Lie) × Stimulation (Between Repetition Stimu-
the comparison questions. After all the questions were reviewed and re- lation, No Between Repetition Stimulation) ANOVA was conducted.
sponses recorded, a comparison question test was conducted. The ques- That analysis failed to reveal any significant effects involving the examiner
tion list was repeated three times while physiological data were being variable. Moreover, none of the effects involving examiners approached
recorded. The order of the questions for the first repetition was N1, SR, significance.
OI1, C1, R1, C2, R2, OI2, C3, R3, and N2. The order of the questions for
3.1.2. OSS2 total scores
OSS2 total scores were analyzed with a Guilt (Guilty/Innocent) × Test
Table 1 Type (Probable-Lie/Directed-Lie) × Stimulation (Between Repetition
Polygraph examination question list. Stimulation/No Between Repetition Stimulation) ANOVA. As predicted
Question type Questions by the rationale of the CQT the main effect of Guilt was significant with
Neutral questions
guilty participants producing negative scores (M = −23.6, SD = 25.5,
N1 Are we in the State of Idaho? 95% CI [− 28.2, −19.1]) and innocent participants producing positive
N2 Are the lights on in this room? scores (M = 15.6, SD = 25.6, 95% CI [10.7, 20.0]), F(1, 241) = 143.82,
Comparison questions p b .001, ηp2 = 0.371. The OSS2 total score correlation with the Guilt cri-
C1 Prior to 2008, did you ever lie to someone who trusted you?
terion was significant, r = .61, p b .01. Review of the effect sizes and sig-
C2 Prior to 2008, did you ever do anything that was dishonest
or illegal? nificance levels associated with the non-significant effects revealed that
C3 Prior to 2008, did you ever lie to a person in a position of none of them approached significance or accounted for any appreciable
authority? amount of variance in the data. With the exception of the significant
Relevant questions main effect of Guilt all of the ηp2 values were less than .01.
R1 Did you steal that missing envelope?
R2 Did you steal that envelope from the door of Room 619
in the Education Building? 3.1.3. OSS2 decisions
R3 Do you know where the missing money is now? To provide some prospective on the effect of the independent vari-
Other questions ables on decisions, the OSS scores were turned into decisions using
SR Regarding the envelope that was stolen from the
the simple +/− 8 rule described by Kircher and Raskin [32]. That is,
Education Building, did you intend to truthfully answer
each question about that? examinations with OSS total scores of +8 or greater were classified as
OI1 Do you believe that I will only ask you the questions we truthful. Examinations with OSS total scores of − 8 or less were
reviewed? classified as deceptive and examinations with total scores between
OI2 Is there something else you are afraid I will ask you a − 8 and + 8 were classified as inconclusive. The resulting decisions
question about?
were coded as deceptive = − 1, inconclusive = 0 and truthful = 1.
C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26 21

Table 2 3.4. Human versus machine


Outcomes based upon OSS2 and human numerical scores.

Scoring method Outcome We examined possible differences between our human generated
Guilt Deceptive Inconclusive Truthful Totals
numerical scores and our computer generated OSS2 scores by subject-
ing them to a Guilt (Guilty/Innocent) by Scoring Method (Human/
OSS2 Guilty 94 17 14 125
OSS2) ANOVA. Scoring Method was coded as a within-subjects factor.
Innocent 24 17 82 125
Human (Utah) Guilty 94 20 11 125 As expected the main effect of Guilt was significant, F(1, 245) =
Innocent 27 16 82 125 149.56, p b .001, ηp2 = 0.379. The ANOVA also revealed a significant in-
teraction between Guilt and Scoring Method, F(1, 245) = 118.02,
p b .001, ηp2 = 0.325. The means for that interaction are illustrated in
Fig. 1. OSS2 generated more negative scores for the guilty and more pos-
This coding scheme retains the full ordinal characteristics of the under- itive scores for the innocent. However, the difference between the de-
lying interval scaling of the OSS values. Arguably the coding method tection efficiency coefficients for outcomes based on those scores was
used here is a simple transformation of the original interval scale that not significantly different, z = .18, ns.
retains characteristics of an interval scale, albeit a truncated one. This
data vector was submitted to a 2 (Guilt) × 2 (Test Type) × 2 (Stimula- 3.5. Physiological component scores
tion) ANOVA. Similar to the ANOVA of the underlying OSS scores, this
analysis revealed a large main effect of Guilt, F(1, 247) = 136.72, To test the hypothesis that probable-lie CQTs produce a different pat-
p b .001, ηp2 = 0.363. The OSS2 outcome matrix is shown in Table 2. tern of reactions across the components we analyzed the physiological
The detection efficiency coefficient [28] was r = .61, p b .001. None of component scores (respiration, electrodermal activity, relative blood
the other effects approached or achieved significance. pressure and peripheral vasomotor response). The Utah numerical scores
were subjected to a Component (Respiration, Electrodermal Activity, Rel-
ative Blood Pressure, Peripheral Vasomotor Response) × Guilt (Guilty,
3.2. Human scoring, Utah numerical scores Innocent) × Test Type (Probable-Lie, Directed-Lie) ANOVA. Component
was analyzed as within-subjects factor. The Sphericity assumption was vi-
3.2.1. Examiners olated in the Utah numerical scores, Mauchly's W (5) = .768, p b .001.
To see if there was an effect of examiner experience on objective Therefore all repeated measures tests were adjusted with the Green-
scores an Examiner (4) × Guilt (Guilty, Innocent) × Test Type (Proba- house–Geisser approach. The Kircher and Raskin [30]) hypothesis that
ble-Lie, Directed-Lie) × Stimulation (Present, Absent) ANOVA was con- component response patterns for probable-lie and directed-lie were dif-
ducted. That analysis failed to reveal any significant effects involving the ferent with innocent participants would be expressed and supported in
Examiner variable. Moreover, none of the effects involving Examiners this analysis by a significant interaction of Component, Guilt and Test
approached significance. Type. However, that three-way interaction did not approach significance,
F(2.583, 632.87) = 0.367, ns, ηp2 = 0.001.
There were two significant effects from the analysis of the Utah
3.2.2. Human-generated scores numerical score data. There was a main effect of Guilt/Innocence,
The human generated total numerical scores were also analyzed F(1, 245) = 108.01, p b .001, ηp2 = 0.306. The estimated marginal
with a Guilt (Guilty, Innocent) × Test Type (Probable-Lie, Directed- mean numerical component score for Guilty participants was − 2.30,
Lie) × Stimulation (Present, Absent) ANOVA. As predicted by the SE = .273, 95% CI [−2.84, −1.76], and for the Innocent participants,
rationale of the CQT guilty participants produced negative scores M = 1.72, SE = .275, 95% CI [1.18, 2.26]. There was also a significant inter-
(M = −9.5, SD = 11.5, 95% [CI −11.6, −7.2]) and innocent participants action of Guilt and Component, F(2.58, 632.87) = 22.36, p = .001, ηp2 =
produced positive scores (M = 6.2, SD = 13.2, 95% CI [4.0, 8.4]), F(1,
242) = 98.92, p b .001, ηp2 = .29. Total numerical score correlation Human OSS2
with the Guilty/Innocent criterion was significant, r = .54, p b .001. Re- 26.25
view of the effect sizes and significance levels associated with the non-
significant effects revealed that none of them approached significance
or accounted for any appreciable amount of variance in the data. With
the exception of the Guilt effect all of the ηp2 values were less than
.01. Virtually all the systematic variance in these data was accounted 13.125
for by the Guilt variable. Total numerical scores and total OSS2 scores
correlated, r = 0.85, p b .001.
Mean Score

0
3.3. Human decisions

A decision vector was created from the human-generated total Utah


numerical scores using the same coding as described above for OSS2 de-
cisions. The outcome matrix of decisions generated from the Utah nu- -13.125
merical scores is also displayed in Table 2. The detection efficiency
coefficient for human outcomes was r = .60, p b .001. The human deci-
sion vector was submitted to a Guilt (Guilty, Innocent) × Test Type
(Probable-Lie, Directed-Lie) × Stimulation (Present, Absent) ANOVA. -26.25
Similar to the ANOVA of the underlying numerical scores, this analysis
revealed a large main effect of Guilt, F(1, 242) = 136.38, p b .001,
Guilty Innocent
ηp2 = 0.36. With the human generated numerical scores none of the Guilt
other effects reached significance nor approached significance with all
ηp2 values being less than .01. Fig. 1. Illustration of the interaction of Guilt and Scoring Method.
22 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

Table 3 Table 5
Mean (standard deviation) numerical scores illustrating the guilt/innocence × component The frequency of spontaneous countermeasure attempts by guilt and test type.
interaction.
Test type Countermeasure attempted
Guilt Component
Guilt No Yes
Respiration Electrodermal Relative blood Peripheral
Probable-lie Guilty 10 (16.1%) 52 (83.9%)
pressure vasomotor
Innocent 51 (83.6%) 10 (16.4%)
Guilty −0.91 (2.51) −3.92 (5.68) −1.90 (4.73) −2.48 (3.55) Totals 61 (49.6%) 62 (50.4%)
Innocent 0.42 (2.89) 3.00 (6.82) 2.26 (4.46) 1.21 Directed-lie Guilty 18 (28.6%) 45 (71.4%)
Innocent 51 (79.7%) 13 (20.3%)
Totals 69 (54.3%) 58 (45.7%)
.084, that reflects the well documented finding that the different compo-
nents vary in discriminative power [8]. The estimated marginal means for whereas only 71.4% of the guilty tested with the directed-lie approach
the interaction of Guilt and Component are shown in Table 3.
attempted a countermeasure. This differences approached but did not
The OSS2 scores were also subjected to a Component (Respiration, reach significance z = 1.7, p = .09, but the direction of the difference
Electrodermal Activity, Relative Blood Pressure) × Guilt (Guilty,
is the opposite of what the critics of the directed-lie approach predicted.
Innocent) × Test Type (Probable-Lie, Directed-Lie) ANOVA. Since the Innocent participants reported countermeasure use as follows: 16.4%
OSS2 approach does not evaluate the peripheral vasomotor response,
with the probable-lie approach and 20.3% with the directed-lie ap-
that component was not included in this analysis. Component was ana- proach, z = .56, ns.
lyzed as a within-subjects factor. The Sphericity assumption was violat-
ed in the OSS2 scores, Mauchly's W (2) = .669, p b .001. Therefore all
repeated measures tests were adjusted with the Greenhouse–Geisser 3.7. Vasomotor response
approach. The Kircher and Raskin [30,31] hypothesis that component
response patterns for probable-lie and directed-lie were different with We examined the usefulness of the vasomotor response in the nu-
innocent participants would be expressed and supported in this analysis merical scores generated by our human evaluator. A correlation matrix
by a significant interaction of Component, Guilt and Test Type. However, of the various component total scores, the total numerical score and the
that three-way interaction did not approach significance, F(1.50, guilt criterion was calculated and displayed here as Table 6. The compo-
368.16) = 0.595, ns, ηp2 = 0.002. nent with the highest correlation with the Guilt criterion was electro-
There were three significant effects from the analysis of the OSS2 dermal activity. The second highest component correlation with the
scores. There was a main effect of Guilt/Innocence, F(1, 245) = criterion was with the vasomotor response. A stepwise discriminant
151.44, p b .001, ηp2 = 0.38. The estimated marginal mean numerical analysis using the four numerical scores as predictors and Guilt as the
score for Guilty participants was − 7.77, SE = .774, 95% CI [− 9.29, criterion was then conducted using the SPSS Wilks' Λ method. Variable
−6.24], and for the Innocent participants, M = 5.74, SE = .777, 95% data entry was set at probability to enter of .05 and probability to
CI [4.20, 7.27]. There was a significant main effect of Component, remove at .10. The discriminant analysis resulted in a three variable
F(1.50, 368.16) = 5.34, p = .01, ηp2 = .021. There was also a signif- solution retaining electrodermal, vasomotor, and relative blood
icant interaction of Guilt and Component, F(1.50, 368.16) = 47.03, pressure. The Canonical Correlation for that function was .559. The
p = .001, η p2 = .161, that reflects the well documented finding standardized discriminant function coefficients for the components
that the different components vary in discriminative power [8]. The were, electrodermal = 0.541, vasomotor = 0.429, and relative blood
estimated marginal means for the interaction of Guilt and Component pressure = 0.35. The function correctly classified 78.3% of the cases.
and the associated main effect of Component are shown in Table 4. On cross-validation with the N − 1 method 77.9% of the cases were cor-
rectly classified suggesting that the model is stable.

3.6. Spontaneous countermeasures 4. Discussion

Forty-eight percent of all participants reported spontaneously 4.1. Experimental questions


attempting a countermeasure while being debriefed. Most of the spon-
taneous countermeasure users were in the Guilty condition where 77% In this study we examined the relative validity of the probable-lie
reported attempting a countermeasure, but 18% of the Innocent partic- and the directed-lie approaches to the CQT in one of the largest labora-
ipants also reported attempting a countermeasure to help them pass tory studies ever conducted on the CQT. We asked five experimental
their test. One criticism sometimes raised against the directed-lie questions to be addressed by the study and analyses.
approach is that because of its clear face validity it will invite more spon- Are there significant differences in the data, accuracy or physiologi-
taneous countermeasure attempts from guilty subjects [27]. Table 5 cal patterns, produced by probable-lie and directed-lie CQTs? As expect-
provides a breakdown of countermeasure attempts by Guilt and Test ed, we found no significant differences between the total scores or the
Type. Table 5 clearly indicates that concerns regarding an increase in decisions produced by the probable-lie and directed-lie approaches to
the use of the directed-lie variant of the CQT resulting in more counter- the CQT. This finding is consistent with the body of published research
measure attempts by guilty participants are incorrect. With the on the use of the directed-lie comparison question. Moreover, as we
probable-lie approach, 83.9% of the guilty attempted a countermeasure,
Table 6
Correlation matrix of human generated numerical scores with the guilt criterion.
Table 4
Mean (standard deviation) OSS2 scores illustrating the guilt × component interaction and Criterion Respiration Electrodermal Blood Vasomotor
the main effect of component. pressure
Guilt Component Respiration 0.239
Electrodermal 0.485 0.432
Respiration Electrodermal Relative blood pressure
Blood pressure 0.414 0.152 0.476
Guilty −6.29 (8.81) −14.09 (20.02) −2.92 (8.39) Vasomotor 0.449 0.321 0.510 0.448
Innocent 0.40 (8.89) 12.24 (20.03) 4.59 (8.68) Total numerical 0.553 0.545 0.881 0.734 0.748
Total −2.96 (6.45) −0.98 (23.94) 0.82 (9.31)
Note. All correlations are significant p b .05, two-tailed.
C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26 23

predicted, we found no evidence for the Kircher and Raskin [30,31] Does the directed-lie CQT produce more spontaneous countermea-
suggestions that there was an interaction between Guilt and Test Type sure attempts than the probable-lie CQT? Our data contain no support
in the pattern of physiological responses. Given the high statistical for the Iacono and Lykken's [27] prediction that the clear face validity
power in our design and our use of directed-lie presentation techniques of the directed-lie approach would invite increased spontaneous coun-
that are representative of field use, our results suggest that in terms of termeasure attempts from the guilty. Our data showed a non-significant
component reactions, numerical scores and decision accuracy there trend in the opposite direction with the directed-lie approach evoking
are no meaningful differences between the two techniques. fewer spontaneous countermeasures from the guilty.
However, this leaves unanswered why different results were obtain- Does the addition of a measure of the vasomotor response compo-
ed in Horowitz et al. [26] and Bell et al. [7]. In the discussion of their re- nent increase the accuracy of polygraph scoring? Our data indicate
sults Bell et al., state: that the inclusion of the peripheral vasomotor response channel signif-
icantly increased the accuracy of deception detection with numerical
During PL tests, innocent subjects appear to inhibit their physiolog- scores. Interestingly, our discriminant analysis gave its second highest
ical responses to avoid detection when the probable-lie items are weighting to the peripheral vasomotor response and did not include
presented whereas innocent subjects seem to produce physiological respiration in the discriminant function. This finding needs to be ex-
responses to directed-lie items in response to being told that it is im- plored further, but it seems to have strong implications for field practice.
portant to appear deceptive on these items. Knowingly or not, inno- A number of studies have included vasomotor data that found it to be a
cent subjects alter the one physiological measure over which they significant predictor in the CQT [5,7,23,26,29,41,42,48]. Given the large
have the greatest control, their respiration, to accomplish the de- number of published positive outcomes supporting the use of the vaso-
mands placed on them by the examiner. (p. 337). motor response as a dependent measure in physiological deception de-
tection it is curious that the measure receives so little use in actual
Participants in our study were not told that they had to appear decep- practice. The apparent neglect of this highly useful dependent measure
tive on these items, they were told that they should respond appropriately should be reconsidered by the polygraph profession.
when lying. Admittedly this is a subtle difference but it may indicate that
the emphasis on showing deception to the directed-lie questions was
stronger in Bell et al., than in this study resulting in a different response 4.2. General discussion
from innocent participants. This is an area that may need further study.
However, since the techniques used in this study were reviewed and In his 2008 book on deception and deception detection Vrij offered a
accepted by the U.S. Government's polygraph school as representative of strong negative critique of the probable-lie CQT. One major area of that
the Federal approaches to these techniques it would be arguable that the critique noted a lack of standardization in conducting the test and
results of this study should have a stronger impact on programs that use scoring the data. The results presented here directly address those crit-
those approaches. Moreover, only two studies, Horowitz et al. [26] and icisms. The directed-lie approach to the CQT provides a simple and stan-
Bell et al. [7] have shown a problem with scoring respiration with the dardized method for conducting a CQT and for preparing comparison
directed-lie. Honts and Raskin [22] did not report a problem with respira- stimuli. The directed-lie approach was found to be as valid as the
tion and the directed-lie (see the reanalysis by Honts & Handler, 2014). probable-lie under the controlled conditions of our laboratory experi-
There are also five published studies of directed-lie polygraph screening ment, but in the field where the probable-lie test will necessarily be
tests and none of those studies have reported problems with the scoring less standardized it seems that the directed-lie has the potential to be
of respiration [19,36,37,46,47]. It may be that the respiration findings more accurate on the basis of decreased variability of the test adminis-
from Horowitz et al. [26] and [7] are simply idiosyncratic to those studies. tration alone.
Does between repetition stimulation of the test questions have a sig- We have also addressed the second of Vrij's standardization con-
nificant impact on the data produced by probable-lie and directed-lie cerns. The computer-based OSS2 algorithm offers an open source, non-
CQTs? We did not find support for our hypothesis that between chart proprietary method for objective chart analysis that eliminates all issues
stimulation would increase accuracy. We also found no data to support of standardization in analysis. Raskin and Kircher [44] showed OSS2 to
the notion that between chart stimulation might decrease accuracy. be more accurate than the other computer based algorithms currently
Our results are consistent with the standard presentation condition of available. We found OSS2 to be more accurate than a highly experienced
Offe and Offe [38] who also failed to find effect of between chart human evaluator who used the Utah Scoring System. The Utah Scoring
stimulation. System is the most studied of the human-based systems in use today
Are there differences between computer generated scores with the and has consistently been shown to be the most valid of the human-
OSS2 algorithm as compared to numerical scores from an experienced based systems [43].
human examiner? Our prediction that the OSS2 algorithm should out- Vrij [53] also criticizes the CQT for being subject to countermeasures.
perform the human evaluator was supported. The OSS2 produced Vrij notes that other critics who have suggested that the directed-lie ap-
more negative scores for guilty participants and more positive scores proach would be even more vulnerable to countermeasures than the
for innocent participants than did the human generated numerical probable-lie approach. We failed to find any support in our data for
scores. These results are even more impressive when one considers the latter criticism. In this study the directed-lie approach showed a
that the human numerical scores included data from a vasomotor chan- trend toward fewer spontaneous countermeasure attempts from guilty
nel that was shown in this study to be a highly valuable valid addition to participants; the opposite of what was suggested by the critics. With re-
the deception detection process. OSS2 currently does not take advan- gard to the general criticism about countermeasures, although there are
tage of vasomotor data. These results combined with the laboratory no countermeasure training studies specifically focused on the directed-
and field results from Raskin and Kircher [44] that show the OSS2 as lie approach there is no reason to believe that it would be any more, or
the most accurate of the generally available computer algorithms, and less, vulnerable to informed and trained countermeasure users than are
that OSS2 generalizes to multi-facet and multi-issue examinations that the probable-lie CQT and the Guilty Knowledge Test [18]. However, the
provide evidence to suggest that OSS2 should be the preferred method concern about countermeasures is not a fatal flaw for the use of the CQT.
for scoring polygraph charts in the field. At the very least it provides a Honts [18] notes that vulnerability to countermeasures is a common
powerful quality control tool. Unfortunately, the advantage of the flaw for all applied psychological tests. Any test from which a subject
OSS2 scores was not expressed in the OSS2 outcomes. That finding sug- may gain a benefit or experience a loss is subject to a potential vulnera-
gests that current OSS2 decision rules may not be optimal and that those bility from deliberate manipulation or distortion, that is, countermea-
decision rules deserve additional scrutiny. sures. As with all applied psychological tests polygraph examiners
24 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

must be aware and vigilant about countermeasures, but in this they are lie detector accuracy in real life.” (p. 181). Yet despite Iacono's and
no different from anyone applying a psychological test. Lykken's dismissal of laboratory research the vast majority (81%) of
Vrij [53] criticizes the CQT for using deceptive practices. Vrij notes the members of the American Psychology Law Society who responded
that in the probable-lie CQT subjects are deceived about the purpose of to a telephone survey endorsed policy makers and courts of law giving
the probable-lie questions. He also claims that all CQT subjects are told some, moderate or considerable weight to the results of laboratory
that the CQT is infallible, and that subjects are also deceived about the studies of CQT polygraph validity, and 49.1% endorsed giving moderate
purpose and nature of the demonstration test. We would disagree with or considerable weight [25]. Pollina et al. [40] directly compared the
Vrij about the accuracy of the latter two assertions for the polygraph pro- results of polygraph data collected in randomized experiments to the
fession, although they are used by some practitioners. In this study the results of field polygraph examinations confirmed by confession. Al-
deception about the nature of the probable-lie comparison questions though they reported some differences in the tonic levels of arousal,
was maintained, but in both conditions participants were told that the they found no differences in the accuracy of classification.
test could make errors and they were not deceived about the purpose Anderson et al. [3] looked at the correspondence between laboratory
of the demonstration test. Our results clearly indicate that the deceptions and field results across a broad range of applied psychological research.
Vrij describes are not necessary for accuracy with the CQT, and they are They report a high correspondence in effect sizes between laboratory
not present at all with the directed-lie approach tested here. and field studies, r = .728. Recently, Hartwig and Bond [15] looked spe-
Vrij [53] also notes that the CQT has a weak theoretical foundation. cifically at laboratory and field research on deception detection (but not
We agree that in 2008 this was the case. Research on the physiological polygraph) and they report,
detection of deception has tended to be applied and not theory driven.
Critics of the deception literature often raise the concern that
Although such research may sometimes lack a clear programmatic de-
common paradigms suffer from limited generalizability. Our meta-
tection, the lack of a strong theoretical foundation does not diminish
analysis does not provide support for this concern: The primary find-
the validity of scientific findings nor the estimates they provide about
ing of our analysis is that lie detectability remains stable across con-
the accuracy of techniques. If strong theoretical foundation was re-
texts. (p. 7 [online version])
quired before applying the results of applied science then humanity
would have been denied the benefit of many important discoveries.
Although Hartwig and Bond did not specifically address the general-
For example, consider aspirin. The effective ingredient in aspirin was
izability of detection deception with the polygraph the fact that lie de-
first discovered in 1763 [52] and was first synthesized in 1897 [50]. As-
tectability remains stable in other deception detection contexts is
pirin quickly became one of the most widely used drugs in the world
supportive of the use of high quality laboratory experiments for poly-
with an estimated current consumption of 40,000 metric tons a year
graph research. A meta-analysis of the current polygraph literature
[56]. However, in 2002 Warner and Mitchell note, “However, despite
using the Hartwig and Bond approach would be very useful in bringing
this long history and large volume of use, we still have an incomplete
clarity to the issue of the generalizability of laboratory research on the
understanding of how the NSAIDs achieve their actions.” p. 13371. War-
polygraph.
ner and Mitchell also note that the first understanding of mechanism of
aspirin came only came in the 1970s. If we were to follow Vrij's and
others [35] lack of theory critique of the polygraph to its logical exten- Acknowledgements
sion, then we should abandon the use of all NSAIDs because we don't
know how they work, despite knowing that they clearly do work. Final- This material is based upon work supported in part by, the U. S. Army
ly, we would note that in 2014 two efforts, albeit disparate, at theory Research Laboratory and the U. S. Army Research Office under contract/
building for the CQT were published [18,39]. Those efforts have generat- grant number W9111NF-07-1-0670. The authors would like to recog-
ed commentary and controversy, but they indicate that efforts toward a nize the following contributors who took part in conducting this
theory of the CQT are progressing. research, Mark Handler, Kimberly Turnbloom, James Pitman, Flavia
The pre-test interviews in the examinations in this study and pre- Pitman, and Scott McBride. We would also like to thank Maria Hartwig
sentation of the relevant and comparison questions described above for her comments on a draft of this manuscript.
and illustrated in Appendices 1 and 2 represent the approach and
methods developed and validated at the University of Utah [43]. Other Appendix 1
approaches may use somewhat different and in some cases much longer
explanations in the presentation of the comparison questions. To the ex- Probable-lie example of presenting the relevant and comparison
tent that field practices are different than those described here the gen- questions. This example includes the examiner responding to the sub-
eralizability of these results may be limited. ject giving an initial affirmative response to a comparison question.
The research reported here is from a laboratory mock crime experi- The transcript begins at the conclusion of the demonstration test
ment and the generalizability of results from laboratory experiments is Examiner: Before we get into the actual test we are going to go over
a legitimate concern. Kircher et al. [28] conducted a meta-analysis all the questions because we are not going to have any surprises on the
where they identified three characteristics of laboratory studies of the test, okay?
polygraph that they describe as necessary to model the field setting. Subject: Okay.
Those characteristics were: the use of explicit incentives associated Examiner: So what I am going to do is that I am going to read them to
with the test outcome, the use of non-student subjects, and the use of you as they will be on the test. I just want you to answer them as you
representative field techniques to conduct and score the polygraph ex- will when you take the actual test, okay?
aminations. This study was conducted to conform with those character- Subject: Okay.
istics and used an explicit monetary incentive associated with passing Examiner: The first set of questions is in regard to the missing
the test, a community sample of subjects and representative techniques money, and the first one is: Regarding the envelope that was stolen
to conduct and score the examinations. from the Education Building, did you intend to truthfully answer each
However, some commentators have said that regardless of the question about that?
methods used laboratory studies of deception detection are not useful Subject: Yes.
for estimating field validity [12,13,27]. Iacono and Lykken [27] went so Examiner: Did you steal that missing envelope?
far as to say, “We have noted repeatedly that laboratory studies in Subject: No.
which lie tests are administered to volunteer subjects who have (or Examiner: Did you steal that envelope from the door of Room 619 in
have not) committed mock crimes, are not a valid basis for predicting the Education Building?
C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26 25

Subject: No. Subject: No.


Examiner: Do you know where the missing money is now? Examiner: Prior to 2008, did you ever lie to a person in a position of
Subject No. authority?
Examiner: Okay. Now the next set of questions are just about you as Subject: No.
a person. These questions are designed to show that you are not the Examiner: Okay, some people find it helpful to think of times that
kind of person who would steal money and lie about it. I mean you they did those things when they are answering the questions. You
are not that kind of person, right? know, speed on the way to work or school, accidentally ran a stop
Subject: Right. sign, lied to your mom when you were a kid, whatever. Just something
Examiner: Okay, good. Prior to 2008, did you ever lie to someone that comes to your mind, okay?
who trusted you?
Subject: [hesitates]. References
Examiner: Just things that would show that you are a thief, or would
[1] American Polygraph Association, Membership Directory, American Polygraph Asso-
make me think that you are a thief.
ciation, Chattanooga, 2009.
Subject: Well, I think that everybody lies, that's kind of a silly [2] American Polygraph Association, Meta-analytic survey of criterion accuracy of vali-
question. dated polygraph techniques, Polygraph 40 (2011) 196–305.
Examiner: Well, I am not looking for the little things that we all do. I [3] C.A. Anderson, J.J. Lindsay, B.J. Bushman, Research in the psychological laboratory:
truth or triviality? Curr. Dir. Psychol. Sci. 8 (1999) 3–9.
am talking about, these are questions about big things that would show [4] Barland, G.H., 1981. A validation and reliability study of counterintelligence screen-
that you are the kind of person who would steal money and lie about it? ing test. Unpublished manuscript, Security Support Battalion, 902nd Military Intelli-
Subject: I haven't. gence Group, Fort George G. Meade, Maryland, USA.
[5] G.H. Barland, D.C. Raskin, An evaluation of field techniques in detection of deception,
Examiner: Okay. Prior to 2008, did you ever do anything that was Psychophysiology 12 (1975) 321–330.
dishonest or illegal? [6] B.G. Bell, P.C. Bernhardt, J.C. Kircher, R.E. Packard, Effects of prior demonstrations on
Subject: No. the accuracy of outcomes of probable-lie and directed-lie tests, Psychophysiology 37
(2000) S27 (Abstract).
Examiner: Prior to 2008, did you ever lie to a person in a position of [7] B.G. Bell, J.C. Kircher, P.C. Bernhardt, New measures improve the accuracy of the
authority? directed-lie test when detecting deception using a mock crime, Physiol. Behav. 94
Subject: No. (2008) 331–340.
[8] B.G. Bell, D.C. Raskin, C.R. Honts, J.C. Kircher, The Utah numerical scoring system,
Examiner: Okay.
Polygraph 28 (1999) 1–9.
[9] Counterintelligence Field Activity, Federal psychophysiological detection of deception
Appendix 2 examiner handbook. Counterintelligence field activity technical manualReprinted in
2011 in Polygraph 40 (2006) 1–72.
[10] Department of Defense Polygraph Institute, Zone Comparison Test (ZCT), 1997.
Directed-lie example of presenting the relevant and comparison [11] D.W. Dutton, Guide for performing the objective scoring system, Polygraph 29
questions. The transcript begins at the conclusion of the demonstration (2000) 177–184.
test [12] M.G. Frank, P. Ekman, The ability to detect deceit generalizes across different types
of high-stake lies, J. Pers. Soc. Psychol. 72 (1997) 1429–1439. http://dx.doi.org/10.
Examiner: Before we go into the actual test what we are going to do 1037/0022-3514.72.6.1429.
is to go over all the questions because we are not going to have any sur- [13] M.G. Frank, E. Svetieva, Lies worth catching involve both emotion and cognition, J.
prises on the test. Appl. Res. Mem. Cogn. 1 (2012) 131–133. http://dx.doi.org/10.1016/j.jarmac.2012.
04.006.
Subject: Okay. [14] A. Ginton, Relevant issue gravity (RIG) strength — a new concept in PDD that
Examiner: What I will do is read the questions to you as they will be reframes the notion of psychological set and the role of attention in CQT polygraph
on the test, and I just want you to answer them as you will when you are examinations, Polygraph 38 (2009) 204–217.
[15] M. Hartwig, C.F. Bond, Lie detection from multiple cues: a meta-analysis, Appl. Cogn.
taking the test, okay? Psychol. 28 (2014) 661–676. http://dx.doi.org/10.1002/acp.3052 (Published online
Subject: Okay. 8 July 2014).
Examiner: The first set of questions are in regard to the missing [16] C.R. Honts, The discussion of comparison questions between list repetitions (charts)
is associated with increased test accuracy, Polygraph 28 (1999) 117–123.
money. The first one is: Regarding the envelope that was stolen from [17] C.R. Honts, The psychophysiological detection of deception, in: P. Granhag, L.
the Education Building do you intend to truthfully answer each question Strömwall (Eds.), Detection of Deception in Forensic Contexts, Cambridge Universi-
about that? ty Press, London, 2004, pp. 103–123.
[18] C.R. Honts, Countermeasures and credibility assessment, in: D.C. Raskin, C.R. Honts,
Subject: Yes.
J.C. Kircher (Eds.), Credibility Assessment: Scientific Research and Applications, First
Examiner: Did you steal that missing envelope? editionAcademic Press, Oxford, UK, 2014, pp. 131–158. http://dx.doi.org/10.1016/
Subject: No. B978-0-12-394433-7.00003-8.
Examiner: Did you steal that envelope from the door of Room 619 in [19] C.R. Honts, W. Alloway, Information does not affect the validity of a comparison
question test, Leg. Criminol. Psychol. 12 (2007) 311–312 (Available online in 2006).
the Education Building? [20] C.R. Honts, S. Amato, A. Gordon, Validity of Outside-Issue Questions in the Control
Subject: No. Question Test: Final Report on Grant No. N00014-98-1-0725Submitted to the Office
Examiner: Do you know where the missing money is now? of Naval Research and the Department of Defense Polygraph Institute Applied Cog-
nition Research Institute, Boise State University, 2000. (DTIC# ADA376666).
Subject: No. [21] C.R. Honts, S. Amato, A. Gordon, Effects of outside issues on the control question test,
Examiner: Okay. The next set of questions are kind of like our num- J. Gen. Psychol. 151 (2004) 53–74.
ber test. They are questions I want you to lie to. They are the only ques- [22] C.R. Honts, D.C. Raskin, A field study of the validity of the directed-lie control ques-
tion, J. Police Sci. Adm. 16 (1988) 56–61.
tions on the test I want you to lie to. Just like the number 4, they are [23] C.R. Honts, D.C. Raskin, J.C. Kircher, Mental and physical countermeasures reduce
there to show that your body responds appropriately when you tell a the accuracy of polygraph tests, J. Appl. Psychol. 79 (1994) 252–259.
lie. These are things we have all done before. On way to recognize that [24] C.R. Honts, D.C. Raskin, J.C. Kircher, Scientific status: the case for polygraph tests, in:
D.L. Faigman, M.J. Saks, J. Sanders, E. Cheng (Eds.), Modern Scientific Evidence: The
these are the questions I want you to lie to is that they all begin with Law and Science of Expert Testimony (Volume 5): 2008–2009 Edition, Eagan, Min-
the words, “Prior to 2008.” So when you hear the words, “Prior to nesota, Thompson West, 2008.
2008” that means I want you to lie, okay? [25] C.R. Honts, S. Thurber, D. Cvencek, W. Alloway, General acceptance of the polygraph by
the scientific community: two surveys of professional attitudes, Paper Presented at the
Subject: Okay.
American Psychology-Law Society Biennial Meeting, Austin, Texas, March 2002.
Examiner: Okay. Prior to 2008, did you ever lie to someone who [26] S.W. Horowitz, J.C. Kircher, C.R. Honts, D.C. Raskin, The role of comparison questions
trusted you? in physiological detection of deception, Psychophysiology 34 (1997) 108–115.
Subject: No. [27] W.G. Iacono, D.T. Lykken, in: D.L. Faigman, D. Kaye, M.J. Saks, J. Sanders (Eds.),The
scientific status of research on polygraph techniques: the case against polygraph
Examiner: Prior to 2008, did you ever do anything that was dishon- tests: 1999 pocket part to Vol. 1, Modern Scientific Evidence: The Law and Science
est or illegal? of Expert Testimony, 1999, pp. 174–184.
26 C.R. Honts, R. Reavy / Physiology & Behavior 143 (2015) 15–26

[28] J.C. Kircher, S.W. Horowitz, D.C. Raskin, Meta-analysis of mock crime studies of the [45] J.E. Reid, A revised questioning technique in lie detection tests, J. Crim. Law Criminol.
control question polygraph technique, Law Hum. Behav. 12 (1988) 79–90. Police Sci. 37 (1947) 542–547.
[29] J.C. Kircher, D.C. Raskin, Human versus computerized evaluations of polygraph data [46] Research Division Staff, A comparison of psychophysiological detection of deception
in a laboratory setting, J. Appl. Psychol. 73 (1988) 291–302. accuracy rates obtained using the counterintelligence scope polygraph and the test
[30] J.C. Kircher, D.C. Raskin, Computer methods for the psychophysiological detection of for espionage and sabotage question formats. DTIC AD Number A319333. Depart-
deception, in: M. Kleiner (Ed.), Handbook of Polygraph Testing, Academic Press, ment of Defense Polygraph Institute. Fort Jackson, SCReprinted in Polygraph 26
London, 2002, pp. 287–326. (2) (1995) 79–106.
[31] J.C. Kircher, D.C. Raskin, The Computerized Polygraph System Version 3.21, Scientific [47] Research Division Staff, Psychophysiological detection of deception accuracy rates
Assessment Technologies, Inc., Salt Lake City, Utah, 2002. obtained using the test for espionage and sabotage. DTIC AD Number A330774. De-
[32] J.C. Kircher, D.C. Raskin, CPSpro Fusion Software Manual: The Computerized Poly- partment of Defense Polygraph Institute. Fort Jackson, SCReprinted in Polygraph 27
graph System, Version 1.1, Stoelting Co., Wood Dale IL, 2011. (3) (1995) 171–180.
[33] D.J. Krapohl, B. McManus, An objective method for manually scoring polygraph data, [48] L.I. Rovner, The accuracy of physiological detection of deception for subjects with
Polygraph 28 (1999) 209–222. prior knowledge, Polygraph 15 (1986) 1–39.
[34] D.J. Krapohl, Short report: update for the objective scoring system, Polygraph 31 [49] S. Senter, D. Weatherman, D. Krapohl, F. Horvath, Psychological set or differential sa-
(2002) 298–302. lience: a proposal for reconciling theory and terminology in polygraph testing, Poly-
[35] National Research Council, The polygraph and lie detection, The Committee to Re- graph 39 (2010) 109–117.
view the Scientific Evidence on the Polygraph. Division of Behavioral and Social Sci- [50] W. Sneader, The discovery of aspirin: a reappraisal, Br. Med. J. Clin. Res. Ed. 321
ences and Education, The National Academies Press, Washington DC, 2003. (2000) 1591–1594.
[36] R. Nelson, M. Handler, C. Morgan, Criterion validity of the directed-lie screening test [51] Stoelting, CPS II 4.35 Software, Stoelting, Wood Dale, IL, 2008.
and the empirical scoring system with inexperienced examiners and non-native ex- [52] E. Stone, An account of the success of the bark of the willow in the cure of agues, A
aminees in a laboratory setting, Polygraph 41 (2012) 176–185. Letter to the Right Honourable George Earl of Macclesfield, President of the Royal
[37] R. Nelson, M. Handler, B. Blalock, N. Hernandez, Replication and extension study of Society from the Rev. Mr. Edmund Stone, of Chipping-Norton in Oxfordshire, Philo-
directed-lie screening tests: criterion validity with seven and three position models sophical Transactions (1683–1775), 53, 1963, pp. 195–200. http://dx.doi.org/10.
and the empirical scoring system, Polygraph 41 (2012) 186–198. 1098/rstl.1763.0033.
[38] H. Offe, S. Offe, The comparison question test: does it work and if so how? Law Hum. [53] A. Vrij, Detecting Lies and Deceit: Pitfalls and Opportunities, Second edition John
Behav. 31 (2007) 291–303. Wiley and Sons, Chichester, UK, 2008.
[39] J.J. Palmatier, L. Rovner, Credibility assessment: preliminary process theory, the [54] A. Vrij, G. Gannis, Theories in deception and lie detection, in: D.C. Raskin, C.R. Honts,
polygraph process, and construct validity, Int. J. Psychophysiol. 95 (2015) 3–13. J.C. Kircher (Eds.), Credibility Assessment: Scientific Research and Applications, First
[40] D. Pollina, A. Dollins, S. Senter, D. Krapohl, A. Ryan, A comparison of polygraph data editionAcademic Press, Oxford, UK, 2014, pp. 303–374. http://dx.doi.org/10.1016/
obtained from individuals involved in mock crimes and actual criminal investiga- B978-0-12-394433-7.00007-5.
tions, J. Appl. Psychol. 89 (2004) 1099–1105. [55] A. Vrij, S. Leal, S. Mann, R. Fisher, Imposing cognitive load to elicit cues to deceit: in-
[41] J.A. Podlesny, D.C. Raskin, Effectiveness of techniques and physiological measures in ducing the reverse order technique naturally, Psychol. Crime Law 18 (2012)
the detection of deception, Psychophysiology 15 (1978) 344–358. 579–594. http://dx.doi.org/10.1080/1068316X.2010.515987.
[42] D.C. Raskin, R.D. Hare, Psychopathy and detection of deception in a prison popula- [56] T.D. Warner, J.A. Mitchell, Cyclooxygenase-3 (COX-3): filling in the gaps toward a
tion, Psychophysiology 15 (1978) 121–136. COX continuum? Proc. Natl. Acad. Sci. U. S. A. 99 (13371) (2002) 13373. http://dx.
[43] D.C. Raskin, C.R. Honts, The comparison question test, in: M. Kleiner (Ed.), Handbook doi.org/10.1073/pnas.222543099.
of Polygraph Testing, Academic, London, 2002, pp. 1–49. [57] J.A. Matte, An analysis of the psychodynamics of the directed lie control questions in
[44] D.C. Raskin, J.C. Kircher, Validity of polygraph techniques and decision models, in: the control question technique, Polygraph 27 (1998) 56–67.
D.C. Raskin, C.R. Honts, J.C. Kircher (Eds.), Credibility Assessment: Scientific Research [58] C.R. Honts, A. Gordon, A critical analysis of Matte's analysis of the directed lie, Poly-
and Applications, First editionAcademic Press, Oxford, UK, 2014, pp. 65–132. http:// graph 27 (1998) 241–252.
dx.doi.org/10.1016/B978-0-12-394433-7.00003-8.

You might also like