You are on page 1of 19

U.S.

Air Force T&E Days AIAA 2008-1648


5 - 7 February 2008, Los Angeles, California

A Generalized Handling Qualities Flight Test


Technique Utilizing Boundary Avoidance Tracking

William R. Gray, III∗


United States Air Force Test Pilot School, Edwards AFB, CA, 93524, USA

The strategies used by operators to control machines have long been of keen interest to
the flight test community. Research into these strategies has, until recently, assumed that
pilots control their aircraft solely to maintain some condition. In 2004, boundary avoid-
ance tracking (BAT) was recognized as a strategy wherein pilots control their machines
to avoid a condition. While BAT has provided some insight into pilot-induced oscillations
(especially those involving hazardous boundaries) it has more recently been utilized at the
USAF Test Pilot School (TPS) to evaluate aircraft handling qualities using a subjective
build-up approach. An abbreviated history of BAT research at the USAF TPS is presented,
from initial computer modeling to in-flight testing on the variable stability NF-16D VISTA
aircraft. Earlier tests have demonstrated the existence of boundary avoidance tracking,
produced results that suggest modeling of pilot boundary avoidance response is possible,
and illustrated how tracking data gathered using boundaries may aid in characterizing air-
craft handling qualities. Furthermore, the VISTA test showed a good correlation between
the Cooper-Harper task results and a BAT flight test technique (FTT), strongly suggesting
that an FTT utilizing boundaries may be sufficient to objectively characterize an aircraft’s
pilot-in-the-loop capabilities for some tasks. This success and others have led to the incor-
poration of BAT flight test techniques into the USAF TPS handling qualities syllabus to
educate students about the effects of pilot aggressiveness and to provide an additional tool
for their future operator-in-the-loop tests.

Nomenclature
xb Displacement from a boundary
tb Instantaneous time to a boundary considering only displacement and rate
Kbm Maximum value of boundary feedback
tmin Minimum time-to-boundary for null feedback
tmax Maximum time-to-boundary for maximum boundary feedback (Kbm )
τb Pure time delay before boundary feedback is applied

I. Introduction: What Are Boundaries?


Traditional analysis of pilot-in-the-loop handling qualities assumes that the pilot is attempting to main-
tain a condition, say a pitch attitude or bank angle. The pilot will use a varying amount of effort or
aggressiveness to maintain this condition. The aggressiveness of the pilot is routinely referred to as the
pilot’s “gain,” a term borrowed from linear feedback system analysis. The simplest method of modeling a
pilot in a tracking task is to represent the pilot as a feedback loop with a simple gain and time delay; this
method is often quite successful. There are many more models available but they all share the assumption
that the pilot is solely attempting to maintain one condition or another.1
The problem of controlling a moving object in relation to a boundary is applicable to many automatic
control systems and much work has been done to optimize this type of control for a machine. Brian Baker
and Scott Paynter of the Lockheed Martin Corporation received a US patent in August 2004 for just such
∗ Chief Test Pilot, 200 S. Wolfe Ave., Member

1 of 19

This material is declared a work of the U.S. Government and American Institute
is not subject of protection
to copyright Aeronauticsin theand Astronautics
United States.
an algorithm.2 Their algorithm is typical in that it is designed to optimize machine control, not duplicate
or explain a pilot’s control actions.
In 2004 the author published a paper challenging the idea that pilots only attempt to maintain conditions
and proposing that pilots also engage in a type of tracking he called “Boundary-Avoidance Tracking.”3 In
this type of tracking, the pilot controls the aircraft in relation to an undesired aircraft state. Perhaps the
best illustration for this is the bicycle/beam analogy.
The analogy of riding a bicycle across a beam has been a popular way to teach the concept of “gain” by
graphically illustrating how the difficulty of the task changes the rider’s inputs. Imagine the task of riding a
bicycle down a straight line painted in the middle of a parking lot; this illustrates the low-gain rider. If you
take that path and elevate it to a deadly height (i.e. a bridge without rails) then make the path narrower
and narrower, the rider will clearly increase his effort to stay on the painted line; an increase in gain. If the
path is narrow enough, the rider’s gain becomes so high that he can no longer remain stable and oscillates
off the narrow beam.
Traditional handling qualities analysis assumes that the risk of falling off the narrow beam increases the
rider’s gain, perhaps to the point of triggering a pilot-induced oscillation (PIO). BAT theory states that
the rider actually controls in relation to the edges. As the rider approaches an edge he is said to engage in
boundary avoidance tracking, where the boundary to be avoided is the edge of the beam. A singular instance
of BAT at one edge can drive the rider into the other edge where the process repeats until the bicycle and
rider tumble into the abyss.

I.A. Boundary Avoidance Tracking Theory


I.A.1. The Spectrum of Pilot Response
Boundary-Escape Tracking, A New Conception of Hazardous PIO.3 concentrated on a specific type of BAT
where the boundary is a life-or-death threat. But boundaries can be anything from a minor annoyance (the
pilot would like to do better but it doesn’t matter whether or not he does) to a moderate concern (the
pilot needs to stay within limits to complete the task but safety is not in question) to the ultimate concern
(passing the boundary will kill the pilot). These levels of BAT significantly complicate the problem but must
be understood as part of the theory—not all boundaries are like the edge of the beam in the bicycle/beam
analogy. Figure 1 illustrates the spectrum of boundary avoidance tracking responses.

Figure 1. The boundary avoidance tracking spectrum.

I.A.2. Modeling Pilot Response


Many flying qualities engineers are accustomed to using pilot models in their analysis of aircraft designs.
BAT hypothesizes that pilot response to a boundary is proportional to the time to the boundary (calculated
as the instantaneous distance to the boundary divided by the rate toward the boundary), but only within
a range of times between the time-to-boundary when the pilot starts to respond (tmin ) and the time-to-
boundary when the pilot applies the largest response (tmax ). The boundary feedback varies linearly between
0 at tmin and Kbm at tmax . (See Table 1). The model is covered thoroughly in Boundary-Escape Tracking,
A New Conception of Hazardous PIO.3 While several studies have shown that this model is a reasonable
approximation of pilot response, some difficulties remain. Specifically, pilot response when outside the

2 of 19

American Institute of Aeronautics and Astronautics


boundary is difficult to predict but seems to range from returning attention to the opposing boundary or
point-tracking task as soon as the boundary is exceeded to maintaining full control input until within the
exceeded boundary.
Table 1. Pilot feedback model for boundary avoidance tracking.

Situation Boundary Awareness Boundary Feedback

Inside the boundary, moving away from it No threat null (0)

Inside the boundary, moving toward it

tb ≥ tmin No threat null (0)

tmin − tb
Kbm (1)
tmin − tmax
tmax < tb < tmin Pilot response increases lin-
early with decreasing tb where
tb = xb (dxb /dt)−1 (2)

tb ≤ tmax Pilot response at maximum Kbm

Pilot response at maximum


Kbm
Displacement outside boundary tb ≤ tmax —or—
—or—
Pilot abandons BAT for an-
null (0)
other tactic*
* Pilot response outside of the boundary is known to be much more complex—the response seems to depend primarily on
the threat of the boundary.

I.B. The Path to the Current Understanding of BAT Theory and Application
The original computer and desktop simulator research into boundary avoidance tracking was conducted by
the author under the auspices of the USAF Test Pilot School. BAT has been the subject of several student
flight test projects at the USAF TPS and their associated master’s degree thesis work with the US Air
Force Institute of Technology (AFIT). The author has also been informed of some applications of BAT flight
test techniques in US Air Force and Navy developmental test programs, but these will not be addressed in
this paper. New test techniques were developed and fine-tuned over the course of the USAF TPS and AFIT
projects, and data from the projects encouraged the examination of new methods for quantifying pilot inputs
during a tracking task.
The development of hypotheses, conduct of research, and fine-tuning of test techniques were necessarily
interlaced and, aside from the research projects, were not neatly parsed in time. They will be presented in
the general order in which they inspired and supported each other. The current understanding of boundary
avoidance tracking and its use in handling qualities evaluation is the result of these several year’s work and
has now been successfully woven into the USAF TPS curriculum, providing both a theoretical basis and a
framework for clear and repeatable handling qualities demonstrations.

3 of 19

American Institute of Aeronautics and Astronautics


II. Computer Modeling of Boundary Avoidance Tracking
The BAT hypothesis started as the idea that pilots will occasionally control their aircraft in relation to
boundaries. Computer models were designed to examine different boundary tracking strategies so that the
results could be subjectively compared to actual pilot responses.3 The pilot model eventually settled-upon
duplicated many known characteristics of PIOs. To illustrate this, Figure 2 shows the progression of BAT
feedback on the same initial system state through tighter and tighter boundaries. From singular “pulse”
inputs to explosively divergent PIO, the model results were consistent with the personal experience of the
author, including his observations of other pilots engaged in boundary avoidance.

(a) Open-loop system. (b) Upper boundary creates a single input.

(c) Both boundaries create a single input. (d) Stable oscillation between boundaries.

(e) Unstable oscillation causes boundary excursions.


Figure 2. The effect of tightening boundaries on a simple boundary-feedback system.

The conclusions from the modeling study were:

1. Unstable boundary escape oscillations tended to grow ‘explosively’ until reaching the bound-
ary tracker gain limits, Kbm .
2. Feedback inputs for a boundary escape oscillation that has diverged to the gain limits are
characterized by stop-to-stop inputs.
3. Boundary escape tracking produces extremely nonlinear (‘clifflike’) results. Very tiny vari-
ations in gain, time delay, or boundary awareness parameters in the boundary-tracking
feedback loop marked the transition from a moderately damped boundary escape response
to rapidly divergent oscillations.
4. Increased feedback delay was an especially powerful driver of boundary escape oscillations.
5. Unstable point-tracking oscillations can rapidly transition to catastrophic boundary avoid-
ance oscillations once boundary awareness is achieved. The transition may be marked by
an explosive increase in feedback (inceptor) inputs.
6. Boundary escape PIO can occur where point-tracking PIO is not present. If the boundaries
are sufficiently tight and/or the increase in gain brought by boundary escape tracking is
sufficiently greater than the normal gain for the point-tracking task, a boundary escape
PIO can quickly arise from a disturbance large enough to assault one of the boundaries.3

Perhaps the most important result from this work was the response of the test pilot community following
the author’s presentation of the results at the 2004 Symposium of the Society of Experimental Test Pilots.
Numerous test pilots made a point of telling the author their PIO stories and how boundary avoidance
tracking successfully described their focus during these events. Many other pilots provided descriptions
of different types of boundaries and how these boundaries affect their flying. One particularly intriguing

4 of 19

American Institute of Aeronautics and Astronautics


example offered was the task of landing a helicopter on a narrow trailer. The pilot relaying this task told
excitedly of how BAT neatly explained the common PIOs and handling qualities issues experienced by pilots
as they learned to accomplish this difficult task.
In addition to the time-based simulation and analysis conducted, considerable effort was put into exam-
ining frequency domain results for the models. The hypothesized form of pilot feedback produced by the
BAT model contains numerous non-linearities; little success was realized and the problem awaits those more
adept at frequency domain analysis.

III. The BAT Workload Buildup Flight Test Technique


III.A. Thoughts on “Pilot Gain”
The term “pilot gain” has become shorthand for “how hard a pilot is working to control his aircraft” but the
term started out as a mathematical concept–the gain of a pilot modeled using linear feedback theory. Pilot
gain is, literally, the term Kp in the pilot-surrogate transfer function Yp for a pilot engaged in compensatory
tracking (acting “in response to errors or controlled output quantities only”4 ). But linear compensatory
tracking is, at best, only an analogy of a very small part of the complete human controller. Although this
analogy has proven very useful for improving aircraft design, the prevalence of the analogy seems to have
led to the false perception that the analogy is the behavior, and the expectation that pilots can control their
gain like they can control their respiration rate. The root causes of a pilot’s control inputs are far more
complex than “gain.” Test pilots intuitively understand this, which may explain in part why (in the author’s
experience) so many of them find the analysis of their control inputs using linear feedback system methods
so unenlightening. In short, “pilot gain” is a useful shorthand but there is certainly no Kp subroutine in the
human brain.
Pilots can drive their pilot/aircraft system into an instability (typically called a PIO) by overcontrolling
or, in the shorthand, using excessive gain. Actually testing for this is another matter entirely! If careful
and skilled test pilots fail to find a dangerous PIO susceptibility in the course of a flight test program, a
less careful and/or less skilled operational pilot probably will. Ferreting out PIO susceptibility has been a
contentious issue. In most aircraft development programs, the flight control system is designed to be free of
known PIO causes. During flight test, test pilots are always alert for PIO but there is often little actual PIO
susceptibility testing.
There have been several flight test methodologies created for PIO, most notoriously the “handling qualities
during tracking” (HQDT) technique that requires test pilots to ”track the precision aim point as aggressively
and as assiduously as possible, always striving to correct even the smallest of tracking errors as quickly as
possible.”5 “The pilot is asked to begin tracking with small-amplitude, low-frequency inputs, then increase
the frequency of the input at small amplitude, and finally, increase the input amplitude at high frequency.”6
If the test pilot can create the control inputs necessary to excite a PIO, should one exist, the PIO will show
itself. Understanding and teaching this technique proved almost impossible to even the most talented USAF
TPS students and instructors; it required them to reproduce high gain tracking with nothing more than the
admonition to make it so. Perhaps the most important shortcoming of this technique may have been its
success in creating PIO when properly conducted!7 Most aircraft in most tasks will PIO if flown aggressively
enough, so HQDT “found” PIO in almost all cases but left the test team with no information on whether or
not the PIO was operationally relevant; it could not answer the question, “Will this PIO affect operational
pilots and if so, will it matter?”

III.B. The Need for the Workload Buildup FTT


USAF TPS graduates, both pilots and non-pilots, must know more than how to execute a test point, they
must understand how to create one. A flight test technique (FTT) is an engineering methodology employed
by a test team to determine a specific characteristic of an aircraft sufficiently for comparison against a
pre-existing requirement. Test pilot schools are charged with building a base of knowledge and practical
experience in their students so that they may move into the crucible of the flight test environment prepared
to apply, modify, or create FTTs that balance programmatic goals and flight safety. Therefore, the FTTs
used at a test pilot school must do more than just “determine a specific characteristic,” they must be selected
for their utility in furthering the students’ understanding of the underlying systems, both man and machine.
The BAT workload buildup FTT is being developed with these goals in mind.

5 of 19

American Institute of Aeronautics and Astronautics


III.C. The Method
The BAT workload buildup FTT (here-
after “workload buildup”) was created and
fine-tuned during the several BAT research
projects. It saw its first use in the 2005 desk-
top simulator study and has been steadily im-
proved with additional use. It started as a
method to gather BAT data but experience
showed that it could be a valuable tool for
determining aircraft handling characteristics
for a specific task. The workload buildup is
a process where the tolerance for error (ex-
pressed with boundaries) is specified in a step-
by-step buildup fashion and test pilots fly,
or role-play, as if those boundaries are safety
critical. For the sake of student understand-
ing, applicable open-loop flying qualities test
points are flown prior to the workload buildup.
Finally, in order to investigate the hazard
that might be encountered during an unrecog-
nized PIO, a specialized maneuver called the
“switch-induced simulated PIO” (SISPIO) is
used. (This maneuver is similar to the SAAB Figure 3. Hypothetical tracking performance as the task becomes
more demanding (i.e. as the boundaries are tightened).
8
“Clonk” method and further discussion will
be left for a future paper.) The most difficult
aspect of the workload buildup method for the test pilot is role-playing as if the boundaries are safety crit-
ical. Experience teaches pilots that giving a task undeserved importance can cause problems–overcoming
this heavily trained (and perhaps instinctive) response is difficult but can be learned. After all, “pushing
the envelope” is what flight testing is about, and the intent and resulting actions of an overly aggressive or
fearful aviator is just another limit to explore.

III.D. The Expected Results


Perhaps the most important characteristic of the workload buildup technique is its ability to produce re-
peatable data. The task, including the tolerance for error, can be consistently defined from test-to-test or
pilot-to-pilot, reducing the importance of pilot comments to establishing their ability to accomplish a task.
(However, comments provided by a trained test pilot remain absolutely critical for identify the cause of poor
results!) For several years now, the USAF TPS has been using a figure created by the author to show the
generalized relationship between task demands and pilot performance. The first evolution of this figure is
depicted as Figure 3. It shows how increasing the difficulty of the task will improve performance until a
point of diminishing return is reached. It also depicts how tasks with increasing difficulty (as defined by
a workload buildup FTT) progress. In handling qualities, the point of diminishing returns appears to be
defined by the point at which PIO reduces the accuracy of the tracking through a combination of the motions
of the PIO itself and the need for the pilot to reduce inceptor workload to stop the oscillations. The BAT
workload buildup FTT is uniquely suited to create this data because it provides tracking results based on
very clear performance requirements. If the hypothetical shape of this relationship between desired and
achieved tracking performance is true, the workload buildup FTT can create the data to show it. For the
USAF TPS, it can also be used to show the students first-hand how increasing the difficulty of the task (and,
consequently, increases in pilot “gain”) can create PIO.

6 of 19

American Institute of Aeronautics and Astronautics


IV. Desktop Simulator Study
In 2005 eight subjects, ranging in flight experience from high-time fighter pilots
to non-flying engineers, were given a series of tracking tasks on a desktop simu-
lator.9 The objective of the study was to gather pilot feedback data in a task
that involved boundary avoidance tracking then compare the actual pilot tracking
data with the predictions of BAT theory. Each subject was given the task of con-
trolling a line moving between two boundaries in a regular but hard-to-memorize
pattern. As the center line moved up and down, depicting displacement from an
ideal center position, the subject would attempt to keep the line between the two
fixed boundaries on either side. Figure 4 depicts the subject’s view. The tracking
task consisted of an unbroken series of 60 second periods, where every 60 seconds
the boundaries would move 25% closer. Aside from the boundary displacement,
the tracking task was identical for each 60 second period. The simulation was set
to turn off if the controlled line touched either boundary. Simple competitiveness
seemed enough to make the subjects work hard to stay between the lines.
Each run was analyzed to estimate the boundary tracking parameters tmin ,
tmax , Kbm , and τb by identifying a boundary avoidance PIO and adjusting the
boundary tracking parameters to produce the best fit to the pilot stick displace-
ment. Then the boundary tracking parameters were analyzed to find a relationship
between the subjects’ success and their boundary tracking strategies.
This study produced several important results. First, each subject clearly
responded to the boundaries by maneuvering to avoid them. They also knew
when they transitioned from staying near the center (the point tracking task)
and boundary avoidance tracking and had no problem describing this cognitive
transition. Second, in most cases the task was terminated when a boundary was
exceeded due to a PIO. Finally, in many cases the correlation between the BAT
theory predicted pilot feedback and actual feed back was quite good. Figure 5
shows a typical result for a damped PIO and a few other BAT events near the end Figure 4. Simulator track-
of a test run. ing task.

Figure 5. Boundary avoidance feedback modeling applied to a test case.

7 of 19

American Institute of Aeronautics and Astronautics


With BAT parameter data for all runs available, the correlation between individual pilot/run BAT
parameters and their run success (measured in run duration) was examined. Several observations were
made. First, if the subject overreacted to the boundary by going to the maximum boundary feedback early
(high tmax ) then the subject had little success. Second, the level of success seemed to correlate best with
the difference of tmin and tmax , divided by the maximum boundary gain. Thus the earlier the subject began
to respond to the boundary, the later the subject reached maximum feedback, and the smaller the subject
could keep their maximum response to the boundary, the more success the subject had.
This study confirmed the existence of BAT and encouraged further research into boundary avoidance
tracking.

V. A Limited Investigation of
Boundary Avoidance Tracking (HAVE BAT) 10
“HAVE BAT” was a student test management project conducted at the USAF Test Pilot School in the
Spring of 2006 to examine boundary avoidance tracking in flight. A T-38C modified with a data acquisition
system, including several video cameras, was used as the test aircraft. A target platform was created by
marking a T-38A with a set of visually clear boundaries for pitch tracking in close formation flying. Figure 6
shows the target aircraft. Pilots were tasked with maintaining a vertical position on the target aircraft by
keeping the wingtip between markings on the fuselage of the target. Four boundaries were available; the two
marked with unbroken and hashed lines, the upper and lower parts of the intake structure, and the center
of the USAF insignia (for tracking with null boundary displacement).

Figure 6. Markings for HAVE BAT target aircraft.

The objective of the HAVE BAT test was to duplicate the earlier simulator study in flight by giving pilots
a formation z-axis displacement tracking task. Numerous practical difficulties made that goal unachievable in
the time available. The test team intended to use video of the lead aircraft to determine the tracking pilot’s
view of the boundaries but the video quality could not support automated processing, leaving insufficient
time to analyze most of the data runs. For those data runs that were analyzed (by manually extracting
data from the video frame-by-frame), comparison of the pilot’s stick inputs to BAT theory was not possible
because the ideal position of the stick at each moment in time was unknown. Without knowing the point
from which to apply the boundary feedback, the BAT parameters could not be determined. The test was not
without significant successes, though; most notably that the student test pilots could condition themselves
to treat the artificial boundaries as if they were real and that the result on their task was consistent with
BAT theory. In all cases, as the boundaries were made more restrictive, each pilot would reach a point where
they would PIO. The pilots also noted that there was a clear cognitive transition between point tracking and
boundary avoidance tracking—finding that transition in the inceptor data is a different problem altogether.
Although somewhat disappointing in terms of the original objectives of the test, the HAVE BAT program

8 of 19

American Institute of Aeronautics and Astronautics


provided the first direct in-flight observation of boundary avoidance tracking and validated the need to
continue studying the phenomenon. The many lessons-learned may be found in Randy Warren’s Master’s
Degree Thesis, An Investigation of the Effects of Boundary Avoidance on Pilot Tracking.11

VI. Dividing Pilot Inceptor Workload into Two Independent Variables


VI.A. Why “Pilot Inceptor Workload” Instead of “Gain”?
Measuring the amount of effort that a pilot is putting into controlling an aircraft is not a simple task. It is
often estimated through rating scales, where the pilot describes effort in terms of workload and compensation,
but numerical measurement has historically been something of a black art. It seems to take a very experienced
flying qualities engineer to tease a little useful information out of the frequency-domain analysis typically
used to examine a pilot’s inceptor inputs. More importantly to the USAF TPS, frequency-domain techniques
are almost impossible to teach in the time alloted and serve little to help pilots understand how they fly
aircraft.
To alleviate any possible confusion that the following methods have anything to do with frequency analysis
or pilot gain as the mathematical entity Kp , the term “inceptor workload” is used to describe how workload is
measured as a combination of the independent variables “duty cycle” and “aggressiveness.” Consult Figure 7
for a visual depiction of how duty cycle and aggressiveness combine to measure inceptor workload.

VI.B. Duty Cycle


When a pilot is involved in a tracking task,
even a difficult one, he is unlikely to be con-
stantly moving the controls. He will occasion-
ally stop changing the magnitude of his input
to allow the aircraft to respond by itself; some-
times to let the aircraft finish the job, some-
times to allow his input time to take effect,
and sometimes because an input simply isn’t
necessary. “Duty cycle” is nothing more than
the percentage of time the pilot is changing
his input on the stick, whether through force
or position. (In data analysis, the inceptor is
assumed “steady” when its rate of motion is
below a threshold determined by observation
of the data.) In terms of workload, it is clear
that as duty cycle is increased, pilot incep-
tor workload is increased as well. Duty cycle
fully describes the time the pilot spends with
the inceptor held nearly motionless, so there
is nothing more to be gained out of examining
the inceptor down-time. All that remains is
to start the process of characterizing how the
pilot moves the inceptor. Figure 7. Pilot inceptor workload in terms of duty cycle and
aggressiveness.
VI.C. Aggressiveness
When the pilot is moving the inceptor, the
movement can be characterized any number of ways. The characterization the USAF TPS needed was
a simple first approximation that captures the effort the pilot is putting into the motion. There is no
attempt in this method to explain why the motions are occurring when they are or why they have the char-
acteristics they have. (Such explanations might be found in linear pilot model approximations or boundary
avoidance tracking theory.) Using the simple assumption that the faster the pilot is moving the inceptor,
the harder he is working, aggressiveness is calculated as the root-mean squared per-second average of the
inceptor measurand (position or force) rate of change. No effort has been made to normalize this measure of

9 of 19

American Institute of Aeronautics and Astronautics


aggressiveness so, for now, it can only be used to compare pilot inceptor workload for identical tasks. It is
readily apparent that increased aggressiveness corresponds to increased workload. Finally, “aggressiveness”
is easily conceived; pilots can easily imagine how they might become more or less aggressive while moving
the controls during tracking.

VI.D. The Two-Dimensional Picture of Pilot Inceptor Workload


With pilot inceptor workload divided into two roughly independent factors, it is convenient to plot them
on an x-y chart, with duty cycle on the x axis and aggressiveness on the y axis. This chart is depicted as
Figure 7 and includes several important generalizations based upon the author’s experience using the BAT
workload buildup FTT as an instructional tool and data from research conducted after the first illustration
was created. What can this figure tell us? First, as the measured workload for a given task moves away
from the origin, it can be said that the pilot’s workload is increasing. Hypothetically, this should correspond
to increasing pilot gain. Second the proportion of duty cycle to aggressiveness tells us something about the
aircraft. For instance, when duty cycle is low and aggressiveness is high (the upper left-hand part of the
figure) the pilot’s inputs are consistent with lead compensation, where an input is made and the system
allowed to respond in order to minimize the affects of a response delay. Third, the upper right-hand corner,
where duty cycle is 1.0 and aggressiveness is maximized, corresponds to the worst possible PIO–stop-to-stop
at maximum effort. (Holding the inceptor at the physical stop is counted as “moving” the inceptor for the
purpose of duty cycle and the inceptor rate at impact is used to calculate aggressiveness; after all, the pilot
only stopped at that point because of the physical limit!) Thus the pilot inceptor workload may be used to
show the change in pilot inceptor workload (or “gain” in the shorthand sense) and be used to compare the
workload between different pilots and different attempts at the same task. Comparing inceptor workload
across tasks will require a method to normalize aggressiveness–an effort as yet unattempted.

VII. Limited Investigation and Characterization of Boundary Avoidance


Tracking Deterministic Analytical Rating Task (BAT DART) 12
In early 2006 it was becoming clear that bound-
ary avoidance tracking might become a powerful tool
in the handling qualities flight test technique arse-
nal. All studies prior to the student test manage-
ment project called “BAT DART” used a sequence
of collapsing boundaries to allow examination of pi-
lot response to different boundaries and to ensure
that the spectrum of BAT was seen during each
run. It became apparent that as the boundaries
collapsed and the tracking task became more dif-
ficult, pilot workload would increase until the pilot
could no longer accomplish the task. As pilot work-
load was increasing performance was increasing as
well—but only to a point. If performance is mea-
sured as average error (usually computed as root-
mean-squared, or RMS, error), the average error rel-
ative to the point at the center of the two boundaries
would steadily increase until PIO became a problem,
then the error would increase as the boundaries were
drawn closer. Figure 8 illustrates this progression.
(This figure is an evolutionary improvement to Fig-
Figure 8. Hypothetical tracking performance as the task
ure 3. Note that the x axis has been reversed from becomes more demanding (i.e. as the boundaries are tight-
the original to depict inceptor workload as increas- ened).
ing as the desired tracking performance moves to the
right.) There was plenty of circumstantial evidence for this progression but little in-flight data to support
it. The BAT DART test management project was designed, in part, to examine this phenomenon. (“BAT
DART” was managed by Jason Dotter, an AFIT/USAF TPS Master’s Degree Student to meet his thesis

10 of 19

American Institute of Aeronautics and Astronautics


requirement13 )
BAT DART had another objective, to compare the performance of pilots in a BAT FTT with their
Cooper-Harper rating (CHR)14 of the same task. Cooper-Harper evaluations have been a mainstay of
handling qualities evaluations for decades. The subjective correlation between CHRs and BAT FTT results
would be an important indicator of the usefulness of BAT test techniques and give some indication of their
best use.
The test consisted of two phases; simulation at the Large Amplitude Multi-Mode Aerospace Research
Simulator (LAMARS) located at the USAF Research Laboratory at Wright-Patterson Air Force Base, Ohio,
and a flight test program on the Variable In-Flight Stability Test Aircraft (VISTA) NF-16D at the USAF
TPS. Both assets are shown in Figure 9. In concept the LAMARS test was to be nothing more than the
VISTA test executed in the simulator but, as with all simulators, significant differences made it difficult to
compare data. The simulator sessions provided an outstanding opportunity for the test subjects to hone
their skills with Cooper-Harper evaluations and get exposure to boundary avoidance tracking.

(a) LAMARS simulator. (b) VISTA NF-16D.


Figure 9. Simulator assets for the BAT DART research project.

(a) CH task display. (b) Workload buildup task display.

Figure 10. HUD Displays for the BAT DART tracking tasks (taken from VISTA HUD video).

In the LAMARS and VISTA, each subject conducted 8 total test runs. Four different models were
flown using both a Cooper-Harper task and a boundary avoidance task. These models included two Level
I (acceptable) models, a Level II (tolerable) model, and a Level III (unacceptable but controllable) model.
With the exception of the pitch response, all tasks were identical flight path angle tracking tasks. A sum-
of-sine pitch input to an F-16 model was used to create a flight path that the subject tracked during the
task. Figure 10 illustrates the CH task and BAT task displays as seen by the subject in the heads-up display
(HUD). For the workload buildup task, the task started with wide boundaries displayed in the HUD and the
pitch tracking task would repeat every minute, with the boundaries closing-in by 25% every step. (Figure 10b
depicts the display at one of the intermediate boundary settings.) The task continued until either boundary

11 of 19

American Institute of Aeronautics and Astronautics


was exceeded, at which point the simulation would end (either automatically in the LAMARS or by the
safety pilot in the VISTA). The tracking task was very demanding, requiring the subject to quickly achieve
and hold approximately 2.5g and -1.0g twice every minute. These g levels were only apparent in VISTA so
subjects were much more aggressive in the LAMARS. Many of the runs were terminated when a PIO caused
the terminal boundary excursion. The subjects, including four pilots and three non-pilots, would often
describe their error as resulting from a PIO, in spite of the excursion occurring before one full oscillation!
(Figure 11 depicts the sixth minute of the Level II and Level Ia model runs in the VISTA NF-16D for the
same pilot.)

(a) Level II model data.

(b) Level Ia model data.

Figure 11. Comparison of the same tracking task (the sixth step in the workload buildup) between the Level II and
Level Ia model.

12 of 19

American Institute of Aeronautics and Astronautics


The BAT DART test was a complete success.
Both the CH task and BAT task gave consistent re-
sults for each system after accounting for test sub-
ject experience. As expected, performance improved
as the boundaries moved closer together, but in-
stead of the performance beginning to degrade at
the tightest boundaries, there was no loss of perfor-
mance as the smallest possible boundaries were ap-
proached. This was attributed to the nature of the
task, which automatically stopped the simulation
when a boundary was exceeded, and to the removal
of less skilled subjects as the tasks became more
difficult. The oscillations that preceded many run
completions were typically composed of only one or
two overshoots—not enough to affect the improved
average at the tighter boundaries—and once individ-
ual subjects failed to remain within the boundaries,
Figure 12. BAT DART tracking performance data; corre-
they were no results to include in the average for lation of CHR and BAT task success.
the boundaries they were unable to attain. Run re-
sults for individual subjects often showed the right side of the curve in Figure 8, especially for those that
encountered PIOs in the final stages of the task.
The subjective correlation between CHRs and pilot success in the BAT FTT provided an interesting
challenge. Initially, there was not much correlation other than an unsurprising but barely evident tendency
for poorly rated systems in the CH task to produce relatively lower tracking times. With the CHRs plotted
against the BAT FTT time, there was far more data scatter than expected. Examination of the data points
made it clear that most of the scatter was due to subjects that had little or no pilot experience. When
the data for non-flyers was eliminated, the correlation came out strongly. Figure 12 depicts these results.
(The authors of the report fully understood the ordinal nature of the CHR scale. The BAT FTT time was
originally plotted against the CHR on a linear scale out of curiosity. This presentation serves as a convenient
method to show the subjective correlation between the two results.) Note that the standard deviation of the
scatter in the tracking time data corresponds to about one CHR. This corresponds to the widely accepted
scatter experienced in a typical CHR test.
The BAT FTT and CH rating FTT provided roughly correlated results in spite of being significantly
different in application and analysis. Table 2 shows the advantages and disadvantages of each as experienced
by the author. The CH rating technique has been show to be quite effective for many years and will remain
a mainstay of the USAF TPS curriculum, but the BAT FTT can provide a significant amount of information
that future test teams might find quite useful. It certainly proved useful for examining inceptor workload as
a function of duty cycle and aggressiveness as well as achieved vs. desired performance.

VIII. Workload Buildup Technique Evidence


for Pilot Inceptor Workload Measurement
Pitch tracking data from the LAMARS and VISTA tests were used for the following analysis. The data
included a total of 49 workload buildup runs averaging about seven minutes each. For clarity, the following
figures only include the data from one of the Level I models (“Level Ib”) and the Level III model. One of
the more experienced pilot’s results are marked with a circle, while one of the non-pilot’s results are marked
with a triangle.

VIII.A. Variation of Pilot Performance with Boundary Size


Data from the BAT DART test confirms the obvious; pilot performance improves as the requirements of the
task become more demanding. These results serve more as a baseline for performance analysis because the
only task the subjects were given was the flight path angle tracking task. The safety pilot handled all other
tasks including clearing, airspeed maintenance, and radio traffic. Figure 13 shows the root-mean squared
tracking error for selected models (LAMARS and VISTA), but only for the steps where the subjects were

13 of 19

American Institute of Aeronautics and Astronautics


Table 2. Relative advantages and disadvantages of CHR and BAT workload buildup FTTs.

Cooper-Harper Rating BAT Workload Buildup FTT

Task Performance Establishes whether or not a pilot Establishes the best performance the pi-
can accomplish a particular task. lot was capable of in a particular task

Pilot Workload Not currently addressed, except when


Directly addressed in the rating.
the workload becomes excessive.
Pilots must be trained to provide them;
Pilot Comments Recognized as the most important
they are an important aspect of any han-
aspect of the test.
dling qualities test.
PIO is often the cause of the pilot’s in-
PIO Susceptibility Only insomuch as the task itself
ability to remain within the boundaries;
tends to cause a PIO.
identifies when PIO becomes a risk.

Learning Curve Learning curve is observed through Learning curve is observed through im-
Assessment improved ratings as the task is re- proved performance as the task is re-
peated. peated.

Time to Accomplish Each boundary decrement requires


Generally completed within a
the FTT about one minute but there is no delay
minute or so.
between decrements.

able to accomplish the full 60 second duration of the sub-task. (The final step from each run was excluded
because runs were typically terminated by a boundary excursion at the first significant challenge. This
resulted in a computed RMS error applicable only for the shortened task and not directly comparable to the
results from completed 60 seconds workload buildup steps.)

(a) LAMARS RMS error. (b) VISTA RMS error.


Figure 13. RMS point-tracking error for the BAT DART LAMARS and NF-16D VISTA workload buildup runs for the
Level Ib and Level III models.

Figure 13 shows how performance improved as the boundaries were brought together. Note that the
subjects allowed more error in flight than in the simulator–this is probably due to the consequences of tight
tracking. The effort required to remain within the boundaries, especially with the more difficult models,
could easily produce occasional spikes of almost -2.0g, as seen in Figure 11. In the simulator instantaneous
motion was simulated using the motion system, but tight tracking was far less uncomfortable so the subjects
were not faced with trading comfort for performance.

14 of 19

American Institute of Aeronautics and Astronautics


As the boundary displacement approached the minimum attainable for each subject, all subjects were
fully engaged in their tasks and comments typically stopped except for the occasion grunt of frustration.
The VISTA results showed a clear difference between the different models, with the actual tracking success
corresponding quite nicely to the models’ handling qualities levels. In the simulator, there was not much
difference between the models in terms of skilled subjects’ ability to progress through the workload buildup
steps. This was certainly a surprise, and remains an interesting puzzle to solve.

VIII.B. Variation of Pilot Inceptor Workload with Boundary Size


Figures 14 and 15 show the reduced data for selected LAMARS and VISTA NF-16D runs, breaking out the
results by run (each line), by model (shading), and subject (two subjects are marked with unique symbols).
The results seem to confirm the hypothesis that duty cycle and aggressiveness increase as pilot “gain” would
be expected to increase. They also show how subjects tended to use different techniques with each model and
that the relative application of these techniques was different in the simulator and in the aircraft. Finally,
the duty cycle and aggressiveness of the least successful subjects tended to differ significantly from that
employed by successful subjects.
The most immediately evident result is the clear grouping of duty cycle and aggressiveness results for

(a) Aggressiveness vs duty cycle, illustrating different tracking (b) Aggressiveness increasing with boundary closure and task
“tactics”. difficulty.

(c) Duty cycle increasing with boundary closure and task dif- (d) Looking down the average slope.
ficulty.
Figure 14. Selected results from the BAT DART LAMARS data.

15 of 19

American Institute of Aeronautics and Astronautics


(a) Aggressiveness vs duty cycle, illustrating different tracking (b) Aggressiveness increasing with boundary closure and task
“tactics”. difficulty.

(c) Duty cycle increasing with boundary closure. (d) Looking down the average slope.
Figure 15. Selected results from the BAT DART NF-16D VISTA data.

particular models. This is clearly evidence of something already known; different flight control configurations
require different control “tactics.” The advantage of this method of characterizing inceptor workload comes
in describing the tactic. For instance, when comparing the types of tracking tactics employed in the VISTA
(see Figure 15(a)) it is clear that the Level Ib and III models required significantly higher duty cycles and
aggressiveness for successful tracking. The least successful subject was almost always using a tactic clearly
different than the other subjects in both the VISTA and LAMARS (specifically a much lower duty cycle)
showing how this method can clearly delineate between tracking strategies.
The subfigures (b) and (c) for both figures show provisional confirmation of the hypothesis that duty
cycle and aggressiveness will tend to increase as the pilot is required to work for better performance. By
using a first-order curve fit for each slope of aggressiveness and duty cycle to the logarithm of the boundary
displacement and assuming a gaussian distribution of slopes, the probability that aggressiveness and duty
cycle have an inverse relationship with boundary displacement can be estimated. The combination of the
two data sets give a probability of this correlation at approximately 90%. Subfigures 14(d) and 15(d) also
show this by being aligned with the view angle down the average slope.

16 of 19

American Institute of Aeronautics and Astronautics


IX. Use of the BAT Workload Buildup Flight Test Technique
at the USAF TPS
For the past decade, the USAF TPS has put much effort into instructing the importance of considering
pilot-in-the-loop instabilities during handling qualities testing. If a test team hopes to identify potential
PIO situations, they must ensure that their test pilots achieve gains in excess of those which would normally
be expected in operational employment. Starting in 1996, the USAF TPS taught a methodology called
“Handling Qualities During Tracking” (HQDT) that relied upon the test pilot to conduct a tracking task at
steadily increasing gains until a PIO was established or it became clear that PIO was not possible. The US
Military Standard 1797B provides an excellent description of the technique:
The key element of the HQDT technique is that the pilot must attempt to totally eliminate any
error in the performance of the task; he adopts the most aggressive control strategy that he can.
Adequate and desired performance objectives are not defined and Cooper-Harper ratings are not
recommended. The reason for this is that, in the “operational” tasks, definition of adequate
and desired performance encourages the pilot to adopt a control strategy which best meets these
performance objectives. In the case of a PIO-prone air vehicle, attempting to totally eliminate
any deviation may induce oscillations which reduce task performance, but by accepting small
errors (reducing pilot gain) the pilot may be able to avoid these oscillations and still meet the
performance objectives (which, by their definition, allow such a tactic). The HQDT technique
does not allow the pilot to do this, thus exposing any possible handling qualities deficiencies.
HQDT could be considered a “stress test” of handling qualities. For this reason, the HQDT
technique is considered the best test of PIO tendencies.1
The intent of HQDT is beyond reproach, but the technique has proven extraordinarily difficult to teach
and has demonstrated limited utility in the flight test environment.15 Very few pilots can intentionally
attempt to aggressively eliminate all error because experience has taught them that tracking with no error
is not possible and that it is likely to cause problems. One of the most important complaints about HQDT
is that it almost always causes a PIO, thus making even excellent aircraft appear dangerous.7 In a sense,
it works too well! The problem with PIOs is not that they exist—they can be made to happen in most
aircraft and most tasks—PIOs become a problem when they occur during normal operations or normally
aggressive pilot tracking. Structural flutter provides a good analogy: flutter is not a problem if it will only
occur far enough outside the aircraft envelope that the operational pilot is highly unlikely to encounter it.
HQDT is like testing for flutter by pushing the aircraft until flutter is encountered. In real flutter testing,
the aircraft is only pushed as far as necessary to ensure that the envelope is clear for normal operations.
Aircraft designers know that flutter is possible in almost any airframe but they only need to show that it is
not a problem inside a safety margin around the aircraft envelope. Likewise with PIO; it is probably “out
there” so how does the test team ensure that it is not a factor inside a safety margin around the expected
pilot gain?
In the course of BAT testing, and especially during the development and testing of the collapsing boundary
BAT FTT, it became apparent that BAT could provide a repeatable method of pilot gain buildup. The
pilot is forced to higher gains as the boundaries are brought closer together. It has been shown that this
will increase the pilot’s performance at the cost of additional workload until the point at which insipient
instability (where “instability” here is, of course, PIO) limits his ability to improve.12 As the stability of the
closed-loop system decreases with tighter and tighter boundaries it will make the task impossible and the
PIO evident.
One particular advantage of this technique is that it uses a “build-up approach”, making it much less
likely that the pilot will misinterpret a PIO as an aircraft instability (perhaps the most important cause of
hazardous PIOs). It can also show what the pilot is capable of, not just whether or not the pilot is capable
of a particular task (as is the case with the Cooper-Harper handling qualities FTT).
Recent experience with the BAT FTT has also shown that it can also identify those systems that are
performance-limited by something other than PIO. The author recently developed a desktop tracking sim-
ulator for student training that allows almost all aspects of the tracking task, from the system to the task
itself, to be changed at the whim of the student or instructor. It is a simple thing to tune the system and
task so that the pilot simply runs out of control authority before any hint of a PIO arises.
The collapsing boundary BAT FTT is in its infancy at the USAF TPS but it shows much promise. It
gives the pilot a series of tasks that will increase his gain instead of relying on the pilot to produce the desired

17 of 19

American Institute of Aeronautics and Astronautics


level of aggressiveness. Where most students and many instructors were essentially unable to understand or
apply the HQDT technique, very few have any difficulty with the BAT FTT. Where difficulties arise, they
are usually the result of pilots not appreciating the need to role-play as if the boundaries represent a real
threat; it seems that this often results when the increment from one boundary task to the next is too large.
Most pilots learn to use the technique quickly. The technique is also an excellent training tool for showing
the effects of a tracking task on pilot workload and the effects of excessive pilot gain.

X. Current and Future Research


Research continues on boundary avoidance tracking, the workload buildup technique, inceptor workload
and their affect on tracking performance. As more students and instructors utilized the BAT workload
buildup FTT on different platforms, the USAF TPS is beginning to settle upon those techniques that best
illustrate and measure pilot-in-the loop control and oscillations. Preliminary simulator work conducted at
the USAF Research Laboratory by a student in the USAF Institute of Technology/TPS Master’s Degree
Program has provided additional support for the ideas discussed in this paper. Beginning in mid-2008, this
student and a few of his classmates will begin their test management project, extending the simulator work
into the actual flight environment. This project, based on bank angle control and boundaries, will attempt
to gather data to support several areas of these theories and will, for the first time, include a secondary task
to help measure pilot concentration on the primary task. It is hoped that the relative simplicity of bank
angle tracking will provide data that can be used to further examine the mathematical model of boundary
avoidance tracking.

XI. Conclusion
In the three years since the first conception of boundary avoidance tracking, the USAF Test Pilot School
has supported and conducted a series of flight and simulator research programs to further define and un-
derstand the phenomenon. These programs have largely supported the original boundary-avodince tracking
feedback model while showing that there is much to learn, especially with respect to how pilots transition
out of boundary avoidance tracking when the boundaries are exceeded. All experimental subjects have con-
firmed their conscious intent to avoid boundaries and the transition from point tracking to boundary tracking.
Boundary avoidance tracking has also been shown to be responsible for stable and unstable pilot-induced
oscillations.
In conjunction with this research, a new flight test technique that uses a series of collapsing boundaries
around a common task has proven a very useful tool for driving pilot gain, estimating the best attainable
closed-loop capability performance for a task, and identifying PIOs safely. The results from these tests
also appear to subjectively correlate well with Cooper-Harper tracking task results. The USAF TPS has
incorporated the collapsing boundary BAT FTT into its curriculum as an evolutionary change to the HQDT
technique. Early results are very promising; the intent of HQDT is met by giving pilots tasks that increase
their gain, not by asking that they maximize their gain.
It seems, then, that boundary avoidance tracking has circled back on point tracking and can be used
to characterize the pilot/aircraft capability for conducting a point tracking task. This should not be a
surprise. From the start of manned flights, pilots have recognized that no handling task can be accomplished
perfectly. “Good enough” has always instinctively been characterized as a certain maximum amount of error;
the boundary between success and failure.

Acknowledgments
The author’s continuing research into boundary avoidance tracking and its application in flight test would
not be possible without the generous support of the USAF Test Pilot School, including the enthusiastic
feedback and support of its talented staff and students.

18 of 19

American Institute of Aeronautics and Astronautics


References
1 United States Department of Defense, MIL-STD-1797B, Flying Qualities of Piloted Aircraft, February 2006.
2 Baker, B. C. and Paynter, S. J., “US Patent #6,785,610: Spatial Avoidance Method and Apparatus,” Patent, August
2004.
3 Gray, III, W. R., “Boundary-Escape Tracking: A New Conception of Hazardous PIO,” 48th Society of Experimental Test

Pilots Symposium Proceedings, 2004.


4 McRuer, D. T., Clement, W. F., Thompson, P. M., and Magdaleno, R. E., “Minimum Flying Qualities, Volume II: Pilot

Modeling for Flying Qualities Applications,” Tech. Rep. F33615-85-C-3610, Wright Research and Development Center Flight
Dynamics Laboratory, August 1989.
5 Twisdale, T. R. and Nelson, M. K., “A Method for the Flight Test Evaluation of PIO Susceptibility,” NASA Dryden

PIO Workshop, 1999.


6 Mitchell, D. G. and Klyde, D. H., “Recommended Practices for Exposing Pilot Induced Oscillations or Tendencies in the

Development Process,” Tech. rep., Systems Technology, Inc., 2004.


7 Prosser, K. E. and Thurling, A. A., “Handling Qualities Stress Testing,” 44th Society of Experimental Test Pilots

Symposium Proceedings, 2000.


8 Angner, J., Jensen, C., and Seidl, M., “JAS 39 Gripen EFCS: How to Deal with Rate Limiting,” 40th Society of

Experimental Test Pilots Symposium Proceedings, 1996.


9 Gray, III, W. R., “Boundary-Avoidance Tracking: A New Pilot Tracking Model,” AIAA Atmospheric Flight Mechanics

Conference and Exhibit, 2005.


10 Warren, R. D., Abell, B., Heritsch, S., Kolsti, K., and Miller, B., “A Limited Investigation of Boundary Avoidance

Tracking (HAVE BAT),” Tech. rep., United States Air Force Flight Test Center, Edwards AFB, CA, 2006.
11 Warren, R. D., An Investigation of the Effects of Boundary Avoidance on Pilot Tracking, Master’s thesis, US Air Force

Institute of Technology, Wright-Patterson AFB, OH, December 2006.


12 Dotter, J. D., “An Investigation into Pilot Performance and PIOs as Influenced by Boundaries,” Tech. rep., United States

Air Force Flight Test Center, Edwards AFB, CA, 2006.


13 Dotter, J. D., An Analysis of Aircraft Handling Quality Data Obtained From Boundary Avoidance Tracking Flight Test

Techniques, Master’s thesis, US Air Force Institute of Technology, Wright-Patterson AFB, OH, March 2007.
14 Cooper, G. E. and Harper, Jr., R. P., “The Use of Pilot Rating in the Evaluation of Aircraft Handling Qualities,” Tech.

Rep. NASA TN D-5153, Cornell Aeronautical Laboratory, 1966.


15 Oelker, H.-C. and Brieger, O., “Flight Test Experiences with Eurofighter Typhoon during High Bandwidth PIO Resistance

Testing,” AIAA Atmospheric Flight Mechanics Conference and Exhibit, 2006.

19 of 19

American Institute of Aeronautics and Astronautics

You might also like