Professional Documents
Culture Documents
Quality Assurance Staffing Impacts in Military Aircraft Maintenance Units OK
Quality Assurance Staffing Impacts in Military Aircraft Maintenance Units OK
Abstract
Purpose – This paper summarizes our research into the impact that current Air Force quality
assurance staffing practices have on key unit performance metrics.
Design/methodology/approach – Interviews and Delphi surveys culminated in the development
of a quality assurance staffing effectiveness matrix. The matrix was used to calculate historical
quality assurance staffing effectiveness at 16 Air Force combat aircraft units. Effectiveness scores
were then regressed with unit historical data for 25 metrics.
Findings – Nine metrics were deemed statistically significant, including break rates, cannibalization
rates, flying schedule effectiveness rates, key task list pass rates, maintenance scheduling
effectiveness rates, quality verification inspection pass rates, repeat rates, dropped objects counts and
safety/technical violations counts. An example benefit-cost analysis for changes in quality assurance
staffing effectiveness presents compelling evidence for maintenance managers to carefully weigh
decisions to leave quality assurance personnel slots empty, or to assign personnel possessing other
than authorized credentials.
Originality/value – Maintenance managers can use this method to help determine personnel
assignment strategies for improving quality assurance unit performance with respect to similar
metrics, and other managers could adopt this general approach to more effectively link personnel
resources to their organization’s key performance indicators.
Keywords Quality assurance, Maintenance, Personnel administration, Performance measures,
Aircraft industry, United States of America
Paper type Research paper
2. Introduction
A typical Air Force combat aircraft wing requires thousands of maintenance
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
technicians to safely launch, recover, and service aircraft between flights. Wing leaders
rely on a cadre of about 25 experts – the quality assurance unit – to provide daily
maintenance oversight, on-the-spot correction, training, an investigative capacity, and
a mechanism for formal feedback.
The key to an aircraft maintenance quality assurance unit’s effectiveness are the
skills and experience-based qualifications of personnel assigned to the limited
positions. Accordingly, the Air Force performs extensive manpower studies to
determine the authorized quality assurance unit composition that can best perform its
exercise, peacetime, and war duties. The resultant list of authorized positions and
associated required expertise for a particular quality assurance unit is called its Unit
Manning Document (UMD).
Such efforts are important because the Air Force relies heavily on inspection-based
quality assurance. Poorly executed quality assurance can have disastrous
consequences – for example, an F-15 aircraft crashed with loss of plane and pilot
after a mechanic incorrectly installed flight control rods, and another mechanic failed
to catch the miscue (Grier, 1998). In 1996, ValuJet Flight 592 caught fire in-flight and
crashed into the Florida Everglades, killing 110 people. The crash was attributed to
contract maintenance personnel improperly rendering safe and shipping oxygen
cylinders in the cargo hold of the aircraft. A key finding was that ValuJet failed to
adequately oversee the maintenance subcontractor, and this failure was the cause of
the accident (Langewiesche, 1998). Catching such issues and ensuring that critical
inspections are performed correctly are fundamental to the quality assurance function.
There exists a considerable literature on quality assurance staff selection and on
task-skill mismatching. McCormick et al. (1992) discuss the merits of contracting
consultants for skilled management oversight positions, versus using in-house
staffing. Srivastava (2005) shows how a systemic, up-front quality assurance planning
and execution approach enabled two major nuclear power plants to rapidly achieve
goals with 1/3 fewer quality assurance manpower positions. Vessely (1984) presents
guidelines for quality assurance manpower requirements throughout a nuclear power
plant’s production and operations life cycle. Van den Beukel and Molleman (2002)
argue that depending on managerial policies and team social dynamics, employee
multifunctionality can lead to both underutilization of skills and task overload. Zeng
and Zhao (2005) introduce an optimization-based approach for analyzing
role-resolution policies used in current workflow management practice and propose
new policies. However, they assume that all staffing decisions have already been made. Quality
MacDonald et al. (1996) report the results of an internal quality assurance pilot assurance
program at 16 Indonesian hospitals, with the goal of motivating hospital staff to
assume greater responsibility for their quality of service. They provide several staffing
recommendations for success, including comprehensive (versus product-focused)
programs, use of dedicated coaches, and visible leadership from top executives.
A key assumption throughout the literature is that necessary personnel can be 35
readily acquired. Military aircraft maintenance organizations are characterized by the
need to build specialized technical and military skills that require years to develop, in
an environment with a high rate of personnel turnover (approximately five years per
maintenance technician). However, the military cannot generally hire this skilled labor,
but must develop it internally (Dahlman et al., 2002). Furthermore, managers must
provide the experts to fill the quality assurance jobs from within their own
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
3. Methodology
Our research was completed in four distinct phases. In phase one, key maintenance
metrics were identified and a staffing effectiveness matrix was constructed. Phase two
consisted of acquiring historical staffing data for 16 Air Force bases and applying the
staffing effectiveness matrix to this data. In phase three, data on the subject aircraft
flying units’ key performance metrics were compiled and analyzed to determine any
relationship with the computed quality assurance staffing effectiveness. Phase four
introduces potential mitigating strategies for use by aircraft maintenance managers.
3.1 Phase one: identifying the metrics and building the effectiveness matrix
We first obtained a list of UMD-authorized manpower positions for all Air Force
combat aircraft flying units. Eliminating administrative skills not directly involved in
quality assurance and combining others led to a final list of 24 different specialty
codes, as shown in Table I. To establish and validate our effectiveness matrix, we used
a Delphi approach, which is a process in which a panel of geographically dispersed
experts deals systematically with a complex task in an iterative manner (The Delphi
Method, 2005). Thirty-four maintenance experts agreed to participate as our Delphi
panel of experts. This panel included at least two members from each Air Force combat
aircraft flying unit, and each possessed a minimum of six years of experience in the
aircraft maintenance field.
The panel assessed the impact that they perceived the quality assurance function
has on a list of 28 candidate wing- and unit-level metrics proposed by us. Our
web-based survey had the respondents rate their perceived quality assurance impact
on the candidate metrics using a six-point scale from Strongly Disagree to Strongly
Agree, and we encouraged them to suggest additional metrics. Twenty-five metrics
were deemed significant by the panel, and are introduced and defined in the Appendix.
Our next survey introduced the quality assurance staffing effectiveness matrix – a tool
JQME
AFSC AFS title
13,1
2A0X1 Avionics Test Station and Computer Journeyman/Craftsman
2A3X0 Tactical Aircraft Superintendent
2A3X1 A10/F15/U2 Avionics Attack Journeyman/Craftsman
2A3X2 A10/F15/U2 Avionics Attack Journeyman/Craftsman
36 2A3X3 Tactical Aircraft Maintenance F-15 Journeyman/Craftsman
2A590 Maintenance Superintendent (Non-Tactical Aircraft)
2A5X1 Aerospace Maintenance Journeyman/Craftsman
2A5X2 Helicopter Maintenance Journeyman/Craftsman
2A5X3 Integrated Avionics Systems/INS Journeyman/Craftsman
2A6X0 Aircraft Systems Manager
2A6X1 Aerospace Propulsion Journeyman/Craftsman
2A6X2 Aerospace Ground Equipment Journeyman/Craftsman
2A6X3 Aircraft Egress Systems Journeyman/Craftsman
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
that can measure the average effectiveness of a person with mismatched qualifications
when assigned to quality assurance duty. This survey required two Delphi rounds. In
round 1, a web-based instrument was developed and sent to the Delphi panel. This
instrument asked each Delphi panelist to systematically rate, on a scale of one to five,
how effectively a person possessing a particular specialty code could be expected to
perform the duties and tasks of the 24 quality assurance staffing positions listed in
Table I (Moore, 2005). The panel repeated this ranking process for each of the 24
specialty codes, thus requiring 576 individual effectiveness rankings.
The Delphi panel’s ability to dedicate several hours to complete this survey became
a problem for many of them and thus round 1 took over three months to complete.
Therefore, we developed a filter designed to reduce the panel’s rating effort in the
second Delphi round, by identifying individual effectiveness rankings that could be
deemed as agreed upon (e.g. consistently scored) by the panel. This filter – a coefficient
of variation (CV) discriminator – gave us the ability to compare the variation of two or
more different variables and provided a standardized view across all 576 mean
responses to gain a better understanding of the variability present in the data. Our
hope was that we could reduce the number of individual rankings that would need to
be examined by the panel in round 2. CV discriminators between 0 and 1.0 were
iteratively applied to the mean data responses to illuminate the “fail” responses
(indicating a lack of agreement among the experts) that would need to be addressed by
the panel. After some trial and error, we subjectively settled on a CV factor of 0.29 as
our threshold, which resulted in 529 “fails” to be examined again by the panel. We then
used these “fails” to develop a spreadsheet-based instrument for use in the second Quality
Delphi round. assurance
Panelists were instructed in round 2 to examine the aggregated staffing
effectiveness matrix together with anonymous comments derived from round 1. Of staffing
the 14 respondents in round 2, thirteen modified their round 1 response in varying
degrees. The median data responses from round 2 were then used as a basis to develop
the final quality assurance staffing effectiveness matrix, shown in Figure 1. We 37
determined from e-mail and telephonic responses from the Delphi panel members that
a third survey round would result in no further adjustment to their individual ratings.
3.2 Phase two: determining unit historical quality assurance staffing effectiveness
We obtained monthly personnel assignment data for years 2003 and 2004 from each of
the 16 participating air bases. We then used this data together with the quality
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
After all individual quality assurance manpower positions were assigned staffing
effectiveness ratings, a simple average was computed to determine each quality
assurance unit’s overall staffing effectiveness rating for each month. (Table II depicts
an example calculation.) We then aggregated the monthly staffing effectiveness scores
for all 16 participating quality assurance units into one chart to develop a quality
assurance staffing effectiveness time-series, as shown in Table III.
38
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
Figure 1.
Results of Delphi survey,
part 2: quality assurance
effectiveness matrix
Quality
Position ID UMD Assigned Quantity Score %
assurance
1 2A390 Unfilled 0 0 staffing
2 2A373 2A651 1 47
3 2W071 2W071 1 100
4 2A371 2A371 1 100
5 2A373 2A373 1 100 39
6 2A373 2E151 1 14
7 2W171 2W171 1 100
8 2A051D 2A071 1 100
9 2A351B Unfilled 0 0
10 2A353 2A353 1 100
11 2A353 2A353 1 100
12 2A353 2A553 1 80
13 2A353 2A350 1 80
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
40
13,1
JQME
Table III.
effectiveness
calculated staffing
Quality assurance unit
January February March April May June July August September October November December
score score score score score score score score score score score score
(%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%)
Barksdale 2003 95 95 95 95 95 95 95 95 95 95 95 95
2004 95 95 95 95 95 95 95 95 95 95 95 95
Beale 2003 66 66 61 63 63 63 59 59 59 56 56 56
2004 56 56 53 56 56 56 56 53 56 56 56 69
Cannon 2003 89 89 89 89 90 90 90 90 87 87 87 87
2004 85 85 85 85 85 85 85 85 86 86 86 86
D-M 2003 80 80 80 84 84 88 88 88 92 80 76 83
2004 79 82 84 84 84 84 84 84 88 81 84 84
Dyess 2003 100 100 100 100 100 100 100 100 100 100 100 100
2004 100 100 100 100 100 96 96 96 95 99 99 99
Ellsworth 2003 78 78 78 78 78 78 79 79 79 79 79 79
2004 79 79 79 78 81 81 81 79 79 79 82 81
Holloman 2003 100 100 100 100 100 100 100 100 100 100 100 100
2004 100 100 100 100 100 100 100 100 100 100 100 100
Langley 2003 85 85 85 85 85 85 85 85 85 85 85 85
2004 85 85 85 85 85 85 84 84 84 87 87 87
Minot 2003 88 88 88 88 88 88 88 88 88 88 88 88
2004 94 94 94 94 94 94 94 94 88 91 91 91
M-H 2003 83 83 80 80 80 79 79 79 79 79 81 81
2004 81 81 81 81 81 81 81 81 81 76 76 76
Nellis 2003 87 87 87 87 87 87 87 87 87 87 87 87
2004 87 87 87 87 87 87 93 93 93 93 93 93
Offutt 2003 64 64 64 64 64 64 64 64 64 68 68 68
2004 71 71 71 75 75 75 75 80 80 80 80 80
Pope 2003
2004 74 74 74 74 80 85 85 78 84 89 89 84
S-J 2003 92 92 92 90 90 90 90 90 90 90 90 90
2004 90 90 91 91 91 91 91 91 92 92 92 92
Shaw 2003 97 97 97 97 97 97 97 97 97 97 97 97
2004 97 97 97 97 97 97 100 100 100 100 100 100
Whiteman 2003 85 85 85 85 85 85 85 85 89 89 89 89
2004 91 91 91 93 91 91 100 91 91 91 91 91
Quality
Number of
Correlation AircraftType/ assurance
Metric direction Correlation strength Units staffing
Abort rate Positive Weak to moderate 8
Negative Weak to moderate 6
Break rate Positive Weak to moderate 11 41
Negative Weak to moderate 10
CANN rate Positive Weak to moderate to strong 9
Negative Weak to moderate 17
Dropped object counts Positive Weak to moderate 3
Negative Weak to moderate 11
Deficiency reports submitted Positive Weak to moderate 4
Negative Weak to moderate 9
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
study – there’s just too much process complexity in a typical Air Force combat aircraft
wing. However, the R-squared values do provide useful information nonetheless. One
final concern emerged in our tests for randomness of the regression residuals: in our
data we found five of nine metrics with Durbin-Watson test values that were outside of
the normally acceptable level (Kutner et al., 2005, p. 487), indicating a lack of
randomness. While data transformations could be made, the serial nature of our data
suggests they would not be successful (Oxley, 2000). However, recent studies indicate
that even when heteroscedasticity cannot be eliminated, valid inferences can still be
made (Oxley, 2000). Since we appended our base-level data sets into a single file, we
expect serial correlation. Eight metrics were deemed statistically significant (i.e.
having nonzero regression slope coefficients), as shown in Table V. An additional
metric, Dropped Objects, was significant at the p , 0.15 level. Since it also
demonstrated (in Table IV) a consistently moderately negative correlation with quality
assurance staffing effectiveness at 11 of the 16 bases studied, we included it in Table V
as well.
.
All persons possessing the expertise in the reported staffing data are fully
capable, are assigned as assets under the maintenance group staffing structure,
and are not performing additional military duties outside of their expertise.
The quality assurance staffing variable can be interpreted as an elasticity value for
rate (non-count) dependent metrics. The elasticity formula is:
DY X %DY
· ¼
DX Y %DX
where Y, X are the variables being measured. For example, a one percent increase in
the quality assurance staffing value will yield a 0.7 percent decrease in the Break rate,
as shown in Table VI. Furthermore, when interpreting the impact on a count-type
metric, the marginal improvement is an amount as shown in Table VII – e.g. a 2 0.9
Dropped Object incremental change means that a 1 percent increase in quality
assurance staffing effectiveness will result in about one fewer dropped objects at each
base per month. (Note that Dropped Objects are maintenance-attributable items that
Variable Elasticity
Break 20.7
Cannibalization 22.4
Flying schedule effectiveness 20.4 Table VI.
Key task list pass 20.1 Percentage change in
Maintenance scheduling effectiveness 20.2 dependent variables for a
Quality verification inspection pass 20.1 one percent change in
Repeat 21.7 staffing effectiveness
the number of Dropped Objects avoided per month, the 16 bases, and the 12 months per
year, and note that the total annual cost of adding 16 noncommissioned officers is
$1,200,000.
We divide the Dropped Object benefit b by the cost of the additional 16 personnel to
derive the benefit-cost ratio:
$b Dropped Object Benefit
Benefit-Cost Ratio ¼
$1,200,000 noncommissioned officer cost
To simply recover the added personnel investment, Dropped Objects need only cost
$1,740 per incident. At a more realistic but still conservative cost of $5,000 per incident,
the benefit-cost ratio is over 2.8, equating to an annual cost avoidance of over $2.2
million. Consider that in 2003-2004, the 16 air bases we studied collectively averaged
about 1.5 Dropped Objects per month, with two high-activity bases experiencing about
2.7 per month (Moore, 2005). Even at 1.5 counts per month, the benefit-cost ratio is still
greater than 1.0 if Dropped Object costs average at least $4,200 per event. This example
alone suggests that increasing quality assurance unit staffing effectiveness (i.e.
assigning one more employee to each organization’s quality assurance unit) is justified
solely on the basis of decreasing Dropped Object counts.
4. Discussion
The magnitude and direction of the nine metric coefficients identified in Table V
generally agree with our intuition. Of note is the finding that Cannibalization rates
show a moderately strong negative relationship to unit quality assurance effectiveness
(cannibalization is the act of removing a serviceable part from one aircraft to replace an
unserviceable part on another). Cannibalization can be an effective supply strategy, but
it does create extra maintenance effort and thus should not be used to mask supply
system deficiencies. The regression result was supported by the correlation analysis
for individual units, which was most evident for one-to three-month lags (i.e. effective
quality assurance actions this month appear to mitigate cannibalization actions in the
next few months). On the other hand, Key Task List Pass rates, Maintenance Schedule
Effectiveness rates, and Quality Verification Inspection (QVI) Pass rates all showed
weak negative relationships to unit quality assurance effectiveness. These findings
were consistent with the mixed positive and negative individual unit correlation
analyses. This result could possibly be attributed to individual quality assurance unit
management strategies, as suggested by one Delphi panelist (Moore, 2005) – i.e. more Quality
effective quality assurance units are “tougher” when performing these critical assurance
inspections, and thus these rates would be expected at least initially to be lower. The
counter argument is that more effective quality assurance units will influence staffing
maintenance personnel to be more thorough in task accomplishment before calling the
inspectors, which should be reflected in a positive relationship between these metrics
and quality assurance staffing effectiveness. More research is needed. 45
Podsakoff and Organ (1986) mention potential problems with self-report based
research. For example, potential biases are inherent in perception-based surveys
because respondents can stage their responses in an overt or unintentional attempt to
affect the results. To mitigate this, we provided only basic instructional guidance on
our surveys. A second issue was the lengthy amount of time required to complete our
staffing effectiveness Delphi surveys. According to Podsakoff and Organ (1986),
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
respondents taking long surveys can experience “transient mood states” where a
consistent bias may be introduced across measures. To control for this, we provided a
“Save & Return Later” function in our Delphi surveys.
A final survey issue is the potential bias attributable to trait, source and methods.
For example, a Delphi panelist may have had consistently higher or lower expectations
on how effective somebody possessing their same specialty code may perform other
jobs (e.g. an electrician respondent may feel that an average electrician would be more
apt than people with other skills to expertly handle any type job they are assigned, and
thus may inflate their quality assurance effectiveness ratings for electricians. We
controlled for this potential bias by stressing that the panelists base their inputs on
“average” personnel performance, and by ensuring that our Delphi panel was diverse:
our panel members possessed different military ranks, different job specialty codes,
had different maintenance backgrounds, and represented different organizations.
From our research, we propose several recommendations. First, managers should
use the staffing effectiveness metric to understand the relationship between quality
assurance skill-task mismatching and organization performance. Air Force combat
aircraft maintenance units can use our derived matrix; other organizations should first
determine their own effectiveness matrix and associated unit performance measures.
Second, military aircraft maintenance quality assurance unit managers should perform
analyses to determine the strength and direction of any relationships between their
unit’s measured quality assurance staffing effectiveness and their unit’s performance
on the nine identified metrics. This will help them assess quality assurance staffing
effectiveness as a potential contributing factor for deficient areas indicated by their
metrics, and enable them to make the appropriate personnel decisions. Finally, we
believe this general method has applicability to any organization that seeks to
understand how their personnel resources correlate to their key performance
indicators.
Our future research efforts will concentrate on replicating this method and
performing benefit-cost ratio analyses with other military or civilian service
organizations with high-demand, low-density personnel resources. This will help
validate our method’s robustness and range of application. Secondly, the metric
relationships that were indicated in this study will be investigated through a structural
equations modeling technique to uncover potential additional linkages. We were
unable to obtain sufficient data in our initial research to pursue this approach.
JQME References
13,1 Dahlman, C., Kerchner, R. and Thaler, D. (2002), Setting Requirements for Maintenance
Manpower in the US Air Force, RAND Publishing, Santa Monica, CA.
“The Delphi Method.” (2005), available at: www.iit.edu/,it/delphi.html (accessed 11 September
2006).
Grier, P. (1998), “Board partly clears airman in F-15 crash”, Aerospace World, Vol. 81 No. 6, June.
46 Kutner, M., Nachtsheim, C., Neter, J. and Li, W. (2005), Applied Linear Statistical Models, 5th ed.,
McGraw-Hill Irwin, New York, NY.
Langewiesche, W. (1998), “The lessons of Valujet 592”, Atlantic Monthly, Vol. 281 No. 3, pp. 81-98.
McCormick, E., Pratt, D., Haunschild, K. and Hegdal, J. (1992), “Staffing up for a major program”,
Civil Engineering, Vol. 62 No. 1, pp. 60-2.
MacDonald, P., Francisco, M. and Azwar, A. (1996), “Internal quality assurance: lessons learned
from the PKMI Hospital”, Quality Assurance Methodology Refinement Series, June,
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
Further reading
Air Force Instruction 21-101 (2004), Aerospace Equipment Maintenance Management, HQ
USAF/ILM. Pentagon, Washington DC. available at: www.e-publishing.af.mil/pubfiles/af/
21/afi21-101/afi21-101.pdf
Linstone, H. and Turoff, M. (1975), The Delphi Method: Techniques and Applications,
Addison-Wesley Publishing Co., Reading, MA.
Appendix Quality
assurance
Metrics Definition staffing
Abort Percentage of missions aborted in the air and on the
ground (an abort is a sortie that ends prematurely and
must be re-accomplished) 47
Break Percentage of aircraft that land “Code-3” (unable to
complete at least one of its assigned missions)
Cannibalization Number of cannabalization (CANN) actions (removal of
a serviceable part from an aircraft or engine to replace
an unserviceable part on another aircraft or engine
Combined mishap Aggregated count of all Class A, B and C mishaps both
flight and ground that are specifically related to
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)
Corresponding author
Alan W. Johnson can be contacted at: alan.johnson@afit.edu
1. D.E. Ighravwe, S.A. Oke. 2014. A non-zero integer non-linear programming model for maintenance
workforce sizing. International Journal of Production Economics 150, 204-214. [CrossRef]
Downloaded by Ms Maria Makrygianni At 11:11 20 October 2014 (PT)