You are on page 1of 17

Llis

llis.nasa.gov/lesson/1580

Subject

This document contains lessons learned from the NOAA-N Prime mishap
which occurred at Lockheed Martin Space Systems Company in Sunnyvale,
CA on September 6, 2003.

Abstract

The fully assembled NOAA N-Prime satellite was significantly damaged after sliding off
an improperly configured cart while being rotated from a vertical position to a horizontal
orientation for an instrument shimming operation at the prime contractor»s facility on
September 6, 2003.

This document presents several lessons learned and recommendations for proactive
measures to prevent a recurrence of this unfortunate accident. They may be considered
applicable to any government/contractor collaborative relationship established to build,
test, and launch space flight hardware. The lessons learned relate to the Integration and
Test (I&T) phase of the satellite development life cycle and are particularly relevant to
long duration projects.

Driving Event
On Saturday September 6, 2003 during an Integration and Test operation at Lockheed
Martin Space Systems Company in Sunnyvale, CA that required tilting the large (14» long
by 6» diameter) NOAA-N Prime satellite from a vertical to a horizontal position, the
satellite slipped from the turn over cart and fell to the floor. (The “turn over cart” is a
piece of mechanical ground equipment on which a NOAA satellite can be rotated as much
as 360 degrees and tilted as much as 90 degrees.) Although no injuries to personnel
occurred, extensive hardware damage was sustained to the structure and many of the
satellite bus and instrument components.

The operation scheduled for the day of the mishap was to shim the Microwave Humidity
Sounder (MHS) instrument by removing the MHS from NOAA-N Prime, installing the
shim, and reinstalling the MHS. This operation required the spacecraft to be rotated from
its usual vertical position and tilted to a horizontal position using the turn over cart. In
the accident the spacecraft fell to the floor as it reached 13 degrees of tilt while being
repositioned. From inspection of the turn over cart after the accident, it was immediately
clear that the satellite fell because the turn over cart adapter ring was not properly
secured to the cart with the required 24 bolts.

Mishap Investigation Board

1/17
The NOAA-N Prime Mishap Investigation Board (MIB) was established on September 9,
2003 by NASA Headquarters in the public interest to gather information, conduct
necessary analyses, and determine the facts of the mishap. The findings were published in
the NOAA-N Prime Mishap Investigation Final Report dated September 13, 2004 and are
available on the NASA website.

To identify the root causes at work in the NOAA N-Prime Mishap, the Board undertook
two approaches. The first was an extensive analysis of the sequence of events prior to and
on the day of the mishap; the planned operational scenario vs. the actual execution; and
the planning activities, including scheduling, crew assembly and test documentation
preparation. The second approach was to utilize the Human Factors Analysis and
Classification System (HFACS) 2000 to provide a comprehensive framework for
identifying and analyzing human error. Evidence from a number of sources, including
witness interviews, test and handling procedures, and project documents, were used to
develop the accident scenarios and populate the HFACS model.

The Board found that the direct cause of the mishap was a failure of the contractor
operations team to follow procedures to properly configure the turn over cart prior to
placing NOAA-N Prime on the cart. The necessary 24 bolts to secure the adapter plate
were not in place and the team relied on paperwork rather than through visual and
mechanical verification as required by the procedures.

The Board also discovered that the entire operation was flawed and it exposed issues that
were systemic in nature. The I&T team exhibited complacency toward the operation as it
was rushed, poorly planned and inadequately staffed. Safety was ignored and, in fact,
safety representatives were not notified about the operation. Red-lines to the procedures
were poorly implemented and controlled and the road map through the operation was not
clearly documented. Procedures were not properly stamped or steps waived because of
inadequate contractor and government quality assurance participation. Mechanical
ground support equipment was shared with another program and not configuration
controlled. Government oversight was inadequate and all parties assumed that handling
of flight hardware was routine.

Corrective Actions

The NOAA-N Prime Corrective Action Plan and Implementation Report was prepared by
the NASA GSFC Polar Operational Environmental Satellites (POES) Project, which is
responsible for developing NOAA-N Prime. The document, which was submitted to, and
approved by, NASA Headquarters in accordance with NPR 8621.A, responds to the
recommendations in the NOAA-N Prime Mishap Investigation Report. The NOAA-N
Prime Corrective Action Plan and Implementation Report details the specific actions
taken by GSFC and Lockheed Martin to ensure that no similar mishap ever occurs again.
Both the contractor and the POES Project recognize that the corrective actions are not
one-time events, they must be diligently implemented on a continuous basis in order to be
effective.

2/17
The Lockheed Martin corrective actions described in the NOAA-N Prime Corrective
Action Plan and Implementation Report include: establishing a formal training program
for all I&T personnel and certifying all I&T test conductors; providing training for
supervisors in monitoring employees and correcting poor process discipline; establishing
effective process guidelines for regulating I&T; staffing product assurance and safety
personnel according to requirements; establishing an effective safety program and
promoting safety awareness to all levels of the organization, establishing an effective I&T
monitoring, trending, audit, and verification program; using video monitoring as an aid to
supervision and performance monitoring; and establishing a training program to
disseminate lessons learned from this and other mishaps and near misses.

The NASA corrective actions described in the NOAA-N Prime Corrective Action Plan and
Implementation Report include: providing a full time, dedicated government civil servant
in-plant representative; establishing clear roles and responsibilities for the government
in-plant representatives; providing sufficient resources for the Defense Contract
Management Agency (DCMA) product assurance functions; establishing oversight
guidelines for I&T operations planning, procedure development, and procedure execution
discipline; implementing oversight guidelines for I&T operations; establishing an effective
safety oversight program; implementing a thorough I&T oversight program; and NASA
conducting periodic independent reviews of the GSFC POES Project, which manages the
Lockheed Martin spacecraft contract. In addition, DCMA evaluated the effectiveness of
their oversight process and more DCMA resources have been applied to the NOAA
contract.

Lesson(s) Learned

The lessons learned in the NOAA-N Prime mishap are listed below. They are not
profound; they are all obvious. The accident occurred because a long duration program
that had experienced several years of high performance success grew complacent, both at
the government and at the contractor. The rules were relaxed, shortcuts were taken,
product assurance grew sloppy, and proper oversight was not applied. The recommended
actions are equally straightforward. Obey the rules, follow the procedures proactively,
train the team, always consider systems safety, and provide effective independent
oversight.

In order to emphasize specific points, ten lessons are identified. The lessons are not
independent, stand alone items. All are interrelated facets of the overall NOAA-N Prime
mishap lesson which is to perform I&T and I&T oversight diligently.

1. Past success does not guarantee current performance


2. Periodically update NASA contract requirements on long duration contracts
3. Conduct periodic training in I&T roles and responsibilities
4. Effective safety program is essential
5. Handling flight hardware is never routine
6. Procedure execution discipline is essential
7. Develop clear documentation, minimize use of redlines

3/17
8. Perform advanced I&T planning
9. Provide configuration control of ground support equipment that interfaces to flight
hardware
10. An effective government in-plant office is needed

The NOAA-N Prime lessons learned are applicable to spacecraft development contractors
and to the government organizations that hold the contracts. Under most NASA prime
contracts, the spacecraft vendors are wholly responsible for the development of the
satellite. The existence of government representatives who monitor contractor
performance does not relieve the contractor of their end item responsibilities. However,
since the independent assurance value of government oversight is diminished unless it is
rigorously applied, most of the recommendations in this document address both the
contractor and the government.

Each lesson learned is described below followed by the recommendation for that lesson.

1. Past success does not guarantee current performance

Lessons Learned:

The NOAA-N Prime accident occurred after three intense and productive years on the
spacecraft contract. Work completed during those three years included two successful
launches (NOAA-L in September 2000, NOAA-M in June 2002), dynamics testing of
NOAA-N (NOAA-N is a different satellite than NOAA-N Prime), thermal vacuum testing
of NOAA-N, dynamics testing of NOAA-N Prime and thermal vacuum testing of NOAA-N
Prime. Because so much had been accomplished over an extended period, both the
government and the contractor were lulled into believing that this high performance level
was normal and would be sustained on its own. It was assumed that I&T operations were
being conducted in compliance with requirements. It was assumed that contractor and
government oversight provided the necessary checks and balances. The contractor and
the NASA project office considered the NOAA-N Prime spacecraft tilt operation scheduled
for September 6, 2003 as routine because numerous tilts had been performed. Neither the
government nor the contractor worried about I&T operations on a Saturday because I&T
had been performed six days a week for years on the program. The accident caught
everyone by surprise; no one saw it coming. The government and the contractor were
overconfident and complacent based on past successes.

The lesson learned is that successful performance in the past does not predict success in
the future, especially on long duration projects. Even the most extraordinary effort in the
past does not guarantee that future efforts will be sustained at the same high quality level.
Proactive measures must be taken to maintain a highly functional I&T environment. This
necessitates leadership to guard against complacency and take action against it, discipline
and enforcement.

Formal guidelines are needed to regulate I&T. Each spacecraft operation must be planned
in advance, even if it has been performed many times. The I&T test procedures must be
correct and clearly written. Each team member must be trained in his responsibilities.

4/17
The correct make up of the I&T crew by skill mix and the required independent verifiers
must be present for each operation. Strict discipline must be enforced on the I&T floor.
Stamping policy must be enforced, procedure steps should not be bought off unless
personally performed or witnessed. System safety must be everyone»s concern.
Operations planned for weekends require the full I&T team, just as any other day of the
week. The procedures must be rigorously followed during execution. The managers
should permit no shortcuts regardless of the number of times an operation has been
performed. Even if the I&T crew is completely familiar with the operation and had
performed it successfully many time
before, procedure execution discipline must be
maintained. It is essential that the government members of the I&T team follow the same
rules as the contractor or else the value of independent oversight is severely reduced.

2. Periodically update generic NASA contract requirements on long duration contracts

Lessons Learned:

Overall NASA requirements, standards, and guidelines for space flight hardware
development are incrementally improved and evolve over time to incorporate advances in
aerospace engineering and management practices. Applicable requirements are imposed
on NASA contracts at contract start in order to improve the satellite design,
manufacturing, and test process to the most current requirements, standards, and
guidelines. This lesson learned refers to these general standards and requirements, such
as EEE parts control and spacecraft orbital debris analysis, not contract unique
requirements like satellite or instrument specifications.

The spacecraft contract was 15 years old at the time of the NOAA-N Prime accident. The
contract was not initially planned to be that long, but additional satellites including
NOAA-N Prime were added after 6 years, extending the performance period. Many of the
contract requirements had not been changed since the contract started. Deliverable plans
were submitted and approved in the first year of the contract and then, for the most part,
archived and rarely referenced. For example, the contractor»s System Safety
Implementation Plan was delivered in the 1980s and never updated in spite of three
corporate turnovers and change in the location of the satellite manufacturing plant in
1998 from the East Coast to the West Coast. The contract»s system safety requirements
related mainly to launch site safety, which was the main NASA safety emphasis at the time
the contract started in 1988. Since then, NASA has significantly expanded system safety
requirements to apply throughout the spacecraft development life cycle. But, the NOAA-N
Prime contractor was not asked to update the System Safety Implementation Plan.

The lesson learned is that long duration contracts should be periodically (perhaps every 5
years) reviewed by the responsible government organization to ensure that the generic
NASA requirements are up to date. If they do not meet current NASA policy relevant to
the contract, the contracts should be modified to add the new requirements, or delete
items no longer needed. The new requirements must first be reviewed to determine that
they are applicable to the contract, not all new requirements should be automatically
added. Examples of new requirements are IT security, specific system engineering

5/17
analysis tools, risk management requirements, new releases of applicable documents, and
independent review requirements. Implementing new requirements will add cost to the
contract, so budget needs to be allocated for this purpose.

Plans that were submitted early after contract start should be reviewed periodically by the
contractor and the government to determine whether they need updating. These plans
describe how a function, such as configuration management or software management,
will be performed during contract execution. Plans may need updating because
implementations can change over time due to things such as company internal
requirements changes, contract requirements changes, organizational changes, new
technologies applied, and corporate mergers. The review and update process will add cost
to the contract, so budget needs to be allocated for this purpose.

Major aerospace contractors have their own corporate requirements, standards, and
policies that are imposed on all the work performed by the company. The companies
should, on their own initiative, review their contracted work periodically to ensure that
their corporate requirements are being fulfilled.

3. Conduct periodic training in I&T roles and responsibilities

Lessons Learned:

Experience in I&T does not eliminate the need for training as the NOAA-N Prime mishap
demonstrated. The I&T team on the floor at the time of the accident was comprised
entirely of experienced workers on the NOAA satellites. In spite of their experience, they
did not correctly perform their jobs on the day of the accident. The instrument shimming
operation planned for the day of the mishap had been hurriedly put together; the test
procedure was scissored out of existing procedures without a clear top level road map.
The Responsible Test Engineer (RTE) did not have the full I&T team present as required
by the test procedure on the day of the accident, but he proceeded anyway. System safety
engineering was not called to participate in the operation as required by the procedure for
a hazardous operation, so no safety engineering representative was present when the tilt
began. The government quality representative was called and he gave verbal permission
to start without
him. He was not present when the tilt began. The test procedure
statement about assuring the configuration of the turn over cart was ambiguous. The RTE
did not physically examine the configuration of the turn over cart, because he had used it
a few weeks earlier on NOAA-N, and relied on paper logs to verify the configuration.
(Unknown to the RTE, bolts securing the adapter plate had been removed from the cart
since he had last used it.) A member of the I&T team questioned the cart configuration
but the RTE, without examining the cart, told him that the cart was configured correctly.

The lesson learned is self evident: experience in I&T is not enough, the I&T team must
perform their work correctly and they need to know what is expected of them. Training is
needed to instruct the I&T team how to perform their work properly. Each member of the
team, including the government and company quality witnesses, must understand their
job and their roles around high value flight hardware. Lead test conductors must be

6/17
certified to have the appropriate qualifying experience as well as the supervisory
discipline and training needed to assume responsibility for the flight hardware. The
Responsible Test Engineer must understand that procedures must be followed; the proper
I&T crew, product assurance, safety, government monitors must be present; and
hazardous operations need special care. The product assurance representatives must
understand their roles as witnesses and they must be trained to not buy off procedure
steps unless they have personally witnessed
them.Everyone needs to be concerned about
the safety of people and high value flight hardware. If any member of the team has a
question or a doubt, it should be investigated, not dismissed. Each member of the team
should be empowered to halt an operation. Organizational practices must be established
to reinforce the role and responsibility of contractor and government inspectors as
independent verification agents.

Long duration contracts and/or contracts for multiple satellites need extra attention.
While there are advantages to being familiar with the spacecraft and the I&T process,
training must also address how to keep the workers mentally engaged given the repetitive
nature of some I&T work. Complacency must be avoided. Each operation must be
carefully planned and executed, regardless of how many times it was run before.

Training cannot be learned solely in a classroom setting. It should include various venues
such as discussion groups, lectures, on line training, use of hardware and software
simulators, documentation, mentoring, and on the job training by watching experienced
personnel. Training should also include stories of how things went wrong on other
projects with the lesson being: it could happen to us.

Training is not a one-time activity. It must be a continual proactive process, a constant


admonition to work correctly. The I&T team must be repeatedly reminded of their
responsibilities and everyone needs to take it seriously. Program training and need dates
for refresher training should be tracked. Certifications must be regularly renewed.

Contractor product assurance personnel need additional training in quality assurance and
safety. The requirement for their independent verification function must be repeatedly
emphasized. Stamping policy must be enforced. Quality should not buy off procedure
steps without witnessing them. Similarly, government quality personnel need training in
their oversight responsibilities.

Supervisors should receive additional training in the management of I&T personnel.


Supervisors must take an active role in monitoring the performance of their employees. If
any employees are not meeting the requirements, the supervisors must arrange for
further training or other appropriate corrective actions. Poor performance cannot be
tolerated. Supervisors must always follow the requirements in their own work and be a
role model for their employees.

The training and certification program must include emphasis on team roles and
responsibilities. Good communication among all team members is essential while
planning and executing I&T operations. The government and the contractor should work

7/17
together to define roles of each member of the I&T team, including the government team.
Contractor roles include the test conductor, technicians, engineering, systems safety, and
quality engineering. If NASA has delegated specific product assurance functions to the
Defense Contract Management Agency and NASA has also retained some independent
quality oversight responsibilities, the respective functions must be clarified to the
contractor so that the right people can be notified prior to the I&T operation. The in-plant
government I&T team members should participate in the same training as the
contractors, as well as additional training defined by DCMA and NASA.

The training program should be carefully developed to focus on the problem areas. Too
much training may be counterproductive. When training becomes overkill, people may
actually pay less attention. Training must be appropriate in both quantity and content in
order to be effective.

4. Effective safety program is essential

Lessons Learned:

System safety did not receive high priority at the contractor or the responsible
government organization prior to the NOAA-N Prime accident. System safety was not
fully integrated into the I&T team. System safety representatives were not contacted by
the I&T team on the day of the accident to witness the hazardous NOAA-N Prime tilt
operation as required by the procedure. Hence, there was no safety expert present on the
I&T floor at the time of the accident. Government safety was mainly concerned with
launch site safety and not with the day-to-day operations at Sunnyvale. Before the
accident, there was insufficient safety support on the program both at the contractor»s
site and in the responsible government organization.

Safety of personnel and flight hardware is a serious matter and should be priority one.
Safety support must not be tailored to resources available. An effective system safety
program is needed at both the government and the contractor to protect people from
injury and to keep high value flight hardware secure. Every member of the contractor and
the government team needs to understand that safety is everyone»s responsibility. The
contractor»s safety program should meet current safety requirements. Safety related
training should be provided to all contractor and government team members. Satellite
development contractors need to have system safety involved in every step of the satellite
life cycle. Enough safety engineers must be assigned to the project to perform the
necessary work; safety engineering should not be a part time function drawn from a
centralized corporate safety engineering pool. Safety engineers should participate in
satellite design, manufacturing and
I&T. Safety engineers should be involved with the
design of flight hardware and ground support equipment to make them as safe as
possible, participate in materials and parts selection, evaluate manufacturing techniques
for compliance with safety requirements, perform hazard analysis, determine which I&T
operations are safety critical and how to best execute them, perform operational hazard
analysis, witness hazardous I&T operations to ensure that all safety precautions have been
taken, document launch site safety issues, and keep up with aerospace safety engineering

8/17
practices. I&T operations must not proceed without safety representatives present when
documented in procedural requirements. The contractor must rigorously adhere to safety
requirements.

The government must have a strong system safety program with sufficient safety
engineers to oversee the contractor»s safety implementation. The government»s system
safety program must implement up to date NASA safety policy, and impose them on the
contractor. The government»s safety program must work closely with the contractor to
monitor their performance and review safety related deliverable documentation. The
government»s safety engineers must provide safety training to the government staff.
Government safety should perform safety audits periodically to evaluate the contractor»s
critical processes and procedures.

5. Handling flight hardware is never routine

Lessons Learned:

Special care must always be taken when handling high value flight hardware. Regardless
of how many times a particular I&T operation has been performed, focused attention is
required when a critical or hazardous operation is performed. The NOAA-N Prime
accident occurred when a hazardous operation was treated as routine, a short handed I&T
crew was working, and the procedures were not followed. The Responsible Test Engineer
felt that the turn over cart was properly configured because he had used it a few weeks
earlier on NOAA-N which he verified by checking the paperwork. He did not actually
examine the cart as required by procedure. Had he looked at the cart matched against the
drawing, he should have noticed the missing bolts from the adapter plate. The bolts are
large and clearly visible when properly installed. Even when one of the technicians
questioned the cart»s configuration, the RTE did not look at the cart. In this case,
handling the flight hardware and
using the turn over cart was considered so ordinary by
the RTE, that he did not bother to physically look at the ground equipment on which he
planned to rotate a fully assembled satellite.

Rules should be developed by the government and the contractor on how to conduct I&T.
Each I&T operation should be fully planned in advance with system safety as the
paramount concern. I&T procedures must be developed to control and guide the proper
handling of flight hardware and these must be followed during I&T operations. I&T
procedures must be clearly written, those that contain hazardous operations must be
clearly marked on the document cover and within the pertinent procedural steps. Flight
hardware must be treated with respect, each operation should be executed with care as
though it was the first time it was being performed. Periodic training is needed to keep the
I&T personnel mentally alert for repetitive operations on long duration contracts. The I&T
team should not feel hampered or overly constrained, but they should always be cognizant
of the value of the flight hardware and the need to work with it cautiously. Procedure
execution
discipline must beenforced so that the proper I&T crew needed for that
operation is assembled for each operation, procedures are followed and each step
independently witnessed. Product assurance personnel must function effectively as

9/17
independent verifiers. No assumptions should be made about the state of the spacecraft
and ground support equipment (within reason, ground support equipment does not need
to be torn down and built up after a shift change). Every team member should feel
empowered to voice concerns and not have them ignored. There must be open
communication among the team members.

6. Procedure execution discipline is essential

Lessons Learned:

Procedure execution discipline must be sustained by the I&T team. The NOAA-N Prime
accident occurred in part because the procedures were not followed. An incomplete I&T
crew was assembled to run the operation. System safety was not notified to send a
representative as required by the procedure. The government quality representative was
called when the operation started, but he told the Responsible Test Engineer to start
without him and he would get there later. He should have been present in the clean room
at the start of the hazardous spacecraft tilt operation, but he did not arrive until after the
accident occurred. The Responsible Test Engineer started the operation with these people
absent, in violation of procedure requirements. Quality representatives affixed their
stamps to attest that they witnessed certain procedure steps before the accident, when in
fact they had not. Quality assurance failed to function as an independent authority
carefully double checking all work. One member of the I&T team actually questioned the
configuration of the turn over cart, but the RTE did not go to look at it, instead he told the
technician he was wrong. These lapses of procedure execution discipline and the failure to
acknowledge the opinion of a member of the I&T crew resulted in the terrible NOAA-N
Prime accident.

Spacecraft I&T is difficult work with little margin for error, often involving flight
hardware worth hundreds of millions of dollars. I&T should be performed according to
established policies and configuration controlled procedures. The workers and their
management must be sincere about following these rules. Procedures should not be run
without the full I&T team present required for the operation. A pre operations meeting
should be held so that all personnel understand the operation to be performed. Operation
sign-off must occur with personal cognizance of the participants or independent
validators. Only those procedure steps actually performed and/or witnessed should be
stamped off. Procedure instructions must be clearly communicated and understood by all
personnel involved in the operation. I&T test procedures should be clearly written. Vague
words like “assure” and “verify” should not be used; they should be replaced with specific
actions
having measurable results where possible. I&T team members should be able to
ask questions at any time and every I&T team member should know that he is authorized
to stop an operation if he feels there is a problem. Government presence required by a
procedure can only be waived by approval from the Project Office»s System Assurance
Manager. Waiving mandatory government I&T presence should be rarely, if ever,
permitted.

10/17
The role of the government must be clear. Which government I&T responsibilities are
delegated by NASA to DCMA should be known to everyone in the contractor»s and
government»s organizations. NASA responsibilities on the I&T floor should be defined so
that the contractor understands each type of oversight. The contractor must track,
examine, categorize and trend the nature and closure of the contractor identified non-
conformance reports or the DCMA generated Corrective Action Requests. The
government must also monitor and track the closure of actions, deficiencies, and
recommendations resulting from outside audits and reviews. All results should be shared
with the contractor. Corrective measures to improve I&T should be jointly developed with
the contractor and jointly monitored for effectiveness.

There must be consequences if procedure execution discipline is not followed. I&T


employees need to be told by their supervisors what is expected of them. Poor performers
should not have access to high value flight hardware. The supervisors must lead by
example and follow all the rules themselves.

7. Develop clear documentation, minimize use of redlines

Lessons Learned:

The operation being run on the day of the NOAA-N Prime mishap was a unique, one-time
activity. It consisted of removing an instrument, installing a shim, and reinstalling the
instrument. Individual segments of the operation, such as the instrument installation,
were documented in released and previously run I&T test procedures. But, instead of
developing a new procedure for the shimming operation, portions of existing procedures
were cobbled together with red-lines. This made it difficult to follow the flow as it wove in
and out of various procedures In addition, portions of the existing procedures were red-
lined. All of this out of sequence activity may have confused the I&T crew and contributed
to the mishap.

Red-lines are useful tools that enable the continuation of an I&T operation when minor
unexpected problem arise. But, the use of red-lines must be minimized and carefully
controlled. Red-lines must be clearly understood and appropriate to the operation before
they are executed. If a procedure is red-lined during an I&T operation because of an
unforeseen circumstance, there must be concurrence by the independent product
assurance personnel and government quality witnesses. If any member of the I&T team
questions the red-line, then the operation should be stopped and the issue resolved with
the cognizant engineering personnel.

Red-lines must be documented as either one-time or permanent for future procedure use.
It is up to the contractor»s product assurance organization and government oversight to
ensure that red-lines are not used on the floor repeatedly. If a red-line is determined to be
a permanent change, then the procedure should be formally updated through the
established configuration control review and approval process.

11/17
The contractor and the government should audit the use of red-lines to assure that they
are used appropriately. If excessive use of red-lines are found, corrective measures should
be taken by the contractor.

8. Perform advanced I&T planning

Lessons Learned:

Sufficient time must be allocated to prepare for each I&T operation according to approved
configuration management requirements. The NOAA-N Prime instrument shimming
operation was inserted into the I&T schedule on a Thursday with the intent to execute it
two days later on Saturday. There was a rush to develop the authorizing paperwork. The
preparation of the I&T procedure for an operation that had never been performed in this
sequence was hurried. It was difficult to assemble an I&T team on short notice, most
technicians who were approached on Friday about working on Saturday declined.

No I&T operation has such urgency that planning can be bypassed. All new I&T
operations must be carefully planned and communicated to all participants. It should not
be a race against time to prepare the paperwork and search out an I&T team. The focus of
a new I&T operation should be on accurately preparing it. System safety considerations
must be made. All configuration management steps must be followed to develop new
procedures. A full review should be made of the procedure by all appropriate groups
(system engineering, I&T, quality, safety, government, etc.) prior to the release of a new
procedure.

There are times when a complete procedure cannot be developed in advance, such as
while troubleshooting an intermittent problem where the follow on steps vary based on
the test results. Even in these cases, there should be time to plan the basic approach and
assemble the correct I&T team.

The government should be aware of the amount of time it takes to prepare I&T
documentation and should not allow the contractor to work with the flight hardware
without the required paperwork. The contractor»s product assurance organization should
not allow an operation to proceed if it has not fully matured. The NASA in-plant
representatives or DCMA should stop any operation that does not appear to be properly
planned.

9. Provide configuration control of ground support equipment that interfaces to flight


hardware

Lessons Learned:

Two different spacecraft programs were housed in the same high bay, clean room complex
when the NOAA-N Prime accident occurred. Each had multiple satellites in production.
The programs used similar mechanical ground equipment including two functionally
identical satellite turn over carts. There were two turn over carts that could be used by
either program when configured correctly. This shared equipment was not maintained

12/17
under configuration control. The prevailing philosophy at the time of the mishap was that
the configuration of the mechanical ground equipment should be checked prior to use in
an I&T operation.

The accident occurred because the procedure to assure that the turn over cart was in the
proper configuration was not followed. The root cause of the accident was the bolts that
attached the adapter plate to the turn over cart were missing, so the satellite fell off the
cart as it was being tilted from vertical to horizontal. Had the turn over cart been under
configuration control, the bolts should not have been removed without the proper
authority and coordination with the I&T team that was using the turn over cart.

All ground equipment, mechanical and electrical, that interfaces to flight hardware should
be under configuration control. Electrical ground equipment, especially software systems,
must be configured items maintained in a known, reproducible state. Data base values,
limits, coefficients, flight software load images, ground software, calibration status, etc.
must all be controlled so that spacecraft test results can be evaluated and compared from
test to test. Similarly, mechanical ground equipment must be controlled in a known state.
Proof testing data, drawings and schematics, failure modes effects analysis, operational
hazard analysis, etc. must be available so that the equipment can be used with confidence.
Configuration control should also account for the whereabouts of mechanical equipment
so that it can be easily located when needed in a shared use environment. Note that
configuration management of mechanical ground equipment does not relieve the I&T
crew of the responsibility of verifying its configuration before use.

10. An effective government in-plant office is needed

Lessons Learned:

This section applies to a government in-plant resident office in the I&T phase of
spacecraft development. The Defense Contract Management Administration usually has
an office in the plant of major aerospace contractors. The NASA project also typically
maintains a small in-plant group of engineers who have product assurance and
engineering functions. The NASA project generally delegates specific spacecraft I&T
product assurance functions to DCMA.

The roles of the in-plant office and DCMA must be clearly established and communicated
to the entire I&T team. The contractor typically receives a copy of NASA»s Letter of
Delegation of responsibilities to DCMA, but the contents may not always be provided to
the I&T team. It is confusing to the contractor when it is unclear which government
representative is responsible for what. The in-plant team members also need to know
their responsibilities in written form. Some may be assigned to product assurance
functions, others may serve as an engineering liaison to the NASA project office.

The in-plant government representatives must be vigilant in overseeing the contractor.


The government»s in-plant office must be staffed adequately to do the job with the
needed number of people and skill mix. The government office needs to be actively
involved in the contractor»s I&T process by reviewing test procedures, monitoring the

13/17
required I&T operations, and following the schedule. In addition, the in-plant staff should
audit the contractor»s I&T performance and check for trends. The government»s in-plant
staff should participate in the same training as the contractor regarding roles and
responsibilities and I&T training.

The government is subject to the same procedure execution discipline as the contractor,
otherwise the benefits of independent verification are lost. When NASA requires its own
personnel to witness operations, the government representatives, whether they are from
DCMA or the NASA Project»s own residence office, should be on the I&T floor when the
operation starts. The government I&T monitors should not delay the contractor»s work
by being late or absent. In the NOAA-N Prime mishap, the government representative
verbally authorized the contractor to start without him and had not yet arrived when the
accident occurred. The contractor should not be permitted to waive mandatory
government presence on the I&T floor, only the NASA System Assurance Manager has
that authority and it should be rarely invoked. Government witnessing of hazardous
spacecraft operations should be mandatory.

The NASA project»s in-plant team should be managed by a senior engineer, preferably a
civil servant. A civil servant in-plant representative, serving as the lead for the in-plant
staff and representing the project office to the contractor, strengthens the authority of the
office and provides more effective supervision of the in-plant staff. A civil servant carries
more stature with the spacecraft contractor than a support services contractor as the
government»s in-plant representative.

Recommendation(s)

The following recommendations address the above lessons learned respectively.

Recommendation #1:

Establish a disciplined approach to spacecraft I&T with appropriate checks and balances
that does not depend on past success. Do not believe that because I&T has recently
proceeded smoothly that it will always continue to do so. The leadership must strongly
and continuously fight against complacency. Proactive action is needed to maintain a high
performance I&T environment. Procedure execution discipline must be constantly
enforced. The I&T team should be continuously trained in their roles and responsibilities,
everyone needs to be accountable for their actions, including the supervisors. Everyone
should be empowered to halt operations if a problem is observed. Test procedures must
be well written to avoid ambiguity. I&T operations cannot be executed without the correct
I&T team present. Accurate I&T records must be kept and procedure steps should not be
stamped unless personally performed/witnessed. Government oversight must be
performed with the same
rigor and discipline as the contractor.

Recommendation #2:

14/17
Long duration contracts should be reviewed periodically (for example, every 5 years) to
determine whether any applicable contract requirements need updating because NASA
requirements have changed. Deliverable plans older than 5 years should be reviewed
every 5 years to determine whether they should be updated. Changed requirements and
updated contractor plans will increase contract costs. Contractors should ensure that their
own corporate standards and policies are being implemented.

Recommendation #3:

Provide a formal training program to certify all test conductors and to identify roles and
responsibilities to all I&T personnel. This needs to be a continual effort to train new
employees as well as to refresh existing employees. In addition, supervisory training is
essential to promote their role in identifying, monitoring and correcting poor process
discipline and ancillary deficiencies. Training should emphasize alertness for repetitive
I&T operations and long duration contracts. The government and the contractor should
work together to establish the I&T roles, which should include government oversight
responsibilities. Government I&T personnel should participate in the I&T training at the
contractor»s facility. Training must be applied carefully in both quantity and content in
order to be effective.

Recommendaton #4:

Establish an effective safety program with a well-defined safety policy and mandatory
requirements enforced by contractor and government personnel. The government»s
safety program should define the safety requirements based on up to date NASA safety
policy. The government»s safety team should work closely with the contractor»s safety
team to assure that all requirements are met. The contractor»s safety program must be
current. The contractor»s safety engineering staff must have the proper number of people
and skill mix to perform all the work. Safety awareness must be promoted at all levels of
the program through training and operations participation.

Recommendation #5:

The satellite development contractor and the government should establish effective
process guidelines for regulating the I&T environment including configuration
management, operations planning, procedure development, redlining, open
communication, and execution discipline. The contractor and government I&T personnel
must be trained in the guidelines. The guidelines must be enforced on a continuous basis
to remind the team of the dangers of complacency and the uncertainties of a dynamic I&T
environment. Appreciation for flight hardware requires a full commitment by all to
develop an I&T environment where effective guidelines and safeguards serve to
discourage tendencies toward compromising process discipline.

Recommendation #6:

15/17
Adhere to strict guidelines for product assurance personnel support for all operations
according to established program requirements. Rules for procedure monitoring and
stamping, redlining, and waiver generation must be strictly followed. Mandatory
government inspection points shall not be waived. NASA in-plant oversight and quality
assurance delegation to DCMA must be clearly documented. NASA and DCMA must
routinely evaluate the effectiveness of their assessment processes and formulate
corrective measures as needed.

Recommendation #7:

All I&T operations should be planned ahead. Redlines should only be permitted one time
for minor adjustments to the approved procedures and only with the concurrence of
product assurance. If a redline is needed more than once, then it should be formally
incorporated into a new release of the procedure after appropriate review. Redline usage
should be audited by the contractor and the government.

Recommendation #8:

Enough time must be allocated to properly develop all paperwork needed for a new I&T
operation and to assemble the I&T team to execute it. Rushing though the appropriate
steps or taking short cuts may result in an inadequate product or, even worse, in an
accident. The I&T team should be comprised of willing workers assembled ahead of time.

Recommendation #9:

All ground support equipment that interfaces to high value flight hardware should be
under configuration control. This recommendation applies to both electrical and
mechanical ground support equipment. Configuration control is particularly important
when multiple projects share I&T facilities and ground support equipment. It is always
incumbent on the user of the ground support equipment to verify that it is in the correct
configuration before it is applied to high value flight hardware.

Recommendation #10:

The government in-plant office should be adequately staffed to perform the work
required. The NASA in-plant role should be distinguished from the DCMA role so that
both the contractor and the government clearly understand their functions. The in-plant
representatives must adhere to the same procedure execution discipline as the contractor.
The in-plant quality representatives should not delay the contractor by being late or
absent when required on the I&T floor. A full time civil servant manager of the
government»s in-plant staff provides an indication of government commitment to the
contractor.

Evidence of Recurrence Control Effectiveness

16/17
The corrective actions taken in response to the NOAA-N Prime accident by Lockheed
Martin and the NASA GSFC POES Project have been effective to date. Work is proceeding
well on the launch of NOAA-N and the rebuilding of NOAA-N Prime. Independent teams
have reviewed the POES Project and Lockheed Martin and have found both organizations
to be correcting the deficiencies that lead to the accident.

It is understood by both the contractor and the government that the corrective actions
must be proactively applied for the remainder of the contract.

Program Relation

N/A

Program/Project Phase
None

Mission Directorate(s)
Science

Topic(s)

None

17/17

You might also like