You are on page 1of 25

FAILURE RATE PATTERNS, BOEING 747,

MSG APPROACH

PROFESSIONAL PRACTICES | March 1, 2017


Table of Contents
FAILURE RATE PATTERNS........................................................................................................3

Question 01..................................................................................................................................3

What are the failure rate patterns? Give at least one example of any of the failure rates............3

BATHTUB CURVE.................................................................................................................5

EXAMPLE 01: CAPACITOR LIFETIME EXPECTANCY...................................................6

EXAMPLE 02: DISK FAILURES IN THE REAL WORLD.................................................7

BOEING 747.................................................................................................................................10

QUESTION 02...........................................................................................................................10

Briefly describe the following in respect of Boeing 747...........................................................10

DEVELOPMENT..................................................................................................................10

DESIGN.................................................................................................................................12

VARIANTS (BOEING 747 FAMILY)..................................................................................13

MAINTENANCE SCHEDULE AND SALIENT CHECKLIST ITEMS.............................15

ACCIDENTS..........................................................................................................................16

MAINTENANCE STEERING GROUP (MSG) APPROACH....................................................18

QUESTION 03...........................................................................................................................18

Discuss the maintenance steering group approach....................................................................18

MSG-1 / MSG-2.........................................................................................................................18

MSG-3........................................................................................................................................20

2|Page
FAILURE RATE PATTERNS
QUESTION 01
WHAT ARE THE FAILURE RATE PATTERNS? GIVE AT LEAST ONE
EXAMPLE OF ANY OF THE FAILURE RATES.
The report from United Airlines highlighted six unique failure patterns (shown in figure below)
of equipment.  Understanding these patterns illustrates why the reduction in maintenance could
result in improved performance.

Figure 1. Six failure patterns

1. Failure Pattern A is known as the bathtub curve and has a high probability of failure 
when the equipment is new, followed by a low level of random failures, and followed by
a sharp increase in failures at the end of its life. This pattern accounts for approximately
4% of failures.
2. Failure Pattern B is known as the wear out curve consists of a low level of random
failures, followed by a sharp increase in failures at the end of its life. The pattern
accounts for approximately 2% of failures.

3|Page
3. Failure Pattern C is known as the fatigue curve and is characterized by a gradually
increasing level of failures over the course of the equipment’s life. This pattern accounts
for approximately 5% of failures.
4. Failure Pattern D is known as the initial break in curve and starts off with a very low
level of failure followed by a sharp rise to a constant level. This pattern accounts for
approximately 7% of failures
5. Failure Pattern E is known as the random pattern and is a consistent level of random
failures over the life of the equipment with no pronounced increases or decreased related
to the life of the equipment.  This pattern accounts for approximately 11% of failures.
6. Failure Pattern F is known as the infant mortality curve and shows a high initial failure
rate followed by a random level of failures. This pattern accounts for 68% of failures.

When looking at the failure patterns, the first three can be group together as the equipment
having a defined life, in which the failure rates increase once the equipment has reached a certain
age.  This age may be time or usage such as hours, widgets produced etc.  The failures are
usually related to the wear, erosion or corrosion and are often simple components which come
into contact with the product.  The total of these time based failures only account for 9% of all
failures.

The other patterns highlight the fact that during the initial start-up of the equipment is when the
majority of failure will occur.  This could be due to maintenance induced failures, or
manufacturing defects in the components.  Once the initial start-up period has passed the failure
are random.  These patterns account for 86% of failure.

These patterns state that the failures are random in nature, but that does not mean that they
failures cannot be predicted or mitigated.  It means that overhauls, and replacements conducted
at a specific frequency are only effective in 9% of the cases.

In the rest of the failures, the equipment can be monitored and the right time to conduct a
replacement, or overhaul is identified based on the condition of the equipment.   This is known
as Condition Based Maintenance, or Predictive Maintenance.

4|Page
BATHTUB CURVE
The bathtub curve, displayed in below, does not depict the failure rate of a single item, but
describes the relative failure rate of an entire population of products over time. Some individual
units will fail relatively early (infant mortality failures), others will last until wear-out, and some
will fail during the relatively long period typically called normal life.

Infant mortality is the time over which the failure rate of a product is decreasing, and may last
for years. Wear-out will not always happen long after the expected product life. It is a period
when the failure rate is increasing, and has been observed in products after just a few months of
use.

Failures during infant mortality are highly undesirable and are always caused by defects and
blunders: material defects, design blunders, errors in assembly, etc. Normal life failures are
normally considered to be random cases of "stress exceeding strength." However, many failures
are often considered as normal life failures are actually infant mortality failures. Wear-out is a
fact of life due to fatigue or depletion of materials (such as lubrication depletion in bearings). A
product's useful life is limited by its shortest-lived component. A product manufacturer must
assure that all specified materials are adequate to function through the intended product life.

Figure 2. Bathtub curve

5|Page
EXAMPLE 01: CAPACITOR LIFETIME EXPECTANCY
The lifetime of a capacitor is the time to failure, where failure is defined as the lack of ability of
a component to fulfil its specified function. The failure modes are classified into two main
categories: 'early failures' and 'wear out failures', which are reflected in the curve known as the
'bathtub' curve (shown in figure): at the beginning of the component existence, in its 'infancy',
the failure rate is rapidly decreasing. These 'youth' failures are normally screened by routine tests
performed by the manufacturer. They are due to design and process weaknesses which have not
been detected by the design and process failure modes and effects analysis FMEA performed
during the development. They are more probably due to production process variations or to
changes in material quality. The process variations are due to tool wear, operator change, and
lack of formation. This
early failure mode is not
taken into account by the
Weibull model theory. In
normal operation this
failure process should not
be observed in the field of
applications. If it occurs the
capacitors, the capacitors
are normally covered by
manufacturers product
Figure 3. Bathtub curve of the failure rate function showing the infancy or
warranty.
early failures occurring at the beginning of the component life (in red) and the
wear out curve (in blue) which is defined by a Weibull law with 2 parameters:
Once the 'early failures' the power factor p = 1.8 and the inverse of the time necessary for 63% of the
sample to fail λ0 = 1/1,500,000 h1.
regime is past, the failure
rate starts to follow a statistical prediction law which depends on several parameters that may be
defined experimentally as a function of the voltage, the temperature, and the environmental
humidity. It has been shown that a Weibull statistic can provide a good prediction of the
capacitor lifetime expectancy.

Weibull failure rate is given by

6|Page
p−1
λ ( t )=λ 0 p ( λ 0 ⋅t )

Table 1. Capacitor lifetime expectancy factor as a function of the required capacitance minimum in an exponential
model.

Reliability Lifetime,
(%) 1/λ0 (h)
36.8 1
50 0.693
63.2 0.500
80 0.223
90 0.105
95 0.051
98 0.020

EXAMPLE 02: DISK FAILURES IN THE REAL WORLD


Table 2 provides an overview over the five data sets used in this study. Data sets HPC1 and
HPC2 were collected in two large cluster systems at two different organizations using
supercomputers. Data sets COM1, COM2, and COM3 were collected at three different cluster
systems at a large internet service provider.

Table 2. Overview of the five failure data sets

Type of Total #Disk Disk Disk MTTF System


#
Data set cluster Duration #Event events Count Type (Mhours) Deploym.
Servers
s
HPC1 HPC 08/2001 - 1800 474 765 2,318 18GB 10K 1.2 08/2001
05/2006 SCSI
463 124 64 1,088 36GB 10K 1.2
SCSI
HPC2 HPC 01/2004 - 14 14 256 520 36GB 10K 1.2 12/2001
07/2006 SCSI
COM1 Int. serv. May 2006 465 84 N/A 26,734 10K SCSI 1 2001
COM2 Int. serv. 09/2004 - 667 506 9,232 39,039 15K SCSI 1.2 2004
04/2006
COM3 Int. serv. 01/2005 - 104 104 N/A 432 10K FC-AL 1.2 1998
12/2005
2 2 N/A 56 10K FC-AL 1.2 N/A

132 132 N/A 2,450 10K FC-AL 1.2 N/A

108 108 N/A 796 10K FC-AL 1.2 N/A

7|Page
In all cases, our data reports on only a portion of the computing systems run by each
organization. Below we describe each data set and the system it comes from in some more detail.

HPC1 is a five-year log of hardware failures collected from a 765 node high-performance
computing cluster. Each of the 765 nodes is a 4-way SMP with 4 GB of memory and 3-4 18GB
10K rpm SCSI drives. 64 of the nodes are used as file system nodes containing in addition to the
3-4 18GB drives, 17 36GB 10K rpm SCSI drives. The applications running on those systems are
typically large-scale scientific simulations or visualization applications. The data contains, for
each hardware failure that was recorded during the 5-year lifetime of this system, when the
problem started, which node and which hardware component was affected, and a brief
description of the corrective action.

HPC2 is a record of disk failures observed on the compute nodes of a 256 node HPC cluster.
Each node is a 4-way SMP with 16 GB of memory and contains two 36GB 10K rpm SCSI
drives, except for 8 of the nodes, which contain eight 36GB 10K rpm SCSI drives each. The
applications running on those systems are typically large-scale scientific simulations or
visualization applications. For each disk failure the data set records the number of the affected
node, the start time of the failure, and the slot number of the failed drive.

COM1 is a log of hardware failures recorded at a cluster at an internet service provider. Each
failure record in the data contains a timestamp on when the failure was repaired, information on
the failure symptoms, and a list of steps that were taken to repair the problem. Note that this data
does not contain information on when a failure actually happened, only when repair took place.
The data covers a population of 26,734 10K SCSI disk drives. The number of servers in the
environment is not known.

COM2 is also a vendor-created log of hardware failures recorded at a cluster at an internet


service provider. Each failure record contains a repair code (e.g. “Replace hard drive”) and the
time when the repair was finished. Again there’s no information on the start time of a failure.
The log does not contain entries for failures of disks that were replaced as hot-swaps, since the
data was created by the vendor, who doesn’t see those replacements. To account for the missing
disk replacements, we obtained numbers for disk replacements from the internet service

8|Page
provider. The size of the underlying system changed significantly during the measurement
period, starting with 420 servers in 2004 and ending with 9,232 servers in 2006. We obtained
hardware purchase records for the system for this time period to estimate the size of the disk
population for each quarter of the measurement period.

The COM3 dataset comes from a large storage system at an internet service provider and
comprises four populations of different types of fiber-channel disks (see Table 2). While this
data was gathered in 2005, the system has some legacy components that are as old as from 1998.
COM3 differs from the other data sets in that it provides only aggregate statistics of disk failures,
rather than individual records for each failure. The data contains the counts of disks that failed
and were replaced in 2005 for each of the four disk populations.

Figure 4. Bathtub curves of Disk failures

9|Page
BOEING 747
QUESTION 02
BRIEFLY DESCRIBE THE FOLLOWING IN RESPECT OF BOEING 747.
The Boeing 747 is a wide body commercial airliner, often referred to by the nickname "Jumbo
Jet". It is among the world's most recognizable aircraft, and was the first wide body ever
produced. Manufactured by Boeing's Commercial Airplane unit in the US, the original version of
the 747 was two and a half times the size of the Boeing 707, one of the common large
commercial aircraft of the 1960s. First flown commercially in 1970, the 747 held the passenger
capacity record for 37 years.

The four-engine 747 uses a double deck configuration for part of its length. It is available in
passenger, freighter and other versions. Boeing designed the 747's hump-like upper deck to serve
as a first class lounge or (as is the general rule today) extra seating, and to allow the aircraft to be
easily converted to a cargo carrier by removing seats and installing a front cargo door. Boeing
did so because the company expected supersonic airliners, whose development was announced in
the early 1960s, to render the 747 and other subsonic airliners obsolete, but that the demand for
subsonic cargo aircraft would be robust into the future. The 747 in particular was expected to
become obsolete after 400 were sold but it exceeded its critics' expectations with production
passing the 1,000 mark in 1993. As of October 2008, 1,409 aircraft had been built, with 115
more in various configurations on order.

The 747-400, the latest version in service, is among the fastest airliners in service with a high-
subsonic cruise speed of Mach 0.85 (567 mph or 913 km/h). It has an intercontinental range of
7,260 nautical miles (8,350 mi or 13,450 km). The 747-400 passenger version can accommodate
416 passengers in a typical three-class layout or 524 passengers in a typical two-class layout. The
next version of the aircraft, the 747-8, is in development, and scheduled to enter service in 2010.
The 747 is to be replaced by the Boeing Y3 (part of the Boeing Yellowstone Project) in the
future.

10 | P a g e
DEVELOPMENT
The 747 was born from the explosion of air travel in the 1960s. The era of commercial jet
transportation, led by the enormous popularity of the Boeing 707, had revolutionized long
distance travel and made possible the concept of the " global village." Boeing had already
developed a study for a very large fixed-wing aircraft while bidding on a US military contract for
a huge cargo plane. Boeing lost the contract to Lockheed's C-5 Galaxy but came under pressure
from its most loyal airline customer, Pan Am, to develop a giant passenger plane that would be
over twice the size of the 707. In 1966 Boeing proposed a preliminary configuration for the
airliner, to be called the 747. Pan Am ordered 25 of the initial 100 series for US$550 million,
becoming its launch customer. The original design was a full-length double-decker fuselage.
Issues with evacuation routes caused this idea to be scrapped in favor of a wide-body design.

At the time, it was widely thought that the 747 would be replaced in the future with an SST
(supersonic transport) design. In a shrewd move, Boeing designed the 747 so that it could easily
be adapted to carry freight. Boeing knew that if and when sales of the passenger version dried up
(see below regarding the future sales of the 747), the plane could remain in production as a cargo
transport. The cockpit was moved to a shortened upper deck so that a nose cone loading door
could be included, thus creating the 747's distinctive "bulge". The supersonic transports,
including the Concorde and Boeing's never-produced 2707, were not widely adopted, such
planes being difficult to operate profitably at a time when fuel prices were soaring, and also there
were difficulties of operating such aircraft due to regulations regarding flying supersonic over
land.

The 747 was expected to become obsolete after sales of 400 units. But the 747 outlived many of
its critics' expectations and production passed the 1,000 mark in 1993. The expected slow-down
in sales of the passenger version in favor of the freighter model has only been realized in the
early 2000s, around 2 decades later than expected. The development of the 747 was a huge
undertaking. Boeing did not have a facility large enough to assemble the giant aircraft, so the
company built an all-new assembly building near Everett, Washington. The factory is the largest
building by volume ever built, on over 780 acres of land.

11 | P a g e
The gamble paid dividends, however, and Boeing enjoyed a monopoly in the very large
passenger aircraft industry for decades. In fact, the record and benchmark set by the 747 would
only be surpassed, more than 35 years after its first delivery, by the A380, built by Boeing's rival,
Airbus. 

DESIGN
Ultimately, the high-winged CX-HLS Boeing design was not used for the 747, although
technologies developed for their bid had an influence. The original design included a full-length
double-deck fuselage with eight-across seating and two aisles on the lower deck and seven-
across seating and two aisles on the upper deck. However, concern over evacuation routes and
limited cargo-carrying capability caused this idea to be scrapped in early 1966 in favor of a wider
single deck design. The cockpit was, therefore, placed on a shortened upper deck so that a
freight-loading door could be included in the nose cone; this design feature produced the 747's
distinctive "bulge". In early models it was not clear what to do with the small space in the pod
behind the cockpit, and this was initially specified as a "lounge" area with no permanent seating.

12 | P a g e
Mostly, Aluminum was used in manufacturing of Boeing. One of the principal technologies that
enabled an aircraft as large as the 747 to be conceived was the high-bypass turbofan engine. Pratt
and Whitney developed a massive high-bypass turbofan engine, the JT9D, which was initially
used exclusively with the 747. Four of these engines mounted in pods below the wings power the
747. To appease concerns about the safety and fly ability of such a massive aircraft, the 747 was
designed with four backup hydraulic systems, split control surfaces, redundant main landing
gear, multiple structural redundancy, and sophisticated flaps that allowed it to use standard-
length runways. The wing was swept back at an unusually high angle of 37.5 degrees, and it was
chosen in order to minimize the wing span, thus allowing the 747 to use existing hangars.

The 747's maximum takeoff weight ranges from 735,000 pounds (333,400 kg) to 970,000 lb
(439,985 kg). Its range has increased from 5,300 nautical miles to 8,000 nmi.

The 747 has redundant structures along with four redundant hydraulic systems and four main
landing gears with four wheels each, which provide a good spread of support on the ground and
safety in case of tire blow-outs. The main gear is redundant so that landing can be performed on
two opposing landing gears if the others do not function properly. In addition, the 747 has split
control surfaces and was designed with sophisticated triple-slotted flaps that minimize landing
speeds and allow the 747 to use standard-length runways. For transportation of spare engines,
747s can accommodate a non-
functioning fifth-pod engine
under the port wing of the
aircraft between the inner
functioning engine and the
fuselage.

13 | P a g e
VARIANTS (BOEING 747 FAMILY)
1. The Boeing 747-100, the first plane in the series, made its maiden flight in February of
1969, and completed its first commercial flight less than a year later, for Pan American
Airlines. Aside from the base model, the Boeing 747-100 was produced in two additional
variants: The Boeing 747-100SR (Short Range; featuring more compact fuel tanks but an
extended passenger space) and the Boeing 747-100SP (Special Performance; with a
shorter fuselage but capable of farther travel, intended for long-distance flights with
smaller passenger loads). The first plane in the 747 family, it was capable of transporting
490 passengers up to 7,200 km. A variant of the base 747-100 model, the Boeing 747SP
has a shorter fuselage and larger wings and empennage. It was developed for long-
distance flights of up to 11,000
km with smaller passenger loads.
2. 1971 saw the introduction of
the Boeing 747-200, outfitted
with more powerful engines and
capable of airlifting greater
payloads across larger distances.
This made it a formidable
freighter, and two specialized
versions, the Boeing 747-200C
and -200F, were optimized for
hauling cargo over long
distances. A development of the
747-100 series, it features greater
maximum takeoff weight and a longer range of 9,500 to 10,500 km.
3. In 1983 the Boeing Corporation began production of the third generation of 747,
the Boeing 747-300. The new series' primary distinction from previous jets was an
extended upper deck, which allowed it to fit up to 580 passengers (in a one-class
configuration). Like its predecessors, the 747-300 was produced in three modifications:
the standard 747-300, the 747-300M (which included more freight space) and the 747-
300SR (short-range version). A revised version of the 747-200 series, this plane has an

14 | P a g e
extended upper deck and increased passenger capacity. It can fit up to 580 people and has
a flight range of 10,500 km.
4. The most popular model of the family turned out to be the Boeing 747-400 series, which
was introduced in 1989. The newer jet was upgraded with better engines, improved
electronics and additional fuel tanks mounted on the tail-end, as well as a revised wing
design featuring winglets. The Boeing 747-400 series is still in production in several
different variations, including the 747-400D (Domestic) with increased passenger
capacity of up to 660 people, the 747-400ER (Extended Range) with a maximum range
of 14,000 km, the 747-400M, a combined passenger/freight version, the 747-400F
(Freighter) and the 747-400ERF (Extended Range Freighter). The most widespread
variant of the 747, models in this series can transport a maximum of 660 passengers over
14,000 km.
5. The Boeing Corporation is currently producing the latest version of the 747,
the Boeing 747-8 Intercontinental. This passenger jet has a longer fuselage, a revised
wing design and new and improved engines and avionics systems which allow it to be
quieter and more efficient than any of its predecessors. It is capable of airlifting up to 467
passengers over a distance of 14,800 km. The freighter version of the 747-8, which is
developed in parallel with the passenger version, made its maiden flight in February
2010. The latest in the 747 family feature some of the latest technological innovations in
the field of airliners.

MAINTENANCE SCHEDULE AND SALIENT CHECKLIST ITEMS


Aircraft maintenance checks are periodic inspections that have to be done on all
commercial/civil aircraft after a certain amount of time or usage; military aircraft normally
follow specific maintenance programs which may or may not be similar to those of
commercial/civil operators. Airlines and other commercial operators of large or turbine-powered
aircraft follow a continuous inspection program approved by the Federal Aviation
Administration (FAA) in the United States,[1] or by other airworthiness authorities such
as Transport Canada or the European Aviation Safety Agency (EASA). Under FAA oversight,
each operator prepares a Continuous Airworthiness Maintenance Program (CAMP) under its

15 | P a g e
Operations Specifications or "OpSpecs". The CAMP includes both routine and detailed
inspections. Airlines and airworthiness authorities casually refer to the detailed inspections as
"checks", commonly one of the following: A check, B check, C check, or D check. A and B
checks are lighter checks, while C and D are considered heavier checks.

ACCIDENTS
1. The first crash of a 747 took place in November of 1974 when Lufthansa Flight 540
crashed in Nairobi killing 59 people.
2. The Tenerife disaster on March 27, 1977 claimed a total of 583 lives when two 747s
collided in heavy fog at Los Rodeos Airport, making it the highest death toll of any
accident in aviation history.

16 | P a g e
3. An Air India Flight 855 Boeing 747 crashed into the sea off the coast of Mumbai
(Bombay) on New Year's Day, 1978. All passengers and crew were killed. Many
residents of sea-front houses in Mumbai were witness to the incident.
4. On August 12, 1985, the Japan Airlines Flight 123 (a 747SR) lost control and crashed,
causing 520 fatalities and is currently the worst single-aircraft disaster in aviation history.
5. The Lockerbie bombing was a Pan Am 747-100.
6. Air India Flight 182 was a 747-237B that exploded on June 23, 1985. All 329 on board
were killed. Up until September 11, 2001, the Air India bombing was the single deadliest
terrorist attack involving aircraft.
7. Korean Air Lines Flight 007 was a 747-230B which was shot down by the Soviet Air
Force on September 1, 1983. All 269 passengers and crew aboard were killed.
8. El Al Flight 1862 was a 747-200F which crashed shortly after take-off from
Amsterdam Schiphol on October 4, 1992. Engines no. 3 and 4 detached shortly after
take-off and as a result the flight crew lost control and the crippled 747 crashed into the
Klein-Kleinberg apartments in Bijlmermeer at high speeds. All 3 crew were killed as well
as 43 on the ground.
9. China Airlines Flight 611, a 747-209B, broke-up midflight on May 25, 2002, en route
to Hong Kong International Airport, Hong Kong from Chiang Kai Shek International
Airport in Taipei, Taiwan. All passengers and crew on board lost their lives.
10. On 31st October 2000, Singapore Airlines Flight 006, a Boeing 747-400 flying on a
Singapore to Los Angeles via Taipei route rammed into construction equipment while
attempting to take off from a closed runway at Chiang Kai Shek International Airport,
caught fire and was destroyed, killing 79 passengers and 3 crew members. The accident
prompted the airline to change the flight number of this route from 006 to 030 and to
remove the "Tropical Megatop" livery on the accident aircraft's sister ship.

Despite all these, very few crashes have been attributed to design flaws of the 747. The Tenerife
disaster was a result of pilot error, ATC error and communications failure, while Japan Airlines
Flight 123 the consequence of improper aircraft repair. United Airlines Flight 811, which
suffered an explosive decompression mid-flight on February 24, 1989, subsequently had NTSB
issuing a recommendation to have all similar 747-200 cargo doors modified. TWA Flight 800, a
747-100 that exploded mid-air on July 17, 1996, led to the Federal Aviation Administration

17 | P a g e
proposing a rule requiring the installation of an inserting system in the center fuel tank for largest
aircraft.

As of May 2006, there were a total of 44 hull-loss occurrences involving 747s, with 3707
fatalities.

18 | P a g e
MAINTENANCE STEERING GROUP (MSG) APPROACH
QUESTION 03
DISCUSS THE MAINTENANCE STEERING GROUP APPROACH.
In 1968 the Boeing Company developed the Maintenance Steering Group (MSG), maintenance
schedule to ensure the safety of their B747-100 aircraft. This approach moved away from the
tradition of "overhaul and replace at time intervals" to one that considered the type of tasks and
intervals needed to keep the aircraft safe. MSG was found to be very successful because it saved
time, money and unnecessary interference with components therefore, the company found a need
to apply the same approach to all their aircraft. Thus, MSG was made applicable to more aircraft
by making it more general.

MSG-1 / MSG-2
MSG-1 / MSG -2 introduced process oriented approach.

1. Soft time interval: A soft time interval is one that is chosen by an operator to be done at
a specific interval but may be adjusted to fit their operational schedule. This interval may
or may not be recommended by the manufacturer.
2. Hard Time limit: Maximum interval for performing maintenance tasks on a part or unit.
Such intervals apply to overhaul, but also to the total life of the part or unit. A hard
time component is a component that requires a specific action at a specific interval
(overhaul, refurbishment, bench check, etc.) per the manufacturer’s recommendations. It
is a failure preventive process.
3. On-Condition:  Repetitive inspections or tests to determine the condition of units or
systems, comprising servicing, inspecting, testing, calibrating and replacement. On-
Condition (OC) is a failure preventive primary maintenance process that requires a
system, component, or appliance be inspected periodically or checked against some
appropriate physical standard to determine if it can continue in service. The standard
ensures that the unit is removed from service before failure during normal operation.
These standards may be adjusted based on operating experience or tests, as appropriate,
IAW a carrier's approved reliability program or maintenance manual. Examples include
tire tread and brake linings, scheduled borescope inspections of engines, engine oil

19 | P a g e
analysis. Other include brake wear indicator pins, control cables (measure for diameter,
tension, and broken strands), linkages, control rods, pulleys etc. (measure for wear, end
or side play, or backlash).
4. Conditioning Monitoring: It simply means that the part is left to expire having been
determined its failure is not of critical consequence. Condition Monitoring (CM) is a
process for systems, components, or appliances that have neither HT nor OC maintenance
as their primary maintenance process. It is accomplished by appropriate means available
to an operator for finding and solving problem areas. The user must control the reliability
of systems or equipment based on knowledge gained by analysis of failures or other
indications of deterioration. It is not a failure preventive process and CM components are
operated until failure occurs (unscheduled maintenance).
FAA states regarding CM:
a. Item has no direct, adverse effect on safety
b. Must not have any “hidden function” (not evident to crew) that could affect safety
c. Must be in condition monitoring or reliability program
d. Avionics and electronic components
Basic elements include data on unscheduled removals, maintenance log entries, on-board
data systems, shop findings etc. This data can be used to adjust HT and OC intervals.

MSG-1 was superseded MSG-2 in the early 1970’s and used a logic process which considered
failures starting at the component level and moving up, with a focus on the understanding that all
aircraft, engines and components, reach a period when they should be discarded or overhauled
and returned to a “as new” condition.

Drawbacks of MSG-2
1. MSG-2 was not designed to consider economic effects, rather maintaining aircraft safety
irrespective of the costs involved.
2. MSG 2 Hidden failures (to the pilots) do not receive appropriate consideration
3. Because MSG-2 is a bottom up approach it was found to be more labor intensive
4. There was some inherent contradiction in the terminology for example “On Condition”
and “Condition Monitored”

20 | P a g e
5. MSG-2 did not pay sufficient attention to modern Corrosion Prevention measures.

MSG-3
United Airlines in fact made a significant contribution to the further development with
sponsorship from the US department of defense.  The project considered a different approach to
deliver effective maintenance. This was again a logic driven process and in fact became the basis
for the MSG-3 process, introduced in 1980 and still in use today (with several revisions).

The major difference with MSG-3 is that it is a task-oriented approach to maintenance using a
methodology which looks at the various failure modes from a system level, or “top down”. In
addition, economic considerations play a role. Maintenance tasks are performed for safety,
operational, or economic reasons. The MSG-3 process provide for both preventative
maintenance as well as considering tasks to expose potential failures

Advantages of MSG-3
1. MSG-3 is a Top-down process, which enables a step by step systematic analysis.
2. MSG-3 delivers lower maintenance costs with typical savings ranging from 15% to 25%
for the same aircraft type on conversion from MSG-3 to MSG-3
3. MSG-3 typically delivers a substantial cost reduction in hard time component removal
and replacement
4. MSG-3 results in fewer maintenance tasks but not the importance of managing
competencies.
5. Some MSG-3 tasks are carried out for economic reasons, while others are carried out to
deliver an improved safety level.

Major enhancements in MSG-3 include:

1. Expansion of the Systems/Power plant definition of inspection


2. Guideline for the development of a Corrosion Prevention and Control Program (CPCP)
3. Increased awareness of aging aircraft requirements
4. Extensive revision to the structure logic.

21 | P a g e
MSG-3 is widely used to develop initial maintenance requirements for modern commercial
aircraft which are published as a Maintenance Review Board Report (MRBR) and include four
main sections:

1. Systems and Power plant (including components and APU’s),


2. Aircraft Structures,
3. Zonal Inspections and
4. Lightning/High Intensity Radio Frequency (L/HIRF).

Each section contains methodology and specific decision logic diagrams. Particularly, the
‘Systems & Power plant’ section requires the identification of Maintenance Significant
Items (MSI) before the application of logic diagrams to determine the maintenance tasks and
intervals. In addition to these tasks developed by using MSG-3 analysis, other maintenance tasks
may be identified as part of the certification process, which requires ‘System Safety Assessment
(SSA)’ and use of methods such as ‘Failure Modes and Effect Analysis (FMEA)’ (FAR/CS
1309). Such tasks are called ‘Certification of Maintenance Requirements (CMR)’. Similarly, the
“Aircraft Structures’ section describes the Structure Significant Items (SSI), which are different
than Principal Structure Element PSE) (FAR/CS 25.571) and it also provides methods and logic
diagrams, which are to be used for the development of structural inspections tasks.

Categories of Task oriented approach


Three categories of task oriented approach are

1. Airframe systems tasks: It includes lubrication, servicing, inspection, functional check,


operational check, visual check, restoration and discard.
2. Structural item tasks: It includes Environmental Deterioration (climate or environment
– may be time dependent), Accidental Damage (result of human error or impact with an
object) and Fatigue Damage (crack or cracks due to loading or stress). Inspections for
deterioration of structural items are General Visual Inspection, Detailed Inspection and
Special Detailed Inspection (Use of Non-destructive inspection).
3. Zonal tasks: It ensures that all systems, components, and installations within a specified
zone receive adequate screening, security of installation and general condition.

22 | P a g e
MSG-3 Basis
This is a gross simplification, but will serve to outline the basic process. An MSG3 analysis is
essentially a top down, system by system failure modes analysis, which is then used to derive the
minimum maintenance actions required for safe operation. Working Groups are formed to
review a system or group of systems. Each Working Group then reports to a Steering Committee,
which is responsible for compiling the MRBR.

1. Maintenance Significant Items (MSIs)


The first task is to identify each significant system and component. This creates a list of
Maintenance Significant Items (MSIs). Each MSI must be described in some detail, to a
predefined format. These will be numerous.
2. Reliability Data
The working group must then provide reliability data for the MSI, again in detail. This
can be inferred to some extent by testing and operating data from other types and service
experience of other aircraft. Any novel features or technology need special attention.
3. Functional Failure Analysis
After the MSI is identified and defined, the Functional Failure analysis must be
performed, where it will list all Functions, Functional Failures, related Failure Effect and
Failure Cause(s).
The Failure Effect are listed for each MSI and must state the final consequences of the
failure. Failure effects may be hidden or evident. There can only be one effect for a given
functional failure. If more than one failure effect has been identified, either the functional
failure, or the failure effect is incorrectly defined.
4. The Failure Effect Analysis
The Failure Effect Analysis (known as the Level 1 Analysis) is performed to define the
criticality of the Functional Failure, based on its Failure Effect. The end result of this
level one analysis will be to have placed each Failure Effect for each MSI in one of the
following categories:
 Evident Safety (Category 5)
 Evident Operational (Category 6)
 Evident Economic (Category 7)

23 | P a g e
 Hidden Safety (Category 8)
 Hidden Non‐Safety (Category 9)
 Evident Safety (Category 5)
This must be approached with the understanding that a task is required to assure
safe operation. If this is not the case, then a re-design is required.
 Evident Operational (Category 6).
A task(s) is desirable if it reduces the risk of failure to an acceptable level.
 Evident Economic (Category 7).
A task(s) is desirable if the cost of the task is less than the cost of repair.
 Hidden Safety (Category 8).
A task(s) is required to assure the availability, necessary to avoid the safety effect
of multiple failures. If this is not the case, then a re-design is required.
 Hidden Non‐Safety (Category 9).
A task(s) may be desirable to assure the availability necessary to avoid the
economic effects of multiple failures.
5. Failure Cause Analysis
The next task is to complete the level 2 analysis. This Failure Cause Analysis is handled
in a similar manner for each of the five failure effect categories. The purpose is to
identify tasks to prevent the failure causes from happening.
There are lots of criteria around the use of these questions and how they must be
answered, but this is enough to give an overview of the basic concept.
There are six possible task resultant questions in the failure effect categories as follows:
 Is a lubrication or servicing task applicable and effective?
 Is a check to verify operation applicable and effective?
 Is an inspection or functional check to detect degradation of a function applicable
and effective?
 Is a restoration task to reduce failure rate applicable and effective?
 Is a discard task to avoid failures or to reduce the failure rate applicable and
effective?
 Is there a task or combination of tasks applicable and effective?

24 | P a g e
6. Task Intervals
The next stage is to determine task intervals and then to combine tasks and remove
duplications. Typically, zonal inspections will be created at this stage, to combine tasks
within a geographic zone, (the cockpit for instance) where tasks are defined for particular
inspection levels. This again is quite a complex and highly structured process.
At this stage you have a minimum list of tasks that are applicable and effective to be
carried out at specified intervals, combined where practical and effective to do so.
This analysis will form the core of the MRBR. From here each operator (airline) will
build their own maintenance program, based on the MRBR and their own operational
profile.

25 | P a g e

You might also like