Lecture 19

CS 501: Software Engineering
Lecture 19
Reliability 1
1 CS 501 Spring 2005

Administration

Lectures on Reliability and
Dependability
Lecture 19, Reliability 1: The development process

Reviews
Lecture 20, Reliability 2: Different aspects of reliability
Programming techniques
Lecture 21, Reliability 3: Testing and bug fixing
Tools

Dependable and Reliable Systems:
The Royal Majesty
From the report of the National Transportation Safety Board:
"On June 10, 1995, the Panamanian passenger ship Royal Majesty
grounded on Rose and Crown Shoal about 10 miles east of
Nantucket Island, Massachusetts, and about 17 miles from where
the watch officers thought the vessel was. The vessel, with 1,509
persons on board, was en route from St. George’s, Bermuda, to
Boston, Massachusetts."
"The Raytheon GPS unit installed on the Royal Majesty had been
designed as a standalone navigation device in the mid- to late
1980s, ...The Royal Majesty’s GPS was configured by Majesty
Cruise Line to automatically default to the Dead Reckoning mode
when satellite data were not available."
The Royal Majesty: Analysis
• The ship was steered by an autopilot that relied on position

information from the Global Positioning System (GPS).
• If the GPS could not obtain a position from satellites, it provided
an estimated position based on Dead Reckoning (distance and
direction traveled from a known point).
• The GPS failed one hour after leaving Bermuda.
• The crew failed to see the warning message on the display (or to
check the instruments).
• 34 hours and 600 miles later, the Dead Reckoning error was 17
miles.

The Royal Majesty: Software Lessons
All the software worked as specified (no bugs), but ...

• Since the GPS software had been specified, the requirements
had changed (stand alone system to part of integrated system).
• The manufacturers of the autopilot and GPS adopted different
design philosophies about the communication of mode changes.
• The autopilot was not programmed to recognize valid/invalid
status bits in message from the GPS (NMEA 0183).
• The warnings provided by the user interface were not
sufficiently conspicuous to alert the crew.
• The officers had not been properly trained on this equipment.

Reliability
Reliability: Probability of a failure occurring in operational

use.
Perceived reliability: Depends upon:
user behavior
set of inputs
pain of failure

User Perception of Reliability
1. A personal computer that crashes frequently v. a machine

that is out of service for two days.
2. A database system that crashes frequently but comes back
quickly with no loss of data v. a system that fails once in
three years but data has to be restored from backup.
3. A system that does not fail but has unpredictable periods
when it runs very slowly.

Reliability Metrics
Traditional Measures
• Mean time between failures
• Availability (up time)
• Mean time to repair
Market Measures
• Complaints
• Customer retention
User Perception is Influenced by
• Distribution of failures
Hypothetical example: Cars are less safe than airplanes in
accidents per hour, but safer in accidents per mile.
Reliability Metrics for Distributed Systems
Traditional metrics are hard to apply in multi-component

systems:
• In a big network, at any given moment something will be giving
trouble, but very few users will see it.
• A system that has excellent average reliability may give
terrible service to certain users.
• There are so many components that system administrators
rely on automatic reporting systems to identify problem areas.
10 CS 501 Spring 2005

Requirements Specification of System
Reliability
Example: ATM card reader
Failure class Example Metric

Permanent System fails to operate 1 per 1,000 days
non-corrupting with any card -- reboot
Transient System can not read 1 in 1,000 transactions
non-corrupting an undamaged card
Corrupting A pattern of Never
transactions corrupts
database
11 CS 501 Spring 2005

Cost of Improved Reliability
Up time
99% 100%
Will you spend your money on new

functionality or improved reliability?
12 CS 501 Spring 2005
Example: Central Computing System
A central computer serves the entire organization.
Any failure is serious.
Step 1: Gather data on every failure
• 10 years of data in a simple data base
• Every failure analyzed:
hardware
software (default)
environment (e.g., power, air conditioning)
human (e.g., operator error)
13 CS 501 Spring 2005

Step 2: Analyze the data

• Weekly, monthly, and annual statistics
Number of failures and interruptions
Mean time to repair
• Graphs of trends by component, e.g.,
Failure rates of disk drives
Hardware failures after power failures
Crashes caused by software bugs in each module
14 CS 501 Spring 2005

Step 3: Invest resources where benefit will be

maximum, e.g.,
• Orderly shut down after power failure
• Priority order for software improvements
• Changed procedures for operators
• Replacement hardware
15 CS 501 Spring 2005

Building Dependable Systems: Three
Principles
For a software system to be dependable:

• Each stage of development must be done well.
• Changes should be incorporated into the structure as carefully
as the original system development.
• Testing and correction do not ensure quality, but dependable
systems are not possible without systematic testing.
16 CS 501 Spring 2005

Reliability: Modified Waterfall Model
Feasibility study
Requirements
System design
Program design
Coding
Changes Testing
Acceptance
Operation & maintenance
17 CS 501 Spring 2005
Key Factors for Reliable Software
• Organization culture that expects quality

• Approach to software design and implementation that hides
complexity (e.g., structured design, object-oriented
programming)
• Precise, unambiguous specification
• Use of software tools that restrict or detect errors (e.g.,
strongly typed languages, source control systems, debuggers)
• Programming style that emphasizes simplicity, readability,
and avoidance of dangerous constructs
• Incremental validation
18 CS 501 Spring 2005

Building Dependable Systems:
Organizational Culture
Good organizations create good systems:

• Acceptance of the group's style of work (e.g., meetings,
preparation, support for juniors)
• Visibility
• Completion of a task before moving to the next (e.g.,
documentation, comments in code)
19 CS 501 Spring 2005

Complexity
The human mind can encompass only limited complexity:

• Comprehensibility
• Simplicity
• Partitioning of complexity
A simple system or subsystem is easier to get right than a
complex one.
20 CS 501 Spring 2005

Specifications for the Client
Specifications are of no value if they do not meet the

client's needs
• The client must understand and review the
requirements specification in detail
• Appropriate members of the client's staff must review
relevant areas of the design (e.g., operations, training
materials, system administration)
• The acceptance tests must belong to the client
21 CS 501 Spring 2005

Building Dependable Systems: Quality
Management Processes
Assumption:
Good processes lead to good software
The importance of routine:
Standard terminology (requirements,
specification, design, etc.)
Software standards (naming conventions, etc.)
Internal and external documentation
Reporting procedures
22 CS 501 Spring 2005

Building Dependable Systems: Change
Change management:
Source code management and version control
Tracking of change requests and bug reports
Procedures for changing requirements
specifications, designs and other documentation
Regression testing
Release control
23 CS 501 Spring 2005

Reviews: Process (Plan)
Objectives:
• To review progress against plan (formal or informal).
• To adjust plan (schedule, team assignments,
functionality, etc.).
Impact on quality:
Good quality systems usually result from plans that are
demanding but realistic.
Good people like to be stretched and to work hard, but
must not be pressed beyond their capabilities.
24 CS 501 Spring 2005

Reviews: Design and Code
DESIGN AND CODE REVIEWS ARE A FUNDAMENTAL

PART OF GOOD SOFTWARE DEVELOPMENT
Concept
Colleagues review each other's work:
can be applied to any stage of software development
can be formal or informal
25 CS 501 Spring 2005

Benefits of Design and Code Reviews
Benefits:
• Extra eyes spot mistakes, suggest improvements
• Colleagues share expertise; helps with training
• An occasion to tidy loose ends
• Incompatibilities between components can be identified
• Helps scheduling and management control
Fundamental requirements:
• Senior team members must show leadership
• Good reviews require good preparation
• Everybody must be helpful, not threatening
26 CS 501 Spring 2005
Review Team (Full Version)
A review is a structured meeting, with the following people

Moderator -- ensures that the meeting moves ahead steadily
Scribe -- records discussion in a constructive manner
Developer -- person(s) whose work is being reviewed
Interested parties -- people above and below in the software
process
Outside experts -- knowledgeable people who have are not
working on this project
Client -- representatives of the client who are knowledgeable
about this part of the process
27 CS 501 Spring 2005
Example: Program Design
Moderator
Scribe
Developer -- the design team
Interested parties -- people who created the system design
and/or requirements specification, and the programmers who
will implement the system
Outside experts -- knowledgeable people who have are not
working on this project
Client -- only if the client has a strong technical representative
28 CS 501 Spring 2005

Review Process
Preparation
The developer provides colleagues with documentation
(e.g., specification or design), or code listing
Participants study the documentation in advance
Meeting
The developer leads the reviewers through the
documentation, describing what each section does and
encouraging questions
Must allow plenty of time and be prepared to continue on
another day.
29 CS 501 Spring 2005
Static and Dynamic Verification
Static verification: Techniques of verification that

do not include execution of the software.
• May be manual or use computer tools.
Dynamic verification:
• Testing the software with trial data.
• Debugging to remove errors.
30 CS 501 Spring 2005

Static Validation & Verification
Carried out throughout the software development process.
Validation &
verification
Requirements
specification Design Program
REVIEWS
31 CS 501 Spring 2005

Static Verification: Program Inspections
Formal program reviews whose objective is to detect faults

• Code may be read or reviewed line by line.
• 150 to 250 lines of code in 2 hour meeting.
• Use checklist of common errors.
• Requires team commitment, e.g., trained leaders
So effective that it is claimed that it can replace unit testing
32 CS 501 Spring 2005

Inspection Checklist: Common Errors
Data faults: Initialization, constants, array bounds, character

strings
Control faults: Conditions, loop termination, compound
statements, case statements
Input/output faults: All inputs used; all outputs assigned a
value
Interface faults: Parameter numbers, types, and order;
structures and shared memory
Storage management faults: Modification of links,
allocation and de-allocation of memory
Exceptions: Possible errors, error handlers
33 CS 501 Spring 2005
Static Analysis Tools
Program analyzers scan the source of a program for possible

faults and anomalies (e.g., Lint for C programs).
• Control flow: loops with multiple exit or entry points
• Data use: Undeclared or uninitialized variables, unused
variables, multiple assignments, array bounds
• Interface faults: Parameter mismatches, non-use of
functions results, uncalled procedures
• Storage management: Unassigned pointers, pointer
arithmetic
34 CS 501 Spring 2005

Static Analysis Tools (continued)
Static analysis tools

• Cross-reference table: Shows every use of a variable,
procedure, object, etc.
• Information flow analysis: Identifies input variables on which
an output depends.
• Path analysis: Identifies all possible paths through the
program.
35 CS 501 Spring 2005

Lecture 19

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 19

Uploaded by

Copyright:

Available Formats

CS 501: Software Engineering

1 CS 501 Spring 2005

2 CS 501 Spring 2005

Lecture 19, Reliability 1: The development process

3 CS 501 Spring 2005

• The ship was steered by an autopilot that relied on position

5 CS 501 Spring 2005

All the software worked as specified (no bugs), but ...

6 CS 501 Spring 2005

Reliability: Probability of a failure occurring in operational

7 CS 501 Spring 2005

1. A personal computer that crashes frequently v. a machine

8 CS 501 Spring 2005

Traditional metrics are hard to apply in multi-component

10 CS 501 Spring 2005

Example: ATM card reader

Failure class Example Metric

11 CS 501 Spring 2005

Will you spend your money on new

13 CS 501 Spring 2005

Step 2: Analyze the data

14 CS 501 Spring 2005

Step 3: Invest resources where benefit will be

15 CS 501 Spring 2005

For a software system to be dependable:

16 CS 501 Spring 2005

• Organization culture that expects quality

18 CS 501 Spring 2005

Good organizations create good systems:

19 CS 501 Spring 2005

The human mind can encompass only limited complexity:

20 CS 501 Spring 2005

Specifications are of no value if they do not meet the

21 CS 501 Spring 2005

22 CS 501 Spring 2005

23 CS 501 Spring 2005

24 CS 501 Spring 2005

DESIGN AND CODE REVIEWS ARE A FUNDAMENTAL

25 CS 501 Spring 2005

A review is a structured meeting, with the following people

28 CS 501 Spring 2005

Static verification: Techniques of verification that

30 CS 501 Spring 2005

Carried out throughout the software development process.

31 CS 501 Spring 2005

Formal program reviews whose objective is to detect faults

32 CS 501 Spring 2005

Data faults: Initialization, constants, array bounds, character

Program analyzers scan the source of a program for possible

34 CS 501 Spring 2005

Static analysis tools

35 CS 501 Spring 2005

You might also like