You are on page 1of 20

Presentation on Reliability

Presented ByVivek Gupta 1081030128 IT-B 3rd Year

What is Reliability?
Reliability is a broad concept. It is applied whenever we expect something to behave in a certain way. Reliability is one of the metrics that are used to measure quality. It is a user-oriented quality factor relating to system operation. Intuitively, if the users of a system rarely experience failure, the system is considered to be more reliable than one that fails more often.

What is Reliability?
A system without faults is considered to be highly reliable. Constructing a correct system is a difficult task. Even an incorrect system may be considered to be reliable if the frequency of failure is acceptable.

Key concepts in discussing reliability

Key concepts in discussing reliability:
Fault Failure Time Three kinds of time intervals: MTTR, MTTF, MTBF.

Key concepts in discussing reliability

Failure A failure is said to occur if the observable outcome of a program execution is different from the expected outcome. Fault The adjudged cause of failure is called a fault. Example: A failure may be cause by a defective block of code. Time Time is a key concept in the formulation of reliability. If the time gap between two successive failures is short, we say that the system is less reliable.

Types of Reliability
Inter-Rater or Inter-Observer Reliability Test-Retest Reliability Parallel-Forms Reliability Internal Consistency Reliability

Inter-Rater Reliability
Inter-Rater or Inter-Observer Reliability Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon. Basically it is Different people, same test. Example-Two people may be asked to categorize pictures of animals as being dogs or cats. A perfectly reliable result would be that they both classify the same pictures in the same way.

Parallel-Forms Reliability
Parallel-Forms Reliability Used to assess the consistency of the results of two tests constructed in the same way from the same content domain. Different people, same time, different test. Example-An experimenter develops a large set of questions. They split these into two and administer them each to a randomly-selected half of a target sample. In development of national tests, two different tests are simultaneously used in trials. The test that gives the most consistent results is used, whilst the other (provided it is sufficiently consistent) is used as a backup.

Test-Retest Reliability
Test-Retest Reliability Used to assess the consistency of a measure from one time to another. Basically it is Same people, different times. Example-Various questions for a personality test are tried out with a class of students over several years. This helps the researcher determine those questions and combinations that have better reliability.

Internal Consistency Reliability

Internal Consistency Reliability Used to assess the consistency of results across items within a test. Basically it is Different questions, same construct

System Reliability Specification

Hardware reliability
probability a hardware component fails

Software reliability
probability a software component will produce an incorrect output software does not wear out software can continue to operate after a bad result

Operator reliability
probability system user makes an error

Software Reliability
To illustrate it , we can have a program X is estimated to have a reliability of 0.96 over 8 elapsed processing hours that means if program X was to be executed 100 times and requires 8 hours of elapsed processing time it is likely to operate 96 times out of hundred without failure.

Measures of Reliability and Availability

Mean Time to Failure (MTTF)
average time between observed failures (aka MTBF)

Availability = MTBF / (MTBF+MTTR)

MTBF = Mean Time Between Failure MTTR = Mean Time to Repair

Reliability = MTBF / (1+MTBF)

Time Units
Raw Execution Time
non-stop system

Calendar Time
If the system has regular usage patterns

Number of Transactions
demand type transaction systems

Measures the fraction of time system is really available for use Takes repair and restart times into account Relevant for non-stop continuously running systems (e.g. traffic signal)

Safety Specification
Each safety specification should be specified separately These requirements should be based on hazard and risk analysis Safety requirements usually apply to the system as a whole rather than individual components System safety is an an emergent system property

Hazard Analysis Stages

Hazard identification
identify potential hazards that may arise

Risk analysis and hazard classification

assess risk associated with each hazard

Hazard decomposition
seek to discover potential root causes for each hazard

Risk reduction assessment

describe how each hazard is to be taken into account when system is designed

Fault-tree Analysis
Hazard analysis method that starts with an identified fault and works backwards to the cause of the fault Can be used at all stages of hazard analysis It is a top-down technique, that may be combined with a bottom-up hazard analysis techniques that start with system failures that lead to hazards

Fault-tree Analysis Steps

Identify hazard Identify potential causes of hazards Link combinations of alternative causes using or or and symbols as appropriate Continue process until root causes are identified (result will be an and/or tree or a logic circuit) the causes are the leaves


RTL builds the system model by specifying events and corresponding actions. The event action model can be analysed using logic operations to test safety assertions about system components and their timing. PETRINET MODELS can be used to determine the faults that are most hazardous. Once hazards are identified , safety related requirement can be specified.