Week 9 - System Failures and Errors Flashcards
(16 cards)
Regulatory failures
Failures caused by lack of information/ under-trained personnel/ lack of regulation.
Managerial failures
Failures caused by safety climate/ lines of command responsibility/ quality control.
Hardware failures
Failures caused by design failure/ requirements failures/ implementation failure.
Software failures
Failures caused by requirements failures/ specification failures.
Human failures
Failures caused by slips/ lapses and mistakes/ team factors/ human error.
Cascading failures
When an error in one part may coincide with the failures of different parts - domino effect of failures.
Many possible combinations of cascading failures in complex systems.
Complex system characteristics
- Complex interactions: unfamiliar, unplanned or unexpected sequences which are not visible or immediately comprehensible.
- Tightly coupled: strong link between multiple parts of the system, rigidly ordered processes, very little slack.
What makes a system particularly prone to failure
If a system has both complex interactions and is tightly coupled.
Swiss Cheese Model
- A way of showing how mistakes in different barriers/ layers of defences of a system can turn a hazard into a loss.
- Each slice of cheese represents a layer of defences of the system.
- The holes in each slice of cheese represent mistakes/failures in this layer of defences in the system.
- If enough holes line up, there is a path between each end of the row of slice representing a hazard being able to move through the mistakes and become a loss.
Limitations of the Swiss cheese model
- There is randomness in whether the holes line up.
- Independence of barriers is assumed.
- Doesn’t explain what the holes are or how they came about.
Dependability
The ability of a system to deliver service that can justifiably be trusted.
Dependability is the most important property for most complex socio-technical systems.
Laprie’s model
A model to represent dependability by splitting it up into 3 factors - impairments, means and attributes.
- Impairments: faults with the system (faults, errors, failures). Want this to be minimised.
- Means: Made up of procurement (fault avoidance and tolerance) and validation (fault forecasting and removal).
- Attributes: Qualities of the system (availability, reliability, safety, security).
Laprie’s model - impairments
- System failure: when the system does not deliver the service its users expect
- System error: where the behaviour of the system does not confirm to its specification
- System fault: incorrect system state not expected by the designers of the system
- Human error or mistake: human behaviour that results in faults being introduced into a system
Laprie’s model - means
- Fault avoidance: preventing the occurrence or introduction of faults
- Fault tolerance: delivering correct service, though faults are present
- Fault removal: reducing number or severity of faults
- Fault forecasting: estimating number of faults, future occurrence, consequences
Laprie’s model - primary attributes of dependability
Availability: ability of system to deliver services when requested
Reliability: ability of the system to deliver services as specified
Safety: ability of the system to operate without catastrophic failure
Security: ability of the system to protect itself against accidental or deliberate intrusion
Laprie’s model - secondary attributes of dependability
Timeliness: the ability of the system to respond in a timely way to user requests.
Survivability: the ability of a system to continue to deliver its services to users in the face of deliberate or accidental attack
Recoverability: the ability of the system to recover from user or system errors.
Maintainability: the ease of repairing the system after a failure has been discovered or changing the system to include new features.