A brief history of incident models

2022/11/07

A brief history of incident models

Accident analysis always implies an accident model, i.e., a set of assumptions of what the underlying "mechanisms" are.

This is a quick view of incident models, as taken from various sources. I found this useful to go over a long while ago and then forgot what I did with my notes, so here's new ones.

Do note that not all models are in precise orders. Some models have lived and evolved for decades and been influenced or borrowed from other models. The ordering here is arbitrary and as close to "first appearance" as possible, but not necessarily to "last time it was useful."

Accident-Proneness

This starts around WWI, where studies showed that some workers in munition factories kept getting into more accidents than others.

The theory said that this is because some workers were just more careless, less skilled or had other personal character traits that made them less safe. This model proved to be both very convenient (just fire the bad apples) and very sticky (it still influences models and insurance to this day) despite never having good evidence for it.

Incident models eventually moved away from it because it's not a very good model. Exposure to risk isn't uniform. Some positions are inherently riskier and attributing fault to the workers does not tend to improve factors.

Domino Theory

In 1931, H. Heinrich developed the first 'sequential accident model' often called the 'Domino Theory'.

It's a sort of evolution on the previous one where faults in character are still at the base of the chain, but any interruption in the environment can also prevent incidents and errors:

Over the years the first dominoes evolved and became organizational factors such as planning and management. Most charitable interpretations of the domino theory mention causal chains of linear effects.

Multiple issues may exist, but identifying any single one and acting on it can prevent accidents. This model still tended to say that humans were responsible for 80% of accident via unsafe actions, and 20% of accidents were caused by unsafe conditions.

Energy and Barriers models

Starting in the 60s and early 70s, the the fancier sequential models are based on energy releases and barriers. There's a potential hazard somewhere that can lead to uncontrolled releases, and what is required is to put space, separations, or barriers to prevent issues.

These resulted in what a lot of classical factory layout with dangerous chemicals or nuclear power plants may look like, with various containment mechanisms, ways to cut off flows, and physical (or procedural) barriers between multiple risk factors. These models are still very useful to this day.

The major weaknesses of all sequential models share however is that linear failures have their limit. Complex systems have many things happening in parallel, and systemic approaches were eventually needed, because the linear models alone could no longer predict or help prevent other sorts of failures.

They however remain very sticky as well to these days, because it satisfies the human need for a causal explanation, especially a simple one that can be graphically represented and follows a linear timeline.

Man-Made Disasters

In 1978, Barry Turner instead comes up with the idea that some of these hazards accumulate over time in a system, and that organizational decisions slowly grow and grow and grow until conditions change. Early signals are ignored, organizations drift, and incidents happen.

This is one of the earliest models to frame incidents as socio-technical in nature, and as an emergent property of systems.

Normal Accident Theory

In 1984, following the Three Mile Island incident, Charles Perrow came up with Normal Accident Theory.

Perrow’s conclusions of the incident were that some distributed systems require tight coupling and complex interactions to work and control highly dynamic and complex situations. tight coupling implies disturbances propagate rapidly and complex interactions mean surprises are bound to happen.

Unfortunately these properties put together result in accidents being something we need to consider as “normal” — as an inevitable consequence of these systems existing and operating. The model guides towards reducing coupling or complexity, but sometimes the task at hand does not allow this. Admitting that not all incidents can be prevented and some are normal was a huge change in attitude.

It also represents a break because it abandons many causal explanations, becomes non-linear, and tells that systems can't be decomposed in individual parts—well-functioning interacting parts can nevertheless cause unexpected incidents.

Latent Failure Model

Reason's work opened the door to a model that takes a sort of statistical approach: there are 'latent conditions' that stay dormant in a system and combine with one another. Once a trigger activates them (unsafe acts, wear and tear), they can overcome the system's defences and cause accidents.

This is generally known as the swiss cheese model, but I'm avoiding the term here because the swiss cheese model is one of these things where people saw a diagram and ran with it. A lot of swiss cheese models out there are just the domino model with new graphics.

Namely, Reason's latent failure model is meant to be non-linear and somewhat probabilistic. It can be often used when doing epidemiological approaches (large population-based studies) such as around traffic and road safety: there are more risk factors than can be removed, but by studying common elements to many incidents, we can make high-level interventions that overall help prevent incidents.

One example of this could be the idea that SUVs are bigger and cause more pedestrian deaths. Reducing car sizes can reduce overall injuries and deaths in the system (mostly pedestrians) despite the car often feeling safer for the driver.

Drift Model

The drift model came up later as a sort of refinement over MMD and other points made by Reason and 20+ years of research.

In the drift model, competing pressures are forcing people to operate closer and closer to the incident boundary (which is fuzzy and unknown until an incident happens). Systemic effects therefore can drive an organization towards risk and failure.

If the system is not effective, it will fail economically and shut down
If the system workload increases too much, workers can burn out and equipment fail
If the system moves too far in the direction of increasing risk, accidents will occur.

There is therefore an ongoing balance between the sharp end (practitioner) and blunt end (policies, management, etc.) in moving the needle of practice.

This model is probably where I'd put terms you might have heard such as "normalization of deviance." Note however that people tend to assume normalization of deviance means people breaking rules and becoming careless. In practice, safety practitioners and researchers will avoid making the judgment that rules and procedures are right by default. In fact, normalization of deviance is often a source of normal variation and adaptation to pressures.

The risk isn't in deviating from the procedures, but in deviating between how work is done and how people imagine it (and prescribe it) to be done. The wider the gap, the more you risk losing sight of how things work and operate on false pretenses.

Resilience Engineering

Resilience Engineering is a sort of newer approach, which states that variation and improvisation are actually frequent, daily, critical parts of running complex systems. Approaches that aim to remove all variations and risk also end up removing adaptive capacity and may reduce the ability of practitioners to keep things working properly.

Resilience Engineering takes a strong focus on what usually goes right to find sources of adaptive capacity, and to reinforce these adaptive patterns within organizations. It focuses on doing deep dives and analysis of normal work to find the best point of intervention to give people more capacity.

A final note

All in all, these models aren't a whole ordeal where one is right and the others are wrong. I'm also missing a ton of models and variations of models. Some are far more dated than others (eg. domino theory) and show fewer good contexts in which they can be usefully applied.

Generally the trend has been to go from simpler, linear causal model into more complex and systemic perspectives, where people and components of socio-technical systems have to cope with emergent complexity in ways that cannot be removed from their context.

Explanations that rely on simple cause and effect will tend to focus on specific trees and miss critical ecosystemic interactions.