My bad opinions

2022/11/23

Hiding Theory in Practice

I'm a self-labeled incident nerd. I very much enjoy reading books and papers about them, I hang out with other incident nerds, and I always look for ways to connect the theory I learn about with the events I see at work and in everyday life. As it happens, studying incidents tends to put you in close proximity with many systems that are in various states of failure, which also tends to elicit all sorts of negative reactions from the people around them.

This sensitive nature makes it perhaps unsurprising that incident investigation and review facilitation come with a large number of concepts and practices you are told to avoid because they are considered counterproductive. A tricky question I want to discuss in this post is how to deal with them when you see them come up.

A small sample of these undesirable concepts includes things such as:

There are more concepts than these, and each could be a post on its own. I've chosen this list because each of them is an absolutely common reaction, something so intuitive it will feel self-evident to people using them. Avoiding these requires a kind of unlearning, so that you can remove the usual framing you'd use to interpret events, and then gradually learning to re-construct them differently.

This is challenging, and while this is something you and other self-labeled incident nerds can extensively discuss and debate as peers, it is not something you can reasonably expect others to go through in a natural post-incident setting. Most of the people with whom you will interact will never care about the theory as much as you do, almost by definition since you're likely to represent expertise for the whole organization on these topics.

In short, you need to find how to act in a way that is coherent with the theory you hold as an inspiration while being flexible enough to not cause friction with others, nor requiring them to know everything you know for your own work to be effective.

As an investigator or facilitator, let's imagine someone who's a technical expert on the team comes to you during the investigation (before the review) and says "I don't get why the on-call engineer couldn't find the root cause right away since it was so obvious. All they had to do was follow the runbook and everything would have been fine!"

There are going to be times where it's okay to let go of these comments, to avoid doing a deep dive on every opportunity. In the context of a review based on a thematic analysis, the themes you are focusing on should help direct where you put your energy, and guide you to figure out whether emotionally-charged comments are relevant or not.

But let's assume they are relevant to your themes, or that you're still trying to figure them out. Here are two reactions you can have, which may come up as easy solutions but are not very constructive:

In either case, if behavior that clashes with theoretical ideals is not welcomed, the end result is that you lose precious data, either by omission or by making participants feel less comfortable in talking to you.

Strong emotional reactions are as good data as any architecture diagram for your work. They can highlight important and significant dynamics about your organization. Ignoring them is ignoring potentially useful data, and may damage the trust people put in you.

The approach I find more useful is one where the theoretical points you know and appreciate guide your actions. That statement is full of amazing hooks grab onto:

None of these requires interrupting or policing what the interviewee is telling you. The incident investigation itself becomes a place where various viewpoints are shared. The review should then be a place where everyone can broaden their understanding, and can form their own insights about how the socio-technical system works. Welcome the data, use it as a foothold for more discoveries.

If you do bring that testimony to the review (on top of having used it to inform the investigation), make sure you frame it in a way that feels safe and unsurprising for all participants involved. Respect the trust they've put in you.

How to do this, it turns out, is not something about which I have seen a lot of easily applicable theory. It's just hard. If I had to guess, I'd say there's a huge part of it that is tacit knowledge, which means you probably shouldn't wait on theory to learn how to do it. It's way too contextual and specific to your situation. If this is indeed the case, theory can be a fuzzy guideline for you at most, not a clear instruction set.

This is how I believe theory is most applicable: as a hidden guide you use to choose which paths to take, which actions to prefer. There's a huge gap between the idealized higher level models and the mess (or richness) of the real world situations you'll be in. Navigating that gap is a skill you'll develop over time. Theory does not need to be complete to provide practical insights for problem resolution. It is more useful as a personal north star than as a map. Others don't need to see it, and you can succeed without it.

Thanks to Clint Byrum for reviewing this text.