Dan Talks About Post-Mortems
Hello, Dan here. So: the Hut 8 Labs team is very excited to be firing up our blog. But, before we get down to new business, I wanted to post some links to other places I’ve written or talked. Of late, a bunch of that writing and talking has been about how to run effective post-mortems.
I’ve actually come to believe that, for many startups, spending a chunk of time improving how they approach post-mortems (and learning from failure more generally) has a just incredible economic return. I suspect it’s one of the most profitable things they can do with their (incredibly scarce) time.
Why? Because the sort of default way groups of human beings respond to failures is with shame… and an attendant desire to quickly move on and pretend it never happened. Thus, that’s how most startups respond to multi-hour outages, or embarrassing bugs showing up in front of important early customers, or the like.
And if that’s what your team does after experiencing some nasty failure, you’re basically guaranteed to be missing simple, cheap, incredibly valuable improvements. It can be helpful to flip this around, and imagine those improvements not as “avoiding bad things”, but rather “making you piles of money” (aka, having strongly positive economic returns). Imagine there’s a big class of customers waiting to buy your product, but you’ve got a team-wide mental block which prevents everyone from seeing them. Improving how you run post-mortems is like discovering those customers are lined up outside your door, waiting to get in.
(I am not, of course, suggesting that making failures or outages go away is somehow simple and cheap — what I’m suggesting is that there are incremental improvements with outsized value, and post-mortems can help you find them. If you’re thinking “But early customers don’t care that much about outages”, you’re totally right — the big economic win comes not from avoiding showing bugs to customers, but from decreasing the frequency of firedrills for your team, which have an outsized opportunity cost.)
Well-run post-mortems can also serve as a very important release valve — again, because of the default response of shame. Unless there’s a structure to deal with failures, people tend to slip into very damaging patterns — searching for someone to blame, inserting slow-moving layers of review, etc.
Most recently, I gave a talk touching on a bunch of this at the Lean Startup Conference, the slides are up here:
How To Run a 5 Whys (With Humans, Not Robots)
You can also watch a 12-minute video of the talk (which has the added benefit of documenting for future-me that, in late 2012, I briefly experimented with a mustache).
Also, a ways earlier, I wrote up a blog post on my experiences running post-mortems at HubSpot:
Hope you enjoy, do check back for more. As a teaser: I’ve been engaged in a very interesting post-mortem-themed email exchange with one John Allspaw (who will tell pretty much anyone who asks that he has some very serious concerns about the 5 Whys approach). I’ve promised to write up my take on that discussion, tentatively titled 5 Whys Baaaad, 5 Whys Gooood, aka “All The Things That Are Wrong With 5 Whys And Why I Think They’re Awesome Anyways”.
Assuming I actually get that written, it should be fun.