Picture by Anant S.
We’ve all had embarrassing moments in our carrer where they involved inadvertently wreaking havoc on a production system. When it happens; for a second you (so desperately) want to believe it didn’t. You will be so afraid to even cross-check that it actually happend.
Github went through an outage yesterday and Chris was brave enough to reveal how it happend, then hacker news post generated a good buzz around the subject. While reading comments on both the threads, I hand picked a few interesting stories about production mishaps. Here they are:
seldo: My worst was discovering I had written a unique ID generator which was (due to me typing “==” instead of “!=”), producing duplicate IDs – and not only that, it was producing them at exponentially increasing rates – and every duplicate ID was destroying an association in our database, making it unclear what records belonged to who.
pixdamix: Mine was for a French social networking site 4 years ago. They used to send mails everyday to say “hey look at the people who you might know”. The links on the mail would automatically log the user on the website. When I sent the code live it took 2 days (and more than 50000 mails to found out that when I sent a mail to person Z about person Y the link logged in Z ON Y’s account.
SkyMarshal: I sent a test email to thousands of customers in your prod database encouraging them to use web check-in for their non-existent flight tomorrow. Yeah, did that five years ago, talk about heart-attack-inducing. Quickly remedied by sending a second email to the same test set, thankfully, but that’s the kind of mistake you never forget.
Would love to hear about your production mishaps if any :).