Not too long ago, I had a few rough days in support of a client project. The client had a big content release, complete with a media embargo and the like. I woke up on the day of the launch, and things were bad. I was staring straight into a wall of red.
Thanks to the intrinsic complexity of software engineering, these situations happen—I’ve been through them before, and I’ll certainly be through them again. While the particulars change, there are two guiding principles I rely on when I find myself looking up that hopelessly tall cliff of red.
You can’t be at the top of your game while stressed and nervous about the emergency, so unless there’s an obvious, quick-to-deploy resolution, you need to give yourself some cover to work.
What that means will be unique to every situation, but as strange as it may sound, don’t dive into work on the be-all and end-all solution right off the bat. Take a few minutes to find a way to provide a bit of breathing room for you to build and implement the long-term solution in a stable, future-friendly way.
Ideally, the cover you’re providing shouldn’t affect the users too much. Consider beefing up your caching policies to lighten the load on your servers as much as possible. If there’s any functionality that is particularly taxing on your hardware and isn’t mission critical, disable it temporarily. Even if keeping the servers alive means pressing a button every 108 minutes like you’re Desmond from Lost, do it.
After you’ve got some cover, work the problem slowly and deliberately. Think solutions through two or three times to be sure they’re the right course of action.
With the pressure eased, you don’t have to rush through a cycle of building, deploying, and testing potential fixes. Rushing leads to oversight of important details, and typically, that cycle ends the first time a change fixes (or seemingly fixes) the issue, which can lead to sloppy code and weak foundations for the future.
If the environment doesn’t allow you to ease the pressure enough to work slowly, go ahead and cycle your way to a hacky solution. But don’t forget to come back and work the root issue, or else temporary fixes will pile up and eat away at your system’s architecture like a swarm of termites.
Emergencies often require more thought and planning than everyday development, so be sure to give yourself the necessary time. Reactions alone may patch an issue, but thoughtfulness can solve it.