DevOps: Firefighters of the Technology WorldNovember 23, 2015 8:00 am ·
I still remember the night I got the call. It was quite possibly the most perfect evening I had experienced in awhile. The only sound I could hear was the humming of my desktop as it finished downloading the last gigabyte of Fallout 4.
It had been a crazy two months at the office as we pushed through sprint after sprint to complete the Big Project. You know the one. The one that always gets higher priority even though the damn mobile page still loads like it’s 95 during rush hour.
Kids were asleep. Dog had finally tired itself out from barking at invisible intruders. I was tired, but not exhausted. I still had a couple hours left in me with the world all to myself. Like I said, perfect.
“Greg? You there? Listen, Greg. We’ve got a problem.”
That voice; it was not what I wanted to hear at the start of my perfect evening. That was not the voice of the Fallout 4 npcs telling me to get into my cryto chamber and prepare for post-apocalyptic Boston. I didn’t even try to hide my annoyance.
“A problem? Yeah, you bet we do. It’s 11:30, Larry. This had better be serious. What’s the problem? We had everything tested and ready to go at eight when I left so what could possibly-”
“It’s a code 4. Full scale.”
I nearly dropped my phone. ‘Code 4’ was the term one of our senior programmers had come up with for the worst possible scenario (a crash on the eve of launch day, for instance), because the number 4 in Chinese sounds the same as the word for death. It seemed appropriate. At that moment, though, I regretted the ominous association.
“Not sure. But it happened at least fifteen minutes ago when I got the call. It could have been longer than that. We have a couple of the team already on their way but you’re the project manager, Greg.”
I sighed. I looked at my Steam library. “Ready to Play” taunted me as well as the full glass of beer. Guess perfect evenings are even rarer than smooth launches.
“I’m on my way.”
I had been in this position before. Everything seems to be going smoothly and then bam! A hidden bug we didn’t catch. A scenario we didn’t test. Something that creeps up and reeks havoc, causing more sleepless nights than I care to remember.
I wasn’t going to let that happen this time. This time, I would get backup. Before I even got to the office, I opened the Experts Exchange app on my phone. I knew someone would be up in another time zone. The hardcore guys have alerts set when new, frantic calls for help come into topics that they follow. I wasn’t sure what I was going to be up against exactly, but I wanted to make sure I had the extra ammo I would need.
Sure enough, the worst had happened. A full on site crash. Everything down. No outside suspects to worry about. It was something internal.
“What do we got guys? Who’s checked the logs? Is someone working on a hot patch?”
I didn’t mean to bark orders, but all this rested on me if it failed. Once we found the source of the problem, I could get my virtual backup team to help. On Experts Exchange, I had talked before with a guy who worked in the UK as a DevOps release manager.
Sure enough, he was on. I picked his brain, then he asked a couple other DevOps experts to double check the code I was working on for the patch. Suddenly, I was starting to feel better. I was beating this. We wouldn’t be here all night. Far from it.
Suddenly, it was done. The fire was out. I looked up at the clock. 1:30 AM launch day. Not bad at all. My phone buzzed in my pocket. It could only be one person. The CIO.
“Greg? What’s going down there? Give me a report.”
“We figured it out. We’re up and running smoothly now.”
Immediately, his voice relaxed.
“Good, good. We were only down what — a couple hours?”
“About that, yeah. It was just me and Andy here from the team.”
“Thank you, Greg. You were our heroes tonight.”
Hero, huh? I could get used to that.