Monday, June 7, 2010

All failed deployments are anachronisms.

Your code doesn't care what day it released on. If there is an extended outage or degradation as a result of a deployment, the code is always in the wrong place at the wrong time. Here are all the wrong days to deploy a broken release:

  1. Monday
  2. Tuesday
  3. Wednesday
  4. Thursday
  5. Friday
  6. Saturday
  7. Sunday
and the times:
  1. One O'Clock
  2. Two O'Clock
  3. ...
and obviously a bunch of holidays and unrelated business synchronizing events (black friday, sales deadlines, etc) that you also shouldn't deploy on.

However, if you really believe this, you should stop writing, managing or using software NOW! Unfortunately, you will deploy broken code, because:
"I’m sorry to say so but, sadly, it’s true that Bang-ups and Hang-ups can happen to you."
-
 Dr. Seuss (Oh, the places you'll go!)
And when you do deploy that broken code, it probably had nothing to do with when it was deployed. In fact no one probably would've noticed if you did just one thing...

ROLLBACK!
Three rules:
  1. Code must always be able to be rolled back.
  2. Rollback must be a single command.
  3. The rules for rollback must be simple, easy-to-follow and aggressive. (ie. Customer call related to issue with release, exception related to a release, etc.)... then, just...
ROLLBACK!

Then, figure out what went wrong, how to prevent it from happening again... rinse and repeat... any day... any time.

2 comments:

  1. Isn't our philosophy designed for roll forward?

    ReplyDelete
  2. nah, we need to be able to rollback, turn features off, etc. you can only have a roll forward policy if you can guarantee that the fixes will take less time than the rollback and won't introduce new defects. since even you can't guarantee that... we need to be able to rollback.

    ReplyDelete