Stress-free production deployment


More frequent smaller releases help to manage the risk of deployment problems.

But it doesn’t reduce the risk of data migrations failing due to production data being in an unexpected state.

I can’t think of a more stressful deployment moment than half-run migrations that failed in the middle. It’s usually easy to roll back the code, but the data will be in a half-migrated state that may not work with either the old or the new version of the code. If the app stayed up during deployment then rolling back the data may mean losing recent changes. Database-level constraints can help but there will be times when they’re missing accidentally.

Automated test deployment with real production data

Our continuous integration server (a hosted service actually, we use Semaphore) does this for us. Every time we make a change to the master branch (as long as the tests are passing) it automatically deploys to a staging environment which is an (almost) exact copy of production including a snapshot of the data.

Here’s how to do it for a Rails app on Heroku using Postgres:

  1. Set up automatic Postgres database snapshots for your production app: heroku addons:add pgbackups:auto-month

  2. Set up automatic deployment with Semaphore.

  3. Configure Semaphore to reset the database to the latest production data snapshot on every deployment. (Note that it’s just a daily snapshot. We could take a new one here, but we’re trading a bit of freshness for efficiency.

The deployment commands for Semaphore:

git push --force heroku $BRANCH_NAME:master
heroku maintenance:on
heroku pgbackups:restore HEROKU_POSTGRESQL_<colour> `heroku pgbackups:url --app <production app name>` --confirm <staging app name>
heroku run rake db:migrate
heroku ps:restart
heroku maintenance:off

Replace <colour>, <production app name>, and <staging app name>. You can get the colour (part of the db name) by running heroku config --app <staging app name>, you’ll see something like HEROKU_POSTGRESQL_IVORY_URL. Use that, but with _URL.

Beware of external services

So now you have real production data on your staging server. Just one more thing: before you go entering “asdf” and “LOL” all over your staging server make sure it’s not going to email real people, send them iOS or Android notifications, post to Twitter or Facebook, etc.

If you’re using Rails, all API keys for external services should be defined in /config/environments/{production,staging}.rb instead of in initializers which are shared by all environments.

Email: Make sure that your app won’t send email to anyone except specific domains. For Rails you can use safety_mailer. Consider sanitizing the data (see below).

Facebook/Twitter: Don’t use the same API keys to connect to services like Facebook and Twitter. Set up new “apps” in those services and make sure your staging environment uses those instead.

S3/AWS: Use different S3 buckets at least, if not completely different credentials. You don’t want to be changing your production S3 data from staging.

iOS/Android notifications: You can wrap calls to those services with your own internal API that checks the environment and discards messages that aren’t to an allowed set of users. You might want to also sanitize your data after importing it to be safe.

Sanitize your data after importing: You can add a step in your Semaphore deployment process to replace user data like email addresses, iOS device tokens, etc. Right after pgbackups:restore you could run something custom that you’ve create which does this, for example heroku run rake myapp:sanitize. You might want to make sure any background workers are stopped first so that nothing bad happens before this finishes.

For example:

UPDATE users SET device_token = 'XXX'

Here’s an example of how to update all email addresses to test+ID@yourdomain.com, where ID is the user’s actual id:

UPDATE users SET email = CONCAT('test+', id, '@yourdomain.com')

Bonus: if your email service supports plus addressing, most do, you can create the account test@yourdomain.com and it will receive any mail that’s delivered. (This account will probably get too much mail, you might still want to combine this with safety_mailer).

External data storage, e.g. S3

This only restores your Postgres data, but you may have other data sources like S3, and as recommended above you’re not using the same credentials (or at least S3 bucket).

If you want to avoid things like broken user profile pictures you’ll need to reset these too. We do S3 manually, only occasionally. You’ll probably find that you usually test with the same users or set of data in which case it’s not going to be an issue. But I’ll share a script one day when I get around to automating this.

Release early, release often

Finally, the other important thing for a stress-free deploy is knowing that the changes weren’t too huge. Release every time you merge a new feature branch to master, and if something does break at least you’ll have less to fix and you’ll probably know exactly what to do right away.

Let's stay connected. Join our monthly newsletter to receive updates on events, training, and helpful articles from our team.