In today’s landscape of web service deployment, it’s pretty clear that everyone has moved to the cloud. You’re either using a public cloud, a private cloud or are currently frustrated with your infrastructure. Containers have started to gain traction beyond development environments as well lately and there are multiple efforts to try and take them into production environments.
But there’s a secret that none of the cloud providers tell you up front: if you want to confidently run your production-grade web service on a cloud, it’s expected to be immutable. Randy Bias coined the analogy of pets vs. cattle to represent this well, where when you have traditional servers you treat them as a pet; they get sick and you nurse them back to health, they have their own quirky personalities and you can know them by name (not always in a good way!). Cattle on the other hand all look the same, and the harsh realities of farms is that if they get sick you just get rid of it and move on, it costs you more to try and nurse it back into health.
While clouds have provided just enough features to allow you to carry over practices of treating your servers as pets, to follow the analogy, you end up with a pet goldfish: it’s boring, you while might be able to cure it if it gets sick but vets will usually laugh at you if you try, and it’ll often be unresponsive and will die at any time without warning.
Pros and cons of clouds
Clouds provide a wide range of features, but at the core of the offering you’re able to access what used to be prohibitively expensive:
- Affordable worldwide redundancy
- Fast delivery to users by serving content geographically close to them
- On-demand resources to always have a responsive service
- Pay for what you need at each minute of the day, not what you use at peak demand
- Experiment with new technologies and topologies without up-front investments
Of course, everything comes at a cost. In order to provide on-demand infrastructure you need to forgo guarantees about reliability:
- Your instance may disappear at any time without prior notice
- It might reboot unexpectedly and unreproducibly
- A new instance might not boot at all
- It might behave erratically at random times, draining CPU, performing slow disk IO or random network lag
In that context immutable services are a perfect fit:
- You can predictably deploy exactly the same bytes everywhere in the world
- There’s very little preparation (and often, none) to launch dozens, hundreds or even thousands of new instances driven by demand
- Automatic error recovery is built into your deployments, misbehaving instances can be automatically killed and new ones brought up. Debugging can happen whenever convenient instead of in the middle of an outage.
- By not having to transform images after boot they will generally either work or they won’t, making it easy to automate deployments
There are many benefits to deploying your services immutably beyond working well with clouds:
- Reduce complexity
- 3 services x 4 servers each = 12 pets. That's a lot of pets to feed!
- No bytes to move around, no bytes to transform. Deployments are as fast as you can bring up a server
- Safe Upgrades & Downgrades
- If an upgrade fails then you can continue using the old version with no cost or disruption. An upgrade is the same as a downgrade, you can can rollback if needed
- Same code returns the same output in everyone’s laptop, in every CI, in every server. There is no uncertainty as to what combination of code and configuration is being used
- Know exactly what’s on every server at any point in time, audit once run everywhere
- Disaster recovery
- In order to create immutable infrastructure, you need to have end-to-end automation of deployments. When (not if!) your infrastructure collapses you can recover in minutes, not hours or days
How to get there
Creating immutable services isn’t always easy, especially if you already have a project that’s been around for a while and was built assuming certain things that are commonly expected like being able to write anywhere on the filesystem or overwrite existing files for fast in-place upgrades. However, you can move towards an immutable service in steps. With each step you get tangible benefits from it, so even if it takes you a while to walk up that ladder it’s still worth it. Let’s explore what walking up the immutable ladder looks like.
Step 0: No in-place upgrades
- Immutable deployments require you to stop doing in-place upgrades
- Choose the fastest path to get there (snapshot your current VM? Tweak your config management software to be able to fully configure a blank image?)
- Will require a bit of work either at the DNS level and/or in load balancers
- Even if you don’t go further than this, you are in a better state
Am I immutable yet? No.
What can fail? can’t download system package (or worse it’s slow!), can fail to configure properly, unexpected version skew
Step 1: Isolate from the OS
- Start by moving all your dependencies out of system directories
- Languages & frameworks tend to play nicely here (Python’s virtualenv, Ruby’s gemfiles, PHP’s composer, etc)
- Compiled dependencies might be a bit trickier
- Move out system-level data externally (rsyslog!)
Am I immutable yet? Still no.
What can fail? fail to download, slow, need to keep a tight grip on version
Step 2: Isolate from the Runtime
- Runtimes don’t change that often
- You don’t always want to move forward
- Small configuration differences (8mb vs 64mb memory_limit) can be mind-boggling
- Generally still plays nicely with being made immutable
Am I immutable yet? Uhm, hrm, not quite.
What can fail? download, configure, also, download half of the internet (and not necessarily the good parts), tight grip on versions
Step 3: Isolate from the Framework
- Frameworks change a bit more often
- Again, you don’t always want to move forward
- Usually have more security issues (26 Python CVEs vs 48 Django CVEs)
- Starts to fight back on being made immutable (primarily in the separation between app code and framework)
Am I immutable yet? Getting close! Hard to distinguish now.
What can fail? Often is slow, can run out of disk, frequently requires IO-intensive processes
Step 4: Immutability achieved!
- This is what most immutable services will look like
- You don’t need to re-build your service from scratch to get here, but it might take some extra effort to make sure it’s only writing in specific places
- Allows you to use auto-scale solutions
Am I immutable yet? YES! WINNER!
What can fail? Can’t always be accessed concurrently, race condition with service start up
Step 5: Immutable + stateless
- Read-only data like static websites (no runtime in production)
- Workers that process data & farm out to a queue (same runtime, no local state)
Am I immutable yet? Super immutable, you can never be changed again
The immutable ladder
What your development-to-production cycle looks like
Why climb the ladder?
- The higher in the immutable ladder, the less risk of failure to bring up an instance
- Clouds are meant to be even less reliable than data centers, so you want to optimise for your instances disappearing as soon as you can
- The more you scale, the more these investments pay off
- Reduces distances between dev & ops, making DevOps practices cheaper
- Move as much as you can to pre-deployment (image bake stage), including configuration management systems
What comes after you’ve climbed the ladder?
- Move logs out, have something to consume them
- Deployment technologies to help kill and bring up new cattle
- Looking back down, it doesn’t seem much more of a climb than traditional web service deployments, it just took a different mindset to get there
What Bitnami is doing to make it easy to embrace the future
- Working to configure popular web frameworks with smart defaults that lead naturally to building immutable services
- Building containers and cloud images in a consistent way, to make it easy to transition from development to production
- Writing and maintaining documentation on best practices on how to start a new project to make it easy to deploy immutably
- Other products to be announced ;-)