Engineering Improvement Runbook | Continuous Deployment
Dylan Etkin
June 2nd, 2020
Three Continuous Deployment mistakes and how to avoid them
If you are hearing all these wonderful things about Continuous Deployment (CD), and think, "There must be a catch," well, you are right. Here are three common mistakes made when adopting continuous deployment, how to avoid them, and how we solve them in Sleuth.
Mistake #1 - Thinking CD is an engineering problem
The biggest mistake is thinking continuous deployment is a technical problem for engineering only and that is wrong on both counts. Continuous deployment is primarily a communication problem–not just for engineers and other engineering teams but for the organization at large. For example, you are shipping new features and the support team needs to be kept in the loop so they can best respond to confused customers. The sales team needs to be able to sell these new features, and your marketing team wants to market them. Your technical writers have to update their documentation because maybe you changed the color from this to that or moved this button, and they might need to regenerate screenshots or recreate videos. When you're shipping infrequently, this isn't a problem, because every three months you get everyone in the same room and you say, "Here's what's gonna be happening." However, once you start shifting to deploying every week, every day, every hour, or even every twenty minutes, all of a sudden it becomes almost impossible for other teams to keep up.
There are a couple of solutions. These solutions aren't perfect because there are no perfect solutions to communication. Previously, weekly demos were just a way to keep engineers accountable but now they've become a useful tool to help share all these changes that hit production with other departments. You can also use a centralized system, like an issue tracker, and put information into the issue tracker about what happened, what went out, and have external teams watch it. Finally, separate the deployment from the enabling of that feature. Your engineering team can start delivering value quickly again and again and again but not have those affect customers until the marketing team, support team, and sales team are ready to have this new feature enabled.
Mistake #2 - Deployer is not the author of the change
When you're releasing every so often, you may have another team whose job is to take these code changes and ship them out into a production system. However, once you start deploying frequently it breaks down very quickly because when you disconnect change from its impact, you're breaking a key feedback loop for the author of that change. You want the person who feels the impact of a deployment to be the person that created that situation in the first place, allowing them to learn from their mistakes.
Mistake #3 - Not tracking deployments
Tracking deployments sounds like an obvious one but it often is forgotten. As people pay so much attention to how they can automate the deployment, they forget to make that information easily available to other people. A continuous deployment tool, for example, may show builds ran, but that's not very consumable by non-developers. A developer needs to know three key events:
- When the deployment is queued;
- When the deployment has completed;
- When the impact has been determined.
You can create a Slack webhook, if you're using Slack as a chat communication tool, to send a message to a common channel. This also gives you a historical record of the deployment so if you're trying to figure out who deployed that thing last night that broke something, you can see the deployment history in Slack. Second, use the Slack channel topic to report the status of a deployment. Finally, you could link to impact dashboards, such as metrics or logging dashboards, so that the developer who just released that change is able to, with a single click, see the impact of their change.
How we do CD at Sleuth
How do we at Sleuth do continuous deployment? Well, that's a bit of a trick question because in addition to using CD ourselves, Sleuth is actually a deployment tracker. We were building Sleuth to directly help teams adopt and maintain healthy continuous deployment practices.
There are a couple of solutions. These solutions aren't perfect because there are no perfect solutions to communication. Previously, weekly demos were just a way to keep engineers accountable but now they've become a useful tool to help share all these changes that hit production with other departments. You can also use a centralized system, like an issue tracker, and put information into the issue tracker about what happened, what went out, and have external teams watch it. Finally, separate the deployment from the enabling of that feature. Your engineering team can start delivering value quickly again and again and again but not have those affected customers until the marketing team, support team, and sales team are ready to have this new feature enabled.
For teams who have started the continuous deployment journey, how did you solve the communication problem? How did you communicate changes to not only other engineers but other parts of the organization?