Real World Stories | Stories from the Trenches
Don Brown
August 3rd, 2021
Why devs need to be involved in production - a story from the trenches at Atlassian
In 2006, I was hired to work at Atlassian. And Atlassian at the time was a server software shop, meaning that they created their software, they gave it to a customer, a customer runs it on their own hardware and Atlassian is not involved at all in running the software. As a result, they tended to release software every two to three months, so not very frequently. The changes would go to the customer. In fact, the customer didn't want the changes any more frequently because it was a pain to upgrade and install their software. When I joined in 2006, what they wanted to do was to start selling software for the cloud, meaning that a user could go to their website, click a button and instantly be able to access the software without having to download anything or run it on a servers. So when I think of, say, DORA metrics at that time for server software, your change lead time is going to be probably four months on average, because it just takes that long to get a change out.
Our mean time to recovery was not great, because if we found an issue, they would have to file a support issue and then we find about it later and whatnot. So again, not a great way to start delivering software, which is why we were so excited about delivering it on the cloud, because now we can do it quicker. So when we came to the time where we started delivering on the cloud, the thinking was, "Atlassian is really good at writing software. So let's take this software and we'll just focus on writing it. And then we'll hand it over the wall to a hosting company to run it for us, because they're good at running software. We're good at writing software. Everyone does what they're best at." However, in practice, this didn't work at all.
And this is something where when I think about one of the lessons I learned in my software career, which is have developers involved in all parts of the software, this actually I think where it stems from, because what I learned in this experience is that when we wanted to deliver a change, it took a minimum of a week or two to give it to the hosting company and then they would run it and it would take a while from them to do all the upgrades. If they discovered a problem, we wouldn't hear about it, sometimes until days or weeks later. And then we have to create a fix. And then it takes another week or two to ship a fix out. So the customer experience wasn't great. And everybody lost really.
So over time, this idea of writing the code, handing it off to the hosting company to run got longer and longer and longer. As the amount of code grew, as the hosting provider upgrade process got more complicated, it took longer and longer to deliver a code to the customer to fix the bugs the customer was having. And so we created not a great experience for the customer that got worse. And I think the core issue here was that our developers were not running the software. We were assuming that another company would run it and it would work great, but that hand-off process created problem.
So when I think about the DORA metrics and the metrics to show how a high-performing team works, I immediately think back to those days where we were creating software, handing it off to another team to run it, because that didn't work very well from a customer experience standpoint and from a software delivery standpoint. It's only later in my career where we started to involve developers into production, to have them actually do the deploys, to have them understand the impact of the deploys, that we're able to really get that change frequency going, the change lead time down, so that we're shipping our changes to production quicker. So again, the lessons that I learned here is have your developers involved not just in writing the code but also running the code if you want to deliver change at a high rate.