Automation is a game changer in the world of software development. It’s not just about making things faster; it’s about creating a more standardised and efficient process. However, I must admit, I have a bit of a love-hate relationship with the term “efficiency.” While it certainly fits the context, it can sometimes overshadow the human element that is so crucial in our field.
Let me share a couple of stories that illustrate the importance of automation and the pitfalls of relying solely on human execution.
The Human Element in Testing
Years ago, I had the opportunity to teach a group of testers how to use the Azure DevOps test tools. At that time, the tools were still quite new, and we had a series of workshops and labs designed to help them grasp the concepts. You’d think that testers, whose superpower is following test scripts, would excel at this. After all, they spend considerable time creating these scripts, right?
However, I kept receiving feedback that the labs were faulty. Testers were adamant that they were following the steps correctly, yet the tests were failing. The reality? They were missing steps or clicking the wrong buttons. This wasn’t a reflection of their skills; it was simply a human error.
A Costly Mistake
On a more commercial note, consider the case of KN Capital Group. They faced bankruptcy due to a deployment error that stemmed from a simple oversight: a person failed to deploy the latest version of their trading software to all servers. With $400 million in the bank at the start of the day, they ended up declaring bankruptcy by the end of it. The cost of human error in this instance was catastrophic.
These examples highlight a crucial point: when humans are involved, mistakes are inevitable. This is why automation is not just beneficial; it’s essential.
Embracing Automation
I often say, “If it can be automated, it should be automated. If it can’t be automated, it should be rearchitected.” This philosophy is about closing the feedback loop. Imagine writing a story and submitting it to a publisher, only to wait six months for feedback. By the time you receive it, you’ve moved on to other parts of the story, and now you have to revisit earlier sections with a different mindset. This cognitive load can be overwhelming, and it’s no different in software development.
To mitigate this, we need automation that provides rapid feedback. Ideally, we want tests that run in minutes—if not seconds. The Azure DevOps team faced a similar challenge. Their automated tests took 48 to 72 hours to run, which is far too long. They realised that their testing pyramid was inverted, with too many long-running system tests and not enough fast-running unit tests.
By flipping that pyramid, they reduced their testing time from days to just three and a half minutes. This is the kind of efficiency we should strive for.
The Cost of Manual Processes
Manual processes are the longest cycle in any development workflow. They are not only time-consuming but also prone to errors. I once worked with an organisation that had an astonishing ratio of testers to coders—600 testers for just 300 coders. Their quality was so poor that they required an excessive amount of manual QA hours to validate any changes. This is simply unsustainable.
Imagine the cost of maintaining such a workforce versus investing in robust automation that can validate software changes quickly and accurately. By embracing automation, you can build more features with higher quality, leading to happier customers and ultimately, greater profitability.
Conclusion
In conclusion, the path to effective software development lies in embracing automation. It’s not just about speeding up processes; it’s about reducing human error and creating a more reliable workflow. By automating where possible, we can focus on what truly matters—delivering high-quality software that meets the needs of our customers.
Let’s strive for a future where automation is at the forefront of our development practices, allowing us to innovate and grow without the burden of human error weighing us down.
Automation is a huge part of enabling the building of software to be more effective, more standardised, more efficient. I don’t like that word, but it certainly fits here. The difficulty is that anytime you’ve got a human doing something, you’re going to have mistakes. You’re going to have mistakes, you’re going to have steps missed, you’re going to have things go wrong.
Two examples: one’s a commercial example and one’s just an interesting one. One of the interesting ones, do that one first, is that I used to teach a bunch of testers how to use the Azure DevOps test tools when there was a separate app. It was quite some time ago, and you would think that the superpower of testers would be that they could follow a test script, right? They’re going to spend a bunch of work creating the test script, but then they’re going to follow that test script to run the test, running a manual test and following those steps, right?
So there’s 10 steps in the test; they follow those 10 steps, and it either passes or it fails, right? Well, in this training, we had workshops, we had labs, and they had to follow a set of steps on the lab. I kept getting feedback from loads of people in the room, loads of instances of the training. I would get feedback all the time that the labs were wrong, that the labs didn’t work, that they were following the steps and the lab doesn’t work; it doesn’t do what it’s supposed to do.
This was false; they just weren’t following the steps. They were either missing something or clicking the wrong button or not doing whatever it was, right? So those testers, whose job it is to create and then run sets of steps manually, perhaps to validate whether something is built correctly, were manifestly unable to successfully and consistently run through a set of steps. This is just part of being human; it’s not a slight on them, it’s not a tester thing, it’s a human thing.
The commercial example is the KN Capital Group, which you might have heard of or maybe not, but they went bankrupt because somebody couldn’t follow a set of steps. They had a deployment of a new version of their product; they were Stock Exchange Management trading software. So you can imagine when something goes wrong, the cost of something going wrong is pretty enormous.
With $400 million in the bank at the beginning of the day, a deployment happened. It went wrong because the person running the deployment deployed to eight out of the nine servers, and because not all the servers had the latest version of the code, some weird stuff started happening. It took them till the end of the day to figure out what the problem was, which was steps missed, and by that time, they had to declare bankruptcy. They were done, right?
The cost of these types of mistakes is enormous. That one is a very extreme example, but the cost of somebody who’s running a set of steps—this could be a tester, it could be a coder, it could be UAT, right?—could be a user. They walk through a set of steps and then communicate back that it was wrong; it didn’t work. That cost is in investigation from the engineers, in loss of focus on new features because they’re not building new stuff for you when they’re having to go look at this erroneous problem.
What we want instead is automation. The solution to both of these problems is automation. There is an expression I use: if it can be automated, it should be automated, and if it can’t be automated, it should be rearchitected.
Close the feedback loop. Think about if you were telling a story. I’m thinking code is kind of like telling a story. You’re telling a story, you’re working on part of this story, and then you submit it to your publisher, and the publisher takes six months to come back with feedback. The feedback is, “This didn’t work, that doesn’t work, can you change this, can you reformat this?”
Now you’re much further along in the story; you’re working on a different part of it, and now you have to come back to this previous part of the story and get into the mindset of this part of the story and try and ignore the things you learned during the future part of the story, right? The character development has changed, and you have to go back to this old version of the character with an older style of how that character does things.
Now you need to do this differently, so now we’re trying to make a change back here in the past when we have understanding from the future that would impact how we would do this, but we can’t let that impact what we’re doing. It’s a huge cognitive load, right? And that’s in code; it’s exactly the same. Exactly the same thing happens when you have to do that.
So not only do we want automation in our process to automate tests so that we don’t have as many false positives of tests failing or false negatives—anyway, false negatives are probably good and false positives, right?—but we also want to have those tests run as quickly as possible so that we get that feedback loop closed as quickly as possible. Preferably, I’m talking about minutes. That’s what I’m talking about here: we want automation that runs in minutes that tells us whether we’ve passed or failed, preferably seconds, right? Or milliseconds, but minutes at the most, at the outside.
A great example of that is the Azure DevOps team. They used to have very long automation to find out whether they’d broken something; it was 48 to 72 hours because they had to run on a bunch of servers. They had long-running automated tests, but they were long-running system tests. Those are the wrong types of automation. So not all types of automation are the right type of automation. They were certainly better than trying to do it manually, right?
But those types of tests tend to be flaky, tend to have a lot of false positives, tend to be difficult. So they embarked on an endeavour to invert that triangle of tests, right? So if you think about it, you’ve got fast-running unit tests that your developers are creating; that’s testing just the smallest unit of work, all the way down to long-running system tests at the bottom.
They had 38,000 long-running system tests and a very small number of unit tests at the top, so their pyramid was the wrong way up, right? So what they did was they spent a whole bunch of time, money, and effort flipping that pyramid round and working through moving all of those tests to fast, slick automation. They got from 24 to 48 hours to find out whether something was wrong all the way down to three and a half minutes to run their entire test infrastructure locally on the developer workstation.
So you close those feedback loops; you shorten those cycles. Manual is the longest cycle. Anything manual, whether it’s deployments, whether it’s testing, anything manual is prone to mistakes. So you get false positives and negatives, but it also takes time. It’s time-consuming, and it’s costly.
I worked with an organisation years ago; that’s the only organisation I’ve ever encountered like this, so it is unusual; it’s an outlier. But they had twice as many testers as coders. Their quality was so bad in their software that they had to do, I think it was, a thousand and a half hours of manual QA to validate whether any change the developers had made would actually work in the system.
So they had 600 testers and 300 coders in their teams, and it was all manual tests. That is just unsustainable. Think of the cost of having all of those people versus having fast, tight, secure, easy to understand, easy to validate automation that could run and check your software on a regular cadence. You will build more features; you will build those features with a higher level of quality, so you’ll have happier customers.
And if you’re building more features, you’re having happier customers, you’re doing more experiments, right? Because you’re building more features, you’re more likely to open out new markets, and you’re more likely to make a profit.