https://nkdagility.com/resources/BAwBXYtKj7o copied to clipboard!

Videos Engineering Excellence DevOps Product Development

a·gen·tic a·gil·i·ty

Technical Debt Management for Long-Term Quality

Explains how managing and repaying technical debt improves software quality, delivery speed, and long-term value by addressing both known and unknown code issues.

Published on 21 November 2024

Written by Martin Hinshelwood

6 minute read

https://nkdagility.com/resources/BAwBXYtKj7o

Technical Debt Management for Long-Term Quality | Martin Hinshelwood

📍 📍 Technical debt is a huge problem for, for organizations. I want to quickly define technical debt. Technical debt is

Future costs that you incur when you or your team prioritize quick short term solutions over more robust long term approaches. Right? So anytime you, you make a choice to do something fast but wrong, because you need it fast you’re, you’re introducing known, you’re knowingly introducing technical debt.

You can also unknowingly introduce technical debt, i. e. we made some architectural choices, they were good choices at the time, but now they’re no longer good choices. Technical debt can appear over time. I’m thinking of a transaction system and we supported x number of transactions per second and our platform of choice was reasonably priced, was able to support Well beyond what we thought we were going to transact, but now we’re transacting a lot more than that, and we’re reaching the limits of the system that we chose.

A great example. A great example of that is the Azure DevOps team when they originally envisaged Work item tracking fields. So a, a, a, a work item was a row in a database and fields were a column. And those of you who are software engineers have already figured out what the problem would be with that, in that you can only have 1024 columns in a SQL database.

So they, they, not quickly but they did hit limitations on the number of columns that you could have for custom fields. Because who would have more than a thousand fields on a work item? But yeah, people do, they do exist and people do that, and it was a thousand totally within the system.

So they thought we’d never have a thousand fields. Or somebody made that decision just like the two digit date decision back in the day. So they had a lot of work to go back and refactor, not just refactor the system, but write the capabilities to refactor the data on upgrade for their customers into a format where each field was a row right in a database rather than a column.

So then you have unlimited capability for fields and data. And those types of decisions either knowingly made decisions that result in something that’s not not quite the way it needs to be or unknown ones are where most it needs to be. Technical debt comes from there are other issues that people call technical debt, which aren’t necessarily technical debt But most people lump it all together and say technical debt.

I think I often do as well And that’s i’ve written bad code and shipped it That’s not technical debt, that’s incompetence, right? So, so, within the context of a competent team, there’s known technical debt and unknown technical debt. But there’s another thing that we call technical debt, which is just chipping bad code making poor choices.

Knowing that they’re poor choices and not doing anything about it, right? Shipping bad code stop shipping stop shipping bad code would be the way you pay that one back but for technical debt, You need to pay it back. You need to prioritize paying back that technical debt Think of it more as an unhedged fund rather than a debt like a credit card most debt is secured against something secured against an asset.

If you stop paying your mortgage then the bank comes and repossesses your house and gets their money back, right? And maybe you get some leftovers because you’ve paid some of your mortgage. But who’s, who’s insuring Your product quality. Who’s insuring your product against your technical debt?

There’s no insurance. It’s uninsured From from from that perspective and nobody can magically come along and pay back all the all the debt. It’s not insured at all Or sell something and pay back. We claim an asset. So It’s something you’re going to have to deal with and you can’t like get out of control and there’s a lot of unknown technical debt.

I mean, that’s, that’s like, I mean, I use the Azure DevOps team a lot as an example, but they’d been a waterfall team for many years shipping once every two years, and then they moved to a more continuous delivery three week model. And they found that they made lots of poor decisions, right, that didn’t, weren’t necessarily poor decisions within the context of two year, but they couldn’t really see the impact of the technical debt, the choices that they’d made, deliberate and non deliberate, right, on, on their ability to deliver product and their ability to deliver value.

But I have a, I have a, I have a graph of, I think it’s 2010 through to. 2018 for, for that product team. So eight years of development and they effectively go by moving to continuous delivery, moving to three week sprints, moving to that faster cycle from a two yearly cycle and running into issues with that and every issue they running into.

Paying it back, right? Paying back the reason that they made those choices, which were perhaps valid reasons at the time, but you still need to pay it back. It doesn’t matter whether it was a valid reason or not. And, and paying it back and doing the work, they actually went from 25 features to production each year in 2010 to something like 360 features to production in 2018.

So by, by focusing on paying back their technical debt of enabling their engineers to close the feedback loops, then shorten the feedback loops, Three ways of DevOps, right? Closing the feedback loops first, then shortening them. And that act of shortening the feedback loops can massively increase the amount of value that you can deliver long term.

And that’s the value of paying back technical debt, of managing technical debt well, is that you Can go from the removing those limitations to maximizing the value that you deliver in your product with the same number of people. That was the Azure DevOps team literally went from 25 features to production each year in 2010 worked very hard to pay back technical debt and were able to even in the first year of focusing on paying back technical debt to get their product new way of working up and running.

They went from 25 features to production to 68 features to production within that one year. And they weren’t even focused on delivering more features. They were focused on let’s deal with our crap and let’s figure out how we deal with those problems. And they still delivered more features. That’s the benefit of paying back technical debt.

That’s the benefit of having a slick. Easy system to add features to your product and that’s what everybody needs don’t Manage technical debt pay it back.

At NKD Agility , we help teams implement modern engineering practices , build robust testing strategies, and achieve engineering excellence . Ready to reduce errors and deliver faster? Visit us today to transform your software delivery pipeline. Let’s automate the future together!

#agile #agileproductdevelopment #agileprojectmanagement #agileproductmanagement #productdevelopment #productmanager #productowner #projectmanager #scrummasters Watch on Youtube

Automated testing is extremely important to our ability to use modern software engineering to benefit our organization and increase our profit, increase our margins, increase our capability, and deliver better quality, higher value software to our customers.

Automated testing comes in lots of different flavours, and I’m definitely going to stretch the term automated testing maybe from your traditional concept. One of the things that automated testing does is it reduces human errors. You would think I used to teach a training class; I still have it on my list of things for manual testers to use as your DevOps test tools. We had labs in it. You would think that if there was one group of people that were awesome at following a set of steps and validating whether they worked or not, you would think that a group of testers would be that magical group of people that would be able to do that.

It’s absolutely not the case. Most of the groups that were doing the labs failed to follow the steps and resulted in the lab not working. This lab doesn’t work, right? And it’s like, yeah, it does. Did you follow all the steps? Yes, yes, yes, we followed all the steps. Then you sit with them, and they walk through it, and I’m like, you missed step four. You didn’t do step four, or you didn’t do the second part of step four, or whatever it is, right? You didn’t follow the instruction.

This is just a human thing; this is not an assassination attempt on testers. It’s just how humans work, right? So you cannot expect somebody to follow a set of steps and do it the same every time. That’s not how humans function. That’s how computers function. So we want to take those things that make sense and convert them into automated tests.

Now, we’re not looking for a particular level of code coverage, although no code coverage is probably bad. Chasing code coverage is always a bad idea because you’ll just have lots of people writing terrible tests that get you good code coverage but don’t actually validate your product, right? So don’t chase code coverage or test coverage, but it is a way for us to get faster feedback.

We want to have fast-running automated tests that we can validate the changes that we make on a continuous basis. TDD results in some of those, right? Some tests out of TDD are like that. Most tests out of TDD validate that at least the product does what the software engineer intended it to do and validates that we’ve got an architecture that probably is a little bit better.

That’s hopefully part of that testing mode, but the value in automated testing is it happens the same every time. You do need to balance this idea of test infrastructure, right? Because when we write test automation, we have a body of tests, and whenever we make a change to our product, it’s going to have an impact on those tests.

There used to be a great feature in Azure DevOps that had something called test impact analysis, and you could make a change in the code, and it would tell you exactly which code paths had changed and tell you which tests were impacted. I think there were a lot of false negatives, which is why it’s not well thought of, but it was a great idea, right? A great idea. How do we know what tests we need to run? Well, let’s look at what’s changed and what the tests hit, and are we missing something, and which tests need to be rerun in order to reduce your test matrix?

But because that fundamentally doesn’t work because of complexity, right? That’s what got in the way: the complexity of software engineering. We need to run them all, which means they all need to be super fast. We need unit tests, not end-to-end tests, not integration tests. We need unit tests that run really, really quickly and thousands of them in milliseconds.

The Azure DevOps team moved from long-running end-to-end tests to fast-running unit tests and took their test strategy, their test infrastructure, from 72 hours down to 3 and a half minutes almost to run their entire test infrastructure. That’s what you’re looking to be able to do, and there’s something like between 60,000 and 100,000 tests being run to validate that their product still works.

That’s the story that you’re looking for. You’re looking for small, lean, discrete tests that don’t have an impact across the entire platform so that when you make a change in this part of the system, you only need to change the tests in that part of the system. All the other tests should still pass because you’ve not broken what they’re expecting in other parts of the system, and it gives you a good indication.

So having this tight test infrastructure, shifting left as much as you can, can and automating everything. The phrase I usually say is if it can be automated, it should be automated, and if it can’t be automated, it should be refactored so that it can, right? Automate everything. You should not have any manual steps between developer cutting code and production.

The only thing that I’d say was a valid place where you put a human between those things is maybe an approval, but I would prefer for those approvals to be automated. Right? On what basis does this human decide whether we’re a go or no-go for release? Well, they look at this data. Well, we can automate that. They look at the calendar, and they only do it on these days. Well, we can look at that. We can say we only release on Monday mornings, right, to give us maximum amount of time to deal with any problems.

Never release on Friday; don’t do that. CL strike did that; don’t do that. We can do all of those things. We can automate everything so that we have that engineering excellence, modern software engineering excellence built into our entire story so that we can then spend the time that we would have spent doing all those things manually and fixing all those problems focusing on delivering the value that’s going to generate the revenue that we need to grow and create more.

That’s what we should be focusing on, and test automation is a huge part of that to reduce the number of problems that make it through to production, to reduce the amount of time it takes to detect that you’ve injected problems into the system. We can help you create those strategies, build that engineering excellence within your organization, and ultimately build better.

Connect with Martin Hinshelwood

If you've made it this far, it's worth connecting with our principal consultant and coach, Martin Hinshelwood, for a 30-minute 'ask me anything' call.

Our Happy Clients

We partner with businesses across diverse industries, including finance, insurance, healthcare, pharmaceuticals, technology, engineering, transportation, hospitality, entertainment, legal, government, and military sectors.

Xceptor - Process and Data Automation

Akaditi

CR2

Capita Secure Information Solutions Ltd

Ericson

Teleplan

Hubtel Ghana

Graham & Brown

Lean SA

Higher Education Statistics Agency

Philips

Genus Breeding Ltd

Jack Links

Flowmaster (a Mentor Graphics Company)

Schlumberger

Illumina

SuperControl

Alignment Healthcare

Washington Department of Transport

Royal Air Force

New Hampshire Supreme Court

Nottingham County Council

Washington Department of Enterprise Services

Ghana Police Service

DFDS

Higher Education Statistics Agency

Capita Secure Information Solutions Ltd

Milliman

Emerson Process Management

NIT A/S