Cool, so this came out of a training course that me and Jeff went to in London, and this was one of the most interesting bits of it that I thought.
So, measuring technical debt and why it matters.
So, it matters because you get what you measure. If you give a team a particular statistic, and say you’re going to be measured on this statistic, the team are going to move that in the direction that you want them to move it in, even if that means that the code doesn’t actually get any better.
So, we have to choose a measure for technical debt such that when we actually put in the effort to pay back the technical debt the time investment is actually worth it.
So, the talk is basically about two measurement approaches.
So this is the first one, which is the delta from ideal, and basically you define a whole bunch of metrics, where you know what the ideal looks like, and you can look at your code and see what you’re currently scoring on that metric, and then you just kinda look at where you are.
So, the first metric — so these are just example metrics, you can pick your own — so the first metric we might look at is how many compiler errors and warnings we’ve got. So, this is the SQL Compare UI, which includes Compare Engine and everything else, and it compiles on my machine — brilliant — but there’s a 1000 warnings. And, so, if we were to spend time reducing that warning count are we actually making the code better? Probably not.
ReSharper errors. There’s nearly 20,000 ReSharper errors across 2,500 files. If we put the time in to make those better, and reduce that count, are we actually making the code better? Probably not.
Code duplication. So this is using the code analysis tools in Visual Studio Ultimate across the Compare Engine, it found 58 exact matches, 75 strong, etc, etc… So these clearly, if we fix these, it would make the code better, because most of these are just, the next bug fix away, won’t fix the bug properly, because you’ll fix it in one of the things but you won’t deal with the duplicated code. So, there you are making the code better.
Next up is unit test coverage. This is SQL Source Control and the plugin that we’re writing for Deployment Manager. Again, pushing these numbers up, clearly you are making the code better.
And finally, is just our gut feel. So the previous 4 we could measure automatically, but gut feeling not so much. So this is the Work class in Compare: it’s a partial class spread across 4 files, and the files are quite big. They’re so big that ReSharper’s intellisense is a bit slow when you’re dealing with these files. And so clearly, yes, we could put time into fix this, but again, are we actually making the application better?
So, basically, in summary, you have to pick the metrics carefully, otherwise, there’ll be a high opportunity cost to fixing that metric, like the compiler warnings we started with at the beginning.
And next up, in a large application with an awful lot of technical debt, this measure is basically useless, because you’re going to end up with something like this, where it’s just insurmountable to deal with this. Or your unit test coverage will be basically 0, and it’s just insurmountable to deal with that.
So, this is the approach that the guy on the course recommended. So, instead of kinda looking at the entire application, and looking at metrics over it, you basically look at the things that you’re actually going to have to change. Because some bits of the application will just sit there and don’t need changing, but more interesting are the bits that you’re going to change and how those bits affect the technical debt.
So, for this approach you need some stories, which can either come from the backlog, or they can be hypothetical. And then you estimate them just as Scrum has taught us to, so we give them points: 1, 2, 3, 5, 8, 13, 20.
And then, the assumption behind the approach is that technical debt increases that estimate, because technical debt makes our code harder to change, and so what we get to do is: in 3 months time, we can re-estimate again, and we can see if the estimates have gone up — this means our technical debt has increased.
We can also use it after a refactoring to see if the technical debt has reduced. And interestingly, we can use it after a hypothetical refactoring, so we can say if we were to make this change to our codebase, would it make that feature that our product manager wants, or that product manager might want in the future, easier to do? And if the answer is yes, then clearly you should do it, before the feature is implemented. And if the answer is no, then if it’s not actually making it easier to add features, then is there much point actually doing that change when you could spend your time focusing on something else?
There are some interesting corollaries with this approach, the first is that actually to reduce your technical debt you don’t actually have to make the code any better. So, if you can increase the team’s understanding of the code, maybe there’s an abstraction layer that doesn’t quite work — it’s too leaky or whatever. But if the team’s understanding of the code is increased, then that means that the technical debt is reduced. So, this basically encompasses what you know, that if you replace the software developers on the team with another set of software developers they lose their in-built mental model, and so the technical debt goes up. So, if you keep the team stable over time, you can see that technical debt is essentially lowered by having a team that understands the code.
The real problem with this approach is imagine the estimate used to be 5. We do some refactoring; it’s now 3. But what does that mean in practice? Have we actually reduced the technical debt? So, what are the errors bars on this? If it’s 5 plus/minus 2, and it’s 3 plus/minus 2, then has our code actually got any better? So, it basically means that you get to inherit all of the problems that come from estimating into technical debt calculation. So you can’t really use it to kinda be quantitative, but you can use it to be qualitative, so if I did this refactoring it would reduce it, but as to how much, or whether it’d reduce it every single time — it’s quite hard…
Cool, and I’ve over-run, any questions?