Category Archives: Tech Debt

Finding Functions That Are Risky to Change

In Tech Debt Detectors and Use Your First Commit to Fix CRAP I explained the concept of combining low code coverage with high code complexity to highlight functions that are risky to change. I mostly do this in my IDE to warn me before I change code, but it’s also useful to use it to do a global search for risky functions.

My main project is in typescript and uses jest and eslint. Here’s how I automated a search for risky functions.

Step 1: Get a list of high complexity functions

Note: by complexity, I am referring to the number of independent paths through a function, which is calculated by counting branches, loops, and boolean expressions.

Eslint has rule that allows you to set a maximum complexity. In my package.json, I call eslint via yarn this way:

"lint": "eslint \"**/*.{ts,tsx}\""

I added a line that does this but with a complexity rule

"complexity": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'"

If I run yarn complexity, I get warnings that look like this:

347:8  warning  Async function 'dbSetUserPlanBlock' has a complexity of 6. Maximum allowed is 5  complexity

I do a bunch of bash-fu and this line produces a sorted list of function names with the highest complexity.

"complexity": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'|grep Maximum |cut -d\"'\" -f2-|sort -t' ' -k6,6nr|cut -d' ' -f1,6|cut -d\"'\" -f1 "

Step 2: Find the code coverage of a function

What I need now is a way to find the coverage of a function given its name. If I run jest via yarn like this:

yarn test --coverage

it will generate a json file called coverage/coverage-final.json which has all of the coverage data. It’s a complex JSON file, but if you install jq via brew, you can use this script to see if it has coverage lower than 80% (credit: ChatGPT)

jq -r -c --arg func "$2" --argjson threshold 0.80 '
to_entries[] as $fe
| $fe.key as $file
| $fe.value as $v
| ($v.fnMap // {} | to_entries[] | select(.value.name == $func) | .value.loc) as $loc
| ($loc.start.line) as $sline
| ($loc.end.line)   as $eline
| {file:$file, name:$func, start:$sline, end:$eline} as $meta
| (reduce (( $v.statementMap // {} ) | to_entries[]
          | select(.value.start.line >= $sline and .value.end.line <= $eline)) as $s
         ({stmt_tot:0, stmt_hit:0};
          .stmt_tot += 1
          | .stmt_hit += ( ($v.s[$s.key] // 0) | tonumber | if . > 0 then 1 else 0 end ))) as $S
| (reduce (( $v.branchMap // {} ) | to_entries[]
          | select(.value.loc.start.line >= $sline and .value.loc.end.line <= $eline)) as $b
         ({br_tot:0, br_hit:0};
          .br_tot += ( ($v.b[$b.key] // []) | length )
          | .br_hit += ( ($v.b[$b.key] // []) | map( (.|tonumber) | if . > 0 then 1 else 0 end ) | add // 0 ))) as $B
| $meta + $S + $B
| .linePct    = (if .stmt_tot==0 then 0 else (.stmt_hit / .stmt_tot * 100) end)
| .branchPct  = (if .br_tot==0  then 0 else (.br_hit  / .br_tot  * 100) end)
| .overallPct = (if (.stmt_tot + .br_tot)==0 then 0 else ((.stmt_hit + .br_hit) / (.stmt_tot + .br_tot) * 100) end)
| select(.overallPct < ($threshold * 100))
| "file: \(.file)\nname: \(.name)\noverallPct: \(.overallPct)\n"
' $1

I put this in a file called is-jscoverage-pct-low, which I call like this

is-jscoverage-pct-low coverage/coverage-final.json dbSetUserPlanBlock

Step 3: Combine the two

Given a list of functions (from step 1) I can filter it based on coverage using xargs like this:

"changerisk": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'|grep Maximum |cut -d\"'\" -f2-|sort -t' ' -k6,6nr|cut -d' ' -f1,6|cut -d\"'\" -f1 | xargs -n1 is-jscoverage-pct-low coverage/coverage-final.json"

I call it like this:

yarn changerisk

And it outputs this:

file: /Users/lou/project/momentum/backend/src/db/entities/plannedBlock.service.ts
name: dbDeleteUserPlannedBlocksForTimeSpan
overallPct: 55.55555555555556

file: /Users/lou/project/momentum/backend/src/db/entities/plannedBlock.service.ts
name: dbSetUserPlannedBlocksForTimeSpan
overallPct: 62.5

file: /Users/lou/project/momentum/backend/src/db/entities/user.service.ts
name: dbSetUserSettings
overallPct: 54.54545454545454

Given a list like this, you could use it to:

Onboard a new developer: To fix, they need to refactor and unit-test the functions. They can likely do this without knowing much about your system. This allows them to concentrate on learning your PR processes.
Identify risky estimates: Anyone creating an estimate of a project that will change code should see if the files and functions they intend to change are risky.
Plan tech debt remediation projects: In my book, Swimming in Tech Debt, I outline a process for building and managing a tech debt backlog. You could use a list like this to build backlog items to tackle.
Build a dashboard: It would be nice to show the rest of the org that the number of risky functions you have is decreasing over time.

How Product Managers and Engineering Teams Can Work to Together To Tackle Tech Debt

I participated in a discussion with Arin Sime and Kent McDonald about tech debt and how PMs and Engineers can work together to address it.

A lot of what I said is covered in detail in my forthcoming book, Swimming in Tech Debt. Here’s an overview of the topics we discussed:

What we mean by Technical Debt
How much time to allocate to it
Whether it’s sometimes ok to ignore it
What to do with code so bad everyone is afraid to touch it
Vibe Coding (of course) and whether it’s a tech debt creating machine

If you would like more detail on anything we talked about, reach out to me on LinkedIn.

Push, Kick, and Swim (through Tech Debt)

My swimming workouts are in a pool, so each lap starts with me pushing off the pool wall, kicking underwater for a bit, and then turning that momentum into a freestyle swim until I get to the opposite wall and start again. The speed of my lap is determined by the efficiency of my strokes, but the push and kicks overcome the water resistance and generate the initial momentum. That push-off is analogous to how I incorporate tech debt payments into my work and is the core idea in my book, Swimming in Tech Debt.

In a single lap, most of the distance is covered by swimming, and that’s the same in my programming. Most of what I do will be directly implementing the feature or fixing the bug, but I start with a small tech debt payment to get momentum. That small payment is improving the area I am about to change, which makes it easier and faster to do that change.

After the push comes underwater kicking, which is so effective that its use is limited to 15 meters in competitions. After that, the swimmer must begin normal strokes. The same principle applies to tech debt payments. They are effective, but they are not the goal. If all you do is pay down debt, you won’t deliver anything of real value. Paying tech debt makes me happy, so I have to limit how much time I spend on it and get back to my task.

Finally, while I am swimming, no matter how tired I am or how slow I am going, I know I’ll get to the other side eventually. When I do, I get to push and kick again to get some extra momentum. Similarly, when I am stuck on a coding task, I sometimes switch to an easy and productive task (like adding a test) while my brain works on the problem in the background. I know I will do this if I have to, so I keep coding on the main problem for as long as I can. I finish my lap.

Then, I push and kick to start a new lap. That cadence of pushes, kicks, and then a nearly full lap of coding is how I finish the task at hand but leave a series of tech debt payments in my wake.

A Good Pull Request Convinces You That it is Correct

My onboarding peer-mentor at Trello described a good pull request as telling a story. In practice this meant that you would edit and order the commits after you were done so that the reviewer could go commit-by-commit and understand the change you made in steps.

So, for example, let’s say you were working on a feature. In your second commit, you introduce a bug, but then in your fifth commit, you find and fix that bug. Instead of just PR’ing that, you would edit the commits so that the bug was never introduced.

This is analogous to sending a document with superfluous text deleted, not crossed out. If you don’t edit the commits, you will waste the reviewers time because they might see the error in the second commit, make a comment, and then have to go back and amend their comment in the fifth. If you did this a lot, they might not even finish reviews before rejecting them (which is what I suggest you do to PRs with obvious problems).

I like the story frame, but I have started to think of a PR as more of an argument of its own correctness. I am trying to teach the reviewer the details of the problem and convince them through evidence that the new code is correct.

In a PR, I might start by introducing a unit test into the area you intend to change. Then, to make things clearer, I might commit a small refactoring (that isolates the change). It’s now possible to add more tests and possibly a failing one that shows what my intended fix will address. My small code clean-up commits are in service of that argument. Without them, it’s hard to tell if my fix won’t break something else. With them, the fix feels like a natural and inevitable conclusion.

Like a philosophical argument, I will anticipate and address the cases the reviewer might think of. But it’s not enough to handle a case in the code, your whole PR needs to make it clear that you anticipated and addressed it (with tests, comments, screenshots or any other evidence you can think of).

But the most important reviewer to convince is myself, of course, and doing the work to write the argument gives me confidence that my code is correct.

Four Ways to Augment Code Coverage

Code Coverage by itself is a hard metric to use because it can be gamed, and so it will suffer more from Goodhart’s Law, which is summarized as “When a measure becomes a target, it ceases to be a good measure.” Goodhart’s Law observes that if you put pressure on people to hit a target, they will, but maybe not in the way you wanted.

And this would happen with code coverage because we can always increase coverage with either useless tests, tests of trivial functions, or tests of less valuable code.

I use these metrics in combination with coverage to make it harder to game:

Code Complexity: The simplest way to do this is to count the branches in a function. I use extensions in my code editor to help bring complex code to my attention. If coverage of the function is also low, I know that I can make the code less risky to change if I test it (or refactor it).
Usage analytics: If you tag your user analytics with the folder that the code generating it is in, you can later build reports that you can tie back to your coverage reports. See Use Heatmaps for iOS Beta Test Coverage. In that post, I used it to direct manual testing, but it would work for code coverage as well.
Recency of the code: To make sure that my PRs have high coverage, I use diff_cover. This makes it more likely that my tests are finding bugs in code that is going to be QA’d soon and has already been deemed valuable to write. Very old code is more likely to be working fine, so adding tests to it might not be worth it. If you find a bug in old code worth fixing, it will generate a PR (and become recent code).
Mutations: I am still trying to find a good tool for this, but this lets you test the quality of your assertions in addition to your coverage. I do it manually now.

Generally, the way to make a metric harder to game is to combine it with a metric that would be worse if it was gamed in ways you can predict (or have seen).

Mutation Testing

A couple of years ago, I wrote about a testing technique that I had learned, but didn’t remember the name of, so I called it code perturbance. I mentioned this technique in my book, and a helpful beta reader let me know the real name: mutation testing.

The idea is to intentionally change the code in a way that introduces a bug, but is still syntactically correct (so you can run it). Then, you run your test suite to see if it can find the problem. This augments code coverage, which will only let you know that code was run in a test, not if an assertion was made against it.

Now that I know the name, I can find out more about it in google. For example, there are tools that can do it for you automatically. The one that I’m most interested in is Stryker-mutator, because it supports TypeScript. I’ll report back when I try it.

Technical Debt Typology Research Paper

A few months ago, I got an email from Mark Greville that included a link to a research paper he coauthored, called A Triple Bottom-line Typology of Technical Debt: Supporting Decision-Making in Cross-Functional Teams.

In the paper, the authors identify several categories of tech debt. One category is internal vs. external effects. In my book, I also identified the external category, which I call visibility. The paper thinks of the entire business as the “internal”, but I think of the team itself as the internal part. My separation is driven by difference in communication that the engineering team needs to use for itself vs. the rest of the business. Customers and other public stakeholders would likely be similar to the non-engineering business teams.

Since a lot of my book is about how tech debt affects developer productivity, I break down internal to the various ways it could reduce productivity. I use misalignment to describe tech debt that doesn’t meet the documented standards of the team. When the code is hard to change, for example if there’s messy code all over the codebase (Marbleized Code Fat), I call that resistance. If the code does what it’s supposed to (so no external effects) and the customers highly depend on its behavior, I warn about the risk of regressions.

Another pair they describe is whether the tech debt is taken knowingly or unknowingly. This is useful from a taxonomy perspective and might contribute to tech debt avoidance, but in my book, I write:

I don’t think of tech debt as the result of an intentional shortcut borrowed from the future. Some debt starts that way, but the reality is that lots of tech debt happens because the world changes. Even if your system represents your best ideas of how to solve the problem at hand, your ideas will get better, and the problems will change. You can do everything right and still have bad code, so it doesn’t help to judge the decisions that got us there. Learn from them, but it’s counter-productive to dwell on them.

My chapters on these dimensions focus on using them to decide what to do about the debt, and I don’t think intention is a factor in deciding what to do next.

The paper is worth a look and also has quite a good bibliography if you are interested in research on tech debt. Since the methodology of the research included a literature review, the list of references reviewed is another treasure trove of research.

Adam Tornhill on Tech Debt’s Multiple Dimensions

In the research for my book on technical debt, I ran into this talk by Adam Tornhill:

Adam has a similar perspective to mine: technical debt is multi-faceted, and the right strategy should address its various dimensions.

One of his examples is combining a code complexity metric with data from your source code repository to define low code health hotspots—areas where code is both complex and frequently changed. To find the hotspots, he built a tool to calculate this metric and visualize it. In the video he shows data from big open-source projects (like Android and .NET core) and pinpoints areas that would benefit from work to pay down debt.

Similarly, in my book, I identify eight dimensions of debt. Complexity is something I consider to be part of Resistance, which is how hard or risky it is to change the code. I would also incorporate low test coverage into resistance, as well as subjective criteria. Adam says that complexity is a good estimate of how many tests you need, which is true, and I give you credit for having the tests. I am mostly concerned by complex code that is undertested.

Like Adam, I believe that bad code only matters if you plan to change it. He believes that the repository history of changes is a good indication of future change, which I agree with, but to a lesser degree. In my book, I recommend that you look at the history and shared this git log one-liner as a starting point:

git log --pretty=format: --name-only --since=3.months | sort | uniq -c | sort -rg | head -10

That line will show you the most edited files. To find the most edited folders, I use this

git log --pretty=format: --name-only --since=3.months | sed -e 's/^/\//' -e 's/[^\/]*$//'  | sort|uniq -c|sort -rg|head -10

This data contributes to a dimension I call volatility. It’s meant to be forward looking, so I would mostly base it on your near future roadmap. However, it is probably true that it is correlated with the recent past. In my case, this data is misleading because I just did a reorganization of my code to pull out a shared library to share between the web and mobile versions of my app. But, knowing this, I could modify the time period or perhaps check various time periods to see if there’s some stable pattern.

Generally, my opinions about tech debt and prioritization are very aligned with what’s in this video, especially the multi-faceted approach.

Marbleized Code Fat?

I go back and forth on whether the name “Tech Debt” is the most useful term. In my book, I decided I can’t fight the term, so I use it, but in my opening paragraphs I make an argument that the problem we call tech debt isn’t like other debt.

The most common way I’ve seen tech debt is that it’s just everywhere. It’s not limited to a specific old module, it pervades the codebase like marbleized fat in a good steak, but not as delicious.

Follow-up on my Tech Debt Article in The Pragmatic Engineer Newsletter

About two weeks ago, The Pragmatic Engineer published an early excerpt from a book I’m writing on Tech Debt. This week, he published a follow-up that’s available for free. Read it here:

https://newsletter.pragmaticengineer.com/p/paying-down-tech-debt-further-learnings

If you missed the original article, it’s here:

https://newsletter.pragmaticengineer.com/p/paying-down-tech-debt

After that article came out, there were a few LinkedIn posts with good conversations about tech debt. This one by Ellery Jose has an interesting table that breaks down how to treat technical debt as a financing decision:

https://www.linkedin.com/posts/ejose_tpm-pmo-engineering-activity-7237750795348164610-5mKq/

Also, Guillaume Renoult sent me this post he wrote:

https://medium.com/elmo-software/how-to-sort-out-tech-debt-and-speed-up-development-9a1d27fdd39e

There is a section of my book that has similar ideas to these two posts, particularly team management of debt using evidence and data. I’m hoping to have a draft done of that part in a few weeks. If you want to see an early version, sign up to receive updates.