Finding Functions That Are Risky to Change

In Tech Debt Detectors and Use Your First Commit to Fix CRAP I explained the concept of combining low code coverage with high code complexity to highlight functions that are risky to change. I mostly do this in my IDE to warn me before I change code, but it’s also useful to use it to do a global search for risky functions.

My main project is in typescript and uses jest and eslint. Here’s how I automated a search for risky functions.

Step 1: Get a list of high complexity functions

Note: by complexity, I am referring to the number of independent paths through a function, which is calculated by counting branches, loops, and boolean expressions.

Eslint has rule that allows you to set a maximum complexity. In my package.json, I call eslint via yarn this way:

"lint": "eslint \"**/*.{ts,tsx}\""

I added a line that does this but with a complexity rule

"complexity": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'"

If I run yarn complexity, I get warnings that look like this:

347:8  warning  Async function 'dbSetUserPlanBlock' has a complexity of 6. Maximum allowed is 5  complexity

I do a bunch of bash-fu and this line produces a sorted list of function names with the highest complexity.

"complexity": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'|grep Maximum |cut -d\"'\" -f2-|sort -t' ' -k6,6nr|cut -d' ' -f1,6|cut -d\"'\" -f1 "

Step 2: Find the code coverage of a function

What I need now is a way to find the coverage of a function given its name. If I run jest via yarn like this:

yarn test --coverage

it will generate a json file called coverage/coverage-final.json which has all of the coverage data. It’s a complex JSON file, but if you install jq via brew, you can use this script to see if it has coverage lower than 80% (credit: ChatGPT)

jq -r -c --arg func "$2" --argjson threshold 0.80 '
to_entries[] as $fe
| $fe.key as $file
| $fe.value as $v
| ($v.fnMap // {} | to_entries[] | select(.value.name == $func) | .value.loc) as $loc
| ($loc.start.line) as $sline
| ($loc.end.line)   as $eline
| {file:$file, name:$func, start:$sline, end:$eline} as $meta
| (reduce (( $v.statementMap // {} ) | to_entries[]
          | select(.value.start.line >= $sline and .value.end.line <= $eline)) as $s
         ({stmt_tot:0, stmt_hit:0};
          .stmt_tot += 1
          | .stmt_hit += ( ($v.s[$s.key] // 0) | tonumber | if . > 0 then 1 else 0 end ))) as $S
| (reduce (( $v.branchMap // {} ) | to_entries[]
          | select(.value.loc.start.line >= $sline and .value.loc.end.line <= $eline)) as $b
         ({br_tot:0, br_hit:0};
          .br_tot += ( ($v.b[$b.key] // []) | length )
          | .br_hit += ( ($v.b[$b.key] // []) | map( (.|tonumber) | if . > 0 then 1 else 0 end ) | add // 0 ))) as $B
| $meta + $S + $B
| .linePct    = (if .stmt_tot==0 then 0 else (.stmt_hit / .stmt_tot * 100) end)
| .branchPct  = (if .br_tot==0  then 0 else (.br_hit  / .br_tot  * 100) end)
| .overallPct = (if (.stmt_tot + .br_tot)==0 then 0 else ((.stmt_hit + .br_hit) / (.stmt_tot + .br_tot) * 100) end)
| select(.overallPct < ($threshold * 100))
| "file: \(.file)\nname: \(.name)\noverallPct: \(.overallPct)\n"
' $1

I put this in a file called is-jscoverage-pct-low, which I call like this

is-jscoverage-pct-low coverage/coverage-final.json dbSetUserPlanBlock

Step 3: Combine the two

Given a list of functions (from step 1) I can filter it based on coverage using xargs like this:

"changerisk": "eslint \"**/*.{ts,tsx}\" --rule '\"complexity\": [\"warn\", 5]'|grep Maximum |cut -d\"'\" -f2-|sort -t' ' -k6,6nr|cut -d' ' -f1,6|cut -d\"'\" -f1 | xargs -n1 is-jscoverage-pct-low coverage/coverage-final.json"

I call it like this:

yarn changerisk

And it outputs this:

file: /Users/lou/project/momentum/backend/src/db/entities/plannedBlock.service.ts
name: dbDeleteUserPlannedBlocksForTimeSpan
overallPct: 55.55555555555556

file: /Users/lou/project/momentum/backend/src/db/entities/plannedBlock.service.ts
name: dbSetUserPlannedBlocksForTimeSpan
overallPct: 62.5

file: /Users/lou/project/momentum/backend/src/db/entities/user.service.ts
name: dbSetUserSettings
overallPct: 54.54545454545454

Given a list like this, you could use it to:

  1. Onboard a new developer: To fix, they need to refactor and unit-test the functions. They can likely do this without knowing much about your system. This allows them to concentrate on learning your PR processes.
  2. Identify risky estimates: Anyone creating an estimate of a project that will change code should see if the files and functions they intend to change are risky.
  3. Plan tech debt remediation projects: In my book, Swimming in Tech Debt, I outline a process for building and managing a tech debt backlog. You could use a list like this to build backlog items to tackle.
  4. Build a dashboard: It would be nice to show the rest of the org that the number of risky functions you have is decreasing over time.