Category Archives: Software Development

Patterns for Comparing Multidimensional Things

Dates and Integers have a natural ordering. We all agree that January 1st is before January 2nd and ten comes after nine. But, there is no natural ordering for things like vectors, complex numbers, and matrices because they are multi-dimensional. Unfortunately, most things in real life are multi-dimensional.

A common way to deal with ordering a list of things in software is to put their attributes in different columns. You see this in email clients, spreadsheets, and lots of other software. Then, when you click on a column, the list is sorted by that attribute. You can explore various orderings for different contexts. Some might be more useful—for example, in my WordPress backend, I sort my posts by reverse chronological date. But, it’s valid to sort them by title if you need to. Good versions use the previous sort choice as a sub-sort that breaks ties in the main sort.

A grid listing blog posts sorted by date.

Another way is to come up with some kind of function of the attributes that results in a single-dimensional attribute that is easy to compare. One that I’ve seen on flight search websites is an “Agony” score that takes into account the number of stops, the price, and the departure time. You could sort by ascending agony and hopefully see the best choice that considered all of the variables, rather than just sorting on price.

I do something like this in my iOS app, Habits. For each habit, I look at your entire history with the habit. I weight recent adherence more than the past and try to come up with a score normalized between 0-100. My intent is that you can use that to compare how well you are doing on different habits.

A third way is to map attributes to elements of a chart. One attribute could be the x-axis and another could be the y-axis. You could map one attribute to the size of a dot and another to color along a gradient. If your x and y are categories rather than a continuous value, you might end up with a heatmap. This heatmap compares the amount of testing done on different iPhones and iOS versions.

A heat map the test status of iOS devices across different features in an app

For continuous axes, you might end up with a chart like this one you can generate with chartjs:

In that last chart, it matters which attribute you map to which chart element. It’s often the case that we filter for just the upper-right quadrant, so the x and y would override color and size. You might want to rotate through different choices of the mapping.

Lastly, you could generate radar charts for each thing. Putting the attributes along a multi-dimensional graph like this one:

This works well when you want to combine things to form a balanced whole. By overlaying two radar graphs, you can see if the combination is complementary.

Blue radar chart overlayed on top of the yellow one to show the combination

But, you could also get a sense of an ordering. You could calculate the covered area, which is function of the attributes. You could size the spokes and normalize the data on them to express a priority and to dampen the effect of outliers.

I’m thinking a lot about multidimensional comparisons as I consider ways to prioritize projects. I’ll be writing more about this soon.

A good reason to use TODO

On my first job, there was a vestigial TODO that always bothered me. It said

/* TODO: PSJB, Is this right? -AW */

I eventually figured out that “PSJB” were the initials of our CEO (who wrote a lot code for the early versions). I knew who AW was, but he left before I started, so I couldn’t ask him what this meant. I wanted to just delete it, but I could never figure out if the code it referred to was right. I left the company before figuring it out—the comment might still be there.

This was a bad way to use TODO.

To avoid this problem for others, for my last PR at Atlassian, I searched for every TODO that I left in the code, which was possible because we were forced to sign them. I resolved each one in some way and then removed it. Saying “toodle-oo” by removing “TODO: Lou” from our code made me smile.

None of these TODOs were there for good reason.

Since I’m 100% in charge of my code style guide these days, I don’t allow TODOs to be merged (it won’t pass linting). But, I do use TODO in the code while I am building a PR if it’s convenient—I’ll be forced to remove it before merge.

The only other TODO I’m ok with is ones that are going to be resolved very soon (within the grace period)—hopefully in a stacked PR.

If you have (tech) debt, inflation is good

I rent my apartment. I moved here in 2018, and over the last 5 years (because of many factors, but mostly inflation), my rent has gone up about 30%. Inflation over that time is about 22%, so it’s even gone up in constant dollars.

If you had a fixed-rate mortgage, your payments would have gone up 0%. When you have a mortgage, inflation makes housing costs relatively smaller in your total budget because everything else goes up. If your interest rate is lower than inflation, then you get to pay the debt back in lower-value dollars.

The same is true for tech debt where inflation is the increasing size of the codebase and the team. Writing more code shrinks the relative cost of the debt you have. Having more team members makes paying tech debt a smaller proportion of your work.

If you had a metric of tech-debt, new good code would tend to lower it. This is true as long as interest on the tech debt is not too high. For tech debt, interest payments are only due if you want to change the code.

If your roadmap requires you to mostly change tech-debt-laden code, then inflation is low (no new code) and so the interest payments are high. This is a good time to prioritize paying tech debt down.

Conversely, code that has debt, but basically works and is not going to be changed, is like having a 0% loan. You have the loan. It may one day come due, but at least you don’t have to service it if you don’t want to. If your team and codebase doubles in size, that debt will feel smaller.

How to Lower Tech Debt with One Easy Trick

Yesterday, I wrote about the Tech Debt Detectors that I use in Visual Studio Code.

Here’s what it looks like for one of my CRAP-y functions. The red bars in the left-gutter show that I don’t test this function at all, and the red square at the end of line 4 shows that it has a lot of branches.

I wrote this function to make it easier to call GQL Mutate functions with boiler-plate error handling. This function reduces the complexity of each calling function. I am ok with this being complex, so to reduce CRAP, I should be testing each branch. I was surprised that I didn’t already do this, because I test GQL calls with a mock server. I did a full text search for for the function name, and I see that … I NEVER USE IT?!

Ah yes, now I remember. I didn’t like that this code wasn’t type-safe, so I generated type-safe variants from my queries (see Why I am Using Code Generation Again, Part I and Extending GraphQL Code Generation (Part II).

I never removed this function after migrating all code to the new version, so I did it now. Deleting code is a great way to lower tech debt. No code—no debt.

Tech Debt Detectors

When I wrote Use Your First Commit to Fix CRAP I said that “there are extensions for many IDEs to get you the [CRAP] metric directly”, but I hadn’t installed any. I thought that the two components of CRAP were easy enough to notice without them, but that’s only half-way true. Today, I use two extensions for Visual Studio Code to help make CRAP-y more evident to me.

Note: The CRAP metric indicates that a function is risky to change because it’s complex and undertested. To fix the function, you either need to break it into smaller functions or add tests—both actions are generally good, so it’s a metric that’s hard to game.

The first component of a CRAP-y function is its complexity, which you can estimate by counting its branches. So, count each if/else-if/else, case in a switch, loop/break/continue, and each or/and in your boolean expressions. You are trying to get an idea of how many paths there are. Since, you want to keep function complexity very low, you really don’t need to count every branch—you can stop at some low (single-digit) number. It isn’t hard to estimate a YES/NO answer to the question of complexity for any particular function, but the problem is remembering to ask.

To get complexity in Visual Studio Code, I am using CodeMetrics by Kiss Tamás. For each function, the extension shows a green, yellow, or red indicator and a short message above the function.

The second component of CRAP is test coverage. To show that in my editor, I use Coverage Gutters. This extension shows red and green markers to the left of the code to indicate if a line was run during tests. It needs you to generate standard code coverage files, which jest can do for me. It should support any language that has standard coverage support (i.e. in lcov format).

I’ll show some examples of what this looks like and how I fixed problem areas in upcoming posts.

Projects that fail never pay off tech debt

I just shut down a project I started in October 2021. It was code for a startup, but it turned out the idea didn’t have traction, and my partner and I decided that it wasn’t worth pursuing. The tech debt in this project will never be paid. If I had been paying it all along, it would have been a waste of time.

This was not a full-time project for me, and I am the only developer on it, so there’s not a ton of code. But, even a three-month project could have a little debt, so even though it’s not that old, it had some debt.

Like most projects, it had dependencies. I just checked my yarn.lock files and I see that the last time I did an update was about a year ago. I consider all third-party dependencies to be tech debt, especially as they get out of date, so that’s one that’s always building on most projects. The only way to avoid dependency debt is to not have dependencies. Which, in a way, is true now.

The biggest codebase issue that I was wrangling with was authorization. The permission model was getting a little out of control, and the code wasn’t helping make sense of it. I had been planning something more attribute based in the code, but well, now I don’t have to worry about it.

If there’s a lesson to learn here, it’s this: Don’t rush to pay off debt in projects that have a good chance of dying. The goal should be to get customers. To the extent that it’s not externally perceivable to customers, code health is usually not much of a factor in early traction.

Tech Debt in a 3 Month Old Project

I’ve been working on a new web app since October. The backend is a GraphQL API on top of a database schema. Each entry point unpacks the request and then calls a function that queries or mutates the database. There are some complex mutations—ones where there might be updates, inserts, and deletions that all need to succeed or fail together.

I know that I’m supposed to do this with transactions, but I was also using an ORM that I’m not very familiar with, and wanted to make progress quickly. I decided to pay lip-service to transactions, wrap some functions in them to indicate to myself that I needed to think about them, but mostly just ignore them. I would never not do transactions in a production app, but for this, it made sense.

After MVP, a month or so ago, I’ve just been adding minor things while my partner onboards some trial users. Eventually, I tried to implement a feature where the API call could result in many inserts and deletions, and I could see in tests that I could cause this to fail in ways that corrupted the data.

So, I had to stop, learn how to do transactions in this ORM, and then implement them in my data code. It took a few hours, and now the debt is paid off.

I do this kind of short-term borrowing/paying of debt all of the time. I am borrowing to keep myself in flow. I try to keep the debt top-of-mind by making it very evident in the code. It’s like using a credit card where you pay it off inside the grace period.

Be Skeptical of Points-based Productivity Claims

I do not personally use Story Points in my estimates because I know that everyone outside of development will translate them to time and I want to do that for them.

But, consider this claim: Adopting GitHub Copilot will increase productivity by 20%. I actually believe that to be true, but can you show it with Points? No, you cannot.

Here’s how it should go

  1. You do 10 sprints, you see that you have a steady state velocity of 100 Points
  2. You introduce GitHub Copilot
  3. Maybe for 2 sprints, you see velocity go up because developers over-estimate their stories. Let’s say it’s 120 now. Better report that quick, because …
  4. Then, devs start to adjust their estimates and velocity goes back to where it was.

Productivity went up, but velocity should stay constant because points are just time. You don’t get more time because of productivity gains.

If you see sustained velocity improvements (without changing the number of team members), then I would suspect gaming or a misunderstanding of how points are supposed to work.

Points and Velocity are best for front-line managers to understand what is going on with their teams and to size sprints. They should not be reported outside of the team, because they will be misunderstood and misapplied.

Making My Own CMS Worked Out in the End

In 2013 I started App-o-Mat to host content I was creating to teach iOS Development.

App-o-Mat is a totally custom CMS built with Django and Python that I have kept up to date for 10 years. It has been migrated from version to version of Django, from Python 2.x to 3.x, and I just recently replaced all of the Bootstrap with Tailwind.

I had learned Django in 2006, which is also when I learned Python. At the time, Ruby and Rails was probably the more obvious choice, but I’m actually glad I chose Python. Python has proven to be more enduring and useful in more contexts (e.g. AI).

From 2013 to 2021, I was full-time on iOS Development. App-o-Mat was the only web development I did. But, now all of the things I want to make are either for the web or best to be done in Python.

I know that making your own CMS is kind of nuts, but if I didn’t do that I wouldn’t know Python and my webdev knowledge would have atrophied. I think using some back-burner (non primary) tech stacks in side-projects might be a good idea—or it least it was for me.

Self-Hosting a Podcast: 2.5 Years Later

It’s been 2.5 years since I started working on a podcast and decided to self-host using the Blubrry WordPress Plugin and S3. Here are my original reasons and current thinking:

  • I want the episodes to be available indefinitely, even if I stop making new ones“: This worked out great since I took a break for 2 years between episode 15 and 16. It would have felt bad paying a bill for something I wasn’t actively doing.
  • I don’t want to pay for just hosting“: The key is “just hosting”. I think there are potentially a lot of things I (as an amateur podcaster) might like to pay for, but I didn’t see anyone offering something I cared about.
  • I don’t care about analytics“: This is the main downside to self-hosting. I rolled my own analytics, but I’m not 100% sure they are correct. The problem is that to get anyone else to do your analytics, you have to send listeners to their URLs, which I am unwilling to do—partly because of the privacy leakage, but mostly because I value my URLs too much. I haven’t found this, but I’d like service I could upload weblogs to and get useful podcast-oriented analytics from.
  • “I have the skills and desire to learn how to self-host“: I don’t think you should self-host unless this is true for you.

Two years later, I can say that my system is easy to use and never requires any intervention to keep running. I never think about the reliability of S3 or Blubrry. My analytics scripts have needed tweaking, but that has mostly settled down.