Category Archives: Software Development

Protecting Myself

I was recently on the Scrum Master Toolbox podcast with Vasco Duarte in a series about AI assisted coding. In it, I said that I read every line of AI generated code (and fix it, if necessary) before I commit it.

This isn’t exactly right.

I read all code, even my own, and fix it before I commit. Doing it for AI is just an extension of how I normally work. I do this for my code because I often find problems. This is even more true for AI generated code.

Another reason to do this is because it makes code go through code review and testing faster. I have written about that previously:

Now that I am the only programmer on my project, I don’t need to worry about code review, but I do have to worry about DevOps, and frankly, I am not willing to trust AI to write code that I have to run. I have already fixed code that introduced beginner level security problems, so pure Vibe Coding on a project meant to be used by others on the web is not an option for me.

Dependency Maintenance vs. Supply Chain Attacks

I am assuming that you basically know what a supply chain attack is, but briefly, it’s when the code you install as a dependency in your development project contains malware. Unfortunately, all dependencies are code, and this code is usually run at a high privilege without needing to be signed.

The main thing it will try to do is grab keys and secrets from your .env files or environment variables and exfiltrate them. Some are targeted at blockchain developers and will try to steal their coins.

This is not a comprehensive guide. I am documenting my decisions based on my needs.

Like William Woodruff, I agree that We Should All Be Using Dependency Cooldowns. TL;DR is in the title. Essentially, never install a dependency that isn’t at least a few days old. The downside is defense against 0-day security fixes. If this is an issue, you could take the time to investigate and adopt the fix with an override.

The other broad advice with little downside is to not allow install scripts to run. You might still install malware, but if you let the install scripts run, they own you immediately. But since you are likely about to run the code inside your project, it’s not much protection. I do it anyway. The downside is when a dependency needs its post-install script to work. I used can-i-ignore-scripts to check for this issue when I used npm.

Ultimately, though, I have decided to leave the npm ecosystem and stop using node and React. Other ecosystems can have supply chain problems, but npm is having them on a regular basis because they are a prime target, and their practices have not scaled enough to deal with this.

I have also left Cursor and gone back to VSCode because Cursor’s fork cannot install the latest version of VSCode extensions. Extensions are also part of the supply chain and can be either malware or a hacking vector, so not being able to update them is not an option for me.

My next decision was to build a dedicated machine for software development. This machine does not have my personal data or information on it. It is not logged into any personal service (like my email). I have not yet dockerized all of my dev environments on it, but that’s a likely next step.

I also limit my dependencies. Another benefit of leaving the JS ecosystem is that Python isn’t as reliant on so many tiny dependencies. I was shocked at how many dependencies React, TypeScript and node/Express installed (I counted 10s of thousands of files in node_modules), and this is before you have written one line of application code. I like the batteries-included ethos of Django and Python. Most of what I need is built-in.

I have written a lot about dependencies and how it’s tech debt the moment you install it.

My final defense against supply chain problems is to have a regular dependency updating policy. Of course, this needs to be done with a cooldown, but my main reason to do it is because ignoring dependencies makes it very hard to do something about problems in the future. The more out of date you are, the harder everything is. Regular updating will also remind you of how bad it is to have dependencies.

To make this palatable, I timebox it. It really should take less than an hour for my project. Even at Trello, it only took a few hours to update the iOS project, which we did every three weeks. You also need extensive, automated test suites and time to test manually.

If updating takes longer for some reason, then the dependency that is causing this is now suspect. I will probably plan to remove it. If I need it (like Django), then I consider this a dry-run for a project I need to plan.

How I Learned Pointers in C

I learned C in my freshman year of college where we used K&R as our text book. This was 1989, so that text and our professor were my only source of information.

But, luckily, I had been programming for about six years on a PET, TRS-80, and Commodore 64. It was on that last computer that I learned 6502 Assembly. I had been experimenting with sound generation, and I needed more performance.

This was my first instance where Knowing Assembly Language Helps a Little.

When we got to pointers in the C class, the professor described it as the memory address of a variable. That’s all I needed to know. In Assembly, memory addresses are a first-class concept. I had a book called Mapping the Commodore 64 that told you what was at each ROM address. Doing pointer arithmetic is a common Assembly coding task. You can’t do anything interesting without understanding addresses.

So, I guess that I learned about C pointers at some point in my learning of 6502 Assembly. Since C maps to Assembly, by the time we got to it, it felt natural to me. If you are having trouble with the concept, I’d try to write simple Assembly programs. Try an emulator of 6502, and not, for example, something modern. Modern instructions are not designed for humans to code easily, but older ones took that into account a little more.

Using Fuzzy Logic for Decision Making

In the 90’s, I read a book about fuzzy logic that would feel quaint now in our LLM-backed AI world. The hype wasn’t as big, but the claims were similar. Fuzzy logic would bring human-like products because it mapped to how humans thought.

Fuzzy Logic is relatively simple. The general idea is to replace True and False from Boolean logic with a real number between 1 (absolutely true) and 0 (absolutely false). We think of these values more like a probability of certainty.

Then, we define operations that map to AND, OR, and NOT. Generally, you’d want ones that act like their Boolean versions for the absolute cases, so that if you set your values to 1 and 0, the Fuzzy logic gates would act Boolean. You often see min(x, y) for AND and max(x, y) for OR (which behave this way). The NOT operator is just: fuzzy_not(x) => 1.0 - x.

If you want to see a game built with this logic, I wrote an article on fuzzy logic for Smashing Magazine a few years ago that showed how to do this with iOS’s fuzzy logic libraries in GameplayKit.

I thought of this today because I’m building a tool to help with decision making about technical debt, and I’m skeptical about LLM’s because I’m worried about their non-determinism. I think they’ll be fine, but this problem is actually simpler.

Here’s an example. In my book I present this diagram:

Diagram showing Pay and Stay Forces

The basic idea is to score each of those items and then use those scores to make a plan (Sign up to get emails about how to score and use these forces for tech debt).

For example, one rule in my book is that if a tech debt item has high visibility (i.e. customers value it), but is low in the other forces that indicate it should be paid (i.e. low volatility, resistance, and misalignment), but has some force indicating that it should not be paid (i.e. any of the stay forces), then this might just be a regular feature request and not really tech debt. The plan should be to put it on the regular feature backlog for your PM to decide about.

A boolean logic version of this could be:

is_feature = visible && !misaligned && !volatile && !resistant && 
              (regressions || big_size || difficult || uncertain)

But if you did this, you have to pick some threshold for each value. For example, on a scale of 0-5, a visible tech debt item be one with a 4 or 5. But, that’s not exactly right because even an item scored as a 3 for visibility should be treated this way depending on the specific scores it got in the other values. You could definitely write a more complex logical expression that took this all into account, but it would hard to understand and tune.

This is where fuzzy logic (or some kind of probabilistic approach works well). Unlike LLMs though, this approach is deterministic, which allows for easier testing and tuning (not to mention, it’s free).

To do it, you replace the operators with their fuzzy equivalents and normalize the scores on a 0.0-1.0 scale. In the end, instead of is_feature, you more get a probability of whether this recommendation is appropriate. If you build up a rules engine with a lot of these, you could use the probability to sort the responses.

Fuzzy logic also allows you to play with the normalization and gates to accentuate some of the values over others (for tuning). You could do this with thresholds in the boolean version, but with fuzzy logic you end up with simpler code and smoother response curves.

Moving from React to HTMX

I have been building web applications in React until very recently, but now I’ve moved to HTMX. Here are the main differences so far. For reference, my React style was client-side only for web-applications.

  1. Client State: In React, I used Redux, which had almost all of the client state. Some state that I needed to be sticky was put into cookies. There was some transient state in useState variables in components. Redux was essentially a cache of the server database. In HTMX, the client state is in the HTML mostly in a style called Hypermedia As the Engine of Application State (HATEOAS). Right now, I do cheat a little, sending a small JavaScript object in a script tag for my d3 visualizations to use.
  2. Wire Protocol: To feed the React applications, I had been using a GraphQL API. In HTMX, we expect HTML from REST responses and even over the web-socket.
  3. DOM Updates: In React, I used a classic one-way cycle. Some event in React would trigger an API call and optimistic update to Redux. The Redux would trigger React Component re-renders. If the API failed, it would undo the Redux, and then re-render the undo and any additional notifications. In HTMX, the HTML partials sent back from the REST request or websocket use the element’s id and HTMX custom attributes to swap out parts of the DOM.
  4. Markup Reuse: In React, the way to reuse snippets of markup is by building components. In HTMX, you do this with whatever features your server language and web-framework provide. I am using Django, so I use tag templates or simple template inclusions. Aesthetically, I prefer JSX over the {%%} syntax in templates, but it’s not a big deal. There are other affordances for reuse in Django/Python, but those are the two I lean on the most.
  5. Debugging: In React, I mostly relied on browser developer tools, but it required me to mentally map markup to my source. This was mostly caused by my reliance on component frameworks, like ReactNative Paper and Material. In HTMX, the source and browser page are a very close match because I am using simpler markup and Bulma to style it. It’s trivial to debug my JavaScript now because it’s all just code I wrote.
  6. UI Testing: In React, I used react-testing to test the generated DOM. In Django, I am using their testing framework to test Django views and the HTML generated by templates. Neither is doing a snapshot to make sure it “looks right”. These tests are more resilient than that and are making sure the content and local state is correct. I could use Playwright for both (to test browser rendering) and it would be very similar.

Also, in general, I should also mention that the client-server architecture of my projects is also quite different. In React, I was building a fat-client, mobile-web app that I was designing so that it could work offline. The GraphQL API was purely a data-access and update layer. All application logic was in the client-side. All UI updates were done optimistically (i.e. on the client side, assuming the API update would succeed), so in an offline version, I could queue up the server calls for later.

My Django/HTMX app could never be offline. There is essentially no logic in the client side. The JavaScript I write is for custom visualizations that I am building in d3. They are generic and are fed with data from the server. Their interactions are not application specific (e.g. tooltips or filters).

This difference has more to do with what I am building, but if I needed a future where offline was possible, I would not choose HTMX (or server rendered React).

Early Thoughts on HTMX

I found out about HTMX about a year ago from a local software development friend whose judgement I trust. His use case was that he inherited a large PHP web application with very standard request/response page-based UI, and he had a need to add some Single Page Application (SPA) style interactions to it. He was also limited (by the user base) to not change the application drastically.

HTMX is perfect for this. It builds on the what a <form> element already does by default. It gathers up inputs, creates a POST request, and then expects an HTML response, which it renders.

The difference is that, in HTMX, any element can initiate any HTTP verb (GET, POST, DELETE, etc) and then the response can replace just that element on the page (or be Out of Band and replace a different element). This behavior is extended to websockets, which can send partial HTML to be swapped in.

To use HTMX on a page, you add a script tag. There is a JavaScript API, but mostly you add custom “hx-*” attributes to elements. It’s meant to feel like HTML. I would say more, but I can’t improve on the HTMX home page, which is succinct and compelling.

My app is meant to allow users to collaboratively score and plan technical debt projects. My intention is to improve on a Google Sheet that I built for my book. So, to start, it needs to have the same collaborative ability. Every user on a team needs to see what the others are doing in real-time. HTMX’s web socket support (backed by Django channels) makes this easy.

Since the wire protocol of an HTMX websocket is just HTML partials, I can use the same template tags from the page templates to build the websocket messages. Each HTML partial has an id attribute that HTMX will use to swap it in. I can send over just the elements that have changed.

Tomorrow, I’ll compare this to React.

Dev Stack 2025, Part X: networking

This is part of a series describing how I am changing my entire stack for developing web applications. My choices are driven by security and simplicity.

This last part about my new dev stack environment will be about how my machines are set up to work together. As I mentioned in the introduction to this series, my primary concern was to get development off of my main machine, which is now all on a Framework desktop running Ubuntu.

Before this, my setup was simple. I had a monitor plugged into my laptop over Thunderbolt and my keyboard and mouse were attached via the USB hub the monitor provided (keyboard) or bluetooth (mouse). When I introduced the Framework, I moved to a USB mouse in the hub, and now I could switch my whole setup from Mac to Framework by unplugging/plugging in one USB-C cable.

But I had a few development use cases that this didn’t support well:

  1. I sometimes need to code with someone over Zoom. My webcam, mic, and headphones are staying connected to the Mac.
  2. I regularly program outside of my office in co-working environments.
  3. I need to support programming while traveling.
  4. I want to be able to go back and forth to between the machines while working at my desk.

To start with, I tried using remote desktop. There’s an official client for Mac made by Microsoft and it’s built into Ubuntu. As I mentioned in my Linux post, I was surprised at how hard this way to troubleshoot. The issue is that you can’t RDP to a Linux box unless it is actively connected to a monitor. So, at first I just left it plugged in while taking the laptop outside. But, this was not ideal.

There are a few solutions for this, but the easiest for me was just buying a virtual HDMI plug. They are cheap and fool the machine into thinking it has a monitor.

To even get RDP to work at all though I needed to make some way for the two machines to see each other. Even in my home office, I put them on different networks on my router. But, I would also need to solve this for when I’m using my laptop outside of my network. This is what Tailscale was made for.

Tailscale is a VPN, but what sets it apart is its UX. You install it on the two machines, log them in to Tailscale, and now they are on a virtual private subnet. I can RDP at my desk or from a café. I can share the Mac “space” that is running RDP over Zoom. The setup was trivial.

So far this has been fine. I don’t even notice the VPN when I am coding at home. When I am outside, it’s a little sluggish, but fine. AI coding makes it more acceptable, since I don’t have to type and navigate code as much.

Dev Stack 2025, Part IX: tooling

This is part of a series describing how I am changing my entire stack for developing web applications. My choices are driven by security and simplicity.

This part will be a catch-all for VSCode extensions and other tools. Some are new to me, some came over from other projects.

  1. coverage.py – I use this on Page-o-Mat, which is also in Python.
  2. Coverage Gutters – this shows coverage right in VSCode. It works with anything that produces standard coverage files, so it works well with coverage.py. I wrote about how I use that here and in my book.
  3. mutmut – This is something I have been playing around with in Page-o-Mat because of my interest in mutation testing. I contributed a feature a few months ago, which I’ll cover later this month. I’ll be using it more seriously now.
  4. flake8 and black – for linting and auto-formatting. This is more necessary as I use AI since its style adherence isn’t perfect.

I still haven’t figured out what I will do for JS testing. I used jest before, but it doesn’t meet my criteria of low-dependencies. Might have to start with this gist for TestMan.

I also need to replace my code complexity extension (it was for JS/TS). I might see how long I can last without it because the main replacements don’t have enough usage to consider installing (VSCode extensions are another hacking vector, like supply chain attacks).

Dev Stack 2025, Part VIII: uv

This is part of a series describing how I am changing my entire stack for developing web applications. My choices are driven by security and simplicity.

This one is easy. Before my latest project, I used pyenv, virtualenv, and then pip with requirements.txt for Python projects. But, since I am friends with Becky Sweger, and read this post about uv, I knew better (but didn’t yet overcome my inertia). Starting fresh meant that I could finally get on modern tools.

I could write more about why, but I am not going to do better than Becky, so go to her blog where she has a uv category with all of her thoughts on it.

Dev Stack 2025, Part VII: Sqlite

This is part of a series describing how I am changing my entire stack for developing web applications. My choices are driven by security and simplicity.

Since Django uses an ORM, switching between databases is relatively easy. I usually pick MySQL, but I’m going to see how far I can get with Sqlite.

The project I am working on is to make a better version of the tech debt spreadsheet that I share in my book (sign up for the email list to get a link and a guide for using it). The app is very likely to be open-source and to start out as something you host yourself. So, I think Sqlite will be fine, but if it ever gets to the point where it won’t work, then switching to MySQL or Postgres shouldn’t be that hard. My DB needs are simple and well within the Django ORM’s capabilities.

Even if I decide to host a version, I might decide on a DB per tenant model, which might be ok for Sqlite. Another possibility is that it would be something in the Jira Marketplace, and in that case, I’d have to rewrite the backend to use Jira for storage, but that wouldn’t be that bad because (given the Jira data-model) I only need to add some custom fields to an issue. Most of the app at that point would be the visualizations and an expert system.

One nice thing about Sqlite is that it’s trivial to host. It’s just a few files (with WAL mode). It’s also trivial to run unit-tests against during development. You can do it in-memory, which is what Django testing does by default. I can also run those test suites against more powerful databases to make sure everything works with them too.

One portability issue is that if I get used to running against Sqlite, I will probably not notice performance issues. Since Sqlite is just some local files, it’s incredibly fast. You can feel free to do lots of little queries to service a request and not notice any latency issues. The same style over a network, potentially to a different datacenter, won’t work as well.

But I have seen enough evidence of production SaaS products using Sqlite, that I think I can get to hundreds of teams without worrying too much. I would love to have a performance problem at that point.

In my book, I talk about how technical debt is the result of making correct decisions and then having wild success (that invalidate those choices). I don’t like calling these decisions “shortcuts” because that word is used a pejorative in this context. Instead, I argue that planning for the future might have prevented the success. If this project is successful, it’s likely that Sqlite won’t be part of it any more, but right now it’s enabling me to get to first version, and that’s good enough.