Category Archives: AI

The Infinity-X Programmer

Forget about the 10-X programmer. I think we’re in a time where AI coding assistants can make you much better than that.

Even if you think I’m crazy, I don’t think it’s a stretch that some programmers, particularly less experienced ones, will get a big relative boost compared to themselves without AI. Meaning, they could become 10x better using Cursor than they would be if they didn’t use AI at all.

The norm is less for experienced devs. I think I’m getting about a 2x or 3x improvement for my most AI-amenable tasks. But when I have to do things on projects where I don’t know the language ecosystem as well, it’s much more. So, it’s less about overall skill, and more about familiarity. As long as you know enough to write good prompts, you get more of a multiple the less you know. For example, for my main project, I might save an hour on a 4-hour task, but a junior dev might save days on that same task. Even if I finish it faster this time, they are still going to improve on that same kind of task until we’re about the same.

But, I also think it’s possible to get very high objective, absolute multipliers against all unassisted programmers with projects that are not even worth trying without the AI assistance.

I’ve started calling this Infinity-X programming. I’m talking about projects where it would take weeks for a programmer to complete, but no one is sure that it’s worth the time or cost. Using tools like Cursor and Replit, I’ve seen instances where a person with some programming ability (but not enough to program unassisted) do it on the side, working on it for fun just because they want to. They get somewhere fast, and now we might approve more work because we can see the value and it feels tractable. I’ve seen this happen a few times in my network lately.

It’s not just “non-programmers”. I’m also seeing this among my very experienced programmer colleagues. They are trying very ambitious side-projects that would be way too hard to do alone. They wouldn’t have even tried. But, now, with AI, they can make a lot of progress right away, and that progress spurs them on to do even more.

Without AI, these bigger projects would be too much of a slog, with too many yak-shaving expeditions, and lots of boring boilerplate and bookkeeping tasks. But, with AI, you get to stay in the zone and have fun, making steady progress the whole way. It makes very big things feel like small things. This is what it feels like to approach infinity.

How it Feels to “Program” with AI

When I type a prompt into the chat pane in Cursor, it is indistinguishable from programming to me. The part where I tap tap tap on the keyboard and code comes on the screen isn’t programming, that’s typing. The part where I use keyboard shortcuts to navigate the IDE isn’t programming either. Both of those parts (the typing and navigating) is being done by a robot when I prompt Cursor, but the programming is still done by me.

When I look at a ticket in JIRA that says, for example, “add a way to archive a contact” in my React/Node/MySql application, when I estimate, I think

Add an archived field to the contact entity, default to false, set as non-nullable
Generate a migration and run it on my local database
Add DB service functions to archive and unarchive contacts
Write unit tests for those DB service functions
Add GQL mutation functions to archive and unarchive a contact
Add archived to client GQL queries
Add archived to the client-side contact model by running the GQL code generator
Make sure to set up the model’s archived field from the GQL query in Redux
Add a Redux reducer to set the archived field
Add Client-side functions to optimistically update the redux and call the GQL mutation (undoing on error)
Add an “archive”/“unarchive” button on edit on the contact edit dialog (show the one that applies to the contact)
Look at lists that show contacts and decide if they need a way to filter archived contacts out or not

I can tell you from experience, that I can do steps 1, 3, 4, and 5 with a prompt that has basically what that says and at-mentioning the files that will be updated and that serve as a model (I probably have another entity with an archived field). Step 2 is a yarn script for me that compares the schema in my code to the one in my DB. Steps 6, 7, 8, 9, and 10 would be another prompt, and finally I will do 12 & 13 manually or with completions because I might want to adjust the UI.

Before Cursor, I still wrote out that list because I like to Build a Progress Bar for My Work that helps me make an estimate, keep on track, and know if I am not going to make it. When I work with Junior devs, I often develop this list with them to communicate what I want done with more details.

Is this programming? I think so. Instead of TypeScript, I am “programming” in a loosely specified, natural language inspired, custom DSL. I run scripts to generate my migration code from schemas and my client side models from GQL queries, and to me, prompting Cursor is basically the same thing.

I Prompt Cursor Based on My Progress Bar

In Build a Progress Bar for Your Work I explained how I take a task and break it down into subtasks. Each subtask then becomes a commit. A benefit I didn’t mention is that I can look at the commits and see the time it took to do each task, which will be a good way to adjust my estimate going forward. Unlike in Using Zeno’s Paradox For Progress Bars where you have no clue how much time is left, I actually do have some idea.

Having this progress bar of subtasks has also been useful in working with Cursor. I use each of those subtasks to drive my prompt. Here’s an example:

I am working on an app that has a thing like a feed with a concept of posts and reactions, like slack or discord. My stack on the server is MySQL, a TypeScript based ORM, and a TypeScript based GQL wrapper that is served by node and Apollo. My Tasks/Subtask progress bar looks like this

Add Reactions to Backend
- Create reaction entity and relate to shared objects
- Create SQL migration
- Create ORM service function to react to a shared object
- Test the service function
- Update shared object ORM getter to have a reaction summary
- Test the reaction summary
- Add a field resolver to the shared object resolver to get the reaction summary

Having that, I use each sub-bullet to create my prompt. I give Cursor a lot more information though. As an example, here’s my prompt for the first bullet (I also need to put in the context files)

Add a reaction.entity.ts with a UUID id, a many to one relationship with sharedObject (and do the reverse in sharedObject). Have a reaction type (string), a User, a createdDate (like sharedObject) — do a unique that is like this UNIQUE (shared_object_id, user_id, reaction_type), but in TypeOrm syntax at the top of the class

From that it knows what fields I would likely want to index, and adds that too. It picked the delete rule I usually use (CASCADE), and it updates the User entity and SharedObject entity to have the reverse relations. It made some slight errors, but they are easy to check and fix.

Once I have that, I make a commit and move on to the next one. In that case, I use an external script for migration, so I just do that myself, and commit.

I go on like that, picking the next thing in my list, coding or generating and fixing, and then making a commit. The result is a PR that looks the way I want (not just the whole change in one commit) in the order that makes sense for a reviewer. I do this even though I am the reviewer (as I wrote in Pull Requests for One) because I do actually do a review and I want it to be easy.

This activity (to me) is very much like programming, which is what I was getting at in Can non-programmers use Cursor? A complete non-programmer? Probably not, but I do think someone could do this if they had some programming skill. The main thing I am doing is what I said in Programming is Disambiguating: “Programming is taking a nebulous problem and breaking it down, understanding it, trying to find building blocks, and then building up something that solves the problem.” Those building blocks could be prompts or code, it doesn’t matter, as long as it solves the problem.

Review of the AI in the Fantastic Four Teaser

I’ll revisit this review when the movie comes out, but this is next in my series of limited perspective movie reviews, where I take a very narrow look at a movie. I haven’t done this in a while, but of the ones I have, the Oz: Review of wizard projection technology has been the most popular.

Here’s the teaser:

We don’t exactly what year or timeline this is in. It feels like space travel is relatively new, and the commercial TV sets are ancient, so the 60’s, but in an alternate timeline. They still seem to have giant mainframes with reel to reel tape. In the comic, and in this movie I’m sure, Reed Richards and the FF have technology far more advanced than the rest of the world, so what we see in their home wouldn’t be typical. I want to focus on the robot, HERBIE.

Using just the information in this trailer, HERBIE can make sauce and is able to understand natural language. We achieved that level of AI a couple of years ago, so he’s about 70 years ahead of his time. But, I don’t think the cloud infrastructure we have is available to the FF, so this might all be local, which is impressive. Maybe Reed has a super computer on a LAN, but that would make it hard for HERBIE to leave the Baxter Building (which we don’t see, so it’s possible).

I also don’t know if the tape reels on HERBIEs face are functional or decorative, but if his AI relies on that, we have to assume that Reed has built something fundamentally different from our LLMs, or has had gains on memory usage many orders of magnitude beyond what we have. Or, he might have just figured out how to do a lot more with tape. When I was a kid, we had the 2XL, which was based on 8-Track tapes.

I look forward to seeing if there’s any in-movie explanation of HERBIE, until then, I think it’s just tricks like the 2XL. Pop in a tape, it makes marinara, and you can talk about marinara. Pop in another and it’s taco night.

Can non-programmers use Cursor?

I wrote Can non-programmers make applications with AI? last month. TL;DR: Yes. But, I hadn’t used Cursor yet. Now, I’m pretty sure that to use Cursor well on a real project, it helps to know some programming. But, if you do, it’s way more useful than it would be to an expert, which is saying something, because I find it very useful.

As an expert, my coding session today was maybe 2x faster for the same code. But, a non-programmer would have taken weeks to do what I did (if they could even do it). I think they have a chance to get close with prompts—I almost did, and they would try harder.

For what I needed to do today, in the first 5 minutes, Cursor did a good first pass. I fixed its syntax errors and the result “worked”. It looked terrible (this was implementing drag drop in a React app)—it took me a couple of hours to get it exactly how I liked it and then polish the code. But, getting me started quickly gave me a ton of momentum, and then I had time to make it exactly how I wanted it.

For a less skilled programmer to do this task, I think the first five minutes goes the same way. I know from experience, that it’s easier for me to just fix little problems, but I think it could be done with prompts. Then, the rounds of successive improvement were helped by autocomplete, but I initiated all of it. I relied on my knowledge of CSS and React to fix issues. I haven’t had good experience with the LLM’s for this—they can’t “see” the problem in the browser yet, and all of my problems were UI nitpicks and complicated Drag/Drop issues (not a static render I could screenshot or easily describe). All of the different modes of Cursor LLM integration have strengths for different uses—but some rely more on your ability than others.

From my use, it feels like knowing some programming is required. But, if it took a less skilled person from 2 weeks to 1-2 days, that’s more like 10x for them. What’s more, I go from 20x faster than them to 4x for this task, and they have more to improve, where my gains are asymptotic.

Three Days of Cursor

I tried out Cursor three days ago. But, before I did, I really did try to give GitHub Edits a chance. I used it for a few things, and it was more trouble than it was worth. I mentioned this to a friend at lunch on Tuesday and he asked why I hadn’t tried Cursor yet. I said I was worried it would interfere with my setup too much, but he confirmed that it reads your VSCode settings and extensions and (for him) just works. I use extensions as Tech Debt Detectors, so they are important to me.

So, the next day, I gave it a try. At 3:15 on Wednesday, I started the download. By 3:20, it was installed and working as I expected. I wrote a prompt to do the thing I needed to do next and its change was perfect. It was 3:27.

My next request did not go as well, but it was a complicated one involving a package I wanted to try. It would not install properly, and neither I nor Cursor could figure it out.

Since then, I’ve been using Cursor a lot. There are three main ways it’s better than GitHub CoPilot.

The Generation from chats is much better than the GH equivalent.
The autocomplete doesn’t require me to place the cursor. It anticipates what I am going to do and offers changes in different parts of the file. Sometimes all at once.
There’s a hotkey for inline generation that works well too. I had been doing this with comments, but this is better because it just lets me type (without interruptions) and it knows I want it to start a generation when I’m done.

For all of these features, I feel very much like I am still programming. I am sequencing the work. It feels like it’s reading my mind since I do know exactly how to do what I am asking, but it’s saving a lot of typing.

This is helped by my app’s code being very regular. There’s only one way to do DB code, one way to wrap it in GQL, one way to call it. My UI is regular. The code has established patterns, and I put example files in the context before I prompt (and mention that I want it done like those files).

The main way it helps me though, is to keep me in flow. I am not constantly juggling files and typing out simple things. I say I want a table with an id, name, and a specific relation, and it knows my id is a UUID, and how I typically name relations. I say I want a cross table and it knows to set up the relations between the tables and the new one (in the way I have done it before). It intuits that I want cascading deletes without me asking. It’s just a lot of little things that save time and let me move on to something else.

Why ChatGPT Works Better for Newbies than StackOverflow

On StackOverflow, it’s common for newbie questions to get downvoted, closed, and draw comments making fun of the questioner. To be fair, many of these questions are not good and show that the asker didn’t follow the guidelines. Even so, StackOverflow is often hostile to new programmers. To be honest, I’m surprised that ChatGPT didn’t somehow learn this bad behavior.

ChatGPT answers are sometimes totally wrong, and they will be even more wrong for the way newbies ask questions. If they weren’t, StackOverflow wouldn’t have had to ban answers generated from chatbots. But, I still think ChatGPT a better experience because it’s fast and synchronous. This allows it to be iterative. Of course, this doesn’t help if the suggested code can’t be used.

If I were StackOverflow, I might consider how LLMs could be used to help newbies ask a better question that gets answered by humans if the LLM can’t answer. Let the user iterate privately, and then have the LLM propose a question based on a system prompt that understands StackOverflow guidelines. Normally, I’d expect the LLM to be able to answer at this point, but I just ran into a problem yesterday where it kept hallucinating an API that was slightly wrong. This kind of thing happens often in ChatGPT for me. In a lot of cases, I could guess the real API or search for it in the docs, but a newer programmer might not be able to do that.

Evaluating the Evaluators

I was a member of Toastmasters during most of 2023 and 2024. Most people know that Toastmasters is a place where you go to get more comfortable at public speaking. A lesser known aspect is their approach to evaluation.

If you give a speech at Toastmasters, it will be evaluated. This is something semi-formal, meaning that there is a format and a rubric. That makes sense and is probably what you would think it is. What was unexpected to me was that an evaluation is treated like another speech and is evaluated as well (but we stop there—it’s not an infinite game). The evaluation of the evaluation is less formal. It’s usually a few lines during the general evaluation, which needs to cover the entire meeting. When I had to do it, I would try to pick out a line from the evaluation that was worth emulating, to underscore it.

I thought of this while I’ve been going down a rabbit hole trying to learn about LLM evaluations, which also has the concept of evaluating the evaluators. I don’t have much more to say, but just want to leave a link to Hamel Husain’s excellent post: Creating a LLM-as-a-Judge That Drives Business Results, which was the best thing I found on how to improve LLM based features in a product.

How to be avoid being replaced by AI

Right now, I am an independent software developer, and so I own all of the code I create. So, of course, I want it to be as efficient as possible to write that code, and I use AI to do that. I get all of the benefit. I am not replaced—I am freed up to do other things.

All of my consulting income comes from work that originated from my network, cultivated from over 30 years of delivering on my commitments. Some of it is because I am one of only of few people that understand a codebase (that I helped create and partially own). There are very few people that could replace me in this work, even with AI. Even if AI were able to 100% reproduce everything I say and write, the client wouldn’t know how to judge it because their judgement of me is almost entirely based on past experience.

It’s not just AI—there are many people who are smarter than me, with more experience, who work harder, and have better judgement. But if they are completely unknown to my clients, they couldn’t replace me.

Of course, I realize that this isn’t something that one could just replicate immediately, but if you are building a software engineering career for the next few decades, I recommend cultivating a network that trusts you based on your past performance and owning as much of your own work as possible. When you do that, all of the benefits of efficiency flow to you.

Can non-programmers make applications with AI?

Of course! Non-programmers have been making applications for decades, well before there was anything like AI helping them. In my experience, the people who really want to make applications learn what they need to learn. AI closes that gap. No-code tools do that too. So does having things like npm, htmx, React, Pandas, SQLite, AWS, etc. But, the motivated can close bigger gaps if they have to.

My first job was in 1992, working for a company that made FX Options Pricing software. That company was founded by an FX Options trader who was a “non-programmer” until he wasn’t. He taught himself enough C to make simple programs to help him price options at work, and just kept making them better until he was able to sell copies to FX Options brokers and traders. And then he used the money to hire programmers, but he still worked on it for at least the first five years of the company.

A couple of jobs later, I got a job at a startup where the first versions were written by the founder, a mechanical engineer, who probably took programming courses, but was not a “professional programmer”. His first two hires were also self-taught. They went from hacking around with scanners and simple websites to building the web-based scanning and Ajax document viewers that were the basis of our acquisition.

At both places, programmers (like me) were brought in. We added some “professionalism” to the codebase and processes. We helped the teams scale, the products be more reliable, and the delivery be more predictable. But bringing us in was just another tool they used to close the gap.