Category Archives: AI

Limiting the Chance of Code Agent Prompt Injections

Yesterday, I wrote about the Lethal Trifecta when using coding agents and how I am escaping it via sandboxing. I built a place to code where there is nothing valuable to lose. The agents might be poisoned by prompt injection and able to phone home, but there’s nothing to send. I can wipe the entire VM at any time and rebuild it from a snapshot or from scratch easily.

This deals with one leg of the trifecta, which is sufficient, but I don’t ignore the other two.

To limit the chance of an agent being exposed to a prompt injections, I build on an architecture of very limited dependencies. My current project is to build visualizations in JS on D3. I only include D3 on pages in the browser (it’s not on my machine). I don’t use npm, and I have no other dependencies.

The thing I miss most is jest, but I decided to build a minimal testing framework (just need to run functions and make assertions). I run the tests in a browser, so I get access to a DOM too, which I could test against. All of the code for this project only makes sense inside of a web page in the browser, which is another sandbox. It’s like Inception up in here.

My other projects are python based and live in their own VM. I need some dependencies there (pandas, numpy, matplotlib and more). The main thing I am doing is keeping that separate from the visualization project so that any issue in one doesn’t affect the other.

Nothing else that I need for the project (that I didn’t create) lives in that VM.

My main exposure to untrusted text is that I let the agent browse the web. I don’t see how I could avoid this, which is why this leg of the trifecta could never be the one I eliminate.

Escaping the Lethal Trifecta of AI Agents

The “Lethal Trifecta” is a term coined by Simon Willison that posits that you are open to an attacker stealing your data using your own AI agent if that agent has:

You need all three to be vulnerable, but usage of Claw or Coding agents will have them by default. I would say that the second two are almost impossible to stop.

#2 Untrusted content includes all of your incoming email and messages, all documents you didn’t write, all packages you have downloaded (via pip, npm, or whatever) and every web page you let the agent read. I have no idea how to make an agent useful without some of these (especially web searching).

#3 External communication includes any API call you let it make, embedded images in responses, or just letting it read the web. Even if you whitelist domains, agents have found ways to piggyback communication because many URLs/APIs have a way of embedding a follow-up URL inside of them.

For my uses, I find it impossible to avoid these two. Reduce? Yes, but not eliminate.

So, my only chance to escape the trifecta is to not give agents access to my private data. This means that I would never let an agent process my email or messages. I also would never run them on my personal laptop. I would never let them login as me to a service.

This is why I built hardware and software sandboxes to code in. Inside a VM on a dedicated machine, there is no private data at all. I use it while assuming that all code inside that VM is untrusted and that my agent is compromised. I do my best to try to make sure that won’t happen, but my main concern is that there is no harm if it does happen.

Incidentally, this same lethal trifecta also applies to every package you install into your coding projects. If an NPM package can (1) read your secrets (2) is untrusted and (3) can communicate, then you may suffer from a supply chain attack. It’s obvious that code you install and run makes #2 and #3 impossible to safeguard against. Not having secrets in the VM is the best solution for supply chain attacks too.

Tomorrow, I’ll follow up with how I reduce the other two legs of the lethal trifecta.

What Makes a Good First Vibe Coding Project

Code can be dangerous to run. It could have security issues. It could leak secrets. If you don’t know what you are doing yet, vibe coding is a good way encounter those problems fast.

Here are some aspects of a project that make it a good one to start learning how to vibe code. This won’t make them perfectly safe (no code that you don’t read could be). But, here’s where to start.

  1. It is a tool that only you will use
  2. It doesn’t need to deploy code to a server that is exposed to the public Internet
  3. It doesn’t need access to any services that require authentication
  4. It is meant to be a prototype
  5. It is run client-side only in a sandbox. For example: a 100% in-browser JavaScript or mobile app.

Games fit most of these.

I’ve been having fun with my nephew writing JavaScript games using PhaserJS. Agents seem to know this library well and we almost never need to look at the code. The games run in a sandbox (the browser) and don’t require any server-side code (that could be hacked).

Joke Templates

Back in the nineties, I was interviewing someone and he mentioned the idea of joke templates. I can’t remember his example, but when I told my boss, he said, “Oh yeah, I love the one where someone says a number and then you multiply it by seven and say it’s that many in dog years.”

My favorite joke template is the two problems one. I think it was originally: “You have a problem and you think, ‘I know: I’ll use a regular expression’ — now you have two problems.”

I’m a sucker for any variant on this. I just posted this to LinkedIn:

You have a problem with your AI code generator undoing its own work when you add something new, so … “I know,“ you think, “I’ll add another LLM to check the code of the first one.“ Now you have two problems.

RegExes aren’t AI, but it felt that way sometimes since they are so good at what they are good at. But, just like LLMs, they suck at what they suck at. Generally, they are great at finding answers that have objective, verifiable truth. But, they are not good at knowing a secret fact. The key is to provide the facts and ability to verify, and then let the LLM iterate to the solution.

Vibe Coding vs. Vibe Engineering

I try to use Vibe Coding in Andrej Karpathy’s original sense:

There’s a new kind of coding I call “vibe coding”, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.

Which makes it hard to describe what I do, which is not that. I have been calling it AI-Assisted Programming, but that’s too long. Simon Willison proposed Vibe Engineering:

I feel like vibe coding is pretty well established now as covering the fast, loose and irresponsible way of building software with AI—entirely prompt-driven, and with no attention paid to how the code actually works. This leaves us with a terminology gap: what should we call the other end of the spectrum, where seasoned professionals accelerate their work with LLMs while staying proudly and confidently accountable for the software they produce?

I propose we call this vibe engineering, with my tongue only partially in my cheek.

He wrote this in October, but it only started to sink in with me recently when he wrote about JustHTML and how it was created. Read the author, Emil Stenström’s, account of how he wrote it with coding agents. This is not vibe coding. He is very much in the loop. I think his method will produce well-architected code with minimal tech debt. Like I said in my book: “The amount of tech debt the AI introduces into my project is up to me.” I think this is true for Emil too.

My personal workflow is to go commit by commit, because it’s the amount of code I can review. But, I see the benefit of Emil’s approach and will try it soon.

Protecting Myself

I was recently on the Scrum Master Toolbox podcast with Vasco Duarte in a series about AI assisted coding. In it, I said that I read every line of AI generated code (and fix it, if necessary) before I commit it.

This isn’t exactly right.

I read all code, even my own, and fix it before I commit. Doing it for AI is just an extension of how I normally work. I do this for my code because I often find problems. This is even more true for AI generated code.

Another reason to do this is because it makes code go through code review and testing faster. I have written about that previously:

Now that I am the only programmer on my project, I don’t need to worry about code review, but I do have to worry about DevOps, and frankly, I am not willing to trust AI to write code that I have to run. I have already fixed code that introduced beginner level security problems, so pure Vibe Coding on a project meant to be used by others on the web is not an option for me.

Moats and Fast Follow For Vibe Coded Projects

I wrote about how, at Atalasoft, I told my engineers to Be Happy When It’s Hard. Be Worried When It’s Easy. We competed against open-source and in-house solutions. When we found valuable problems that were hard to solve, I was relieved. The same is true for vibe coded solutions.

If you can create a valuable app in two weeks, then so could a competitor. If your secret sauce is your idea, then that’s hard to protect if you want people to use your app. We don’t even know if AI generated code is copyrightable, so it’s very unlikely to be patentable (i.e. inventors must be humans).

Here are three things you could do:

  1. Keep building on the idea – right now, someone following you has the benefit of seeing your solution and feeding that to the AI. So, it helps if you can keep building on the idea and hope they can’t keep up. If you do the minimum, the bar is too low.
  2. Build on secret data – once you have a working system, the biggest moat you have is the data inside the system. AI can’t see that or reproduce it from scratch. Build new (valuable) features that require secret data to work. This doesn’t need to be used as training data. This is like a network effect, but more direct and long-lasting.
  3. Use your unique advantages – If your app is a simple UI on CRUD operations, then it can be reproduced by anyone. But, let’s say, you have personal branding in a space. Can you make an app that extends on it? Do you have access to hard-to-win customers? A mailing list, subscribers, etc? Fast-followers might be able to recreate your software but your audience won’t care if they trust only you.

Of these, I am relying mostly on the last one. The software I am working on is an extension of Swimming in Tech Debt. It takes the spreadsheet that I share in Part 3 and builds on it with better visualizations than the built-in ones. Someone could clone this, I guess, but probably they would need to reference my book in order to explain it. I am indifferent to whose software they use if this is true.

Using Fuzzy Logic for Decision Making

In the 90’s, I read a book about fuzzy logic that would feel quaint now in our LLM-backed AI world. The hype wasn’t as big, but the claims were similar. Fuzzy logic would bring human-like products because it mapped to how humans thought.

Fuzzy Logic is relatively simple. The general idea is to replace True and False from Boolean logic with a real number between 1 (absolutely true) and 0 (absolutely false). We think of these values more like a probability of certainty.

Then, we define operations that map to AND, OR, and NOT. Generally, you’d want ones that act like their Boolean versions for the absolute cases, so that if you set your values to 1 and 0, the Fuzzy logic gates would act Boolean. You often see min(x, y) for AND and max(x, y) for OR (which behave this way). The NOT operator is just: fuzzy_not(x) => 1.0 - x.

If you want to see a game built with this logic, I wrote an article on fuzzy logic for Smashing Magazine a few years ago that showed how to do this with iOS’s fuzzy logic libraries in GameplayKit.

I thought of this today because I’m building a tool to help with decision making about technical debt, and I’m skeptical about LLM’s because I’m worried about their non-determinism. I think they’ll be fine, but this problem is actually simpler.

Here’s an example. In my book I present this diagram:

Diagram showing Pay and Stay Forces

The basic idea is to score each of those items and then use those scores to make a plan (Sign up to get emails about how to score and use these forces for tech debt).

For example, one rule in my book is that if a tech debt item has high visibility (i.e. customers value it), but is low in the other forces that indicate it should be paid (i.e. low volatility, resistance, and misalignment), but has some force indicating that it should not be paid (i.e. any of the stay forces), then this might just be a regular feature request and not really tech debt. The plan should be to put it on the regular feature backlog for your PM to decide about.

A boolean logic version of this could be:

is_feature = visible && !misaligned && !volatile && !resistant && 
              (regressions || big_size || difficult || uncertain)

But if you did this, you have to pick some threshold for each value. For example, on a scale of 0-5, a visible tech debt item be one with a 4 or 5. But, that’s not exactly right because even an item scored as a 3 for visibility should be treated this way depending on the specific scores it got in the other values. You could definitely write a more complex logical expression that took this all into account, but it would hard to understand and tune.

This is where fuzzy logic (or some kind of probabilistic approach works well). Unlike LLMs though, this approach is deterministic, which allows for easier testing and tuning (not to mention, it’s free).

To do it, you replace the operators with their fuzzy equivalents and normalize the scores on a 0.0-1.0 scale. In the end, instead of is_feature, you more get a probability of whether this recommendation is appropriate. If you build up a rules engine with a lot of these, you could use the probability to sort the responses.

Fuzzy logic also allows you to play with the normalization and gates to accentuate some of the values over others (for tuning). You could do this with thresholds in the boolean version, but with fuzzy logic you end up with simpler code and smoother response curves.

Make a Programmer, Not a Program

From what I have seen, pure vibe coding isn’t good enough to produce production software that is deployed to the public web. This is hard enough for humans. Even though nearly every major security or outage was caused by people, it’s clear that that’s just because we haven’t been deploying purely vibe coded programs at scale.

But, it’s undeniable that vibe coding is useful, and that it would be great if we could take it all of the way to launch. Until then, it’s up to the non-programming vibe coder to level up and close the gap. Luckily, the same tools they use to make programs can also be used to make them into programmers.

Here’s what I suggest: Try asking for very small updates and then reading just that difference. In Replit, you would go to the git tab and click the last commit to see what changed. Then, read what the agent actually said about what it did. See if you can make a very related change yourself. For example, getting spacing exactly right or experimenting with different colors by updating the code yourself.

Do this to get comfortable reading the diffs and to eventually be able to read the code. The next step would be being able to notice that code is wrong, which is most of what I do these days.

    Describing Tech Debt to Vibe Coders

    In my book, Swimming in Tech Debt, I write that I don’t think we (engineers) should be explaining tech debt to our non-engineering peers. But that only applies to our tech debt (because it’s boring). Now that they are vibe coding, I do want them to understand their own.

    (If you are reading this in August 2025, my book is probably still at its pre-sale price of $0.99.)

    I talk to a lot of vibe coders who are running into the problems caused by tech debt in their projects. They don’t and can’t read code, so my definition of tech debt is hard to convey to them. But, I’ve come up with an analogy that I think works.

    Imagine that I “vibe design” a concert poster. I go to DALL-E, give it a prompt and it generates an image for me. I look at it and think it’s 80% of the way there, but I want to make changes. So, I prompt again with more details and it gets closer. I try again, and again, and again, but as I go on, I start to see that some of the things that were right in early versions are gone now. I think to myself, maybe I should take the best version and try to fix it myself in a design tool.

    But, then I run into a problem. DALL-E generated pixels, not a design file. It doesn’t have layers. It’s not even using fonts and text components. I just want to rotate a background shape a few degrees and fix a typo, but that’s not possible. Or what if instead of an InDesign file, it can only generate PageMaker files. They are organized perfectly, but in an older technology that I can’t use.

    Changes that should be easy are hard (or impossible). Choices that were sane don’t make sense today. All of those aspects of this digital file that are hard to change are very similar to what coders experience with tech debt. It only matters if you want to make changes. It’s the resistance you feel when you try.

    The irony is that the same things that made it hard for us is making it hard for the AI too. I can’t tell it to rotate a red triangle in the background because there is not triangle there, just a bunch of pixels. It can’t fix the typo because there aren’t any letters. If it had generated a sane representation, we wouldn’t need to look at because it might have been able to change it for us.