Author Archives: Lou Franco

Prompt Engineering is a Dead End

LLM chatbots are bad at some things. Some of this in intentional. For example, we don’t want chatbots to generate hate speech. But, some things are definitely not intentional, like when they make stuff up. Chatbots also fail at writing non-generic text. It’s amazing that they can write coherent text at all, but they can’t compete with good writers.

To get around some of these limitations, we have invented a field called “prompt engineering”, which use convoluted requests to get the chatbot to do something it doesn’t do well (by design or not). For example, LLM hackers have created DAN prompts that jailbreak the AI out of its own safety net. We have also seen the leaked prompts that the AI companies use to set up the safety net in the first place. Outside of safety features, prompt engineers have also found clever ways of trying to get the LLM to question its own fact assertions to make it less likely that it will hallucinate.

Based on the success of these prompts, it looks like a new field is emerging. We’re starting to see job openings for prompt engineers. YouTube keeps recommending that I watch prompt hacking videos. Despite that, I don’t think that this will actually be a thing.

All of the incentives are there for chatbot makers to just make chatbots better with simple prompts. If we think chatbots are going to approach human-level intelligence, then we’ll need prompt engineers as much as we need them now for humans, which is “not at all.”

Prompt engineering is not only a dead end, it’s a security hole.

Triangle Estimates

I estimate using time, not points. I do this even though I have mostly worked on product development teams inside of companies and not as a contractor for clients.

But, estimation on product development is not as high-stakes an activity as it is if you are billing based on it. The judgement of a successful product development project is based on whether it moves some business needle, not whether it was under budget. The margin on successful product development dwarfs reasonable overruns. There is also a lot of appetite for risk, where we know that we’ll win some and lose some.

That’s not true if the work is for clients. If they pay by the hour, they expect the estimate to be accurate. But there are very few (if any) big software projects that can be accurately estimated. When you’re wrong, you either have to eat it or have a very difficult conversation.

I don’t have a good answer, but I would start with triangle (or three-point) estimates. To use them, you estimate a most-likely estimate and then a best and worst case on either side. The three numbers describe a distribution (see the wikipedia page for how to combine them). The result is an estimate with a error range and a confidence interval. I would recommend sharing that instead of a single number.

The only task management system that I have seen that offers this option is DevStride, which carries three-point estimates throughout its system.

Stopping at a Good Part

When I’m reading non-fiction, I usually progress chapter by chapter, purposefully stopping at the end of each one so that I can have time to process what I have just read. It’s a good time to write a note with my reaction to it. Today, I ended up in a very long and dense chapter and found a different kind of stopping point.

I am reading The Sense of Style by Steven Pinker. After a while, I got to a new section in this long chapter that would have been a good place to stop, but I just kept going. About a page into it, he made a great point that I want to remember. I was excited to keep reading. But I put the book down.

I purposefully turned back a page so I have to read that part again. I know that the passage I just read is very propulsive and will make me want to read whatever follows, so I want it to be at the very beginning of the next session. It’s also keeping an open loop (in a good way) that makes me think about what I just read.

It reminds me about the way I purposefully leave a unit test broken so that I know what to do when I return to the code.

When Someone is Wrong on the Internet

I have a policy never to write a negative reply to an opinion on the Internet. But I still sometimes have negative reactions. At first, I try to let it go. That works a lot, but not always.

If I find myself thinking about it the next day, then I need to do something just to get it out of my head. In Reframing Anxiety, I wrote about how I’ve come to see anxiety as as asset. I see my anxiety as the flip-side to conscientiousness, which I need to be successful. There’s another way anxiety is working for me now.

Part of what’s happening when you read social media and see an opinion you disagree with is that you imagine that you are in a live debate with that person and that you are losing. You imagine that everyone can see this, so (if you are prone to anxiety) your brain will keep it in your head. You think you can solve it with the perfect remark. The problem is that both sides of the argument think this, so it quickly escalates.

What I am doing instead is using that energy to write my own post here that expresses my opinion on the subject. I write it in a positive tone. I don’t refer to the original post. I don’t post it on social media. It’s just here on my site outside of the conversation.

My inability to let it go helps me fulfill my personal commitment to write every day and I’m grateful for that.

Write While True Episode 36: Stack of Paragraphs

The advice applies to any kind of writing. It resonated with me because I feel like I might be having that problem in my written work. That sometimes my writing feels like a stack of paragraphs. I am feeling the lack of propulsion that John and Aline described.

Transcript

Generating Podcast Episode Ideas

Tomorrow, I will record and publish episode 36 of Write While True. I have not given a lot of thought about the content yet except that I have the topic.

For each episode, all I want to do is end with a takeaway that I have learned about writing better, It feels like there should be a limitless number of topics, so I’m not worried about running out, but I still need to think of them.

To make it more focused, I have been using “seasons” to set a theme. At some point in the week, something that fits in the theme comes to me. Sometimes it’s from something I’m reading, or maybe another podcast, or it just pops into my head from some past bit of writing advice I saw somewhere.

Sometimes I get an idea that is not on theme. For that, I just make a card on my podcast Trello board. Eventually, there will be enough cards in some other theme that I can use to start a new season.

In a way it’s a lot like James Webb Young’s Technique for Producing Ideas. He recommends exposing yourself to both random things and the problem you are trying to solve. At some point, a new idea will pop into your head, since new ideas are just novel combinations of old ideas.

Then, you refine it, because the idea alone is only a seed, and not good enough on its own.

OWASP Should Include LLM Prompt Hacks in Injection

Yesterday, I wrote that LLM prompt hacking was like an injection attack. I looked up injection in OWASP’s 2021 10 top of security vulnerabilties and see that it’s number three. Since LLM prominence started this year, they haven’t listed prompt hacking yet, but you can see from their description and list of remedies how similar it is to injection. And since we’re busily attaching LLMs to web applications via their APIs, prompt hacking should be considered a web application security vulnerability in the next survey.

Here’s the top prevention technique:

Preventing injection requires keeping data separate from commands and queries:

  • The preferred option is to use a safe API, which avoids using the interpreter entirely, provides a parameterized interface, or migrates to Object Relational Mapping Tools (ORMs).
    Note: Even when parameterized, stored procedures can still introduce SQL injection if PL/SQL or T-SQL concatenates queries and data or executes hostile data with EXECUTE IMMEDIATE or exec().

For an LLM this means that the LLM itself isn’t affected by the user query. I realize that that may be impossible with current implementations. My suggestion is to somehow create two channels (one for “code” and one for “data”) in the training process so that the resulting model isn’t exploitable this way.

No, I have no idea how to do that, but it’s not with a more convoluted prompt.

We Keep Reinventing Injection Attacks

Web programmers can cause security problems if they embed data into HTML and render the result. For example, if I have a simple form that asks for your name and then output a page with that name in it, I’ll open myself up to an “injection” attack if the user types in some Javascript, and I don’t carefully escape it. I’ll end up running that Javascript.

The same is true if we take user data and try to create queries by concatenating it with SQL, as lampooned by XKCD.

We invented encoding and string interpolation techniques to solve this. But nothing forces you to use those features, so we still mess it up, which is why security bounties are frequently paid for injection attacks.

But, those issues are with legacy languages like HTML and SQL where we send strings that mix code and data over the network and run them. We should have designed them in a way that separated the code and the data. Surely, we learned from that for new things that we invented since then.

We did not.

An LLM chatbot is also a service that we send strings over a network to. The prompt you send is “code” in natural language and the LLM “runs it”. The problem is that there is a kind of meta-language that controls the chatbot itself, which can be sent before your normal prompts. Using these “jailbreaking” prompts, you can trick the LLM into dropping its safety net and produce hate speech or help you code malware.

These prompts are essentially the same idea that Bobby’s mom is using in the comic, and the solution is likely going to be a prompt version of what encoding and string interpolation is doing.

It would be better if the system was designed such that user chat requests weren’t treated like a program that could change the chatbot itself.

Announcing: Morning Pages Journal with Prompts

I’ve been experimenting with creating books for Amazon KDP using Page-o-Mat. My first book is a journal for writing prompted morning pages.

Cover for the Morning Pages Journal with Prompts book

There are 4 volumes of the journal, each offering a different 30 prompts.

If you don’t know what morning pages are, I covered them in two episodes of my podcast:

I have written about them in these posts:

The journal has two pages per prompt. At 8.5 x 11, it takes me 20-30 minutes to fill them, which is about the right length of time for morning pages. I set them up so that they are the front and back of the same page, so you could remove the page if you wanted.

I also encourage you to read and highlight past pages. At the back of the book is an index where you can harvest your favorite parts.

Using Recruiters for Entry Level Developers

I graduated in 1992 and got my first job using a tech recruiter. It was for a small company in FinTech with less than 20 people when I joined. While I was there we hired a lot of entry-level developers, mostly from college recruiting, but we did use recruiters too.

30 years later, I think it’s rare to use a recruiter to hire entry-level developers. There is a lot of supply. There is certainly as aspect to recruiting in what the code bootcamp schools are doing, but from the hiring end, I haven’t been at a place that used recruiters for junior developers for quite a while.

But, one exception I noticed is in FinTech. John Easton, who got me my first job, and who is one of the best recruiters in NYC, seems to frequently have entry-level FinTech jobs. Here’s one he posted today.

If you are in the market for this kind of work, especially if you are in NYC, I’d follow his account.