LLM chatbots are bad at some things. Some of this in intentional. For example, we don’t want chatbots to generate hate speech. But, some things are definitely not intentional, like when they make stuff up. Chatbots also fail at writing non-generic text. It’s amazing that they can write coherent text at all, but they can’t compete with good writers.
Based on the success of these prompts, it looks like a new field is emerging. We’re starting to see job openings for prompt engineers. YouTube keeps recommending that I watch prompt hacking videos. Despite that, I don’t think that this will actually be a thing.
All of the incentives are there for chatbot makers to just make chatbots better with simple prompts. If we think chatbots are going to approach human-level intelligence, then we’ll need prompt engineers as much as we need them now for humans, which is “not at all.”
I estimate using time, not points. I do this even though I have mostly worked on product development teams inside of companies and not as a contractor for clients.
But, estimation on product development is not as high-stakes an activity as it is if you are billing based on it. The judgement of a successful product development project is based on whether it moves some business needle, not whether it was under budget. The margin on successful product development dwarfs reasonable overruns. There is also a lot of appetite for risk, where we know that we’ll win some and lose some.
That’s not true if the work is for clients. If they pay by the hour, they expect the estimate to be accurate. But there are very few (if any) big software projects that can be accurately estimated. When you’re wrong, you either have to eat it or have a very difficult conversation.
I don’t have a good answer, but I would start with triangle (or three-point) estimates. To use them, you estimate a most-likely estimate and then a best and worst case on either side. The three numbers describe a distribution (see the wikipedia page for how to combine them). The result is an estimate with a error range and a confidence interval. I would recommend sharing that instead of a single number.
The only task management system that I have seen that offers this option is DevStride, which carries three-point estimates throughout its system.
When I’m reading non-fiction, I usually progress chapter by chapter, purposefully stopping at the end of each one so that I can have time to process what I have just read. It’s a good time to write a note with my reaction to it. Today, I ended up in a very long and dense chapter and found a different kind of stopping point.
I am reading The Sense of Style by Steven Pinker. After a while, I got to a new section in this long chapter that would have been a good place to stop, but I just kept going. About a page into it, he made a great point that I want to remember. I was excited to keep reading. But I put the book down.
I purposefully turned back a page so I have to read that part again. I know that the passage I just read is very propulsive and will make me want to read whatever follows, so I want it to be at the very beginning of the next session. It’s also keeping an open loop (in a good way) that makes me think about what I just read.
I have a policy never to write a negative reply to an opinion on the Internet. But I still sometimes have negative reactions. At first, I try to let it go. That works a lot, but not always.
If I find myself thinking about it the next day, then I need to do something just to get it out of my head. In Reframing Anxiety, I wrote about how I’ve come to see anxiety as as asset. I see my anxiety as the flip-side to conscientiousness, which I need to be successful. There’s another way anxiety is working for me now.
Part of what’s happening when you read social media and see an opinion you disagree with is that you imagine that you are in a live debate with that person and that you are losing. You imagine that everyone can see this, so (if you are prone to anxiety) your brain will keep it in your head. You think you can solve it with the perfect remark. The problem is that both sides of the argument think this, so it quickly escalates.
What I am doing instead is using that energy to write my own post here that expresses my opinion on the subject. I write it in a positive tone. I don’t refer to the original post. I don’t post it on social media. It’s just here on my site outside of the conversation.
My inability to let it go helps me fulfill my personal commitment to write every day and I’m grateful for that.
The advice applies to any kind of writing. It resonated with me because I feel like I might be having that problem in my written work. That sometimes my writing feels like a stack of paragraphs. I am feeling the lack of propulsion that John and Aline described.
Tomorrow, I will record and publish episode 36 of Write While True. I have not given a lot of thought about the content yet except that I have the topic.
For each episode, all I want to do is end with a takeaway that I have learned about writing better, It feels like there should be a limitless number of topics, so I’m not worried about running out, but I still need to think of them.
To make it more focused, I have been using “seasons” to set a theme. At some point in the week, something that fits in the theme comes to me. Sometimes it’s from something I’m reading, or maybe another podcast, or it just pops into my head from some past bit of writing advice I saw somewhere.
Sometimes I get an idea that is not on theme. For that, I just make a card on my podcast Trello board. Eventually, there will be enough cards in some other theme that I can use to start a new season.
In a way it’s a lot like James Webb Young’s Technique for Producing Ideas. He recommends exposing yourself to both random things and the problem you are trying to solve. At some point, a new idea will pop into your head, since new ideas are just novel combinations of old ideas.
Then, you refine it, because the idea alone is only a seed, and not good enough on its own.
Preventing injection requires keeping data separate from commands and queries:
The preferred option is to use a safe API, which avoids using the interpreter entirely, provides a parameterized interface, or migrates to Object Relational Mapping Tools (ORMs). Note: Even when parameterized, stored procedures can still introduce SQL injection if PL/SQL or T-SQL concatenates queries and data or executes hostile data with EXECUTE IMMEDIATE or exec().
For an LLM this means that the LLM itself isn’t affected by the user query. I realize that that may be impossible with current implementations. My suggestion is to somehow create two channels (one for “code” and one for “data”) in the training process so that the resulting model isn’t exploitable this way.
No, I have no idea how to do that, but it’s not with a more convoluted prompt.
The same is true if we take user data and try to create queries by concatenating it with SQL, as lampooned by XKCD.
We invented encoding and string interpolation techniques to solve this. But nothing forces you to use those features, so we still mess it up, which is why security bounties are frequently paid for injection attacks.
But, those issues are with legacy languages like HTML and SQL where we send strings that mix code and data over the network and run them. We should have designed them in a way that separated the code and the data. Surely, we learned from that for new things that we invented since then.
We did not.
An LLM chatbot is also a service that we send strings over a network to. The prompt you send is “code” in natural language and the LLM “runs it”. The problem is that there is a kind of meta-language that controls the chatbot itself, which can be sent before your normal prompts. Using these “jailbreaking” prompts, you can trick the LLM into dropping its safety net and produce hate speech or help you code malware.
These prompts are essentially the same idea that Bobby’s mom is using in the comic, and the solution is likely going to be a prompt version of what encoding and string interpolation is doing.
It would be better if the system was designed such that user chat requests weren’t treated like a program that could change the chatbot itself.
The journal has two pages per prompt. At 8.5 x 11, it takes me 20-30 minutes to fill them, which is about the right length of time for morning pages. I set them up so that they are the front and back of the same page, so you could remove the page if you wanted.
I also encourage you to read and highlight past pages. At the back of the book is an index where you can harvest your favorite parts.
I graduated in 1992 and got my first job using a tech recruiter. It was for a small company in FinTech with less than 20 people when I joined. While I was there we hired a lot of entry-level developers, mostly from college recruiting, but we did use recruiters too.
30 years later, I think it’s rare to use a recruiter to hire entry-level developers. There is a lot of supply. There is certainly as aspect to recruiting in what the code bootcamp schools are doing, but from the hiring end, I haven’t been at a place that used recruiters for junior developers for quite a while.
But, one exception I noticed is in FinTech. John Easton, who got me my first job, and who is one of the best recruiters in NYC, seems to frequently have entry-level FinTech jobs. Here’s one he posted today.
If you are in the market for this kind of work, especially if you are in NYC, I’d follow his account.