The “think of a number” fallacy

Some time a go a colleague commenting on the idea of iterative prompting, suggested to ask GPT to “think about something” and then make a decision on what to write or not to write.

The problem with this approach is that a session with an LLM doesn’t really have a memory outside the actual text being created by the chat, consequently it cannot “keep something in mind” while completing other tasks.

But it can pretend it does.

To test this, you can ask to a LLM to “think of a number, but don’t tell me”. At the time of this writing most models will respond by confirming that they have thought of a number. Of course they haven’t... but because they are trained to mimic human interactions, they are pretending they are.

This is something to always keep in mind while prompting.

For example, it is not effective to prompt a system to “make a list and only show me the part matching a criteria”, but you can request to print the full output and then generate a final list (“print the list, then update it with the criteria”).

GroceriesGPT

A friend this morning shared a list of vegetables, noting how hard it is to eat 30 different ones in the same week.

I immediately turned to my AI chatbot to ask to create a list of commonly eaten vegetables, and of course I got a very good one.

At that point I thought that it would be nice to add that list to my next grocery order on Ocado.

And this is where the magic ended.

My chatbot doesn’t talk to the Ocado app. And I actually use more than one bot, sometime I go with ChatGPT, sometime I go with Claude, they are both good and continuously improving and I like to pit them against each other.

ChatGPT has a plug-in architecture which potentially would allow to connect with other applications creating custom GPTs, but so far I haven’t seen any particularly good application. And what would be the idea there? That Ocado would have to build a custom GPT? And what about other chatbots? I don’t really want to be siloed again. I’m happy to pay for services, even Google, but leave me free to connect.

Meanwhile I’m sure that somebody at Ocado is already thinking on how to integrate an AI in their app (if you aren’t, call me), and while this will be a nice feature to have, it will be yet another AI agent unable to talk with my other agents.

Maybe the solution is similar to what Rabbit appears to be working on: teach AI to use UI. Avoid altogether the challenge of getting companies and engineers to agree on open standards and just teach AIs to use shitty incompatible interfaces of our apps.

AI interoperability might be one of the most interesting future problems that we will face.

I want the AIs I pay for to collaborate, not to compete.