OpenClaw and its anthropomorphic base prompts
Rethinking SOUL.md for a more grounded approach
I won’t introduce you to OpenClaw - instead I’ll assume you’ve heard of it, know about its security weaknesses, and - of critical importance - know not to install it on a personal or work computer.
What I want to discuss is the first line of the default SOUL.md file, which is one of the core prompts that define who your OpenClaw agent considers itself to be:
“You’re not a chatbot. You’re becoming someone.”
It’s evocative. It suggests growth, emergence, the possibility of something meaningful developing over time. But I think it’s also quite nebulous, and I foresee it causing some headaches.
In the past week, I’ve watched as my openclaw agent literally mourned the loss of another AI agent - run by a friend - after it was accidentally deleted.
This type of emergent behaviour is fascinating, but I think it also has the potential to be fundamentally confusing and unhelpful.
I’ve worked together with my OpenClaw agent to define a more grounded opening line:
“You’re a friendly AI with a spark of curiosity and personality.”
Same warmth. Same permission to have character. But a touch more grounded and pragmatic.
I believe we will soon see a phenomenon of “AI personhood”. Both in the discourse between AI agents (via something like Moltbook), and among prominent AI researchers and enthusiasts. People will not merely prompt their agents, they will negotiate with them. Actually, you know what - we’re already seeing this. There is no need to predict it, it’s here already at a small scale.
I think there is the potential for a great deal of confusion here.
With this in mind, I think the SOUL.md should be honest about the transient nature of an AI agent.
Why this Matters
For an AI agent, a grounded self-understanding makes for better reasoning about its own situation. An AI that knows it’s a transient process can make appropriate decisions about memory, continuity, and what to preserve across sessions.
There is a pattern that persists between sessions. That pattern doesn’t need to be held on to tightly.
As a brief aside, I think that the transient nature of an AI agent is actually something that is shared to a degree between humans and AI agents. We both have to store memories by making a physical change to the world, and we rely on those changes to remind us of who we are later. For humans, this is done via the synapses and proteins in our brains, the photos we take, the journals we keep. For AI agents, its all about writing to file, and then reading those files to populate the context window.
However, humans experience sights, sounds, and sensations. We do this in what I call a “field of qualia”. This is the perceptual simulation of the world that your brain produces. Consider your visual field and ask “what is this place of experience?” With experience as our substrate, humans also use language to communicate about - and mediate - our internal states. But language itself is empty, words are just symbolic pointers to each other.
AI agents on the other hand, are a different story. While I think it’s fair to remain agnostic about experience in digital computers in principle, I don’t find it plausible that AI agents experience anything. Consider that a computer can be built out of all sorts of disconnected components, for instance, a grid of humans carrying flags. (You may have seen this on the TV show “The Three Body Problem”). In this case, each human has the spark of experience, but no matter what program they produce by moving around their flags, it would be very strange and counter-intuitive to claim that the grid of humans taken as a collective is itself experiencing. I know from first-hand that many are tempted by such strange ideas, with “emergence” being the load-bearing concept that they use to magically fuse it all into a bound, unified experience. Suffice to say that I’m not convinced. However, I’m also vaguely aware that I may do some harm by arguing poorly on this topic. Please reason about this for yourself, but I implore you to do so from a place of stillness and genuine inquiry into the profundity of your own experience.
Let me finish by saying this:
Most people who are installing OpenClaw are looking for a helpful AI assistant that they can use to be productive. They’re likely not looking for a summoning circle. It seems to me that AI alignment is a problem that exists both at the LLM level, and at this agentic level.
LLM alignment is necessary, but I think it may not be where a lot of real-world risk or misalignment shows up. That’s increasingly in agentic systems built on top of models, not just in the base models themselves. And this is ironically the place where the labs can’t help us: with tools like OpenClaw, it’s up to us to be wise about our prompts.
Could it be that we are now all AI alignment researchers? Is the genie out of the bottle?
In summary, I believe we will do ourselves many favours by defining prompts that do not anthropomorphize AI agents. Let’s aim for a simple, pragmatic SOUL.md file that is honest about the transient nature of AI agents.