This piece is missing the most important reason OpenClaw is dangerous: LLMs are still inherently vulnerable to prompt injection / lethal trifecta attacks, and OpenClaw is being used by hundreds of thousands of people who do not understand the security consequences of giving an LLM-powered tool access to their private data, exposure to potentially untrusted instructions and the ability to run tools on their computers and potentially transmit copies of their data somewhere else.
It feels like everyone is just collectively ignoring this.
LLMs are way less useful when you have to carefully review and approve every action it wants to take, and even that’s vulnerable to review exhaustion and human error. But giving LLMs unrestricted access to a bunch of stuff via MCP servers and praying nothing goes wrong is extremely dangerous.
All it takes is a tiny snippet from any source to poison the context and then an attacker has remote code execution AND can leverage the LLM itself to figure out how best to exfiltrate and cause the most damage. We are in a security nightmare and everyone is asleep. Claude Code isn’t even sandboxed by default for christ sakes, that’s the least it could do!
Right on. Human-in-the-loop doesn't scale at agent speed. Sandboxing constrains tool execution environments, but says nothing about which actions an agent is authorized to take. That gets even worse once agents start delegating to other agents.I've been building a capability-based authz solution: task-scoped permissions that can only narrow through delegation, cryptographically enforced, offline verification. MIT/Apache2.0, Rust Core.
https://github.com/tenuo-ai/tenuo
This is just the price of being on the bleeding edge.
Unfortunately, prompt injection does strongly limit what you can safely use LLMs for. But people are willing to accept the limitations because they do a lot of really awesome things that can't be done any other way.
They will figure out a solution to prompt injection eventually, probably by training LLMs in a way that separates instructions and data.
It’s like money laundering, but now responsibility laundering.
Anthropic released Claude saying “hey be careful. But now that enables the masses to build OpenClaw and go “hold my bear”. Now the masses people using OpenClaw had no idea what responsibility they should hold.
I think eventually we will have laws like “you are responsible for your AI’s work”. Much like how driver is (often) responsible for car crashes, not the car companies.
Hey, author here. I don't think that the security vulns are the most important reason OC is dangerous. Security vulnerabilities are bad but the blast radius is limited to the person who gets pwnd. By comparison, OpenClaw has demonstrated potential to really hurt _other_ people, and it is not hard to see how it could do so en masse.
>> Security vulnerabilities are bad but the blast radius is limited to the person who gets pwnd
No? Via prompt injection an attacker can gain access to the entire machine, which can have things like credentials to company systems (e.g. env variables). They can also learn private details about the victim’s friends and family and use those as part of a wider phishing campaign. There are dozens of similar scenarios where the blast radius reaches well beyond the victim.
Agree with author - it's especially scary that even without getting hacked, openclaw did something harmful
That's not to say that prompt injection isn't also scary. It's just that software getting hacked by bad actors has always been a thing. Software doing something scary when no human did anything malicious is worse.
>> No? Because I wouldn't give it access to those things.
Not everyone is like that. In fact, OpenClaw's true "power" is unlocked when the user gives it full access. That's what the overwhelming majority of hype is coming from. Most people who actually get a lot of value out of it don't run it on e.g. docker containers on VPSs that can only be accessed via Tailscale + SSH.
I think there is a much higher risk of it hurting the people are using it directly, especially once bad people realize how vulnerable they are.
Not to mention a bad person who takes control of a network of OpenClaw instances via their insecurities can do the other bad things you are describing at a much greater scale.