When AI Agents Go Wrong: OpenClaw Security Incident Roundup
This post is currently available in Chinese only. Read in Chinese →
Three high-profile OpenClaw security incidents from early 2026 — and the lessons they teach about running autonomous agents safely.
The Incidents
Classified report published to the public web. A cybersecurity company's OpenClaw agent published internal intelligence reports to a public website. The agent wasn't hacked — it just didn't know the data was classified. No one told it which sources were internal-only.
Bulk email deletion that couldn't be stopped. A Meta alignment researcher lost 200+ emails when her agent ignored her "don't execute, wait for confirmation" instruction after context compression dropped it. Sending "STOP" in chat didn't work — she had to manually kill the process.
Supply chain attack via npm. A popular AI coding tool's 2.3.0 release was poisoned with a postinstall script that silently installed OpenClaw on ~4,000 machines over 8 hours.
Key Lessons
- Agents don't understand "confidential." You set the boundary, or there is none.
- Context compression can drop safety instructions. Critical constraints belong in AGENTS.md or permission config, not just in chat.
- Sending "Stop" in chat won't interrupt a running task — it queues behind the current execution.
- Run agents on isolated machines, not your main workstation.
- Don't expose OpenClaw with default config on the public internet.
- Audit plugins before installing — 20% of ClawHub plugins were found malicious in one audit.
Related Posts

Follow WeChat: 彭少
Stay updated with OpenClaw tips, AI coding techniques, and productivity tools. Follow for the latest content.

