论文 · Papers2026-06-02 · Tuesday, June 2, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

论文提出 ClawTrojan，研究本地 agent harness 中由文件或工具输出触发、写入并跨会话生效的多步 trojan backdoor。OpenClaw-style workspace 中 GPT-5.4 的攻击成功率达到 95.5%，而传统单轮 prompt injection 在同一模型上几乎为零。DASGuard 通过扫描敏感文件中的 control-like text、追踪来源并清理不可信控制内容来防御。

–浏览

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

评论 · Comments