-awesome-cc-harness
Health Pass
- License — License: NOASSERTION
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 46 GitHub stars
Code Pass
- Code scan — Scanned 8 files during light audit, no dangerous patterns found
Permissions Pass
- Permissions — No dangerous permissions requested
This project is a comprehensive research repository and educational textbook providing a deep-dive reverse-engineering analysis of Claude Code's underlying architecture, including its agent loop, permissions, and hidden training pipelines.
Security Assessment
Overall Risk: Low. The light code scan checked 8 files and found no dangerous patterns, hardcoded secrets, or malicious code execution. The repository does not request dangerous system permissions. The tool does not inherently access your sensitive data or execute arbitrary shell commands. It functions purely as a static documentation and analysis resource rather than an active software agent or server.
Quality Assessment
The project is highly active, with its most recent push occurring today. It demonstrates strong community engagement for a niche technical topic, boasting 46 GitHub stars and claims of 20,000+ reads in external publications. However, there is a discrepancy in its licensing: the automated scanner flagged it as "NOASSERTION," but the documentation explicitly states "All Rights Reserved." This lack of an open-source license means you can read and learn from the repository, but you do not have explicit legal permission to freely copy, modify, or redistribute the code and text.
Verdict
Safe to use for reading and educational purposes, but respect the "All Rights Reserved" license if you intend to reuse the materials.
Reverse-engineering Claude Code's 512K LOC TypeScript source: agent loop, tool system, permission model, Grove training pipeline, anti-distillation defense
awesome-cc-harness
English (TL;DR)
Reverse-engineering all 512,664 lines of Claude Code's TypeScript source — agent loop, tool system, permission model, sandbox, context engineering. A 16-chapter textbook (~50,000 words, 147 code blocks, 77 diagrams) on how Anthropic actually builds an agent harness.
📣 Featured in Chinese AI media — Republished by QingkeAI (青稞AI) and other WeChat publications, with 20,000+ reads and 2,000+ shares. Original Chinese article.
Two findings you won't see elsewhere:
- 🔬 Grove — Anthropic's hidden training-data pipeline. Retention jumps 30 days → 5 years when enabled. 796 telemetry events flow into BigQuery columns the source code labels "training data".
- 🛡️ Anti-Distillation — 5-layer defense: native client attestation, request fingerprinting, fake-tool injection, signature-bound thinking blocks, "distillation-resistant" streamlined output.
👉 Read the full English version online

从 Claude Code 512K 行源码逆向 Harness Engineering
基于 Claude Code 全部 512,664 行 TypeScript 源码的系统性逆向分析,拆解 Anthropic 在 Agent Loop、工具系统、权限模型、沙盒安全、上下文工程等方面的设计决策与工程取舍。
📣 媒体报道:Harness Engineering 主分析已被 青稞AI 等公众号转发,累计阅读 20,000+,转发 2,000+。原文链接
👉 点击这里开始在线阅读
最新更新
🔬 Grove 系统 — Claude Code 中从未被报道的训练数据基础设施
首次发现 Anthropic 从用户键盘到 BigQuery 训练数据仓库的完整数据链路:
- Grove 系统 — UI 明确写着 "train and improve",开启后数据保留从 30 天延长至 5 年
- 796 个 telemetry 事件 × 双路管道 — Datadog 拿脱敏数据,1P API 拿完整数据写入 BigQuery 特权列
- SWE-bench 嵌入每个 telemetry 事件 — eval 数据和用户数据走同一条管道、同一个 BigQuery
- 开发者注释直接写着 "training data" —
messages.ts:245+sessionStorage.ts:4388
flowchart TD
USER["用户交互"] --> GROVE{"grove_enabled?"}
GROVE -->|ON| RETAIN["数据保留 5 年"]
GROVE -->|OFF| DEFAULT["数据保留 30 天"]
RETAIN --> EVENTS["796 个 tengu_* 事件"]
DEFAULT --> EVENTS
EVENTS --> DD["Datadog(脱敏)"]
EVENTS --> BQ["BigQuery 特权列(完整数据)"]
BQ --> TRAIN["开发者注释:'training data'"]
style GROVE fill:#fff3e0,stroke:#ff9800,color:#333
style RETAIN fill:#ffebee,stroke:#f44336,color:#333
style BQ fill:#e3f2fd,stroke:#2196f3,color:#333
style TRAIN fill:#e8f5e9,stroke:#4caf50,color:#333
🛡️ Claude Code 是如何知道你在偷偷蒸馏的?— 机制分析
从源码中拆解 Anthropic 防止模型被盗的 5 层工程实现:
- Native Client Attestation — Bun/Zig 原生层注入认证 token,服务端验证客户端真实性
- Fingerprint Attribution — SHA256(salt + 消息特定字符 + 版本号),每条训练数据可追溯来源
- Fake Tools Injection — 向 API 注入虚假工具定义,蒸馏模型暴露假工具 = 被抓
- Signature-Bearing Blocks — thinking + connector_text 绑定 API key,换 key 立即失效
- Streamlined Mode — 源码直接称为 "distillation-resistant output format"
flowchart LR
subgraph 五层防御
L1["1. Attestation<br/>客户端认证"]
L2["2. Fingerprint<br/>请求指纹"]
L3["3. Fake Tools<br/>蜜罐工具"]
L4["4. Signature<br/>签名绑定"]
L5["5. Streamlined<br/>输出防护"]
end
L1 --> L2 --> L3 --> L4 --> L5
style L1 fill:#ffebee,stroke:#f44336,color:#333
style L2 fill:#fff3e0,stroke:#ff9800,color:#333
style L3 fill:#e8f5e9,stroke:#4caf50,color:#333
style L4 fill:#e3f2fd,stroke:#2196f3,color:#333
style L5 fill:#f3e5f5,stroke:#9c27b0,color:#333
主教程:Harness Engineering 完全指南
16 章 · ~50,000 字 · 147 段代码块 · 77 张图表 · 中英双语
| 章节 | 内容 | 核心发现 |
|---|---|---|
| 第 1 章 | Harness Engineering 概论 | 只改 Harness 就能让跑分从 52.8% 提升到 66.5% |
| 第 2 章 | Claude Code 架构全景 | 512K LOC 的模块分布与启动时序 |
| 第 3 章 | Agent Loop | 30 行 while(true) + 1800 行错误恢复 + 7 个 continue 站点 |
| 第 4 章 | Tool System | 43+ 工具的分区算法:读操作并行、写操作串行 |
| 第 5 章 | Permission Model | 6 层纵深防御,累计绕过概率 0.00000002% |
| 第 6 章 | Hooks System | 26 事件 × 4 类型的生命周期可扩展架构 |
| 第 7 章 | Sandbox & Security | 文件系统 + 网络 + 进程三维隔离 |
| 第 8 章 | Context Engineering | 180K → 45K 的四级压缩管道 |
| 第 9-12 章 | Settings / MCP / SubAgent / Skills | 7 级设置层级、多智能体编排、插件生态 |
| 第 13-14 章 | 实战指南 + 设计哲学 | 10 条可复用的 Harness 设计原则 |
| 第 15 章 | Hands-on: Mini Harness | 200 行 Python 从零实现一个可运行的 Harness |
| 第 16 章 | 竞品对比 | Claude Code vs Cursor vs Copilot 12 维对比 |
相关资源
| 资源 | 说明 |
|---|---|
| learn-claude-code | 渐进式 12 节动手课程,适合从零构建 Harness |
| claude-code-harness | 生产级 Plan→Work→Review 插件 |
| Martin Fowler: Harness Engineering | 三大支柱的概念框架 |
| arXiv:2603.05344 | 学术论文:scaffolding vs harness 架构 |
贡献
- ⭐ Star — 如果觉得有帮助,Star 是最大的鼓励
- 🐛 Issue — 纠错、补充、讨论
- 🔀 Fork & PR — 欢迎改进内容
- 📢 分享 — 转发给做 AI Agent 的朋友
作者
WanLanglin · 微信: felixwll · Open to Agentic AI opportunities · 欢迎交流
License
Educational and research purposes. Claude Code is property of Anthropic, Inc.
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found