Turn a model running on your own machine into a real, tool-using software agent.

No cloud. No API keys. No telemetry. Your code never leaves your computer.

📘 Use it · 🧩 How it works · 🤝 Contribute · ⚖️ License

🌟 Why qwen-harness?

🔒 Truly private & offline. The model, your files, and every transcript stay on
your machine. Zero network egress, no accounts, no API keys, no vendor lock-in.
⚡ Zero dependencies, zero build step. Pure TypeScript run directly by Node —
clone, point it at Ollama, go. Nothing to install from a registry.
🤖 A real agent, not a chatbot. It reads, searches, writes, and edits files and
runs commands in a permission-gated read → decide → act → observe loop —
including multi-agent delegation, long-term memory, and resumable sessions.
🛡️ Safe by default. Every action passes a permission gate, and clearly
dangerous commands are always blocked.
🪶 Built for small local models. Argument validation + repair, malformed
tool-call recovery, and automatic context compaction make modest models reliable.
✅ Battle-tested. 162 automated tests, plus a mandatory secret scan on every run.

✨ What it can do

🧩 Use real tools (read · grep · write · edit · multi-edit · bash · list-models) •
⚡ run independent tool calls in parallel •
👥 split work across 2 parallel agents •
🧠 remember facts across chats •
💾 resume any conversation •
🌊 stream answers live •
🔁 switch models on the fly •
🛑 cancel a running task with Ctrl+C •
🪶 auto-compact long chats.

🎯 Built for

Developers who want a private, air-gapped coding agent · teams avoiding cloud AI
costs and data exposure · anyone running local LLMs who wants real actions, not
just chat · learners who want a clean, readable agent architecture to study.

📖 Documentation

This README is the front door. Everything else lives in a focused document:

Looking for…	Go to
Install, run & use it — step-by-step, plain-English	USER-GUIDE.md
Architecture + data-flow diagram, project structure, technology choices, dependencies	THIRD_PARTY_NOTICES.md
Contributing & the dev/test workflow	CONTRIBUTING.md
License (Apache-2.0)	LICENSE

🚀 Try it in 60 seconds

ollama serve     # start your local model server
npm start        # launch the agent (interactive)

Then just type what you want in plain English. Full walkthrough, options, and
examples → USER-GUIDE.md.

💡 Using it to the fullest

The small things that make it genuinely useful day to day:

🗣️ It holds the whole conversation. In the interactive REPL the agent remembers
everything said so far, so follow-ups just work:

> Read config.ts and tell me what it does.
> Now add a comment at the top explaining that.   ← it knows "that" = config.ts

⏸️ Stop now, continue later — right where you left off. Every chat is saved and the
REPL prints a session id at startup:

npm start -- --list-sessions       # find a past chat
npm start -- --resume <id>         # reopen it with full context restored
# inside the REPL:  /sessions  (list)    /new  (start fresh)

🧠 It remembers across different chats. Tell it something durable and it comes back
later, even in a brand-new conversation:

> Remember that our build output goes in the dist folder.
# …a new chat, another day:
> Where does the build output go?        →  "the dist folder"

👥 Hand a big job to a team (max 2 at once).

npm start -- --multi "add a header comment to every file in src, in parallel"

🔁 Switch brains for the task — fast 7B for everyday work, the bigger model for hard
problems, mid-chat:

/model qwen3-coder:30b        (/model qwen2.5-coder:7b to switch back)

🎚️ Decide how often it checks with you:

/mode acceptEdits   # stop confirming each edit (dangerous commands still blocked)
/mode plan          # look only, change nothing
/mode default       # ask before every change (the safe default)

⚡ One-and-done for a quick task:

npm start -- "list the files here and tell me which is largest"

⌨️ REPL commands: /model <tag> · /mode <mode> · /models · /sessions · /new · /exit

Prefer a full, step-by-step walkthrough? → USER-GUIDE.md.

⭐ At a glance

162 tests · 0 dependencies · no build step · 100% offline · Apache-2.0 licensed.

🔎 Keywords

local LLM agent · offline AI coding assistant · Ollama · Qwen ·
qwen2.5-coder · qwen3-coder · autonomous coding agent · tool-using LLM ·
agentic loop · ReAct · function calling · multi-agent · zero-dependency ·
TypeScript · Node.js · private AI · on-device AI · self-hosted AI ·
no API key · terminal / CLI AI · local inference · code generation ·
developer tool

_{💡 For real discoverability, also set these as the repository Description and
Topics in GitHub's “About” panel — that, plus this README, is what search engines
and GitHub search index.}

_{Apache-2.0 Licensed · Designed to run entirely on your machine 🖥️}

ollama-local-coding-agent