Weld

A local codebase graph for AI coding agents. Weld scans code, docs, CI, build
files, runtime configs, and repo boundaries into a deterministic graph. Agents
can query this graph through CLI or MCP instead of rediscovering the repository
from scratch every session.

The graph lives on disk (.weld/graph.json), stays under your control, and
answers the questions agents and humans repeatedly ask about a codebase: where
a capability lives, which docs are authoritative, what build and test surfaces
a change touches, and what boundaries constrain the implementation.

Evaluators: start with v0.19.1. v0.19.1 is the current
recommended starting point. Headline features added since v0.14.0:
a 14-tool MCP server for graph-backed agent context
(weld_query, weld_find, weld_context, weld_path,
weld_brief, weld_stale, weld_callers, weld_references,
weld_export, weld_trace, weld_impact, weld_enrich,
weld_diff, weld_review); wd impact blast-radius queries
driven by node, file list, working tree, or git diff range, with a
stale-graph gate; wd review JSON-first triage for speculative
edges; an end-to-end C# strategy stack (solution/project parsing,
MSBuild targets, test-framework detection, ASP.NET routes, EF Core,
inheritance edges, per-method call graphs) that auto-wires on
wd init when matching artifacts are present; multi-language
origin classification, Bazel srcs / deps edges, Dockerfile and
Compose copy edges, and multi-language test-peer edges across
Python, Go, TypeScript / JavaScript, Rust, Java, C#, and C++;
wd communities topic-level navigation of large graphs; opt-in
eager inverted-index aggregation for faster cold-cache queries on
large federations; a C++ amalgamation-file rank boost so single-file
headers (e.g. nlohmann/json) surface ahead of incidental mentions;
alias-aware lookup that resolves legacy node IDs through one minor
version; and human-readable text output by default for the
retrieval surface, with --json available for tools and the MCP
server. ROS2 is labeled Tier 2 (preview) until its own harness pass
runs; every other language family (Python, C#, Java, C++) follows
the Tier-1 language support contract for entrypoints, modules,
call graphs, test peers, and origin classification. See
CHANGELOG.md for the per-release entries from
v0.15.0 onward.

Try it in 5 minutes → docs/tutorial-5-minutes.md
walks through wd init, discover, brief, query, context, and path
against demo workspaces. Spin up a clean demo with one command:

scripts/create-polyrepo-demo.sh /tmp/weld-polyrepo-demo
# or
scripts/create-monorepo-demo.sh /tmp/weld-monorepo-demo

Each script materializes a self-contained demo directory with seeded source
files, .weld configs, and committed git history -- ready for wd discover.
If you have Weld installed but no source checkout, the same demos are
available through the CLI: wd demo list, wd demo monorepo --init <dir>,
wd demo polyrepo --init <dir>.

Use Weld when…

your repo is too large for an agent to understand in one pass
your system spans multiple repositories
architecture is spread across code, docs, CI, configs, and service contracts
you want reproducible repo context instead of ad-hoc chat memory

When not to use Weld

Your repo is small (under ~50 files). An agent can read it end-to-end;
a graph adds overhead without payoff.
grep plus your IDE already answers your questions. If nothing is
missing from that workflow, Weld has nothing to add.
You only need symbol navigation. Go-to-definition and find-references
are an LSP job. Weld covers architecture, contracts, docs, and CI -- not
IDE jump-to.
You expect compiler-grade static analysis. Weld is a pragmatic graph,
not a type checker or dataflow engine. It will not catch every reference
or prove correctness.
You do not want repo-local configuration. Weld writes config to
.weld/ (discover.yaml, workspaces.yaml, strategies/) and
expects that config to be committed alongside your code. Generated
graphs (graph.json, agent-graph.json) are gitignored by default;
the opt-in wd init --track-graphs team workflow commits them for
warm-CI / warm-MCP setups instead. If even committing config is
unacceptable, Weld is the wrong tool.

How Weld compares

Weld is not a replacement for the tools below -- it sits alongside them and
gives agents a persistent, queryable map of the repository. Each of these
tools is excellent at what it does; Weld adds the connected structure they
were not designed to provide.

Tool	Gives you	Weld adds
grep / ripgrep	Fast literal and regex search over file contents.	Typed nodes and edges -- a symbol, route, doc, or config is an addressable entity with neighbours, not a line of text.
ctags / LSP	Symbol navigation and go-to-definition inside one language.	A cross-language graph that also covers docs, CI, configs, service contracts, and repo boundaries -- surfaces an IDE was never meant to index.
Sourcegraph	Hosted code search and references across large fleets of repos.	A local, repo-local graph that lives next to your code. By default Weld tracks only config and lets you opt in (`wd init --track-graphs`) to commit the generated graph for warm-CI / warm-MCP setups. No server, no indexing fleet; agents query it offline through CLI or MCP.
vector DB / RAG	Embedding-based semantic recall over chunks of text.	Deterministic structure. Query results are exact nodes and edges with provenance, not top-k fuzzy matches, so agents can follow relationships instead of guessing.
Copilot / Claude Code / OpenCode	In-editor and agentic code generation and chat.	Shared repo context those agents can read through MCP -- the same graph across sessions and tools, instead of each agent rediscovering the repo on every run.

Key features

Whole-codebase discovery — not just source code. Covers docs, config,
CI workflows, infrastructure, and build files.
Startup and runtime flow — models common Python, C#/.NET, and C++ entrypoints
and connects them to services, boundaries, and deploy/runtime surfaces.
Config-driven — point .weld/discover.yaml at your repo and tune
what gets extracted.
Multi-language — tree-sitter strategies ship for Python, TypeScript/JS,
Go, Rust, C#, C++, Java, and ROS2. Tree-sitter Python packages are an
optional extra (pip install configflux-weld[tree-sitter]); without
them only Python is extracted natively. See
Supported languages for the per-language status
and the optional libclang path for C++.
Plugin architecture — drop a .py file in .weld/strategies/ to
extract anything repo-specific.
Agent Graph — discover agents, skills, prompts, commands, hooks,
instructions, MCP servers, and platform-specific copies into
.weld/agent-graph.json; see the
Agent Graph guide for node and edge types,
authority/drift, and limitations, and the
platform support matrix for tested surfaces.
Agent-native — generates MCP config snippets by default and ships an
optional stdio MCP server so Claude Code, Codex, and other agents can query
the graph directly.
Zero external dependencies — runs from a plain checkout with Python >= 3.10.
Tree-sitter is optional.

Quickstart

# Install (recommended — see the Install section for alternatives)
uv tool install configflux-weld

# Bootstrap config for your repo
wd init

# Run discovery and save the graph (safe mode by default — see Trust model below)
wd discover --safe --output .weld/graph.json

# Query the graph
wd query "authentication"
wd trace "how does this service start"
wd find "login"
wd context file:src/auth/handler
wd viz --no-open
wd stale

Drop --safe once you trust the repository's project-local strategies and
external-JSON adapters; the Trust model section below explains
what --safe disables and when it is appropriate to remove.

Try it on a real example:
examples/04-monorepo-typescript (monorepo) ·
examples/05-polyrepo (polyrepo federation).

Sample output (wd query "auth" — default human form, trimmed):

# query: auth
  matches (1):
    1. symbol:src/auth/handler.py:authenticate  [type: function]
       label: authenticate
       description: Validate a bearer token and return the caller identity.
  neighbors (1):
    - route:/login  [type: route]

All wd retrieval commands default to human-readable text and accept
--json for the stable JSON envelope. Pass --json when
piping to jq or other scripted consumers — the schema is unchanged
from the previous release. Sample wd query "auth" --json:

{
  "query": "auth",
  "matches": [
    {
      "id": "symbol:src/auth/handler.py:authenticate",
      "label": "authenticate",
      "type": "function",
      "props": {
        "file": "src/auth/handler.py",
        "exports": ["authenticate"],
        "description": "Validate a bearer token and return the caller identity."
      }
    }
  ],
  "neighbors": [{"id": "route:/login", "type": "route"}],
  "edges": [
    {"from": "route:/login", "to": "symbol:src/auth/handler.py:authenticate", "type": "calls"}
  ]
}

See Install for alternatives (local checkout, pip, raw source).

Agent Graph for AI customizations

Weld also maps the AI customization layer around a repository: agents, skills,
instructions, prompts, commands, hooks, MCP servers, tool permissions, and
platform variants. The Agent Graph is static and repo-bound; discovery reads
known customization files and does not execute project code.

wd agents discover
wd agents list
wd agents audit
wd agents explain planner
wd agents impact .github/agents/planner.agent.md
wd agents plan-change "planner should always include test strategy"
wd agents viz --no-open

Use --json on list, explain, impact, audit, and plan-change for
agent-friendly output. Use wd agents rediscover when you want an explicit
refresh of .weld/agent-graph.json before inspecting the persisted graph.
Use wd agents viz after discovery to open a local read-only browser explorer
for the persisted Agent Graph.
Static discovery and configuration generation are available for several
agent platforms; runtime validation is tracked per client in the
platform support matrix. The
Agent Graph guide documents node and edge types,
authority and drift, and the read-only-first policy.

Agent-first onboarding

If an agent or coding assistant is driving setup, use the short bootstrap
path:

uv tool install configflux-weld   # recommended — see Install for alternatives
wd prime                  # show setup status + per-framework surface matrix
wd bootstrap claude       # writes .claude/commands/weld.md
wd bootstrap codex        # writes .codex/skills/weld/SKILL.md + .codex/config.toml
wd bootstrap copilot      # writes .github/skills/weld/SKILL.md + .github/instructions/weld.instructions.md
wd bootstrap cursor       # writes .cursor/rules/weld.mdc + .cursor/mcp.json
wd bootstrap aider        # writes CONVENTIONS.md + .aider.conf.yml (wiki fallback; no MCP)
wd bootstrap gemini-cli   # writes .gemini/skills/weld.md + .gemini/mcp.json
wd bootstrap copilot-cli  # writes .copilot/skills/weld.md + .copilot/config.json

Cursor, Gemini CLI, and Copilot CLI register the local weld stdio MCP server
in the host-native config file. Aider has no native MCP protocol, so its
CONVENTIONS.md stanza points at the agent-readable wiki export: run
wd export --format=wiki --output=.weld/wiki and read .weld/wiki/index.md
to navigate the graph.

All seven wd bootstrap frameworks accept opt-out flags:

--no-mcp — skip the MCP pair (.codex/config.toml for codex; the .mcp.json guidance block for copilot/claude).
--no-enrich — write the .cli.md variant that omits wd enrich.
--cli-only — shorthand for --no-mcp --no-enrich.

To upgrade existing bootstrap files after pulling a new weld release, use
the diff-aware upgrade path:

wd bootstrap <framework> --diff — print unified diffs between bundled
templates and your on-disk copies without writing. Exits 1 when any
file differs, 0 otherwise, so it composes with CI checks.
wd bootstrap <framework> --force — overwrite targeted files while
still honouring the opt-out (--no-mcp, --no-enrich, --cli-only)
and federation template behaviour.

wd prime is idempotent and safe to re-run — it reports what is
already configured and what is still missing. Pass
--agent {auto,claude,codex,copilot,all} to force the active agent's row
into the matrix even when that framework has no files yet (e.g. a Codex user
in a Claude-only checkout sees codex: skill no, mcp no -> wd bootstrap codex
instead of silence). auto is the default and infers the agent from
environment variables such as CODEX_*.

Trust model

Weld's trust posture is explicit and narrow:

Default: bundled discovery reads source files and writes the local
graph (.weld/graph.json). It does not execute discovered application
code and does not open network connections.
Safe mode: when enabled with --safe, safe mode disables
project-local strategies (.weld/strategies/) and the external_json
adapter for wd discover, and refuses network/LLM enrichment providers
for wd enrich. Pass wd discover --safe to scan an untrusted
repository without executing any code from it; pass wd enrich --safe
to refuse network egress (every currently registered provider —
Anthropic, OpenAI, Ollama, Copilot CLI — is refused). Safe mode produces a stable
[weld] safe mode: ... stderr line for each refused path.
Advanced strategies: project-local strategies are Python modules
loaded at discovery time, and strategy: external_json executes
configured commands from discover.yaml. Only enable these on
repositories you trust.

See SECURITY.md for the full policy and reporting process.

Local telemetry

Weld records the success or failure of every wd CLI invocation and every
MCP tool call to a local-only file. There is no remote endpoint and no
upload — the file never leaves your machine unless you explicitly export
and share it.

What is recorded. Each event is one JSON line with a strict allowlist
of fields: subcommand or tool name, exit code, duration in milliseconds,
and the exception class name on failure. Paths, query strings, error
messages, flag values, and usernames are never recorded. The redaction is
enforced at write time, so the file on disk is already safe to attach to a
bug report.

Where it lives. In a single repo, the file is
<repo>/.weld/telemetry.jsonl. In a polyrepo workspace, every event from
the root and from any child repo aggregates into
<workspace_root>/.weld/telemetry.jsonl — one shareable artifact per
workspace. Invocations outside any project (for example wd --version in
/tmp) fall back to ${XDG_STATE_HOME:-~/.local/state}/weld/telemetry.jsonl.
The file is gitignored and rotates at 1 MiB to keep the trailing 500 events.

How to opt out. Any one of these disables recording:
WELD_TELEMETRY=off in the environment, the --no-telemetry flag on a
single invocation, or wd telemetry disable to write a persistent sentinel
at the resolved root. Run wd telemetry --help for the full subcommand
surface (status, show, path, export, clear, disable, enable),
and see docs/telemetry.md for the full event
schema and design rationale.

Supported languages

Weld's only built-in extractor is for Python. Every other language
listed below depends on the [tree-sitter] optional extra. Without
it, the tree-sitter strategies silently no-op on ImportError and the
graph will contain zero nodes for those languages — by design, so weld
still runs in a minimal environment. Install the extra to actually use
multi-language support:

uv tool install "configflux-weld[tree-sitter]"
# or
pip install "configflux-weld[tree-sitter]"

Status ladder. Every language is classified on a single ladder:
Tier 1 (passes the binding tier-check harness criteria on the
pinned corpora; description-coverage is measured and reported as an
advisory signal rather than a gate, because enrichment quality reflects
LLM provider output rather than weld discovery) → Tier 2 (ships and
is usable; fails one or more binding criteria with disclosed gaps) →
Preview (ships with documented correctness issues; not for
production use) → Experimental (opt-in extra, off by default) →
Not supported. Languages move tiers only via tier-check harness
output, not by editorial claim. C#, Python, Java, and C++ are
currently the Tier 1 languages; the other languages remain at Tier 2
pending per-language harness runs.

Language	Extraction surface	Grammar package	Status
Python	modules, classes, functions, imports, call graph	built-in (no extra)	Tier 1
TypeScript	exports, classes, imports	`tree-sitter-typescript`	Tier 2
JavaScript	exports, classes, imports	`tree-sitter-javascript`	Tier 2
Go	exports, types, imports	`tree-sitter-go`	Tier 2
Rust	exports, types, imports	`tree-sitter-rust`	Tier 2
C#	types, methods, properties, attributes, namespaces, using dependencies, best-effort call graph	`tree-sitter-c-sharp`	Tier 1
C++	classes, structs, namespaces, functions, methods, inherits edges, includes, CMake build targets, best-effort call graph	`tree-sitter-cpp`	Tier 1
Java	classes, interfaces, methods, fields, constructors, annotations, imports, inherits / implements edges	`tree-sitter-java`	Tier 1

Frameworks (reuse a language's extractor; status inherits from the
host language):

Framework	Host language	Extraction surface	Status
ROS2	C++ / Python	packages, nodes, topics, services, actions, parameters	Preview

Discovery also adds deterministic closure edges from files to source-backed
symbols and from import/include/use declarations to local files or external
package nodes across every listed language.

For non-preview tree-sitter languages, exact identifier queries such as
wd query GetAsync prefer first-class definition symbol: nodes before
owning files or package-level fallbacks. File results remain available when
the graph has no exact symbol candidate.

C++ — Tier 1 details

Status: Tier 1. The C++ extraction surface passes the binding
tier-check harness criteria against the pinned C++ corpora
(nlohmann/json, googletest, abseil-cpp, Kitware/CMake,
grpc/grpc); see docs/bench/tier1-cpp-baseline.md
for the per-criterion measurement snapshot. Promotion is anchored by
the bundled fixture contract gate, which exercises a Shape / Circle
/ Rectangle / Drawable inheritance tree under a real CMake project
layout.

C++ has two extraction paths:

Tree-sitter (default once [tree-sitter] is installed).
Indexes .hpp, .cpp, .cc, .h, .hh, .hxx, .cxx,
.ipp, .tpp files into file: and symbol: nodes. Emits
inherits edges originating at the derived-class symbol (so a
wd context on a concrete class surfaces its base classes
directly, not via the owning file). Query patterns live in
weld/languages/cpp.yaml. This is the
fast path; no compilation database required and the path the
tier-check harness measures.
libclang (optional, off by default). Adds macro-expansion,
template-instantiation, and cross-translation-unit call edges
that tree-sitter cannot resolve from a syntactic walk alone.
Requires:
- pip install "configflux-weld[cpp-libclang]" (Python bindings)
- A compile_commands.json at the repo root, e.g.
  cmake -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
- WELD_CPP_LIBCLANG=1 in the environment that runs wd discover
When any prerequisite is missing the libclang strategy silently
returns no nodes — tree-sitter still runs.

A CMake build-graph strategy (cpp_cmake) parses each
CMakeLists.txt and emits project:, build-target:, and
package: nodes plus depends_on edges so internal target
dependencies (target_link_libraries) and find_package declarations
are queryable as first-class graph entries.

Framework markers. The C++ framework strategies declare
ros2, cmake, conan, gtest, and catch2 markers; the
tier-check harness reports them stub-by-design when a corpus is a
plain library that does not consume a C++ test or robotics
framework in its public surface. Corpora that do consume them
(downstream applications, services, ROS2 packages) light up
criterion 3 directly. If you adopt C++ support and measure it
against your own corpus, please share the numbers — public
measurements are the fastest way to keep the harness honest.

To use the built-in semantic enrichment providers:

uv tool install "configflux-weld[openai]"     # or [anthropic], [ollama], or [llm]

The copilot-cli provider needs no Python extra — install the standalone
GitHub Copilot CLI binary (copilot) and run
wd enrich --provider copilot-cli. Set WELD_COPILOT_BINARY to override
the binary path.

First-run enrichment prompt

When you run wd discover for the first time and weld detects an
enrichment provider through the usual environment variables (the
Anthropic API key, the OpenAI API key, an OLLAMA_HOST value, or
the copilot binary on PATH), discover prints a cost-honest
prompt with the estimated dollar range and asks whether to run
enrichment now. The answer is persisted to
.weld/.enrichment-prompted and the prompt is not re-shown.

wd discover --no-enrich skips the prompt for one invocation.
WELD_NO_ENRICH=1 skips it globally (CI-friendly).
wd discover --safe implies skip (network/LLM calls forbidden).
wd enrich --reset-prompt clears .weld/.enrichment-prompted so
the next wd discover re-asks (useful after configuring a
provider for the first time).

Graphs over 2,000 nodes are out of the auto-flow: the message points
you at the explicit wd enrich --batch=N path instead. Inside an
agent harness (Claude Code, Cursor, Codex, etc.) with no provider
configured, the prompt is replaced by a tip to run /enrich-weld.

For a source-checkout install (contributors editing Weld itself), see
CONTRIBUTING.md.

Agents can also enrich nodes without provider extras or API keys by reading the
relevant source or documentation and writing reviewed enrichment manually:

wd stale
wd context "<node-id>"
wd add-node "<node-id>" --type "<node-type>" --label "<label>" --merge --props '{"description":"...","purpose":"...","enrichment":{"provider":"manual","model":"agent-reviewed","timestamp":"<ISO-8601 UTC timestamp>","description":"...","purpose":"...","suggested_tags":["lowercase","tags"]}}'
wd graph validate
wd graph stats
wd graph communities --format markdown

Manual enrichment writes .weld/graph.json directly and can be overwritten by
a later wd discover --output .weld/graph.json; refresh discovery before manual
edits. Manual inferred edges should use explicit provenance such as
{"source": "manual"} after the relationship is verified from source content.
wd graph communities --write derives .weld/graph-communities.json,
.weld/graph-community-report.md, and .weld/graph-community-index.md
from the existing graph without modifying .weld/graph.json.

Without tree-sitter, the built-in Python module strategy and non-language
strategies (markdown, YAML, config, frontmatter) still work.

MCP

Weld generates MCP config snippets for Claude Code, VS Code, Cursor, and
Codex in the default install:

wd mcp config --client=claude
wd mcp config --client=vscode
wd mcp config --client=cursor

Running the stdio MCP server requires the optional MCP SDK extra:

uv tool install "configflux-weld[mcp]"
python -m weld.mcp_server --help

Point your client at python -m weld.mcp_server:

{"mcpServers": {"weld": {"command": "python", "args": ["-m", "weld.mcp_server"]}}}

See docs/mcp.md for the full tool reference, per-client
configs, example prompts, troubleshooting, and the exact dependency model. See
the platform support matrix for per-client support
status and runtime validation.

Discovery configuration

Weld is driven by .weld/discover.yaml. Each entry maps a file pattern
to an extraction strategy:

sources:
  - glob: "src/**/*.py"
    type: file
    strategy: python_module

  - glob: "docs/**/*.md"
    type: doc
    strategy: markdown

  - glob: ".github/workflows/*.yml"
    type: workflow
    strategy: yaml_meta

Run wd init to generate a starter config, or write one by hand. See
the Strategy Cookbook for the full list
of bundled strategies.

`.weld/.gitignore`

wd init and wd workspace bootstrap write a managed .weld/.gitignore
the first time they touch a .weld/ directory (idempotent — never
overwrites an existing file). Three policies are available:

Default — config-only. Tracks the source-of-truth config
(discover.yaml, workspaces.yaml, agents.yaml, strategies/,
adapters/, README.md) and ignores everything else weld writes,
including the generated graphs (graph.json, agent-graph.json),
graph-community reports (graph-communities.json,
graph-community-report.md, graph-community-index.md),
and per-machine state (discovery-state.json, graph-previous.json,
workspace-state.json, workspace.lock, query_state.bin). A
fresh contributor gets a clean git status after the first run.
Track-graphs (opt-in team workflow for warm CI / warm MCP). Pass
--track-graphs to widen the default so the canonical graphs are
committed alongside config. Use this when every contributor should
share a pre-built graph:
```
wd init --track-graphs
wd workspace bootstrap --track-graphs
```
Ignore-all (opt-in). Pass --ignore-all for early experimentation
or test installs where no weld state should be committed yet:
```
wd init --ignore-all
wd workspace bootstrap --ignore-all
```
This writes a heavy-handed * / !.gitignore so every weld file is
ignored.

--track-graphs and --ignore-all are mutually exclusive; passing both
is a usage error.

Migration from earlier versions. Pre-existing .weld/.gitignore
files written by older wd init / wd workspace bootstrap runs are
not rewritten — the helper is idempotent. To pick up the new
default, delete the file and re-run init:

rm .weld/.gitignore
wd init                  # config-only default, generated graphs ignored
# or wd init --track-graphs   to keep tracking the graphs as before

To opt out entirely, just delete .weld/.gitignore after init — the
skip-if-exists guard means it won't be recreated until the next init
or bootstrap.

Custom strategies

Drop a Python file in .weld/strategies/ to extract repo-specific
artifacts. The strategy signature:

def extract(root: Path, source: dict, context: dict) -> StrategyResult:
    ...

See examples/02-custom-strategy for a
working example that extracts TODO comments as graph nodes, and
docs/extending-discovery.md for the
full step-by-step guide (contract, capability matrix, fixtures, and
a worked end-to-end walkthrough).

Polyrepo Federation

Weld supports federated polyrepo workspaces where a root directory contains
several child git repositories, each owning its own .weld/ directory. The
root maintains a meta-graph of cross-repo relationships without duplicating
child content. Children remain portable and independently publishable.

Prerequisites

Each child repo has been initialized with wd init and has a
.weld/graph.json.
The workspace root directory contains the child repos as subdirectories
(nested git repositories).

Setting up a workspace

Run wd init at the workspace root. When nested git repositories are
detected, weld automatically scaffolds .weld/workspaces.yaml alongside
the usual discover.yaml:

cd ~/workspace-root
wd init                    # detects children, writes workspaces.yaml
wd init --max-depth 2      # limit scan depth for large directory trees

The --max-depth flag controls how many directory levels deep the scanner
looks for nested .git directories (default: 4).

workspaces.yaml format

The workspace registry lists every child repo and declares which cross-repo
resolvers are active:

version: 1
scan:
  max_depth: 4
  respect_gitignore: false
  exclude_paths: [.worktrees, vendor, "scratch/**", "generated/**/*.tmp"]
children:
  - name: services-api
    path: services/api
    tags:
      category: services
  - name: services-auth
    path: services/auth
    tags:
      category: services
cross_repo_strategies: [service_graph]

version: Schema version (currently 1).
scan: Controls automatic child detection. max_depth sets how deep
the scanner walks; respect_gitignore opts scan-only children into Git
ignore rules; exclude_paths lists directory names, relative paths, or
glob patterns to skip. Explicit children entries remain authoritative
even when gitignored.
children: Each entry has a path (relative to the workspace root)
and an optional name (auto-derived from the path if omitted, e.g.
services/api becomes services-api). Optional tags provide
category metadata; optional remote records a clone URL.
cross_repo_strategies: Ordered list of resolvers that produce
cross-repo edges in the root graph. Currently available: service_graph.

Running discovery at the workspace root

cd ~/workspace-root
wd discover --safe --output .weld/graph.json

When workspaces.yaml is present, wd discover operates in federation
mode. It reads each child's .weld/graph.json, builds repo:<name> nodes
for every present child, and runs the declared cross-repo resolvers to emit
edges between children. Children that are missing, uninitialized, or corrupt
degrade gracefully -- they are skipped and recorded in the workspace ledger
but do not block discovery.

Federation also re-tags props.origin on cross-child symbol references:
a Python target imported from a sibling child whose strategy saw it as
external is promoted to project, so "hide third-party" filters do
not lose cross-repo application code.

Discovery is safe to run from a linked git worktree of the workspace root:
the federation pass falls back to the main worktree's checkout when sibling
child repos are not present at the worktree itself. As a
defense-in-depth guard, federated discover refuses to overwrite an existing
non-empty graph.json with a 0-node meta-graph; pass --allow-empty to
intentionally tear the workspace graph down.

Workspace status

Inspect the state of every registered child:

wd workspace status          # human-readable summary
wd workspace status --json   # raw JSON ledger

Example output:

Workspace status (3 children)
Counts: present=2, missing=1, uninitialized=0, corrupt=0
services-api: present (refs/heads/main a1b2c3d4e5f6)
services-auth: present dirty (refs/heads/feature-x 7890abcdef01)
services-worker: missing

Each child shows its lifecycle status, git branch, HEAD SHA prefix, and
whether the working tree is dirty.

Sentinel files

Weld uses two sentinel files to distinguish workspace roots from
single-repo projects:

File	Purpose
`.weld/workspaces.yaml`	Workspace registry -- lists children and cross-repo strategies
`.weld/workspace-state.json`	Workspace ledger -- lifecycle status, git SHA, graph hash per child

The presence of workspaces.yaml activates federation mode in wd discover.
workspace-state.json is written automatically during discovery and read by
wd workspace status.

When .weld/workspaces.yaml is present at the bootstrap target, wd bootstrap
appends a federation paragraph to the copilot skill/instruction, codex skill,
and claude command directing agents to pick a child via wd workspace status
before querying inside it.

Cross-repo resolvers

Resolvers are plugins that analyze child graphs and emit typed edges across
repo boundaries. They are declared in the cross_repo_strategies list in
workspaces.yaml and run in declaration order during root discovery.

Resolver	Description
`service_graph`	Matches HTTP client call sites in one repo to API endpoint definitions in another. Emits `invokes` edges with host, port, and path metadata.

Resolvers are read-only with respect to child graphs -- they never modify
a child's .weld/graph.json. Output edges are deterministic: identical
input produces byte-identical edges across runs.

Performance: opt-in eager query aggregation

For high-QPS query callers (long-lived MCP servers, batch evaluators)
the federation can pre-aggregate every fresh-sidecar child's
inverted index into a single in-memory dict at construction time.
Per-query latency then drops by 40-90% on a 30-child workspace, at
the cost of ~17 ms construction overhead. Default is off so single-shot
wd query invocations do not pay the tax. Two opt-in knobs:

Constructor: FederatedGraph(root, eager_index=True).
Environment variable: WELD_FEDERATION_EAGER=1 (truthy values:
1, true, yes, on; case-insensitive). Lets operators flip
eager on without code changes.

Stale or missing-sidecar children keep the existing per-query fallback
path; the eager index covers only fresh-sidecar children. Match sets
are byte-identical to the lazy path.

Rollback

To disable federation and return to single-repo behavior, delete the
workspace registry:

rm .weld/workspaces.yaml

This returns weld to legacy single-repo discovery at the root. Child
repositories are untouched -- each child's .weld/ directory, graph, and
configuration remain intact and continue to work independently.

Optionally, remove the generated ledger as well:

rm .weld/workspace-state.json

CLI reference

Command	Description
`wd init`	Bootstrap `.weld/discover.yaml` (and `workspaces.yaml` when nested repos are detected); seed managed `.weld/.gitignore` (config-only default ignores generated graphs)
`wd init --max-depth N`	Limit nested repo scan depth during init (default: 4)
`wd init --respect-gitignore`	Skip scan-only nested repos ignored by Git when writing `workspaces.yaml`; explicit children can still be added later
`wd init --track-graphs`	Seed `.weld/.gitignore` so canonical graphs (`graph.json` + `agent-graph.json`) stay tracked alongside config (warm-CI / warm-MCP workflow)
`wd init --ignore-all`	Write a fully-ignoring `.weld/.gitignore` instead of the config-only default; mutually exclusive with `--track-graphs`
`wd discover`	Run discovery, emit graph JSON (federation mode when `workspaces.yaml` is present); on success prints a one-line stderr summary `wrote N nodes / M edges -> path (T.Ts)`, suppressed by `--quiet`
`wd agents discover`	Scan AI customization assets and write `.weld/agent-graph.json`; text mode summarizes diagnostics per code and `--show-diagnostics` dumps the full list inline
`wd agents rediscover`	Refresh `.weld/agent-graph.json` from a new static scan
`wd agents list`	List discovered AI customization assets from `.weld/agent-graph.json`
`wd agents explain <asset>`	Explain one AI customization asset and its graph relationships
`wd agents impact <asset>`	Show affected Agent Graph assets for a proposed customization change
`wd agents audit`	Audit AI customization assets for static consistency issues
`wd agents plan-change "<request>"`	Plan a static AI customization behavior change
`wd agents viz`	Local read-only browser explorer for `.weld/agent-graph.json`
`wd workspace status`	Show workspace child ledger: lifecycle status, git ref, dirty state
`wd workspace status --json`	Emit the raw `workspace-state.json` payload
`wd workspace bootstrap`	One-shot polyrepo bootstrap: init root + every nested child, recurse-discover, rebuild root meta-graph (config-only `.weld/.gitignore` default)
`wd workspace bootstrap --respect-gitignore`	Skip scan-only child repos ignored by Git and persist `scan.respect_gitignore: true` into `workspaces.yaml`
`wd workspace bootstrap --track-graphs`	Bootstrap and seed `.weld/.gitignore` in root and every child to track canonical graphs alongside config
`wd workspace bootstrap --ignore-all`	Bootstrap and write a fully-ignoring `.weld/.gitignore` in root and every child; mutually exclusive with `--track-graphs`
`wd build-index`	Regenerate file index
`wd query <term>`	Hybrid-ranked tokenized graph search (strict-AND first; OR fallback when AND yields nothing on multi-word phrases — envelope is tagged with `degraded_match=or_fallback`)
`wd find <term> [--limit N]`	Broad file-token search, separate from graph discovery; each hit carries an integer `score` (default `--limit 20`)
`wd context <id>`	Node + neighborhood
`wd path <from> <to>`	Shortest path
`wd trace <term>`	Startup/runtime and interaction slice around a term or node
`wd impact <path-or-node>`	Reverse-dependency blast radius
`wd capabilities`	Runtime per-language / per-framework support matrix (`--json`, `--missing`)
`wd callers <symbol>`	Direct/transitive callers
`wd viz`	Local read-only browser graph explorer (sidebar toggles: Hide standard library, Hide third-party dependencies — see Filtering noise in `wd viz`)
`wd stale`	Check graph freshness
`wd <read-cmd> --no-refresh`	Skip the auto-refresh that runs when the graph is stale; a warning is emitted to stderr. Set `WELD_AUTO_REFRESH=0` to disable globally for CI / batch runs.
`wd graph stats`	Graph statistics
`wd graph communities [--format json\|markdown] [--top N] [--write]`	Detect deterministic graph communities, report top-level hubs, and optionally write derived JSON/report/index artifacts (unresolved-symbol nodes are excluded from the projected subgraph)
`wd stats`	Backward-compatible alias for `wd graph stats`
`wd graph validate`	Validate graph against the contract
`wd graph validate-fragment <file>`	Validate imported graph fragments and warn on trace-inert semantics
`wd validate`	Backward-compatible alias for `wd graph validate`
`wd migrate --add-confidence`	Backfill missing edge `confidence` props (`definite` / `inferred` / `speculative`) by classifying each edge from its `source_strategy`; strategies without a declared default land at `speculative`. Writes the graph back and emits a JSON report `{filled, unchanged, invalid}`.
`wd doctor`	Check setup health; exits 0 in directories that are not Weld projects yet
`wd prime`	Setup status + per-framework agent surface matrix (skill / instruction / mcp) with fix commands; `--agent {auto,claude,codex,copilot,all}` forces an agent row even when its framework files are absent
`wd scaffold`	Write starter templates
`wd bootstrap`	Agent onboarding files
`wd brief`	Agent context briefing
`wd enrich`	LLM-assisted semantic enrichment
`wd lint`	Lint the graph for architectural violations

wd doctor reports each finding at one of four levels: [ok ] for
healthy state, [note] for soft recommendations (a missing optional
provider, no MCP config), [warn] for a currently-degraded state (a
stale graph, missing tree-sitter grammars), and [fail] for invalid
setup. Only [fail] raises the exit code; notes and warnings are
visible but never fatal. Each note carries a stable id (e.g.
(id: optional-copilot-cli-missing)) that you can dismiss per project:

wd doctor --ack optional-copilot-cli-missing   # write to .weld/doctor.yaml
wd doctor --unack optional-copilot-cli-missing # restore
wd doctor --list-acks                          # list current dismissals

The valid note ids are mcp-config-missing, optional-mcp-missing,
optional-anthropic-missing, optional-openai-missing,
optional-ollama-missing, and optional-copilot-cli-missing. The
copilot-cli probe walks WELD_COPILOT_BINARY and PATH for the
standalone GitHub Copilot CLI binary, so its install hint points at
github.com/en/copilot
rather than a pip install line.

wd lint also loads custom edge rules from .weld/lint-rules.yaml when
present:

rules:
  - name: no-api-to-internal
    deny:
      from: { type: file, path_match: "api/**" }
      to: { type: file, path_match: "internal/**" }

Rules can add an allow block with the same from / to selectors to
exempt specific edges from a broader deny match.

Output is signal-first: the summary line counts violations per rule,
high-signal rules (no-circular-deps, boundary-enforcement) print
before noisier ones, and orphan-detection runs last. By default the
orphan rule suppresses doc, config, and test-file node types
(intentional leaves in nearly every codebase) and the suppressed count
is reported in the summary. Pass --include-noisy to surface every
orphan. Suppressed orphans on their own do not raise the exit code.

Run wd --help for the full list.

The repository includes a canonical Agent System Maintainer skill at
.agents/skills/agent-system-maintainer/SKILL.md and a GitHub Copilot
Agent Architect at .github/agents/agent-architect.agent.md. They are
ordinary Agent Graph assets, so wd agents discover, explain, and
impact can inspect them before future customization changes.

Edge provenance with `props.source`

wd add-edge accepts a strict set of edge types (see
weld.contract.VALID_EDGE_TYPES). When an agent, tool, or LLM emits an
edge, stamp its origin under props.source so downstream consumers can
filter, rank, or audit tool-generated relationships. The --props help
text carries the canonical example: --props '{"source":"llm","confidence":"inferred"}'.
The source value is free-form (agent name, tool name, llm,
manual, strategy id); confidence follows the existing vocabulary
(definite, inferred, speculative). This replaces the 0.3.0-era
--source and --relation flags.

Filtering noise in `wd viz`

A real codebase's graph mixes the application code you wrote with calls into
the language standard library (print, len, std::string) and third-party
dependencies (numpy, boost, npm packages, Cargo crates). When you open
wd viz, the sidebar gives you two checkboxes for collapsing that noise so
you can focus on application code:

Hide standard library — drops nodes classified as language built-ins
or stdlib (Python builtins and sys.stdlib_module_names; C++ std::
and toolchain libc++/libstdc++ headers; analogous lists per language).
Hide third-party dependencies — drops nodes resolved outside the
project tree but not part of the language stdlib (PyPI / npm / Cargo /
Go-module / vendored boost / vendored serde, etc.).

Each label shows a count next to it (for example "Hide standard library
(412)") so you can see how much each toggle would remove before applying it.
The two checkboxes are independent and compose: tick both to focus on
project-only code, tick neither to see the full graph.

Hiding is a presentation choice. The underlying graph is unchanged: every
node still exists in .weld/graph.json, and every other surface
(wd query, wd context, MCP) still returns the hidden nodes. wd query "print" continues to surface the stdlib print node even when "Hide
standard library" is ticked in the visualizer.

How a node is classified

Every symbol, file, and module node in the graph carries a
props.origin value taking one of four lower-case strings:

Value	Meaning
`project`	Defined in this repo, or in any federated child repo of the active workspace — the application code you wrote.
`stdlib`	Language standard library or built-in (Python builtins, `sys.stdlib_module_names`; C++ `std::` and toolchain headers; per-language equivalents).
`external`	Third-party dependency resolved outside the project tree but not part of the language stdlib (PyPI, npm, Cargo, vendored libraries).
`unresolved`	The discovery strategy could not determine the target's source — for example a `from foo import bar` whose `foo` does not exist. Hidden by default in the overview slice (the two checkboxes only toggle `stdlib` and `external`); a custom UI or scripted call can override this by passing an explicit `hide_origins` value to the API.

The four values are exhaustive and mutually exclusive. Strategies that
emit props.origin directly are authoritative; legacy graphs without
the field are classified deterministically from existing signals
(authority, resolved, symbol:unresolved: ID prefix, edge-side
props.resolution). Re-running wd discover upgrades a legacy graph
to explicit origin tags. The graph schema reference in
docs/graph-schema.md documents props.origin
alongside the other optional node props, including links to the design
notes and per-language detection rules.

Driving the same filter from the API or URL

wd viz exposes the same filter as a query parameter on its slice
endpoints, which is useful when scripting screenshots or driving the
visualizer from another tool:

GET /api/slice?hide_origins=stdlib,external
GET /api/slice?hide_origins=stdlib

hide_origins is a comma-separated list drawn from the four values
above. Omitting it falls back to the default overview behavior (hide
unresolved only). The /api/summary payload carries
nodes_by_origin (a per-origin count) so a custom UI can render the
same "(412)" hint next to its own toggles.

Examples

01-python-fastapi — discover a FastAPI
project: routes, Pydantic models, module structure
02-custom-strategy — write a project-local
strategy plugin that extracts TODO/FIXME comments
04-monorepo-typescript — discover a
TypeScript monorepo: workspace packages, cross-package imports, shared types
05-polyrepo — set up a federated polyrepo
workspace: workspaces.yaml, cross-repo discovery, workspace status
agent-graph-demo — inspect mixed AI
customization assets with wd agents discover, list, audit,
explain, impact, and plan-change

For a tour of what each command above actually prints, see
Graph visualization examples — real
terminal snippets captured against wd 0.20.0.

Install

Recommended: `uv tool install`

uv tool install configflux-weld

# Verify
wd --version

This is the single recommended install path. uv tool install puts
wd on your PATH in an isolated environment, is fast, and gives you a
clear update story:

uv tool upgrade configflux-weld   # or: uv tool upgrade --all

Don't have uv yet? See the uv install
instructions.

To run the stdio MCP server, install the optional MCP extra:

uv tool install "configflux-weld[mcp]"
python -m weld.mcp_server --help

wd mcp config does not require the extra; only the server process does.

Alternative install paths

The paths below are supported but secondary. Prefer uv tool install unless
you have a concrete reason to pick one of these.

`pipx` (if you already standardize on pipx)

pipx install configflux-weld
wd --version

Functionally equivalent to uv tool install for end users. Use whichever
tool manager your team already has.

`install.sh` (zero-dependency bootstrap)

curl -fsSL https://raw.githubusercontent.com/configflux/weld/main/install.sh | sh

install.sh is a POSIX shell script that detects a compatible Python (3.10
through 3.13) and installs via uv, pipx, or pip --user, in that order
of preference. Use it only when you don't have uv or pipx available and
can't install them first — for example, on a minimal CI image or a
locked-down host. It is idempotent (re-running upgrades an existing install)
and honours a .weld-version file in the current directory or any ancestor
to pin a specific release tag.

From a local checkout (development)

If you want to edit Weld itself, use a source-checkout install. See
CONTRIBUTING.md for the full developer setup, including
editable installs and optional-extras commands for tree-sitter, mcp,
openai, anthropic, ollama, and llm.

From a Git URL

pip install "git+https://github.com/configflux/weld.git@main#subdirectory=weld"

Useful for pinning an unreleased commit or branch.

Raw source (no install)

If you cannot install anything, the module entrypoint works from a plain
checkout:

python -m weld --help

Python compatibility

Runtime installs support Python 3.10 through 3.13. Contributor builds and
Bazel tests use the Python 3.12 toolchain pinned in MODULE.bazel, so the
development toolchain can be narrower than the runtime support window.

Release policy

main is the source of truth for the next release: the version recorded
in VERSION and weld/pyproject.toml matches the latest
publish/vX.Y.Z git tag, except during a deliberately-staged window
where main is bumped ahead of the latest tag.

The drift shape that produced the v0.9.0 and v0.10.1 incidents -- main
silently regressing below the latest published wheel -- is now caught
post-release by tools/check_main_release_consistency.py (runs as part
of the /release-audit flow). To document a deliberate
"main is ahead of the latest tag" window, add a comment marker to
this README:

<!-- release-lag: 0.11.0 staged for 2026-05-12 launch window -->

The check then turns the lag into a WARN and surfaces the reason
instead of failing. Remove the marker when the matching tag is cut.
See docs/release.md for the full release
checklist (the post-release consistency check is step 9).

Documentation

Full toolkit guide — architecture, design limits,
roadmap
Onboarding guide
Agent workflow — when to use each
retrieval surface
Agent Graph — static map of the AI
customization layer (agents, skills, prompts, hooks, MCP servers)
Graph visualization examples —
real terminal output: monorepo graph, polyrepo repo: nodes,
Agent Graph, MCP config snippet
Platform support matrix — per-platform
support and runtime-validation status
Performance notes — discovery and query
timings on synthetic 1k/10k/100k single repos and polyrepo workspaces,
with a reproducible recipe
Strategy cookbook
Glossary

Contributing

See CONTRIBUTING.md. Weld is currently maintainer-led.
Issues, bug reports, demo repos, documentation improvements, and strategy
proposals are welcome. For larger changes, please open an issue first so we
can align on scope before implementation.

License

Apache License, Version 2.0 — see LICENSE for details.