weld
Health Uyari
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 5 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
Weld connected structure toolkit for agent-first repository discovery
Weld
A local codebase graph for AI coding agents. Weld scans code, docs, CI, build
files, runtime configs, and repo boundaries into a deterministic graph. Agents
can query this graph through CLI or MCP instead of rediscovering the repository
from scratch every session.
The graph lives on disk (.weld/graph.json), stays under your control, and
answers the questions agents and humans repeatedly ask about a codebase: where
a capability lives, which docs are authoritative, what build and test surfaces
a change touches, and what boundaries constrain the implementation.
Evaluators: start with v0.19.1. v0.19.1 is the current
recommended starting point. Headline features added since v0.14.0:
a 14-tool MCP server for graph-backed agent context
(weld_query,weld_find,weld_context,weld_path,weld_brief,weld_stale,weld_callers,weld_references,weld_export,weld_trace,weld_impact,weld_enrich,weld_diff,weld_review);wd impactblast-radius queries
driven by node, file list, working tree, or git diff range, with a
stale-graph gate;wd reviewJSON-first triage for speculative
edges; an end-to-end C# strategy stack (solution/project parsing,
MSBuild targets, test-framework detection, ASP.NET routes, EF Core,
inheritance edges, per-method call graphs) that auto-wires onwd initwhen matching artifacts are present; multi-language
origin classification, Bazelsrcs/depsedges, Dockerfile and
Compose copy edges, and multi-language test-peer edges across
Python, Go, TypeScript / JavaScript, Rust, Java, C#, and C++;wd communitiestopic-level navigation of large graphs; opt-in
eager inverted-index aggregation for faster cold-cache queries on
large federations; a C++ amalgamation-file rank boost so single-file
headers (e.g.nlohmann/json) surface ahead of incidental mentions;
alias-aware lookup that resolves legacy node IDs through one minor
version; and human-readable text output by default for the
retrieval surface, with--jsonavailable for tools and the MCP
server. ROS2 is labeled Tier 2 (preview) until its own harness pass
runs; every other language family (Python, C#, Java, C++) follows
the Tier-1 language support contract for entrypoints, modules,
call graphs, test peers, and origin classification. SeeCHANGELOG.mdfor the per-release entries from
v0.15.0 onward.
Try it in 5 minutes → docs/tutorial-5-minutes.md
walks through wd init, discover, brief, query, context, and path
against demo workspaces. Spin up a clean demo with one command:
scripts/create-polyrepo-demo.sh /tmp/weld-polyrepo-demo
# or
scripts/create-monorepo-demo.sh /tmp/weld-monorepo-demo
Each script materializes a self-contained demo directory with seeded source
files, .weld configs, and committed git history -- ready for wd discover.
If you have Weld installed but no source checkout, the same demos are
available through the CLI: wd demo list, wd demo monorepo --init <dir>,wd demo polyrepo --init <dir>.
Use Weld when…
- your repo is too large for an agent to understand in one pass
- your system spans multiple repositories
- architecture is spread across code, docs, CI, configs, and service contracts
- you want reproducible repo context instead of ad-hoc chat memory
When not to use Weld
- Your repo is small (under ~50 files). An agent can read it end-to-end;
a graph adds overhead without payoff. grepplus your IDE already answers your questions. If nothing is
missing from that workflow, Weld has nothing to add.- You only need symbol navigation. Go-to-definition and find-references
are an LSP job. Weld covers architecture, contracts, docs, and CI -- not
IDE jump-to. - You expect compiler-grade static analysis. Weld is a pragmatic graph,
not a type checker or dataflow engine. It will not catch every reference
or prove correctness. - You do not want repo-local configuration. Weld writes config to
.weld/(discover.yaml,workspaces.yaml,strategies/) and
expects that config to be committed alongside your code. Generated
graphs (graph.json,agent-graph.json) are gitignored by default;
the opt-inwd init --track-graphsteam workflow commits them for
warm-CI / warm-MCP setups instead. If even committing config is
unacceptable, Weld is the wrong tool.
How Weld compares
Weld is not a replacement for the tools below -- it sits alongside them and
gives agents a persistent, queryable map of the repository. Each of these
tools is excellent at what it does; Weld adds the connected structure they
were not designed to provide.
| Tool | Gives you | Weld adds |
|---|---|---|
| grep / ripgrep | Fast literal and regex search over file contents. | Typed nodes and edges -- a symbol, route, doc, or config is an addressable entity with neighbours, not a line of text. |
| ctags / LSP | Symbol navigation and go-to-definition inside one language. | A cross-language graph that also covers docs, CI, configs, service contracts, and repo boundaries -- surfaces an IDE was never meant to index. |
| Sourcegraph | Hosted code search and references across large fleets of repos. | A local, repo-local graph that lives next to your code. By default Weld tracks only config and lets you opt in (wd init --track-graphs) to commit the generated graph for warm-CI / warm-MCP setups. No server, no indexing fleet; agents query it offline through CLI or MCP. |
| vector DB / RAG | Embedding-based semantic recall over chunks of text. | Deterministic structure. Query results are exact nodes and edges with provenance, not top-k fuzzy matches, so agents can follow relationships instead of guessing. |
| Copilot / Claude Code / OpenCode | In-editor and agentic code generation and chat. | Shared repo context those agents can read through MCP -- the same graph across sessions and tools, instead of each agent rediscovering the repo on every run. |
Key features
- Whole-codebase discovery — not just source code. Covers docs, config,
CI workflows, infrastructure, and build files. - Startup and runtime flow — models common Python, C#/.NET, and C++ entrypoints
and connects them to services, boundaries, and deploy/runtime surfaces. - Config-driven — point
.weld/discover.yamlat your repo and tune
what gets extracted. - Multi-language — tree-sitter strategies ship for Python, TypeScript/JS,
Go, Rust, C#, C++, Java, and ROS2. Tree-sitter Python packages are an
optional extra (pip install configflux-weld[tree-sitter]); without
them only Python is extracted natively. See
Supported languages for the per-language status
and the optional libclang path for C++. - Plugin architecture — drop a
.pyfile in.weld/strategies/to
extract anything repo-specific. - Agent Graph — discover agents, skills, prompts, commands, hooks,
instructions, MCP servers, and platform-specific copies into.weld/agent-graph.json; see the
Agent Graph guide for node and edge types,
authority/drift, and limitations, and the
platform support matrix for tested surfaces. - Agent-native — generates MCP config snippets by default and ships an
optional stdio MCP server so Claude Code, Codex, and other agents can query
the graph directly. - Zero external dependencies — runs from a plain checkout with Python >= 3.10.
Tree-sitter is optional.
Quickstart
# Install (recommended — see the Install section for alternatives)
uv tool install configflux-weld
# Bootstrap config for your repo
wd init
# Run discovery and save the graph (safe mode by default — see Trust model below)
wd discover --safe --output .weld/graph.json
# Query the graph
wd query "authentication"
wd trace "how does this service start"
wd find "login"
wd context file:src/auth/handler
wd viz --no-open
wd stale
Drop --safe once you trust the repository's project-local strategies and
external-JSON adapters; the Trust model section below explains
what --safe disables and when it is appropriate to remove.
Try it on a real example:
examples/04-monorepo-typescript (monorepo) ·
examples/05-polyrepo (polyrepo federation).
Sample output (wd query "auth" — default human form, trimmed):
# query: auth
matches (1):
1. symbol:src/auth/handler.py:authenticate [type: function]
label: authenticate
description: Validate a bearer token and return the caller identity.
neighbors (1):
- route:/login [type: route]
All wd retrieval commands default to human-readable text and accept--json for the stable JSON envelope. Pass --json when
piping to jq or other scripted consumers — the schema is unchanged
from the previous release. Sample wd query "auth" --json:
{
"query": "auth",
"matches": [
{
"id": "symbol:src/auth/handler.py:authenticate",
"label": "authenticate",
"type": "function",
"props": {
"file": "src/auth/handler.py",
"exports": ["authenticate"],
"description": "Validate a bearer token and return the caller identity."
}
}
],
"neighbors": [{"id": "route:/login", "type": "route"}],
"edges": [
{"from": "route:/login", "to": "symbol:src/auth/handler.py:authenticate", "type": "calls"}
]
}
See Install for alternatives (local checkout, pip, raw source).
Agent Graph for AI customizations
Weld also maps the AI customization layer around a repository: agents, skills,
instructions, prompts, commands, hooks, MCP servers, tool permissions, and
platform variants. The Agent Graph is static and repo-bound; discovery reads
known customization files and does not execute project code.
wd agents discover
wd agents list
wd agents audit
wd agents explain planner
wd agents impact .github/agents/planner.agent.md
wd agents plan-change "planner should always include test strategy"
wd agents viz --no-open
Use --json on list, explain, impact, audit, and plan-change for
agent-friendly output. Use wd agents rediscover when you want an explicit
refresh of .weld/agent-graph.json before inspecting the persisted graph.
Use wd agents viz after discovery to open a local read-only browser explorer
for the persisted Agent Graph.
Static discovery and configuration generation are available for several
agent platforms; runtime validation is tracked per client in the
platform support matrix. The
Agent Graph guide documents node and edge types,
authority and drift, and the read-only-first policy.
Agent-first onboarding
If an agent or coding assistant is driving setup, use the short bootstrap
path:
uv tool install configflux-weld # recommended — see Install for alternatives
wd prime # show setup status + per-framework surface matrix
wd bootstrap claude # writes .claude/commands/weld.md
wd bootstrap codex # writes .codex/skills/weld/SKILL.md + .codex/config.toml
wd bootstrap copilot # writes .github/skills/weld/SKILL.md + .github/instructions/weld.instructions.md
wd bootstrap cursor # writes .cursor/rules/weld.mdc + .cursor/mcp.json
wd bootstrap aider # writes CONVENTIONS.md + .aider.conf.yml (wiki fallback; no MCP)
wd bootstrap gemini-cli # writes .gemini/skills/weld.md + .gemini/mcp.json
wd bootstrap copilot-cli # writes .copilot/skills/weld.md + .copilot/config.json
Cursor, Gemini CLI, and Copilot CLI register the local weld stdio MCP server
in the host-native config file. Aider has no native MCP protocol, so itsCONVENTIONS.md stanza points at the agent-readable wiki export: runwd export --format=wiki --output=.weld/wiki and read .weld/wiki/index.md
to navigate the graph.
All seven wd bootstrap frameworks accept opt-out flags:
--no-mcp— skip the MCP pair (.codex/config.tomlfor codex; the.mcp.jsonguidance block for copilot/claude).--no-enrich— write the.cli.mdvariant that omitswd enrich.--cli-only— shorthand for--no-mcp --no-enrich.
To upgrade existing bootstrap files after pulling a new weld release, use
the diff-aware upgrade path:
wd bootstrap <framework> --diff— print unified diffs between bundled
templates and your on-disk copies without writing. Exits 1 when any
file differs, 0 otherwise, so it composes with CI checks.wd bootstrap <framework> --force— overwrite targeted files while
still honouring the opt-out (--no-mcp,--no-enrich,--cli-only)
and federation template behaviour.
wd prime is idempotent and safe to re-run — it reports what is
already configured and what is still missing. Pass--agent {auto,claude,codex,copilot,all} to force the active agent's row
into the matrix even when that framework has no files yet (e.g. a Codex user
in a Claude-only checkout sees codex: skill no, mcp no -> wd bootstrap codex
instead of silence). auto is the default and infers the agent from
environment variables such as CODEX_*.
Trust model
Weld's trust posture is explicit and narrow:
- Default: bundled discovery reads source files and writes the local
graph (.weld/graph.json). It does not execute discovered application
code and does not open network connections. - Safe mode: when enabled with
--safe, safe mode disables
project-local strategies (.weld/strategies/) and theexternal_json
adapter forwd discover, and refuses network/LLM enrichment providers
forwd enrich. Passwd discover --safeto scan an untrusted
repository without executing any code from it; passwd enrich --safe
to refuse network egress (every currently registered provider —
Anthropic, OpenAI, Ollama, Copilot CLI — is refused). Safe mode produces a stable[weld] safe mode: ...stderr line for each refused path. - Advanced strategies: project-local strategies are Python modules
loaded at discovery time, andstrategy: external_jsonexecutes
configured commands fromdiscover.yaml. Only enable these on
repositories you trust.
See SECURITY.md for the full policy and reporting process.
Local telemetry
Weld records the success or failure of every wd CLI invocation and every
MCP tool call to a local-only file. There is no remote endpoint and no
upload — the file never leaves your machine unless you explicitly export
and share it.
What is recorded. Each event is one JSON line with a strict allowlist
of fields: subcommand or tool name, exit code, duration in milliseconds,
and the exception class name on failure. Paths, query strings, error
messages, flag values, and usernames are never recorded. The redaction is
enforced at write time, so the file on disk is already safe to attach to a
bug report.
Where it lives. In a single repo, the file is<repo>/.weld/telemetry.jsonl. In a polyrepo workspace, every event from
the root and from any child repo aggregates into<workspace_root>/.weld/telemetry.jsonl — one shareable artifact per
workspace. Invocations outside any project (for example wd --version in/tmp) fall back to ${XDG_STATE_HOME:-~/.local/state}/weld/telemetry.jsonl.
The file is gitignored and rotates at 1 MiB to keep the trailing 500 events.
How to opt out. Any one of these disables recording:WELD_TELEMETRY=off in the environment, the --no-telemetry flag on a
single invocation, or wd telemetry disable to write a persistent sentinel
at the resolved root. Run wd telemetry --help for the full subcommand
surface (status, show, path, export, clear, disable, enable),
and see docs/telemetry.md for the full event
schema and design rationale.
Supported languages
Weld's only built-in extractor is for Python. Every other language
listed below depends on the [tree-sitter] optional extra. Without
it, the tree-sitter strategies silently no-op on ImportError and the
graph will contain zero nodes for those languages — by design, so weld
still runs in a minimal environment. Install the extra to actually use
multi-language support:
uv tool install "configflux-weld[tree-sitter]"
# or
pip install "configflux-weld[tree-sitter]"
Status ladder. Every language is classified on a single ladder:
Tier 1 (passes the binding tier-check harness criteria on the
pinned corpora; description-coverage is measured and reported as an
advisory signal rather than a gate, because enrichment quality reflects
LLM provider output rather than weld discovery) → Tier 2 (ships and
is usable; fails one or more binding criteria with disclosed gaps) →
Preview (ships with documented correctness issues; not for
production use) → Experimental (opt-in extra, off by default) →
Not supported. Languages move tiers only via tier-check harness
output, not by editorial claim. C#, Python, Java, and C++ are
currently the Tier 1 languages; the other languages remain at Tier 2
pending per-language harness runs.
| Language | Extraction surface | Grammar package | Status |
|---|---|---|---|
| Python | modules, classes, functions, imports, call graph | built-in (no extra) | Tier 1 |
| TypeScript | exports, classes, imports | tree-sitter-typescript |
Tier 2 |
| JavaScript | exports, classes, imports | tree-sitter-javascript |
Tier 2 |
| Go | exports, types, imports | tree-sitter-go |
Tier 2 |
| Rust | exports, types, imports | tree-sitter-rust |
Tier 2 |
| C# | types, methods, properties, attributes, namespaces, using dependencies, best-effort call graph | tree-sitter-c-sharp |
Tier 1 |
| C++ | classes, structs, namespaces, functions, methods, inherits edges, includes, CMake build targets, best-effort call graph | tree-sitter-cpp |
Tier 1 |
| Java | classes, interfaces, methods, fields, constructors, annotations, imports, inherits / implements edges | tree-sitter-java |
Tier 1 |
Frameworks (reuse a language's extractor; status inherits from the
host language):
| Framework | Host language | Extraction surface | Status |
|---|---|---|---|
| ROS2 | C++ / Python | packages, nodes, topics, services, actions, parameters | Preview |
Discovery also adds deterministic closure edges from files to source-backed
symbols and from import/include/use declarations to local files or external
package nodes across every listed language.
For non-preview tree-sitter languages, exact identifier queries such aswd query GetAsync prefer first-class definition symbol: nodes before
owning files or package-level fallbacks. File results remain available when
the graph has no exact symbol candidate.
C++ — Tier 1 details
Status: Tier 1. The C++ extraction surface passes the binding
tier-check harness criteria against the pinned C++ corpora
(nlohmann/json, googletest, abseil-cpp, Kitware/CMake,grpc/grpc); see docs/bench/tier1-cpp-baseline.md
for the per-criterion measurement snapshot. Promotion is anchored by
the bundled fixture contract gate, which exercises a Shape / Circle
/ Rectangle / Drawable inheritance tree under a real CMake project
layout.
C++ has two extraction paths:
Tree-sitter (default once
[tree-sitter]is installed).
Indexes.hpp,.cpp,.cc,.h,.hh,.hxx,.cxx,.ipp,.tppfiles intofile:andsymbol:nodes. Emitsinheritsedges originating at the derived-class symbol (so awd contexton a concrete class surfaces its base classes
directly, not via the owning file). Query patterns live in
weld/languages/cpp.yaml. This is the
fast path; no compilation database required and the path the
tier-check harness measures.libclang (optional, off by default). Adds macro-expansion,
template-instantiation, and cross-translation-unit call edges
that tree-sitter cannot resolve from a syntactic walk alone.
Requires:pip install "configflux-weld[cpp-libclang]"(Python bindings)- A
compile_commands.jsonat the repo root, e.g.cmake -B build -DCMAKE_EXPORT_COMPILE_COMMANDS=ON . WELD_CPP_LIBCLANG=1in the environment that runswd discover
When any prerequisite is missing the libclang strategy silently
returns no nodes — tree-sitter still runs.
A CMake build-graph strategy (cpp_cmake) parses eachCMakeLists.txt and emits project:, build-target:, andpackage: nodes plus depends_on edges so internal target
dependencies (target_link_libraries) and find_package declarations
are queryable as first-class graph entries.
Framework markers. The C++ framework strategies declareros2, cmake, conan, gtest, and catch2 markers; the
tier-check harness reports them stub-by-design when a corpus is a
plain library that does not consume a C++ test or robotics
framework in its public surface. Corpora that do consume them
(downstream applications, services, ROS2 packages) light up
criterion 3 directly. If you adopt C++ support and measure it
against your own corpus, please share the numbers — public
measurements are the fastest way to keep the harness honest.
To use the built-in semantic enrichment providers:
uv tool install "configflux-weld[openai]" # or [anthropic], [ollama], or [llm]
The copilot-cli provider needs no Python extra — install the standalone
GitHub Copilot CLI binary (copilot) and runwd enrich --provider copilot-cli. Set WELD_COPILOT_BINARY to override
the binary path.
First-run enrichment prompt
When you run wd discover for the first time and weld detects an
enrichment provider through the usual environment variables (the
Anthropic API key, the OpenAI API key, an OLLAMA_HOST value, or
the copilot binary on PATH), discover prints a cost-honest
prompt with the estimated dollar range and asks whether to run
enrichment now. The answer is persisted to.weld/.enrichment-prompted and the prompt is not re-shown.
wd discover --no-enrichskips the prompt for one invocation.WELD_NO_ENRICH=1skips it globally (CI-friendly).wd discover --safeimplies skip (network/LLM calls forbidden).wd enrich --reset-promptclears.weld/.enrichment-promptedso
the nextwd discoverre-asks (useful after configuring a
provider for the first time).
Graphs over 2,000 nodes are out of the auto-flow: the message points
you at the explicit wd enrich --batch=N path instead. Inside an
agent harness (Claude Code, Cursor, Codex, etc.) with no provider
configured, the prompt is replaced by a tip to run /enrich-weld.
For a source-checkout install (contributors editing Weld itself), see
CONTRIBUTING.md.
Agents can also enrich nodes without provider extras or API keys by reading the
relevant source or documentation and writing reviewed enrichment manually:
wd stale
wd context "<node-id>"
wd add-node "<node-id>" --type "<node-type>" --label "<label>" --merge --props '{"description":"...","purpose":"...","enrichment":{"provider":"manual","model":"agent-reviewed","timestamp":"<ISO-8601 UTC timestamp>","description":"...","purpose":"...","suggested_tags":["lowercase","tags"]}}'
wd graph validate
wd graph stats
wd graph communities --format markdown
Manual enrichment writes .weld/graph.json directly and can be overwritten by
a later wd discover --output .weld/graph.json; refresh discovery before manual
edits. Manual inferred edges should use explicit provenance such as{"source": "manual"} after the relationship is verified from source content.wd graph communities --write derives .weld/graph-communities.json,.weld/graph-community-report.md, and .weld/graph-community-index.md
from the existing graph without modifying .weld/graph.json.
Without tree-sitter, the built-in Python module strategy and non-language
strategies (markdown, YAML, config, frontmatter) still work.
MCP
Weld generates MCP config snippets for Claude Code, VS Code, Cursor, and
Codex in the default install:
wd mcp config --client=claude
wd mcp config --client=vscode
wd mcp config --client=cursor
Running the stdio MCP server requires the optional MCP SDK extra:
uv tool install "configflux-weld[mcp]"
python -m weld.mcp_server --help
Point your client at python -m weld.mcp_server:
{"mcpServers": {"weld": {"command": "python", "args": ["-m", "weld.mcp_server"]}}}
See docs/mcp.md for the full tool reference, per-client
configs, example prompts, troubleshooting, and the exact dependency model. See
the platform support matrix for per-client support
status and runtime validation.
Discovery configuration
Weld is driven by .weld/discover.yaml. Each entry maps a file pattern
to an extraction strategy:
sources:
- glob: "src/**/*.py"
type: file
strategy: python_module
- glob: "docs/**/*.md"
type: doc
strategy: markdown
- glob: ".github/workflows/*.yml"
type: workflow
strategy: yaml_meta
Run wd init to generate a starter config, or write one by hand. See
the Strategy Cookbook for the full list
of bundled strategies.
.weld/.gitignore
wd init and wd workspace bootstrap write a managed .weld/.gitignore
the first time they touch a .weld/ directory (idempotent — never
overwrites an existing file). Three policies are available:
Default — config-only. Tracks the source-of-truth config
(discover.yaml,workspaces.yaml,agents.yaml,strategies/,adapters/,README.md) and ignores everything else weld writes,
including the generated graphs (graph.json,agent-graph.json),
graph-community reports (graph-communities.json,graph-community-report.md,graph-community-index.md),
and per-machine state (discovery-state.json,graph-previous.json,workspace-state.json,workspace.lock,query_state.bin). A
fresh contributor gets a cleangit statusafter the first run.Track-graphs (opt-in team workflow for warm CI / warm MCP). Pass
--track-graphsto widen the default so the canonical graphs are
committed alongside config. Use this when every contributor should
share a pre-built graph:wd init --track-graphs wd workspace bootstrap --track-graphsIgnore-all (opt-in). Pass
--ignore-allfor early experimentation
or test installs where no weld state should be committed yet:wd init --ignore-all wd workspace bootstrap --ignore-allThis writes a heavy-handed
*/!.gitignoreso every weld file is
ignored.
--track-graphs and --ignore-all are mutually exclusive; passing both
is a usage error.
Migration from earlier versions. Pre-existing .weld/.gitignore
files written by older wd init / wd workspace bootstrap runs are
not rewritten — the helper is idempotent. To pick up the new
default, delete the file and re-run init:
rm .weld/.gitignore
wd init # config-only default, generated graphs ignored
# or wd init --track-graphs to keep tracking the graphs as before
To opt out entirely, just delete .weld/.gitignore after init — the
skip-if-exists guard means it won't be recreated until the next init
or bootstrap.
Custom strategies
Drop a Python file in .weld/strategies/ to extract repo-specific
artifacts. The strategy signature:
def extract(root: Path, source: dict, context: dict) -> StrategyResult:
...
See examples/02-custom-strategy for a
working example that extracts TODO comments as graph nodes, and
docs/extending-discovery.md for the
full step-by-step guide (contract, capability matrix, fixtures, and
a worked end-to-end walkthrough).
Polyrepo Federation
Weld supports federated polyrepo workspaces where a root directory contains
several child git repositories, each owning its own .weld/ directory. The
root maintains a meta-graph of cross-repo relationships without duplicating
child content. Children remain portable and independently publishable.
Prerequisites
- Each child repo has been initialized with
wd initand has a.weld/graph.json. - The workspace root directory contains the child repos as subdirectories
(nested git repositories).
Setting up a workspace
Run wd init at the workspace root. When nested git repositories are
detected, weld automatically scaffolds .weld/workspaces.yaml alongside
the usual discover.yaml:
cd ~/workspace-root
wd init # detects children, writes workspaces.yaml
wd init --max-depth 2 # limit scan depth for large directory trees
The --max-depth flag controls how many directory levels deep the scanner
looks for nested .git directories (default: 4).
workspaces.yaml format
The workspace registry lists every child repo and declares which cross-repo
resolvers are active:
version: 1
scan:
max_depth: 4
respect_gitignore: false
exclude_paths: [.worktrees, vendor, "scratch/**", "generated/**/*.tmp"]
children:
- name: services-api
path: services/api
tags:
category: services
- name: services-auth
path: services/auth
tags:
category: services
cross_repo_strategies: [service_graph]
- version: Schema version (currently
1). - scan: Controls automatic child detection.
max_depthsets how deep
the scanner walks;respect_gitignoreopts scan-only children into Git
ignore rules;exclude_pathslists directory names, relative paths, or
glob patterns to skip. Explicitchildrenentries remain authoritative
even when gitignored. - children: Each entry has a
path(relative to the workspace root)
and an optionalname(auto-derived from the path if omitted, e.g.services/apibecomesservices-api). Optionaltagsprovide
category metadata; optionalremoterecords a clone URL. - cross_repo_strategies: Ordered list of resolvers that produce
cross-repo edges in the root graph. Currently available:service_graph.
Running discovery at the workspace root
cd ~/workspace-root
wd discover --safe --output .weld/graph.json
When workspaces.yaml is present, wd discover operates in federation
mode. It reads each child's .weld/graph.json, builds repo:<name> nodes
for every present child, and runs the declared cross-repo resolvers to emit
edges between children. Children that are missing, uninitialized, or corrupt
degrade gracefully -- they are skipped and recorded in the workspace ledger
but do not block discovery.
Federation also re-tags props.origin on cross-child symbol references:
a Python target imported from a sibling child whose strategy saw it asexternal is promoted to project, so "hide third-party" filters do
not lose cross-repo application code.
Discovery is safe to run from a linked git worktree of the workspace root:
the federation pass falls back to the main worktree's checkout when sibling
child repos are not present at the worktree itself. As a
defense-in-depth guard, federated discover refuses to overwrite an existing
non-empty graph.json with a 0-node meta-graph; pass --allow-empty to
intentionally tear the workspace graph down.
Workspace status
Inspect the state of every registered child:
wd workspace status # human-readable summary
wd workspace status --json # raw JSON ledger
Example output:
Workspace status (3 children)
Counts: present=2, missing=1, uninitialized=0, corrupt=0
services-api: present (refs/heads/main a1b2c3d4e5f6)
services-auth: present dirty (refs/heads/feature-x 7890abcdef01)
services-worker: missing
Each child shows its lifecycle status, git branch, HEAD SHA prefix, and
whether the working tree is dirty.
Sentinel files
Weld uses two sentinel files to distinguish workspace roots from
single-repo projects:
| File | Purpose |
|---|---|
.weld/workspaces.yaml |
Workspace registry -- lists children and cross-repo strategies |
.weld/workspace-state.json |
Workspace ledger -- lifecycle status, git SHA, graph hash per child |
The presence of workspaces.yaml activates federation mode in wd discover.workspace-state.json is written automatically during discovery and read bywd workspace status.
When .weld/workspaces.yaml is present at the bootstrap target, wd bootstrap
appends a federation paragraph to the copilot skill/instruction, codex skill,
and claude command directing agents to pick a child via wd workspace status
before querying inside it.
Cross-repo resolvers
Resolvers are plugins that analyze child graphs and emit typed edges across
repo boundaries. They are declared in the cross_repo_strategies list inworkspaces.yaml and run in declaration order during root discovery.
| Resolver | Description |
|---|---|
service_graph |
Matches HTTP client call sites in one repo to API endpoint definitions in another. Emits invokes edges with host, port, and path metadata. |
Resolvers are read-only with respect to child graphs -- they never modify
a child's .weld/graph.json. Output edges are deterministic: identical
input produces byte-identical edges across runs.
Performance: opt-in eager query aggregation
For high-QPS query callers (long-lived MCP servers, batch evaluators)
the federation can pre-aggregate every fresh-sidecar child's
inverted index into a single in-memory dict at construction time.
Per-query latency then drops by 40-90% on a 30-child workspace, at
the cost of ~17 ms construction overhead. Default is off so single-shotwd query invocations do not pay the tax. Two opt-in knobs:
- Constructor:
FederatedGraph(root, eager_index=True). - Environment variable:
WELD_FEDERATION_EAGER=1(truthy values:1,true,yes,on; case-insensitive). Lets operators flip
eager on without code changes.
Stale or missing-sidecar children keep the existing per-query fallback
path; the eager index covers only fresh-sidecar children. Match sets
are byte-identical to the lazy path.
Rollback
To disable federation and return to single-repo behavior, delete the
workspace registry:
rm .weld/workspaces.yaml
This returns weld to legacy single-repo discovery at the root. Child
repositories are untouched -- each child's .weld/ directory, graph, and
configuration remain intact and continue to work independently.
Optionally, remove the generated ledger as well:
rm .weld/workspace-state.json
CLI reference
| Command | Description |
|---|---|
wd init |
Bootstrap .weld/discover.yaml (and workspaces.yaml when nested repos are detected); seed managed .weld/.gitignore (config-only default ignores generated graphs) |
wd init --max-depth N |
Limit nested repo scan depth during init (default: 4) |
wd init --respect-gitignore |
Skip scan-only nested repos ignored by Git when writing workspaces.yaml; explicit children can still be added later |
wd init --track-graphs |
Seed .weld/.gitignore so canonical graphs (graph.json + agent-graph.json) stay tracked alongside config (warm-CI / warm-MCP workflow) |
wd init --ignore-all |
Write a fully-ignoring .weld/.gitignore instead of the config-only default; mutually exclusive with --track-graphs |
wd discover |
Run discovery, emit graph JSON (federation mode when workspaces.yaml is present); on success prints a one-line stderr summary wrote N nodes / M edges -> path (T.Ts), suppressed by --quiet |
wd agents discover |
Scan AI customization assets and write .weld/agent-graph.json; text mode summarizes diagnostics per code and --show-diagnostics dumps the full list inline |
wd agents rediscover |
Refresh .weld/agent-graph.json from a new static scan |
wd agents list |
List discovered AI customization assets from .weld/agent-graph.json |
wd agents explain <asset> |
Explain one AI customization asset and its graph relationships |
wd agents impact <asset> |
Show affected Agent Graph assets for a proposed customization change |
wd agents audit |
Audit AI customization assets for static consistency issues |
wd agents plan-change "<request>" |
Plan a static AI customization behavior change |
wd agents viz |
Local read-only browser explorer for .weld/agent-graph.json |
wd workspace status |
Show workspace child ledger: lifecycle status, git ref, dirty state |
wd workspace status --json |
Emit the raw workspace-state.json payload |
wd workspace bootstrap |
One-shot polyrepo bootstrap: init root + every nested child, recurse-discover, rebuild root meta-graph (config-only .weld/.gitignore default) |
wd workspace bootstrap --respect-gitignore |
Skip scan-only child repos ignored by Git and persist scan.respect_gitignore: true into workspaces.yaml |
wd workspace bootstrap --track-graphs |
Bootstrap and seed .weld/.gitignore in root and every child to track canonical graphs alongside config |
wd workspace bootstrap --ignore-all |
Bootstrap and write a fully-ignoring .weld/.gitignore in root and every child; mutually exclusive with --track-graphs |
wd build-index |
Regenerate file index |
wd query <term> |
Hybrid-ranked tokenized graph search (strict-AND first; OR fallback when AND yields nothing on multi-word phrases — envelope is tagged with degraded_match=or_fallback) |
wd find <term> [--limit N] |
Broad file-token search, separate from graph discovery; each hit carries an integer score (default --limit 20) |
wd context <id> |
Node + neighborhood |
wd path <from> <to> |
Shortest path |
wd trace <term> |
Startup/runtime and interaction slice around a term or node |
wd impact <path-or-node> |
Reverse-dependency blast radius |
wd capabilities |
Runtime per-language / per-framework support matrix (--json, --missing) |
wd callers <symbol> |
Direct/transitive callers |
wd viz |
Local read-only browser graph explorer (sidebar toggles: Hide standard library, Hide third-party dependencies — see Filtering noise in wd viz) |
wd stale |
Check graph freshness |
wd <read-cmd> --no-refresh |
Skip the auto-refresh that runs when the graph is stale; a warning is emitted to stderr. Set WELD_AUTO_REFRESH=0 to disable globally for CI / batch runs. |
wd graph stats |
Graph statistics |
wd graph communities [--format json|markdown] [--top N] [--write] |
Detect deterministic graph communities, report top-level hubs, and optionally write derived JSON/report/index artifacts (unresolved-symbol nodes are excluded from the projected subgraph) |
wd stats |
Backward-compatible alias for wd graph stats |
wd graph validate |
Validate graph against the contract |
wd graph validate-fragment <file> |
Validate imported graph fragments and warn on trace-inert semantics |
wd validate |
Backward-compatible alias for wd graph validate |
wd migrate --add-confidence |
Backfill missing edge confidence props (definite / inferred / speculative) by classifying each edge from its source_strategy; strategies without a declared default land at speculative. Writes the graph back and emits a JSON report {filled, unchanged, invalid}. |
wd doctor |
Check setup health; exits 0 in directories that are not Weld projects yet |
wd prime |
Setup status + per-framework agent surface matrix (skill / instruction / mcp) with fix commands; --agent {auto,claude,codex,copilot,all} forces an agent row even when its framework files are absent |
wd scaffold |
Write starter templates |
wd bootstrap |
Agent onboarding files |
wd brief |
Agent context briefing |
wd enrich |
LLM-assisted semantic enrichment |
wd lint |
Lint the graph for architectural violations |
wd doctor reports each finding at one of four levels: [ok ] for
healthy state, [note] for soft recommendations (a missing optional
provider, no MCP config), [warn] for a currently-degraded state (a
stale graph, missing tree-sitter grammars), and [fail] for invalid
setup. Only [fail] raises the exit code; notes and warnings are
visible but never fatal. Each note carries a stable id (e.g.(id: optional-copilot-cli-missing)) that you can dismiss per project:
wd doctor --ack optional-copilot-cli-missing # write to .weld/doctor.yaml
wd doctor --unack optional-copilot-cli-missing # restore
wd doctor --list-acks # list current dismissals
The valid note ids are mcp-config-missing, optional-mcp-missing,optional-anthropic-missing, optional-openai-missing,optional-ollama-missing, and optional-copilot-cli-missing. Thecopilot-cli probe walks WELD_COPILOT_BINARY and PATH for the
standalone GitHub Copilot CLI binary, so its install hint points at
github.com/en/copilot
rather than a pip install line.
wd lint also loads custom edge rules from .weld/lint-rules.yaml when
present:
rules:
- name: no-api-to-internal
deny:
from: { type: file, path_match: "api/**" }
to: { type: file, path_match: "internal/**" }
Rules can add an allow block with the same from / to selectors to
exempt specific edges from a broader deny match.
Output is signal-first: the summary line counts violations per rule,
high-signal rules (no-circular-deps, boundary-enforcement) print
before noisier ones, and orphan-detection runs last. By default the
orphan rule suppresses doc, config, and test-file node types
(intentional leaves in nearly every codebase) and the suppressed count
is reported in the summary. Pass --include-noisy to surface every
orphan. Suppressed orphans on their own do not raise the exit code.
Run wd --help for the full list.
The repository includes a canonical Agent System Maintainer skill at.agents/skills/agent-system-maintainer/SKILL.md and a GitHub Copilot
Agent Architect at .github/agents/agent-architect.agent.md. They are
ordinary Agent Graph assets, so wd agents discover, explain, andimpact can inspect them before future customization changes.
Edge provenance with props.source
wd add-edge accepts a strict set of edge types (seeweld.contract.VALID_EDGE_TYPES). When an agent, tool, or LLM emits an
edge, stamp its origin under props.source so downstream consumers can
filter, rank, or audit tool-generated relationships. The --props help
text carries the canonical example: --props '{"source":"llm","confidence":"inferred"}'.
The source value is free-form (agent name, tool name, llm,manual, strategy id); confidence follows the existing vocabulary
(definite, inferred, speculative). This replaces the 0.3.0-era--source and --relation flags.
Filtering noise in wd viz
A real codebase's graph mixes the application code you wrote with calls into
the language standard library (print, len, std::string) and third-party
dependencies (numpy, boost, npm packages, Cargo crates). When you openwd viz, the sidebar gives you two checkboxes for collapsing that noise so
you can focus on application code:
- Hide standard library — drops nodes classified as language built-ins
or stdlib (Pythonbuiltinsandsys.stdlib_module_names; C++std::
and toolchain libc++/libstdc++ headers; analogous lists per language). - Hide third-party dependencies — drops nodes resolved outside the
project tree but not part of the language stdlib (PyPI / npm / Cargo /
Go-module / vendored boost / vendored serde, etc.).
Each label shows a count next to it (for example "Hide standard library
(412)") so you can see how much each toggle would remove before applying it.
The two checkboxes are independent and compose: tick both to focus on
project-only code, tick neither to see the full graph.
Hiding is a presentation choice. The underlying graph is unchanged: every
node still exists in .weld/graph.json, and every other surface
(wd query, wd context, MCP) still returns the hidden nodes. wd query "print" continues to surface the stdlib print node even when "Hide
standard library" is ticked in the visualizer.
How a node is classified
Every symbol, file, and module node in the graph carries aprops.origin value taking one of four lower-case strings:
| Value | Meaning |
|---|---|
project |
Defined in this repo, or in any federated child repo of the active workspace — the application code you wrote. |
stdlib |
Language standard library or built-in (Python builtins, sys.stdlib_module_names; C++ std:: and toolchain headers; per-language equivalents). |
external |
Third-party dependency resolved outside the project tree but not part of the language stdlib (PyPI, npm, Cargo, vendored libraries). |
unresolved |
The discovery strategy could not determine the target's source — for example a from foo import bar whose foo does not exist. Hidden by default in the overview slice (the two checkboxes only toggle stdlib and external); a custom UI or scripted call can override this by passing an explicit hide_origins value to the API. |
The four values are exhaustive and mutually exclusive. Strategies that
emit props.origin directly are authoritative; legacy graphs without
the field are classified deterministically from existing signals
(authority, resolved, symbol:unresolved: ID prefix, edge-sideprops.resolution). Re-running wd discover upgrades a legacy graph
to explicit origin tags. The graph schema reference indocs/graph-schema.md documents props.origin
alongside the other optional node props, including links to the design
notes and per-language detection rules.
Driving the same filter from the API or URL
wd viz exposes the same filter as a query parameter on its slice
endpoints, which is useful when scripting screenshots or driving the
visualizer from another tool:
GET /api/slice?hide_origins=stdlib,external
GET /api/slice?hide_origins=stdlib
hide_origins is a comma-separated list drawn from the four values
above. Omitting it falls back to the default overview behavior (hideunresolved only). The /api/summary payload carriesnodes_by_origin (a per-origin count) so a custom UI can render the
same "(412)" hint next to its own toggles.
Examples
- 01-python-fastapi — discover a FastAPI
project: routes, Pydantic models, module structure - 02-custom-strategy — write a project-local
strategy plugin that extracts TODO/FIXME comments - 04-monorepo-typescript — discover a
TypeScript monorepo: workspace packages, cross-package imports, shared types - 05-polyrepo — set up a federated polyrepo
workspace: workspaces.yaml, cross-repo discovery, workspace status - agent-graph-demo — inspect mixed AI
customization assets withwd agents discover,list,audit,explain,impact, andplan-change
For a tour of what each command above actually prints, see
Graph visualization examples — real
terminal snippets captured against wd 0.20.0.
Install
Recommended: uv tool install
uv tool install configflux-weld
# Verify
wd --version
This is the single recommended install path. uv tool install putswd on your PATH in an isolated environment, is fast, and gives you a
clear update story:
uv tool upgrade configflux-weld # or: uv tool upgrade --all
Don't have uv yet? See the uv install
instructions.
To run the stdio MCP server, install the optional MCP extra:
uv tool install "configflux-weld[mcp]"
python -m weld.mcp_server --help
wd mcp config does not require the extra; only the server process does.
Alternative install paths
The paths below are supported but secondary. Prefer uv tool install unless
you have a concrete reason to pick one of these.
pipx (if you already standardize on pipx)
pipx install configflux-weld
wd --version
Functionally equivalent to uv tool install for end users. Use whichever
tool manager your team already has.
install.sh (zero-dependency bootstrap)
curl -fsSL https://raw.githubusercontent.com/configflux/weld/main/install.sh | sh
install.sh is a POSIX shell script that detects a compatible Python (3.10
through 3.13) and installs via uv, pipx, or pip --user, in that order
of preference. Use it only when you don't have uv or pipx available and
can't install them first — for example, on a minimal CI image or a
locked-down host. It is idempotent (re-running upgrades an existing install)
and honours a .weld-version file in the current directory or any ancestor
to pin a specific release tag.
From a local checkout (development)
If you want to edit Weld itself, use a source-checkout install. See
CONTRIBUTING.md for the full developer setup, including
editable installs and optional-extras commands for tree-sitter, mcp,openai, anthropic, ollama, and llm.
From a Git URL
pip install "git+https://github.com/configflux/weld.git@main#subdirectory=weld"
Useful for pinning an unreleased commit or branch.
Raw source (no install)
If you cannot install anything, the module entrypoint works from a plain
checkout:
python -m weld --help
Python compatibility
Runtime installs support Python 3.10 through 3.13. Contributor builds and
Bazel tests use the Python 3.12 toolchain pinned in MODULE.bazel, so the
development toolchain can be narrower than the runtime support window.
Release policy
main is the source of truth for the next release: the version recorded
in VERSION and weld/pyproject.toml matches the latestpublish/vX.Y.Z git tag, except during a deliberately-staged window
where main is bumped ahead of the latest tag.
The drift shape that produced the v0.9.0 and v0.10.1 incidents -- main
silently regressing below the latest published wheel -- is now caught
post-release by tools/check_main_release_consistency.py (runs as part
of the /release-audit flow). To document a deliberate
"main is ahead of the latest tag" window, add a comment marker to
this README:
<!-- release-lag: 0.11.0 staged for 2026-05-12 launch window -->
The check then turns the lag into a WARN and surfaces the reason
instead of failing. Remove the marker when the matching tag is cut.
See docs/release.md for the full release
checklist (the post-release consistency check is step 9).
Documentation
- Full toolkit guide — architecture, design limits,
roadmap - Onboarding guide
- Agent workflow — when to use each
retrieval surface - Agent Graph — static map of the AI
customization layer (agents, skills, prompts, hooks, MCP servers) - Graph visualization examples —
real terminal output: monorepo graph, polyreporepo:nodes,
Agent Graph, MCP config snippet - Platform support matrix — per-platform
support and runtime-validation status - Performance notes — discovery and query
timings on synthetic 1k/10k/100k single repos and polyrepo workspaces,
with a reproducible recipe - Strategy cookbook
- Glossary
Contributing
See CONTRIBUTING.md. Weld is currently maintainer-led.
Issues, bug reports, demo repos, documentation improvements, and strategy
proposals are welcome. For larger changes, please open an issue first so we
can align on scope before implementation.
License
Apache License, Version 2.0 — see LICENSE for details.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi