graphlens

Name: graphlens
Author: Neko1313

Extensible polyglot code analysis framework that parses source projects, normalizes their structure into a shared graph IR, and exposes it for dependency analysis, navigation, and code intelligence tooling.

Documentation · Repository · Issues

Architecture

Repository → Language Adapter → GraphLens (IR) → Graph Backend

Layer	Responsibility
Language Adapter	Parses source files, produces `GraphLens`
GraphLens	Typed nodes + directed relations (the IR)
Graph Backend	Persists or queries the graph (Neo4j, in-memory, …)

Adapters are pure data producers — they never write to any backend. The graph is the only output.

Why graph IR?

Language-agnostic — one shared model for Python, TypeScript, Go, Rust, PHP, …
Plugin-based adapters — each language is a separate package, registered via Python entry points
Tree-sitter powered — all adapters use tree-sitter for CST parsing and exact span positions, combined with type-aware resolution (ty for Python, TypeScript Compiler API for TypeScript, gopls for Go, rust-analyzer for Rust, PHPantom for PHP)
Cross-language aware — adapters emit language-agnostic BOUNDARY ports (HTTP, queues, gRPC, Temporal); graphlens-link connects a consumer in one language to a provider in another
Monorepo aware — can_handle() and find_*_roots() handle multi-language repos correctly
Deterministic node IDs — SHA-256 hash of project::kind::qualified_name → stable across re-scans

Benchmarks

Analysis throughput on large real-world projects, refreshed automatically on
every release — one cold run per project inside the published Docker image
(so the numbers reflect exactly the toolchain users get). See
benchmarks/ to reproduce locally or add a project.

Last run: 2026-06-23 13:48 UTC · image latest · runner Linux x86_64 · single cold run, indicative only.

Project	Lang	Commit	LOC	Files	Nodes	Relations	Time	Peak RSS	KLOC/s	Resolver	Resolved
apache/superset	python	`c83fb2b`	399 519	1 886	156 253	379 813	150.2s	2,050 MB	2.7	ok	84% of 281 667 (81s)
colinhacks/zod	typescript	`1fb56a5`	74 194	404	8 741	25 258	19.6s	645 MB	3.8	ok	91% of 15 771 (16s)
gin-gonic/gin	go	`73726dc`	23 672	98	7 227	11 882	14.3s	2,089 MB	1.7	ok	100% of 8 920 (13s)
casdoor/casdoor	go	`696bcf0`	86 898	458	14 987	28 276	133.6s	14,434 MB	0.7	ok	100% of 19 421 (130s)
gohugoio/hugo	go	`4d22555`	224 821	897	34 809	72 225	110.0s	9,487 MB	2.0	ok	99% of 49 013 (103s)
BurntSushi/ripgrep	rust	`4649aa9`	50 275	98	5 365	15 087	18.5s	1,524 MB	2.7	ok	99% of 11 435 (1s)
tokio-rs/axum	rust	`c59208c`	43 653	296	8 093	14 799	83.7s	4,426 MB	0.5	ok	88% of 9 662 (1s)
astral-sh/ruff	rust	`6686f63`	687 409	1 870	69 708	217 127	255.6s	8,570 MB	2.7	ok	100% of 155 276 (14s)
laravel/framework	php	`bd8aeb6`	441 358	2 478	97 048	169 733	1133.8s	2,831 MB	0.4	ok	45% of 191 435 (945s)
Total			2 031 799		402 231		1919.3s		1.1		79% of 742 600

_{Peak RSS measured via cgroup.v2 (whole process tree, incl. LSP resolver subprocesses). KLOC/s = analysed thousands-of-lines per second. Generated by benchmarks/run_benchmarks.py.}

Documentation

Full product documentation lives at https://Neko1313.github.io/graphlens/
(built with Docusaurus from website/):

Getting Started — install, quick start, core concepts
Guides — library API, CLI, querying, visualization, Neo4j, cross-language, MCP
CI Integration — strict mode, GitHub Actions, Docker, local hooks
Adapters — Python, TypeScript, Go, Rust, PHP, and writing your own
Graph Model — nodes, relations, boundaries, serialization
API Reference — exact signatures

To run the docs locally: cd website && pnpm install && pnpm start.

Installation

# Core library only (models, contracts, registry)
pip install graphlens

# Core + Python adapter
pip install "graphlens[python]"

# Core + TypeScript adapter
pip install "graphlens[typescript]"

# Core + Go / Rust / PHP adapters
pip install "graphlens[go]"
pip install "graphlens[rust]"
pip install "graphlens[php]"

# CLI (graphlens analyze / visualize / query / neo4j)
pip install "graphlens-cli[python]"          # with Python adapter
pip install "graphlens-cli[all]"             # Python + TS + Go + Rust + PHP + Neo4j

With uv:

uv add graphlens
uv add "graphlens[python]"
uv add "graphlens[typescript]"
uv add "graphlens-cli[all]"

Docker (all adapters + toolchains pre-installed)

For CI, the published image bundles the CLI with every adapter and the
toolchains their resolvers drive (ty, Node, Go + gopls, Rust + rust-analyzer,
PHP + PHPantom) — no local setup required, and the supported way to get the
Go, Rust and PHP adapters (which are not published to PyPI). Mount your project
at /workspace:

docker run --rm -v "$PWD:/workspace" ghcr.io/neko1313/graphlens \
    analyze /workspace --output /workspace/graph.json

The image is published to the GitHub Container Registry on each release
(:latest plus :X.Y.Z / :X.Y version tags).

Quick start

from pathlib import Path
from graphlens import adapter_registry

# Load and instantiate the Python adapter
adapter = adapter_registry.load("python")()

# Analyze a project — returns a GraphLens
graph = adapter.analyze(Path("./my-project"))

print(f"Nodes:     {len(graph.nodes)}")
print(f"Relations: {len(graph.relations)}")

# Inspect nodes by kind
from graphlens import NodeKind

modules = [n for n in graph.nodes.values() if n.kind == NodeKind.MODULE]
classes = [n for n in graph.nodes.values() if n.kind == NodeKind.CLASS]

# Check the resolver actually ran (don't trust a silently degraded graph)
from graphlens import RESOLVER_STATUS_KEY
assert graph.metadata[RESOLVER_STATUS_KEY] == "ok"

# Query the graph (indexed lookups, no manual scanning)
fn = next(n for n in graph.nodes.values() if n.name == "my_function")
callers = graph.callers(fn.id)          # who calls it
callees = graph.callees(fn.id)          # what it calls
near = graph.neighbors(fn.id, depth=2)  # 2-hop neighbourhood

# Serialize for pipelines / agents (round-trippable JSON), then reload
text = graph.to_json(indent=2)
graph2 = type(graph).from_json(text)

# Diff two scans (e.g. before/after a change)
diff = old_graph.diff(graph)
print(diff.added_nodes, diff.removed_relations, diff.is_empty)

CLI (`graphlens-cli`)

Install graphlens-cli to get the graphlens entry point:

# Print node/relation statistics
graphlens analyze <project_root>
graphlens analyze ~/myrepo --lang python,typescript,go,rust

# Serialize the graph to JSON (CI indexing step); --strict fails on a
# degraded resolver so a pipeline never feeds agents an incomplete graph
graphlens analyze ~/myrepo --output graph.json
graphlens analyze ~/myrepo --format json
graphlens analyze ~/myrepo --strict

# Query a saved graph (callers | callees | references | neighbors)
graphlens query my_function --graph graph.json --op callers
graphlens query MyClass.method --graph graph.json --op neighbors --depth 2

# Interactive HTML graph viewer (opens in browser)
graphlens visualize <project_root>
graphlens visualize ~/myrepo --lang python --show-external --max-nodes 500
graphlens visualize . --output graph.html --no-open

# Export to Neo4j
graphlens neo4j <project_root> --uri bolt://localhost:7687 --user neo4j --password secret
graphlens neo4j . --wipe --batch-size 200

# Serve the graph to agents over the Model Context Protocol (needs the
# optional `mcp` extra: pip install "graphlens-cli[mcp]")
graphlens mcp --graph graph.json

`mcp` — Model Context Protocol server

Exposes a saved graph to LLM agents as MCP tools: graph_stats,
find_nodes, callers, callees, references, neighbors,
boundaries, and communicates_with. Install with the mcp extra and
point it at a JSON graph produced by graphlens analyze --output.

`visualize` — interactive HTML graph viewer

Produces a self-contained HTML file powered by vis.js and opens it in the browser.

Flag	Description
`--lang auto\|python\|typescript\|python,typescript`	Adapters to use (default: auto-detect all)
`--show-external`	Include stdlib / third-party external symbol nodes
`--show-structure`	Add `CONTAINS` / `DECLARES` structural edges
`--max-nodes N`	Prune low-degree nodes above N (default: 1500)
`--output PATH`	Write HTML to PATH instead of `graph-<name>.html`
`--no-open`	Do not open the browser automatically

Click behaviour — click any node to see its info panel. For FUNCTION
and METHOD nodes the panel has a "Show callers" button that switches the
graph into focus mode: only the selected node and every node that calls or
references it are shown, with the caller list in the sidebar. Click empty
space or ← Back to return to the full graph.

`neo4j` — export to Neo4j

Uses UNWIND … MERGE Cypher (no APOC required). Every node gets a :Code
label plus a kind-specific label (:Function, :ExternalSymbol, …).
Relations are created grouped by type. Install the optional neo4j extra:

pip install "graphlens-cli[neo4j]"

Graph model

Node kinds

Kind	Description
`PROJECT`	Root project node
`MODULE`	Python/TS/… module (directory or file)
`FILE`	Source file
`CLASS`	Class declaration
`FUNCTION`	Top-level function
`METHOD`	Method inside a class
`PARAMETER`	Function/method parameter
`VARIABLE`	Module-level or local variable
`ATTRIBUTE`	Class attribute
`TYPE_ALIAS`	Type alias declaration
`IMPORT`	Import statement
`DEPENDENCY`	Declared package dependency
`EXTERNAL_SYMBOL`	External symbol (stdlib, third-party, or unknown); carries `metadata["origin"]`
`BOUNDARY`	Cross-language interface port (HTTP route, queue topic, gRPC method, Temporal activity); shared id collapses matching server/client across languages

Relation kinds

Kind	Description
`CONTAINS`	Structural containment (project → module → file → class)
`DECLARES`	Declaration (file declares function, class declares method)
`IMPORTS`	Import edge (file → import node)
`RESOLVES_TO`	Import resolved to a module or external symbol
`CALLS`	Function/method call (resolved to declaration node)
`REFERENCES`	Value reference (variable/attribute used as a value)
`INHERITS_FROM`	Class inheritance (resolved to declaration node)
`HAS_TYPE`	Type annotation/inference edge (function/param/variable → class or external)
`DEPENDS_ON`	Package dependency
`EXPOSES`	A server/provider exposes a `BOUNDARY` (e.g. an HTTP route handler)
`CONSUMES`	A client/consumer consumes a `BOUNDARY` (e.g. an HTTP call)
`COMMUNICATES_WITH`	Consumer → provider, added by `graphlens-link` from matching `EXPOSES`/`CONSUMES`

Cross-language boundaries

Adapters emit BOUNDARY ports for the interfaces a service exposes or
consumes — HTTP/REST routes and clients, message-queue topics, gRPC
methods, and Temporal activities. Each port has a language-agnostic id
(make_boundary_id(mechanism, key)), so a Python FastAPI route and a
TypeScript fetch call to the same path collapse onto one BOUNDARY
node when their graphs are merged. The graphlens-link package then pairs
CONSUMES with EXPOSES into COMMUNICATES_WITH edges:

from graphlens_link import link_graph

merged = python_graph.merge(ts_graph, allow_shared=True)
result = link_graph(merged)          # adds COMMUNICATES_WITH edges

See examples/demo_cross_language.py for a Python-server ↔ TypeScript-client
walkthrough.

Adapter plugin system

Language adapters register themselves via Python entry points — no changes to the core needed:

# packages/graphlens-python/pyproject.toml
[project.entry-points."graphlens.adapters"]
python = "graphlens_python:PythonAdapter"

The registry discovers installed adapters automatically at runtime:

from graphlens import adapter_registry

adapter_registry.available()          # ["python", ...]
adapter_cls = adapter_registry.load("python")
adapter = adapter_cls()

Adapters can also be registered manually (useful for testing):

adapter_registry.register("python", MyPythonAdapter)

Implementing an adapter

Subclass LanguageAdapter and implement four methods:

from pathlib import Path
from graphlens import GraphLens, LanguageAdapter

class MyLangAdapter(LanguageAdapter):
    def language(self) -> str:
        return "mylang"

    def file_extensions(self) -> set[str]:
        return {".ml", ".mli"}

    def can_handle(self, project_root: Path) -> bool:
        return (project_root / "dune-project").exists()

    def analyze(
        self, project_root: Path, files: list[Path] | None = None
    ) -> GraphLens:
        graph = GraphLens()
        files = files or self.collect_files(project_root)
        # ... parse and populate graph ...
        return graph

Project structure

graphlens/                      ← uv workspace root (core library)
  src/graphlens/                ← models, contracts, registry, exceptions, utils
  packages/
    graphlens-python/           ← Python adapter (tree-sitter + ty)
    graphlens-typescript/       ← TypeScript adapter (tree-sitter + Compiler API)
    graphlens-go/               ← Go adapter (tree-sitter + gopls)
    graphlens-rust/             ← Rust adapter (tree-sitter + rust-analyzer)
    graphlens-php/              ← PHP adapter (tree-sitter + PHPantom)
    graphlens-link/             ← cross-language linker (COMMUNICATES_WITH)
    graphlens-cli/              ← CLI (typer): analyze, query, visualize, neo4j, mcp
  tests/                         ← core tests (100% coverage)
  examples/                      ← standalone usage examples

Development

Requires Python 3.13+, uv, task.

task install        # uv sync --all-groups
task lint           # ruff + ty + bandit for all packages
task tests          # all tests with coverage

Individual package tasks:

task core:lint           task core:test
task python:lint         task python:test
task typescript:lint     task typescript:test
task cli:lint            task cli:test

License

MIT