mathematical-discovery-engine
Health Uyari
- License — License: MIT
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Low visibility — Only 9 GitHub stars
Code Gecti
- Code scan — Scanned 12 files during light audit, no dangerous patterns found
Permissions Gecti
- Permissions — No dangerous permissions requested
Bu listing icin henuz AI raporu yok.
This is a mathematical discovery engine, which searches new mathematics applying techniques to existing results
Mathematical Discovery Engine (MDE)
Treat mathematical proof as path-finding through a graph. A knowledge graph of 13,920 mathematical states and 873 proof techniques — from Pythagoras to Perelman — that an LLM searches to discover new theorems.
13,920 nodes · 23,397 edges · 873 proof techniques · 4,262 theorems — explore the full graph in 3D
What This Repository Does
This repository encodes the structure of mathematical knowledge as a graph and uses it to discover new theorems automatically.
It starts from a simple observation: every major theorem in mathematics was discovered by applying known techniques to known facts. Galois used symmetry reduction on polynomial roots. Cantor used diagonalization on real numbers. Wiles used a bridge between modular forms and elliptic curves. The techniques are reusable — the same method of proof appears across centuries and subdisciplines.
This project does three things:
Collects the knowledge. A detailed report covering ~100 of the most consequential theorems in mathematics — from Pythagoras through Perelman — with the discovery context, motivation, thought process, and proof ideas behind each. The report spans 13 chapters organized chronologically, drawing from primary sources (Euler's letters, Gauss's diaries, Ramanujan's notebooks, Wiles's interviews) where they survive.
Builds a knowledge graph. The report is distilled into a structured directed graph: 13,920 nodes (3,791 axioms, 4,994 intermediate states, 4,262 theorems, 873 techniques) connected by 23,397 edges. Each technique has a formal signature — inputs, process, outputs, preconditions — organized into 12 clusters (algebraic manipulation, symmetry & invariants, cross-field transfer, topology & obstruction, etc.). The graph captures not just what is true, but how each truth was reached from earlier ones.
Proposes and implements a discovery engine. Mathematical proof is reframed as path-finding through this finite graph. An orchestrator LLM selects which techniques to try; worker LLMs attempt each technique application in parallel; a pruner eliminates impossible branches using known impossibility theorems (Abel-Ruffini, Gödel, Turing). The search tree grows until a path from the starting axioms to the goal theorem is found — or the frontier is exhausted.
The key insight: instead of searching over all possible mathematical statements (infinite), the engine searches over compositions of 873 known techniques applied to 13,920 known states (finite, tractable). Each solved problem adds new edges back to the graph, creating a self-improving system.
How It Works
Mathematical proof is path-finding through a finite directed graph. Nodes are mathematical states (axioms, intermediate results, theorems). Edges are techniques (the 873 operations cataloged across 12 clusters). An orchestrator LLM directs the search; worker LLMs attempt each technique application; a pruner eliminates impossible branches.
┌──────────────────────┐
│ ORCHESTRATOR LLM │ (Sonnet — strategic decisions)
│ │
│ - Parses problem │
│ - Picks techniques │
│ - Evaluates results │
│ - Decides next path │
└──────────┬───────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌──────▼──────┐ ┌───▼────────┐ ┌──▼───────────┐
│ WORKER A │ │ WORKER B │ │ WORKER C │ (Haiku — cheap, parallel)
│ Tries │ │ Tries │ │ Tries │
│ Technique 1│ │ Technique 2│ │ Technique 3 │
└──────┬──────┘ └───┬────────┘ └──┬───────────┘
│ │ │
▼ ▼ ▼
┌────────────────────────────────────────┐
│ RESULT COLLECTOR │
│ Each worker returns: │
│ - new_state: mathematical result │
│ - confidence: 0.0 - 1.0 │
│ - proof_sketch: reasoning │
│ - impossible: true/false │
└────────────────────┬───────────────────┘
│
┌──────▼──────┐
│ PRUNER │ Kills only branches that are 100% dead
└──────┬──────┘
│
┌──────▼──────┐
│ SEARCH TREE │ Priority queue — expands most promising node
└─────────────┘
The Search Loop
- Parse problem — identify start nodes in the knowledge graph and define the goal state
- Select techniques — score all 873 techniques using a weighted heuristic
- Dispatch workers in parallel — each worker applies ONE technique to ONE state, returns a new state with confidence and proof sketch
- Prune — kill branches violating impossibility theorems (Abel-Ruffini, Gödel, Turing, etc.); demote low-confidence branches
- Check goal — does the new state match the target?
- Repeat — expand the most promising frontier node next
Technique Scoring Heuristic
The orchestrator ranks techniques using:
score = 0.30 * proximity_to_goal
+ 0.25 * precondition_match
+ 0.15 * historical_success
+ 0.20 * bridge_potential ← cross-cluster novelty bonus
- 0.10 * depth_penalty
The bridge_potential term is critical: techniques that connect nodes across different clusters (e.g., number theory ↔ probability) get a bonus because cross-field transfers are where novel proofs live.
Pruning Rules
Hard prunes (100% certain — kill the branch):
- No radical formula for degree ≥ 5 (Abel-Ruffini)
- No decision procedure for arbitrary Diophantine equations (MRDP)
- No consistent complete axiomatization of arithmetic (Gödel)
- No algorithm deciding program halting (Turing)
- Type mismatch or logical contradiction with a known theorem
Soft demotions (uncertain — deprioritize but keep alive):
- Worker confidence < 0.2 → multiply priority by 0.3
- Worker says impossible but reason isn't definitive → multiply by 0.3
Anti-pruning (never prune these):
- Cross-cluster bridge techniques
- Any branch at depth < 3
Setup
Requirements
- Python 3
- Two pip packages
pip install anthropic networkx
Or from the requirements file:
pip install -r discovery_engine/requirements.txt
Knowledge Graph
The graph ships as knowledge_graph.json (5 MB) in the repo root. No separate database or server needed — the engine loads it directly into memory via NetworkX.
| Metric | Count |
|---|---|
| Axiom nodes | 3,791 |
| State nodes | 4,994 |
| Theorem nodes | 4,262 |
| Technique nodes | 873 |
| Total edges | 23,397 |
Configuring Worker APIs
Workers are the LLMs that attempt each technique application. Three modes are available:
Mode 1: Dry Run (free, no API)
Uses mock workers that return synthetic results based on graph structure. Good for testing the search logic.
python3 -m discovery_engine.discover --dry-run \
"Prove that for any primitive set A, f(A) <= f(primes)"
Mode 2: CLI Workers (uses Claude Code subscription)
Spawns claude -p subprocesses. No API key needed — uses your existing Claude Code subscription billing.
python3 -m discovery_engine.discover --use-cli \
--start s_divisibility_definition,s_antichain_in_boolean_lattice \
--goal "f(A) = sum 1/(a log a) is maximized when A is the set of primes" \
"Erdős primitive set conjecture"
Default: 2 parallel workers. Override with --workers N.
Mode 3: API Workers (Anthropic API key)
Direct API calls with prompt caching. The cheapest per-call option for large searches.
export ANTHROPIC_API_KEY=sk-ant-...
python3 -m discovery_engine.discover \
"Prove the Erdős primitive set conjecture"
Default worker model: claude-haiku-4-5-20251001 (~20x cheaper than Sonnet). Override with --worker-model.
Default: 5 parallel workers. Override with --workers N.
Token Optimizations
Three optimizations reduce API cost by ~90-95%:
- Haiku workers — workers apply a single technique to a single state, a focused task that Haiku handles well
- Prompt caching — the static worker system prompt is marked with
cache_control: {"type": "ephemeral"}, so after the first call all subsequent calls within 5 minutes pay only ~10% for that block - Trimmed prompts — user prompts are ~200-400 tokens (3 example inputs/outputs, 8 context nodes, no verbose instructions)
| Scenario | Naive (Sonnet, no cache) | Optimized (Haiku, cached) | Savings |
|---|---|---|---|
| 50 worker calls | ~$0.50 | ~$0.03 | 94% |
| 200 worker calls | ~$3.00 | ~$0.25 | 92% |
| 500 worker calls | ~$8.00 | ~$0.50 | 94% |
Configuring the Orchestrator (Master AI)
The orchestrator is the "brain" that parses problems, selects techniques, and evaluates results. It runs separately from the workers.
Heuristic Mode (default)
No LLM calls for technique selection or goal checking — uses keyword matching and graph structure. Free.
python3 -m discovery_engine.discover --dry-run \
--start s_divisibility_definition \
--goal "f(A) is maximized by primes" \
"Erdős primitive set conjecture"
LLM Orchestration Mode
The orchestrator uses Claude to parse the problem, select techniques, and verify goals. Smarter but costs orchestrator tokens.
python3 -m discovery_engine.discover --llm-orchestrate \
--model claude-sonnet-4-20250514 \
"Prove the Erdős primitive set conjecture"
Default orchestrator model: claude-sonnet-4-20250514. Override with --model.
Running a Discovery
Basic Examples
# Dry run — test the search logic for free
python3 -m discovery_engine.discover --dry-run \
"Prove the Erdős primitive set conjecture"
# CLI workers — uses your Claude Code subscription
python3 -m discovery_engine.discover --use-cli \
--start s_divisibility_definition,s_antichain_in_boolean_lattice \
--goal "f(A) = sum 1/(a log a) is maximized when A is the set of primes" \
"Erdős primitive set conjecture"
# Full API mode with LLM orchestration
export ANTHROPIC_API_KEY=sk-ant-...
python3 -m discovery_engine.discover --llm-orchestrate \
--model claude-sonnet-4-20250514 \
--worker-model claude-haiku-4-5-20251001 \
--workers 5 --max-depth 7 --max-iterations 200 --candidates 4 \
"Prove the Erdős primitive set conjecture"
Checkpointing and Resume
Long runs save periodic checkpoints. Ctrl+C triggers a graceful save before exiting.
# Run with auto-checkpointing (saves every 3 iterations)
python3 -m discovery_engine.discover --use-cli \
--checkpoint-dir checkpoints --checkpoint-every 3 \
--start s_divisibility_definition,s_antichain_in_boolean_lattice \
--goal "density f(c) exists and is continuous" \
"Prime gap density continuity conjecture"
# Resume from last checkpoint
python3 -m discovery_engine.discover --use-cli \
--resume checkpoints/checkpoint_latest.json \
"ignored — problem is restored from checkpoint"
Example Output
Loading knowledge graph from knowledge_graph.json...
Loaded in 0.8s — {'nodes': 13920, 'edges': 23397, 'techniques': 873}
Mode: LIVE (API)
Orchestrator model: claude-sonnet-4-20250514
Worker model: claude-haiku-4-5-20251001
Prompt caching: ON
Problem: Prove the Erdős primitive set conjecture
Start nodes: ['s_divisibility_definition', 's_antichain_in_boolean_lattice']
Goal: f(A) = sum 1/(a log a) is maximized when A is the set of primes
--- Iteration 1 | Expanding sn_0001 (depth 0) | Frontier: 0 ---
State: Erdős primitive set conjecture
Candidates: ['t_axiomatize_from_instances', 't_compose_with_identity',
't_complex_analysis_to_integers', 't_probabilistic_existence']
○ t_axiomatize -> sn_0002 (conf=95%) Primitive set = antichain in (Z>1, |)
○ t_compose -> sn_0003 (conf=80%) FTA + multiplicative structure
○ t_complex -> sn_0004 (conf=65%) Von Mangoldt weights
○ t_probabilistic -> sn_0005 (conf=50%) Erdős-Kac connection
--- Iteration 2 | Expanding sn_0004 (depth 1) | Frontier: 3 ---
...
★ GOAL REACHED at sn_0042!
=== DISCOVERED PATH ===
Layer 0: [START] -> Erdős primitive set conjecture (conf=100%)
Layer 1: [t_axiomatize_from_instances] -> antichain formalized (conf=95%)
Layer 2: [t_compose_with_identity] -> FTA + multiplicative structure (conf=80%)
Layer 3: [t_complex_analysis_to_integers] -> von Mangoldt weights (conf=65%)
Layer 4: [t_reduce_to_canonical_form] -> Markov chain model (conf=55%)
Layer 5: [t_probabilistic_existence] -> stationary distribution bound (conf=60%)
Layer 6: [t_exhaustion_squeeze] -> f(A) ≤ f(primes). QED. (conf=75%)
=== END ===
============================================================
Discovery complete in 42.3s
Stats: {"iterations": 15, "worker_calls": 47, "pruned": 12, "goal_found": true}
--- Token Usage ---
Input tokens: 12,450
Output tokens: 8,200
Cache reads: 7,350 (saved ~90% on these)
Cache creates: 150
Worker calls: 47
Cache hit rate: 37%
All CLI Flags
| Flag | Default | Description |
|---|---|---|
--dry-run |
— | Mock workers, zero API cost |
--use-cli |
— | Use claude -p subprocess as worker (subscription billing) |
--llm-orchestrate |
— | LLM picks techniques + checks goal (more expensive) |
--start <ids> |
auto | Comma-separated start node IDs |
--goal <text> |
auto | Goal description |
--graph <path> |
auto-detect | Path to knowledge_graph.json |
--max-depth N |
7 | Maximum search tree depth |
--max-iterations N |
200 | Maximum search iterations |
--candidates N |
4 | Techniques to try per step |
--workers N |
2 (cli) / 5 (api) | Number of parallel workers |
--model <id> |
claude-sonnet-4-20250514 |
Orchestrator model |
--worker-model <id> |
claude-haiku-4-5-20251001 |
Worker model |
--save-tree <path> |
— | Save search tree to JSON |
--print-tree |
— | Print tree structure at the end |
--checkpoint-dir <path> |
off | Directory for periodic checkpoints |
--checkpoint-every N |
5 | Save checkpoint every N iterations |
--resume <path> |
— | Resume from a checkpoint file |
--quiet |
— | Suppress progress output |
Interactive 3D Graph Viewer
The entire knowledge graph is explorable in an interactive 3D visualization — orbit, zoom, click, and trace the connections between axioms, techniques, and theorems.
https://github.com/user-attachments/assets/e6afe79e-9c0c-41e5-a049-af1aba319a31
TL;DR: Every node is a mathematical object (axiom, state, theorem, or technique). Every edge is a logical dependency. Hover a node to see its connections light up. Double-click to explore its neighborhood. Particles flowing along edges show the direction of mathematical reasoning — from axioms through techniques to theorems.
Quick start
The simplest way is the bundled launcher — it frees the port, starts a local server, and opens the 3D viewer automatically:
./run_graph.sh # serves on port 8765 and opens the 3D viewer
./run_graph.sh 9000 # use a custom port
Prefer the 2D viewer (lighter)? Use ./serve_viewer.sh instead, or start a server by hand:
python3 -m http.server 8765
open http://localhost:8765/graph_viewer_3d.html # 3D viewer (recommended)
open http://localhost:8765/graph_viewer.html # 2D viewer (lighter)
Opening HTML files directly (file://) won't work — fetch() of knowledge_graph.json is blocked by the browser's same-origin policy. Use a local server.
What you can do
| Action | How |
|---|---|
| Orbit & zoom | Drag to rotate, scroll to zoom, right-drag to pan |
| Inspect a node | Click — details panel slides in with description, connections, cluster |
| Explore neighborhood | Double-click any node — see it + all its connections in a focused view |
| Filter by kind | Toggle axiom / state / theorem / technique checkboxes |
| Filter by cluster | Select one of 12 technique clusters (e.g. "Symmetry & invariants") |
| Search | Type any name or ID, press Enter |
| Neighborhood mode | Switch mode dropdown — search a node to center the graph on it |
Visual encoding
| Element | Meaning |
|---|---|
| Gray dots | Axioms — foundational definitions and objects |
| Blue dots | States — intermediate results |
| Green stars | Theorems — proved results |
| Colored diamonds | Techniques — sized by number of connections, colored by cluster |
| Flowing particles | Direction of logical dependency (axiom → technique → theorem) |
| Bright edges on hover | Direct connections to the hovered node |
The Report
The knowledge graph is built from a companion report tracing ~100 pivotal theorems with their discovery context, motivation, and thought process.
Chapters
| # | File | Era | Focus |
|---|---|---|---|
| 0 | 00_introduction.md |
— | What is a theorem, and how are theorems discovered? |
| 1 | 01_ancient.md |
prehistory – 1400 CE | Babylonian, Greek, Indian, Chinese, Islamic foundations |
| 2 | 02_renaissance_17c.md |
1400 – 1700 | Cubic equations, analytic geometry, calculus |
| 3 | 03_eighteenth_century.md |
1700 – 1800 | Euler, the Bernoullis, Lagrange, Laplace |
| 4 | 04_nineteenth_century.md |
1800 – 1900 | Gauss, Cauchy, Galois, Riemann, Cantor |
| 5 | 05_early_twentieth_century.md |
1900 – 1950 | Hilbert, Noether, Gödel, Turing |
| 6 | 06_modern_contemporary.md |
1950 – present | Wiles, Perelman, computer-assisted proofs |
| 7 | 07_brief_catalog.md |
all | Abbreviated reference for remaining theorems |
| 8 | 08_epilogue.md |
— | Patterns in mathematical discovery |
| 9 | 09_discovery_techniques.md |
cross-cutting | 873 techniques across 12 clusters |
| 10 | 10_toolbox.md |
cross-cutting | Function-style technique dictionary with Mermaid diagrams |
| 11 | 11_knowledge_graph.md |
cross-cutting | Knowledge graph documentation |
| 12 | 12_ai_solvable_discoveries.md |
2026 | Recently solved problems traced through the graph |
| 13 | 13_workflow.md |
cross-cutting | Discovery engine architecture and worked examples |
Read in order for the narrative arc, or jump by area using INDEX.md.
Roadmap
Where this is headed — see future_works.md for the full list:
- Expand the graph beyond ~100 theorems toward full subdiscipline coverage
- Formal-verification backend (Lean/Coq) to machine-check discovered proof paths
- Richer impossibility-theorem pruner with more hard-prune rules
- Feed each solved problem's path back as new edges (self-improving loop)
- Benchmark suite of open problems with reproducible discovery runs
Contributing
Contributions are welcome — issues and PRs both. Good first contributions:
- Add nodes/edges to the knowledge graph (new theorems, techniques, or connections)
- Report a discovery run — open an issue with the problem, flags used, and the path found
- Improve the viewer — interactions, performance, or visual encoding in
graph_viewer_3d.html - Fix or extend the report — corrections and citations in the chapter files
Please open an issue first for larger changes so we can align on direction. Use GitHub Issues for bugs and ideas, and GitHub Discussions for open-ended questions.
Citation
If you use this work in your research, please cite:
@software{das2026mathsdiscovery,
author = {Das, Ansuman},
title = {Mathematical Discovery Engine: Automated Theorem Discovery via Knowledge Graphs},
year = {2026},
url = {https://github.com/ansumandas441/mathematical-discovery-engine},
note = {ORCID: 0000-0001-9695-9408. Knowledge graph of 13,920
mathematical nodes and 23,397 edges with an
LLM-orchestrated proof search engine}
}
Or in prose:
Ansuman Das, Mathematical Discovery Engine: Automated Theorem Discovery via Knowledge Graphs, 2026. Available at https://github.com/ansumandas441/mathematical-discovery-engine
License
This project is licensed under the MIT License.
Yorumlar (0)
Yorum birakmak icin giris yap.
Yorum birakSonuc bulunamadi