Foundry
Health Pass
- License — License: Apache-2.0
- Description — Repository has a description
- Active repo — Last push 0 days ago
- Community trust — 813 GitHub stars
Code Fail
- rm -rf — Recursive force deletion command in examples/approval/auto_classifier.py
Permissions Pass
- Permissions — No dangerous permissions requested
This tool is a production framework for building agentic AI systems. It provides a model-agnostic stack for LangChain and LangGraph agents powered entirely by MCP tools over HTTP/SSE.
Security Assessment
The tool requires no explicitly dangerous permissions. However, the automated code scan flagged a recursive force deletion command (`rm -rf`) within an example script (`examples/approval/auto_classifier.py`). While this is likely an innocuous file-cleanup mechanism limited to an example rather than core library code, developers should inspect the script before running it to ensure it doesn't accidentally delete unintended directories. No hardcoded secrets were identified. Because the framework functions over HTTP/SSE, it inherently makes network requests to communicate with external tools and models. Overall risk is rated as Medium due to the forceful file deletion command found in the codebase.
Quality Assessment
The project demonstrates strong health and maintenance signals. It is licensed under the permissive Apache-2.0. With over 800 GitHub stars and repository activity as recent as today, it shows solid and growing community trust. The documentation highlights high-quality engineering standards, including strict typing, comprehensive test coverage (over 3,000 tests), and clean security scans (zero high-severity issues on Bandit).
Verdict
Use with caution—while the core framework is well-maintained and high-quality, you should review the example scripts for potentially destructive shell commands before executing them in your environment.
Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.
Promptise Foundry
The production framework for agentic AI systems.
Every other framework gives you an LLM wrapper.
Promptise Foundry gives you the stack behind it.
Documentation · Quick Start · Showcase · Discussions
Agents that survive production need more than a prompt and a tool list.
They need MCP-native tool discovery. A reasoning engine you can shape. Memory you can trust. Guardrails that actually fire. Governance that enforces budgets. A runtime that recovers from crashes. Promptise Foundry ships all of it as one coherent framework — built for engineering teams who are done assembling AI infrastructure from ten half-finished libraries.
Get started in 30 seconds
pip install promptise
import asyncio
from promptise import build_agent, PromptiseSecurityScanner, SemanticCache
from promptise.config import HTTPServerSpec
from promptise.memory import ChromaProvider
async def main():
agent = await build_agent(
model="openai:gpt-5-mini",
servers={
"tools": HTTPServerSpec(url="http://localhost:8000/mcp"),
},
instructions="You are a helpful assistant.",
memory=ChromaProvider(persist_directory="./memory"),
guardrails=PromptiseSecurityScanner.default(),
cache=SemanticCache(),
observe=True,
)
result = await agent.ainvoke({
"messages": [{"role": "user", "content": "What's the status of our pipeline?"}]
})
print(result["messages"][-1].content)
await agent.shutdown()
asyncio.run(main())
Guardrails block injection and redact PII. Semantic cache serves similar queries instantly. Full observability.
Five pillars. One framework.
Each pillar replaces an entire category of libraries you would otherwise assemble yourself.
01🤖 |
AgentTurn any LLM into a production-ready agent with one function call. Replaces: LangChain + a guardrails library + an output validator + a vector-store wrapper + a retry helper.
|
02🧠 |
Reasoning EngineCompose reasoning the way you compose code. Not a black box. Replaces: hand-rolled LangGraph wiring, bespoke planner/executor loops, ReAct-from-scratch.
|
03🔧 |
MCP Server SDKProduction server and native client for the Model Context Protocol. Replaces: rolling your own tool server. What FastAPI is to REST, this is to MCP.
|
04⚡ |
Agent RuntimeThe operating system for autonomous agents. Replaces: Celery + cron + a state store + your own crash recovery + a governance layer. 5 trigger types (cron, webhook, file watch, event, message) · crash recovery via journal replay · 5 rewind modes · 14 lifecycle hooks · budget enforcement with tool costs · health monitoring (stuck, loop, empty, error rate) · mission tracking with LLM-as-judge · secret scoping with TTL and zero-fill revocation · 14 meta-tools for self-modifying agents · 37-endpoint REST API with typed client · live agent inbox · distributed multi-node coordination. |
05✨ |
Prompt EngineeringPrompts built like software. Not strings. Replaces: f-strings + 8 block types with priority-based token budgeting · conversation flows that evolve per phase · 5 composable strategies ( |
Why Promptise Foundry?
Honest comparison. ✅ native · ⚠️ partial or via adapter · ❌ not supported
| Promptise | LangChain | LangGraph | CrewAI | AutoGen | PydanticAI | |
|---|---|---|---|---|---|---|
| MCP-first tool discovery | ✅ Native | ⚠️ via adapter | ⚠️ via adapter | ⚠️ via adapter | ⚠️ via adapter | ⚠️ via adapter |
| Native MCP server SDK (auth · middleware · queue · audit) | ✅ Full | ❌ | ❌ | ❌ | ❌ | ❌ |
| Composable reasoning graph | ✅ 20 nodes · 7 patterns · agent-assembled | ❌ | ✅ Graph-native | ⚠️ Crew/Flow | ⚠️ GroupChat | ❌ |
| Semantic tool optimization (ML selects relevant tools per query) | ✅ 40–70% savings | ❌ | ❌ | ❌ | ❌ | ❌ |
| Local ML security guardrails (prompt-injection · PII · creds · NER · content) | ✅ 6 heads | ❌ external | ❌ external | ❌ | ❌ | ❌ |
| Semantic response cache | ✅ Per-user isolated | ⚠️ Basic (shared) | ⚠️ via LangChain | ❌ | ❌ | ❌ |
| Human-in-the-loop | ✅ 3 handlers + ML classifier | ⚠️ Basic | ✅ interrupt_before/after | ⚠️ human_input=True |
✅ UserProxyAgent | ❌ |
| Sandboxed code execution | ✅ Docker · seccomp · gVisor | ⚠️ PythonREPL | ❌ | ❌ | ✅ Docker executor | ❌ |
| Crash recovery / replay | ✅ 5 rewind modes | ❌ | ✅ Checkpointer | ❌ | ❌ | ❌ |
| Autonomous runtime (triggers · lifecycle · messaging) | ✅ Full OS | ❌ | ⚠️ Persistence only | ❌ | ❌ | ❌ |
| Budget / health / mission governance | ✅ Built-in | ❌ | ❌ | ❌ | ❌ | ❌ |
| Live agent conversation (inbox · ask) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Orchestration REST API | ✅ 37 endpoints + typed client | ❌ | ❌ | ❌ | ❌ | ❌ |
Promptise unifies every row above — one dependency, one type-checked API, one runtime.
Benchmarks
Apples-to-apples. Same model, same 40-tool MCP server, same prompts, fresh agent per run.
6 frameworks × 30 tasks × 5 repeats = 900 real agent runs
Promptise · LangGraph · LangChain · PydanticAI · CrewAI · AutoGen
— all driven by openai:gpt-5-mini at temperature=0.
| Tier | Measures | Count |
|---|---|---|
| T1 Direct lookup | Can the agent pick the right tool and quote the result? | 6 |
| T2 Multi-step | Can it chain 5–7 tools with state carried across calls? | 6 |
| T3 Synthesis | Can it reason across 3+ tool outputs? | 6 |
| T4 Tool selection | Can it disambiguate across 40 tools in 7 namespaces? | 6 |
| T5 Autonomous reasoning | Can it decompose a goal, branch on intermediate results, re-plan on failure, and synthesize evidence-grounded answers? | 6 |
We measure latency (median, p95), tokens (in/out), cost, tool-call count, tool precision, factual accuracy (LLM-as-judge, 0–5), and hallucination rate — for every framework, on every task, every time. The full trace (answers, tool calls, judge rationales) is committed as raw JSON under benchmarks/results/. Nothing is cherry-picked.
export OPENAI_API_KEY=sk-...
./benchmarks/reproduce.sh # end-to-end: start server, run 900 agents, regenerate RESULTS.md
→ benchmarks/RESULTS.md · benchmarks/README.md (fairness protocol + honesty guarantees)
Model-agnostic
Any LLM, one string. Or any LangChain BaseChatModel. Or a FallbackChain across providers.
build_agent(model="openai:gpt-5-mini", ...)
build_agent(model="anthropic:claude-sonnet-4-20250514", ...)
build_agent(model="ollama:llama3", ...)
build_agent(model="google:gemini-2.0-flash", ...)
Deploy autonomous agents
Triggers, budgets, health checks, missions, secrets — all in Python.
from promptise.runtime import (
AgentRuntime, ProcessConfig, TriggerConfig,
BudgetConfig, HealthConfig, MissionConfig,
)
async with AgentRuntime() as runtime:
await runtime.add_process("monitor", ProcessConfig(
model="openai:gpt-5-mini",
instructions="Monitor data pipelines. Escalate anomalies.",
triggers=[
TriggerConfig(type="cron", cron_expression="*/5 * * * *"),
TriggerConfig(type="webhook", webhook_path="/alerts"),
],
budget=BudgetConfig(max_tool_calls_per_day=500, on_exceeded="pause"),
health=HealthConfig(detect_loops=True, detect_stuck=True, on_anomaly="escalate"),
mission=MissionConfig(
objective="Keep uptime above 99.9%",
success_criteria="No P1 unresolved for more than 15 minutes",
evaluate_every_n=10,
),
))
await runtime.start_all()
Documentation
| Section | What it covers |
|---|---|
| Quick Start | Your first agent in 5 minutes |
| Key Concepts | Architecture, design principles, the five pillars |
| Building Agents | Step-by-step, simple to production |
| Reasoning Engine | Graphs, nodes, flags, patterns |
| MCP Servers | Production tool servers with auth and middleware |
| Agent Runtime | Autonomous agents with governance |
| Prompt Engineering | Blocks, strategies, flows, guards |
| Showcase | Working patterns, end-to-end |
| API Reference | Every class, method, parameter |
Ecosystem
Promptise plugs into what your team already runs.
Models
+ any LangChain BaseChatModel · FallbackChain for automatic failover
Memory & Vectors
Local embeddings · air-gapped model paths · prompt-injection mitigation built in
Conversation Storage
Session ownership enforced · per-user isolation for cache and guardrails
Observability
8 transporters: OTel · Prometheus · Slack · PagerDuty · Webhook · HTML · JSON · Console
Sandbox & Infrastructure
Docker + seccomp + gVisor + capability dropping · Kubernetes-native health probes
Protocols
stdio · streamable HTTP · SSE · HMAC-chained audit logs
Reviews (0)
Sign in to leave a review.
Leave a reviewNo results found