Foundry

mcp
Security Audit
Fail
Health Pass
  • License — License: Apache-2.0
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Community trust — 813 GitHub stars
Code Fail
  • rm -rf — Recursive force deletion command in examples/approval/auto_classifier.py
Permissions Pass
  • Permissions — No dangerous permissions requested
Purpose
This tool is a production framework for building agentic AI systems. It provides a model-agnostic stack for LangChain and LangGraph agents powered entirely by MCP tools over HTTP/SSE.

Security Assessment
The tool requires no explicitly dangerous permissions. However, the automated code scan flagged a recursive force deletion command (`rm -rf`) within an example script (`examples/approval/auto_classifier.py`). While this is likely an innocuous file-cleanup mechanism limited to an example rather than core library code, developers should inspect the script before running it to ensure it doesn't accidentally delete unintended directories. No hardcoded secrets were identified. Because the framework functions over HTTP/SSE, it inherently makes network requests to communicate with external tools and models. Overall risk is rated as Medium due to the forceful file deletion command found in the codebase.

Quality Assessment
The project demonstrates strong health and maintenance signals. It is licensed under the permissive Apache-2.0. With over 800 GitHub stars and repository activity as recent as today, it shows solid and growing community trust. The documentation highlights high-quality engineering standards, including strict typing, comprehensive test coverage (over 3,000 tests), and clean security scans (zero high-severity issues on Bandit).

Verdict
Use with caution—while the core framework is well-maintained and high-quality, you should review the example scripts for potentially destructive shell commands before executing them in your environment.
SUMMARY

Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.

README.md

Promptise Foundry

Promptise Foundry

The production framework for agentic AI systems.

Every other framework gives you an LLM wrapper.
Promptise Foundry gives you the stack behind it.


PyPI Python Downloads CI License Docs

Async Typed Security MCP Tests


Documentation  ·  Quick Start  ·  Showcase  ·  Discussions




Agents that survive production need more than a prompt and a tool list.

They need MCP-native tool discovery. A reasoning engine you can shape. Memory you can trust. Guardrails that actually fire. Governance that enforces budgets. A runtime that recovers from crashes. Promptise Foundry ships all of it as one coherent framework — built for engineering teams who are done assembling AI infrastructure from ten half-finished libraries.


 


Get started in 30 seconds


pip install promptise
import asyncio
from promptise import build_agent, PromptiseSecurityScanner, SemanticCache
from promptise.config import HTTPServerSpec
from promptise.memory import ChromaProvider

async def main():
    agent = await build_agent(
        model="openai:gpt-5-mini",
        servers={
            "tools": HTTPServerSpec(url="http://localhost:8000/mcp"),
        },
        instructions="You are a helpful assistant.",
        memory=ChromaProvider(persist_directory="./memory"),
        guardrails=PromptiseSecurityScanner.default(),
        cache=SemanticCache(),
        observe=True,
    )

    result = await agent.ainvoke({
        "messages": [{"role": "user", "content": "What's the status of our pipeline?"}]
    })
    print(result["messages"][-1].content)
    await agent.shutdown()

asyncio.run(main())

One call. Auto tool discovery from MCP servers. Memory auto-searched before every invocation.
Guardrails block injection and redact PII. Semantic cache serves similar queries instantly. Full observability.

 


Five pillars. One framework.

Each pillar replaces an entire category of libraries you would otherwise assemble yourself.



01

🤖

Agent

Turn any LLM into a production-ready agent with one function call.

Replaces: LangChain + a guardrails library + an output validator + a vector-store wrapper + a retry helper.

build_agent() · auto MCP tool discovery · semantic tool optimization (40–70% fewer tokens) · 3 memory providers with auto-injection · 4 conversation stores · 6-head security scanner · semantic cache with per-user isolation · sandboxed code execution · auto-approval classifier · pluggable RAG · streaming · model fallback · adaptive strategy.

Agent docs →



02

🧠

Reasoning Engine

Compose reasoning the way you compose code. Not a black box.

Replaces: hand-rolled LangGraph wiring, bespoke planner/executor loops, ReAct-from-scratch.

PromptGraph with 20 node types — 10 standard (PromptNode, ToolNode, RouterNode, GuardNode, ParallelNode, LoopNode, HumanNode, TransformNode, SubgraphNode, AutonomousNode) and 10 reasoning (ThinkNode, PlanNode, ReflectNode, CritiqueNode, SynthesizeNode, ValidateNode, ObserveNode, JustifyNode, RetryNode, FanOutNode). 7 prebuilt patterns (react, peoatr, research, autonomous, deliberate, debate, pipeline). 18 node flags for typed capabilities. Agent-assembled paths from a node pool. Lifecycle hooks. Skill registry. JSON serialization.

Reasoning docs →



03

🔧

MCP Server SDK

Production server and native client for the Model Context Protocol.

Replaces: rolling your own tool server. What FastAPI is to REST, this is to MCP.

@server.tool() with auto-schema from type hints · JWT + OAuth2 + API key auth · role/scope guards · 12+ middleware (rate limit, circuit breaker, audit, cache, OTel) · HMAC-chained audit logs · priority job queue with retries and progress · versioning + transforms · OpenAPI import · MCPMultiClient federation · live 6-tab dashboard · TestClient for in-process testing · 3 transports (stdio, HTTP, SSE).

MCP docs →



04

Agent Runtime

The operating system for autonomous agents.

Replaces: Celery + cron + a state store + your own crash recovery + a governance layer.

5 trigger types (cron, webhook, file watch, event, message) · crash recovery via journal replay · 5 rewind modes · 14 lifecycle hooks · budget enforcement with tool costs · health monitoring (stuck, loop, empty, error rate) · mission tracking with LLM-as-judge · secret scoping with TTL and zero-fill revocation · 14 meta-tools for self-modifying agents · 37-endpoint REST API with typed client · live agent inbox · distributed multi-node coordination.

Runtime docs →



05

Prompt Engineering

Prompts built like software. Not strings.

Replaces: f-strings + instructor + ad-hoc few-shot files + prompt sprawl across a codebase.

8 block types with priority-based token budgeting · conversation flows that evolve per phase · 5 composable strategies (chain_of_thought + self_critique) · 4 perspectives · 14 context providers auto-injected every turn · SSTI-safe template engine with opt-in shell · 5 guards · SemVer registry with rollback · inspector that traces every assembly decision · test helpers (mock_llm(), assert_schema()) · chain, parallel, branch, retry, fallback.

Prompts docs →


 


Why Promptise Foundry?

Honest comparison. ✅ native  ·  ⚠️ partial or via adapter  ·  ❌ not supported


Promptise LangChain LangGraph CrewAI AutoGen PydanticAI
MCP-first tool discovery ✅ Native ⚠️ via adapter ⚠️ via adapter ⚠️ via adapter ⚠️ via adapter ⚠️ via adapter
Native MCP server SDK (auth · middleware · queue · audit) ✅ Full
Composable reasoning graph ✅ 20 nodes · 7 patterns · agent-assembled ✅ Graph-native ⚠️ Crew/Flow ⚠️ GroupChat
Semantic tool optimization (ML selects relevant tools per query) ✅ 40–70% savings
Local ML security guardrails (prompt-injection · PII · creds · NER · content) ✅ 6 heads ❌ external ❌ external
Semantic response cache ✅ Per-user isolated ⚠️ Basic (shared) ⚠️ via LangChain
Human-in-the-loop ✅ 3 handlers + ML classifier ⚠️ Basic ✅ interrupt_before/after ⚠️ human_input=True ✅ UserProxyAgent
Sandboxed code execution ✅ Docker · seccomp · gVisor ⚠️ PythonREPL ✅ Docker executor
Crash recovery / replay ✅ 5 rewind modes ✅ Checkpointer
Autonomous runtime (triggers · lifecycle · messaging) ✅ Full OS ⚠️ Persistence only
Budget / health / mission governance ✅ Built-in
Live agent conversation (inbox · ask)
Orchestration REST API ✅ 37 endpoints + typed client

LangGraph's checkpointer gives it genuine replay; AutoGen ships a real Docker code executor; LangChain has a basic semantic cache.
Promptise unifies every row above — one dependency, one type-checked API, one runtime.

 


Benchmarks

Apples-to-apples. Same model, same 40-tool MCP server, same prompts, fresh agent per run.


6 frameworks  ×  30 tasks  ×  5 repeats  =  900 real agent runs

Promptise · LangGraph · LangChain · PydanticAI · CrewAI · AutoGen
 —  all driven by openai:gpt-5-mini at temperature=0.


Tier Measures Count
T1 Direct lookup Can the agent pick the right tool and quote the result? 6
T2 Multi-step Can it chain 5–7 tools with state carried across calls? 6
T3 Synthesis Can it reason across 3+ tool outputs? 6
T4 Tool selection Can it disambiguate across 40 tools in 7 namespaces? 6
T5 Autonomous reasoning Can it decompose a goal, branch on intermediate results, re-plan on failure, and synthesize evidence-grounded answers? 6

We measure latency (median, p95), tokens (in/out), cost, tool-call count, tool precision, factual accuracy (LLM-as-judge, 0–5), and hallucination rate — for every framework, on every task, every time. The full trace (answers, tool calls, judge rationales) is committed as raw JSON under benchmarks/results/. Nothing is cherry-picked.

export OPENAI_API_KEY=sk-...
./benchmarks/reproduce.sh           # end-to-end: start server, run 900 agents, regenerate RESULTS.md

benchmarks/RESULTS.md · benchmarks/README.md (fairness protocol + honesty guarantees)


 


Model-agnostic

Any LLM, one string. Or any LangChain BaseChatModel. Or a FallbackChain across providers.


build_agent(model="openai:gpt-5-mini", ...)
build_agent(model="anthropic:claude-sonnet-4-20250514", ...)
build_agent(model="ollama:llama3", ...)
build_agent(model="google:gemini-2.0-flash", ...)

 


Deploy autonomous agents

Triggers, budgets, health checks, missions, secrets — all in Python.


from promptise.runtime import (
    AgentRuntime, ProcessConfig, TriggerConfig,
    BudgetConfig, HealthConfig, MissionConfig,
)

async with AgentRuntime() as runtime:
    await runtime.add_process("monitor", ProcessConfig(
        model="openai:gpt-5-mini",
        instructions="Monitor data pipelines. Escalate anomalies.",
        triggers=[
            TriggerConfig(type="cron", cron_expression="*/5 * * * *"),
            TriggerConfig(type="webhook", webhook_path="/alerts"),
        ],
        budget=BudgetConfig(max_tool_calls_per_day=500, on_exceeded="pause"),
        health=HealthConfig(detect_loops=True, detect_stuck=True, on_anomaly="escalate"),
        mission=MissionConfig(
            objective="Keep uptime above 99.9%",
            success_criteria="No P1 unresolved for more than 15 minutes",
            evaluate_every_n=10,
        ),
    ))
    await runtime.start_all()

 


Documentation


Section What it covers
Quick Start Your first agent in 5 minutes
Key Concepts Architecture, design principles, the five pillars
Building Agents Step-by-step, simple to production
Reasoning Engine Graphs, nodes, flags, patterns
MCP Servers Production tool servers with auth and middleware
Agent Runtime Autonomous agents with governance
Prompt Engineering Blocks, strategies, flows, guards
Showcase Working patterns, end-to-end
API Reference Every class, method, parameter

 


Ecosystem

Promptise plugs into what your team already runs.


  Models  

OpenAI
Anthropic
Gemini
Ollama
Mistral
Hugging Face

+ any LangChain BaseChatModel · FallbackChain for automatic failover



  Memory & Vectors  

ChromaDB
Mem0
Sentence Transformers

Local embeddings · air-gapped model paths · prompt-injection mitigation built in



  Conversation Storage  

PostgreSQL
Redis
SQLite

Session ownership enforced · per-user isolation for cache and guardrails



  Observability  

OpenTelemetry
Prometheus
Slack
PagerDuty

8 transporters: OTel · Prometheus · Slack · PagerDuty · Webhook · HTML · JSON · Console



  Sandbox & Infrastructure  

Docker
gVisor
Kubernetes
seccomp

Docker + seccomp + gVisor + capability dropping · Kubernetes-native health probes



  Protocols  

MCP
OpenAPI
JWT
OAuth 2.0

stdio · streamable HTTP · SSE · HMAC-chained audit logs




Reviews (0)

No results found