zone-of-distrust

mcp
Security Audit
Warn
Health Warn
  • License — License: Apache-2.0
  • Description — Repository has a description
  • Active repo — Last push 0 days ago
  • Low visibility — Only 9 GitHub stars
Code Pass
  • Code scan — Scanned 9 files during light audit, no dangerous patterns found
Permissions Pass
  • Permissions — No dangerous permissions requested

No AI report is available for this listing yet.

SUMMARY

Open security architecture for autonomous AI agents - extending Zero Trust principles

README.md

Zones of Distrust

A Layered Security Architecture for Autonomous AI Agents

Version 0.9 RFC — February 2026


Zero Trust gave us the philosophy: never trust, always verify. But Zero Trust was built for users and devices—entities that know when they've been compromised.

Autonomous AI agents break that assumption. A prompt-injected agent doesn't know it's been compromised. The manipulation becomes its genuine reasoning.

Zones of Distrust extends Zero Trust principles for autonomous entities that can be compromised without knowing it.

Core Thesis: Security is not about making the agent trustworthy. It's about building a system that remains safe even when the agent is not.


🔎 Start Here

If you're new to ZoD, start with:

  1. Architecture Specification
  2. Threat Model
  3. Security Properties (P1–P12)
  4. Implementation Guide

Full documentation index: docs/README.md


🏗️ Architecture Overview

ZoD defines seven interdependent layers:

Layer Name Function
L7 Human Governance Risk-weighted escalation, policy setting, break-glass procedures
L6 Continuous Monitoring Behavioral baselining, drift detection, memory audit
L5 Execution Isolated process for validated actions, immutable logging
L4 Request Validation (CA) Independent Certificate Authority, semantic policy, token binding
L3 Cognitive Isolation Agent reasoning separated from execution capability
L2 Input Control Adversarial screening before agent processes data
L1 OS Foundation Agent identity, process isolation, credential brokering

Core Thesis: Security is not about making the agent trustworthy. It's about building a system that works even when the agent is not.


🎯 What We Want Reviewed (Attack This)

ZoD is published as an RFC to invite adversarial critique. We are actively seeking review on:

  • Cross-layer bypass scenarios (L2→L5, L3→L5, etc.)
  • Prompt injection containment limits and failure modes
  • Token binding flaws and replay/exfiltration risks
  • Multi-agent boundary failures and delegated-agent abuse
  • Drift detection evasion techniques
  • Break-glass / human override abuse paths
  • Logging integrity assumptions and tamper resistance

If you find a weakness, eevn if it's outside these areas, open an issue describing the attack path and impacted layers.


🧩 Security Properties

ZoD defines 12 measurable security properties (P1–P12) intended to serve as a baseline for agentic system security evaluation.

See: Security Properties


🗺️ Framework Mappings

ZoD includes crosswalks to major AI security and governance frameworks:

  • OWASP Agentic
  • MITRE ATLAS
  • NIST AI RMF
  • Google SAIF
  • MAESTRO
  • EU AI Act / ISO / SOC 2 mappings

See: Mapping Index


🧩 Road to a Community Standard

ZoD is currently v0.9 RFC. Breaking changes are expected.

Target milestones:

  • v0.9 → community critique + issue intake
  • v0.95 → bypass catalog + red-team test corpus
  • v1.0 → stable reference architecture + security property baseline

🚀 Reference Implementation

Status: Planned (Q2 2026)

A vendor-neutral agent runtime implementing ZoD across Windows, macOS, Linux, and Android is in development.


🤝 Contributing

ZoD is published openly to evolve into an industry reference architecture.

Contributions are welcome, including:

  • threat model critique and bypass analysis
  • implementation patterns for agent frameworks
  • security property validation and measurable metrics
  • real-world deployment constraints and lessons learned

See: CONTRIBUTING


📋 Project Documents

Document Description
ROADMAP Implementation timeline and milestones
GOVERNANCE Project governance model
SECURITY Security policy and reporting
DISCLAIMER Legal disclaimers
TRADEMARKS Trademark notices

📜 License

Content License
Documentation & Specification Creative Commons CC-BY 4.0
Reference Implementation (when released) Apache 2.0

🏢 About

Too many solutions are bandaids—security built for humans at the keyboard, retrofitted for AI. We've watched incident after incident unfold like a horror movie, everyone yelling "DON'T OPEN THE DOOR!" while shoving your mouth full of popcorn, unable to tear your eyes away from the actors doing just that.

We decided to do something about it instead.

Zones of Distrust is our open contribution to help raise the bar for agentic AI security as an open reference architecture for securing autonomous AI agents.

We (humans) need new solutions. Let's build what works.

GitHub: @bluvibytes

Contact: [email protected]


Zones of Distrust v0.9 RFC — February 2026 — BluVi

Reviews (0)

No results found