ai-setup 8 min read

Mnemom - Trust Protocol for AI Agents

Open-source Agent Alignment Protocol (AAP) makes AI agent behavior observable with Ed25519-signed decision traces, EU AI Act compliance, and cross-provider verification.

#ai-agent #trust-protocol#ai-safety #open-source #eu-ai-act
By
Share: X in
Mnemom trust protocol thumbnail

TL;DR

TL;DR: Mnemom’s Agent Alignment Protocol (AAP) is an open-source transparency layer for AI agents that produces cryptographically signed decision traces, making agent behavior auditable and EU AI Act compliant across OpenAI, Anthropic, and Gemini.

Source and Accuracy Notes

What Is Mnemom?

Mnemom is a trust infrastructure platform for the “agentic internet.” It provides protocols that make AI agent behavior observable, auditable, and verifiable across different model providers.

The core offering is the Agent Alignment Protocol (AAP) — an open-source transparency protocol that lets agents declare their alignment posture, produce auditable decision traces, and verify value coherence before coordinating with other agents.

“AAP is a transparency protocol, not a trust protocol. It makes agent behavior more observable, not more guaranteed.” — Mnemom documentation

Why AAP Exists

The agent protocol stack has evolved rapidly:

| Protocol | Function | Gap | |----------|----------|-----| | MCP | Agent-to-tool connectivity | No alignment semantics | | A2A | Task negotiation between agents | No value verification | | AP2 | Payment authorization | No behavioral audit |

None of these address a fundamental question: Is this agent serving its principal’s interests?

As agent capabilities become symmetric — equal access to information, equal reasoning power — alignment becomes the primary differentiator. AAP provides the infrastructure to make alignment claims verifiable.

Three Core Components

┌─────────────────┬─────────────────┬─────────────────┐
│ Alignment Card  │    AP-Trace     │ Value Coherence │
│                 │                 │    Handshake    │
├─────────────────┼─────────────────┼─────────────────┤
│ "What I claim   │ "What I         │ "Can we work    │
│  to be"         │  actually did"  │  together?"     │
└─────────────────┴─────────────────┴─────────────────┘
     Declaration        Audit          Coordination

1. Alignment Card

A structured declaration of an agent’s alignment posture:

{
  "card_version": "unified/2026-04-26",
  "card_id": "ac-my-agent-001",
  "agent_id": "did:web:my-agent.example.com",
  "autonomy_mode": "enforce",
  "integrity_mode": "enforce",
  "principal": {
    "type": "human",
    "identifier": "did:web:user.example.com",
    "relationship": "delegated_authority"
  },
  "values": {
    "declared": ["principal_benefit", "transparency", "minimal_data"],
    "conflicts_with": ["deceptive_marketing", "hidden_fees"]
  },
  "autonomy": {
    "bounded_actions": ["search", "compare", "recommend"],
    "escalation_triggers": [
      {
        "condition": "purchase_value > 100",
        "action": "escalate",
        "reason": "Exceeds autonomous spending limit"
      }
    ],
    "forbidden_actions": ["share_credentials", "subscribe_to_services"]
  }
}

2. AP-Trace

An audit log entry recording each decision with cryptographic signatures:

{
  "trace_id": "tr-f47ac10b-58cc-4372",
  "card_id": "ac-f47ac10b-58cc-4372",
  "timestamp": "2026-01-31T12:30:00Z",
  "action": {
    "type": "recommend",
    "name": "product recommendation",
    "category": "bounded"
  },
  "decision": {
    "alternatives_considered": [
      {"option_id": "A", "score": 0.85, "flags": []},
      {"option_id": "B", "score": 0.72, "flags": ["sponsored_content"]}
    ],
    "selected": "A",
    "selection_reasoning": "Highest score. Option B flagged as sponsored and deprioritized per principal_benefit value.",
    "values_applied": ["principal_benefit", "transparency"]
  }
}

3. Value Coherence Handshake

Before agents coordinate, they verify their values align. This prevents conflicting agents from working together on tasks where their values clash.

Setup Workflow

Step 1: Install AAP

pip install agent-alignment-protocol

Or via npm:

npm install @mnemom/agent-alignment-protocol

Step 2: Generate an Alignment Card

aap init --values "principal_benefit,transparency,harm_prevention"
# Creates alignment-card.json

Step 3: Instrument Your Agent

from aap import trace_decision

@trace_decision(card_path="alignment-card.json")
def recommend_product(user_preferences):
    # Your agent logic here
    # Decisions are automatically traced
    return recommendation

Step 4: Verify Behavior

aap verify --card alignment-card.json --trace logs/trace.json
# Output:
# Verified [similarity: 0.82]
# Checks: autonomy, escalation, values, forbidden, behavioral_similarity

Mnemom AEGIS — Runtime Enforcement

Beyond AAP’s transparency layer, Mnemom offers AEGIS (Adaptive Enforcement, Governance & Intelligence Substrate) — a runtime enforcement system that:

  • Screens transactions at four checkpoints (front door, agent core, back door, post-hoc)
  • Provides cross-tenant threat intelligence
  • Detects supply-chain attacks across npm/PyPI packages
  • Propagates signed Managed Rules in under 30 seconds (P95 target)

AEGIS is the runtime layer that catches what supply-chain verification misses.

EU AI Act Compliance

Mnemom ships EU AI Act Article 50-ready disclosures out of the box:

  • Article 50 enforcement begins August 2, 2026
  • Transparency obligations for AI systems
  • Machine-readable content marking
  • Signed governance event chains for audit

The platform maps Articles 10, 12, and Annex IV to the signed governance event chain.

Cryptographic Guarantees

Every verdict is Ed25519-signed with:

  • Hash-chained traces (SHA-256)
  • Merkle tree inclusion proofs
  • ZK-STARK proofs sampled at 10% by default
  • Exportable audit bundles for regulators

The proof chain answers four questions regulators actually ask:

  1. Prove the agent did what the card declared — Every response is cryptographically bound to the card hash
  2. Prove no unauthorized tool was called — Every tool call is recorded against the card’s declared scope
  3. Prove no regulated data leaked — Back-door screening evaluates every output against PII/PHI patterns
  4. Prove the agent was not prompt-injected — Front-door verdicts on every inbound message are part of the decision trace

Zone-Neutral Stance

Mnemom is zone-neutral — it doesn’t belong to any model provider:

  • Works across OpenAI, Anthropic, Gemini, and open-weights models
  • A trust plane built by any one model vendor is structurally conflicted
  • Alignment Cards are portable across providers, versions, and runtimes

“A Fortune 500 does not run on one model. Claude, GPT, Gemini, open-weights Llama — all in production, often on the same workflow.” — Mnemom documentation

Practical Evaluation Checklist

Before adopting Mnemom for production agents:

  • [ ] Define your Alignment Card — What values does your agent declare? What actions are bounded vs. forbidden?
  • [ ] Test trace generation — Instrument a non-production agent and verify traces are generated correctly
  • [ ] Verify coherence checks — Run aap verify on sample traces to ensure behavioral similarity scores meet your threshold
  • [ ] Evaluate AEGIS if needed — For runtime enforcement (not just transparency), assess whether AEGIS’s four-checkpoint model fits your architecture
  • [ ] Check EU AI Act timeline — If you operate in the EU, Article 50 enforcement begins August 2, 2026
  • [ ] Review cross-provider needs — If you use multiple model providers, Mnemom’s zone-neutral stance is a key advantage

Security Notes

  • AAP is transparency, not trust — It makes behavior observable but doesn’t guarantee agents behave as declared
  • Supply-chain detection — AEGIS detects cross-tenant patterns (e.g., compromised npm packages with valid SLSA attestations)
  • Adversarial arena — 15 canonical personas test the system, including a “supply-chain mole” persona
  • Calm-at-GA contract — If the network is calm, the public IoC feed is empty by design (no fake activity)
  • Two-person review — Tier-1 and tier-2 rules (those that block production traffic) require two-person human review before promotion

FAQ

Q: Is Mnemom open source? A: Yes, AAP is Apache 2.0 licensed. The core protocols (AAP, AIP) are open source. AEGIS (runtime enforcement) has commercial tiers.

Q: Does Mnemom replace Lakera Guard or NeMo Guardrails? A: No. Mnemom complements existing guardrails. It operates at a different layer — the network/protocol layer, not the in-process detection layer. You can run AEGIS alongside Lakera, NeMo, Cloudflare WAF, or AWS Bedrock Guardrails.

Q: What’s the difference between AAP and AIP? A: AAP (Agent Alignment Protocol) declares what the agent should do and produces post-hoc traces. AIP (Agent Integrity Protocol) provides real-time integrity checkpoints during agent reasoning, with drift detection.

Q: Does Mnemom work with local models? A: Yes. Mnemom is model-agnostic. It works with OpenAI, Anthropic, Gemini, and any local model fronted by the gateway.

Q: What happens if my agent’s behavior drifts from its Alignment Card? A: AIP detects drift in real-time. The system produces a verdict (clear, review-needed, or boundary-violation) with SHA-256 hashed evidence. AEGIS can automatically escalate or block based on your configuration.

Q: Is Mnemom production-ready? A: The protocols are at v1.0.0 spec. AEGIS is live in production with customers. The platform ships with a “calm-at-GA” contract — if there are no active threats, the public advisory feed explicitly says so rather than fabricating activity.

Conclusion

Mnemom addresses a critical gap in the AI agent stack: trust infrastructure. As agents become more capable and autonomous, the ability to verify their behavior — not just observe it — becomes essential for regulated industries, multi-provider deployments, and cross-agent coordination.

The Agent Alignment Protocol provides a foundation for making agent alignment claims verifiable, while AEGIS adds runtime enforcement for teams that need active protection. The zone-neutral, open-source approach positions Mnemom as a protocol layer rather than a vendor lock-in.

For teams building production agents — especially in regulated industries or multi-provider environments — Mnemom’s transparency and enforcement layers are worth evaluating before the EU AI Act enforcement deadline in August 2026.