The Canonical Data Protocol

Memory Condensation System

The definitive technical reference. The implementation standard.

Condensate is a central, platform-agnostic memory operating system for AI Agents. Define your state once, and let agents in GCP, OpenAI, or anywhere else connect to it. You own the final insights, the data, and the graphs; ready for future fine-tuning.

Why Condensate Exists

Modern LLM deployments suffer from escalating costs and latency. Traditional systems push entire conversation histories into models, hoping relevance emerges. We built Condensate because injecting full transcripts wastes tokens, bloated contexts lead to "lost in the middle" phenomena, and redundant retrieval cycles burn resources.

Design Philosophy

  • Stop Escalating Token Costs

    Reduce prompt size by 60–90% by injecting structured semantic summaries instead of verbatim text.

  • Cure Context Window Saturation

    Enforce dynamic context ceilings to eliminate latency and confusion associated with massive context windows.

  • Structured Before Embedded

    Extract decoded Intent, Entities, Decisions, and Outcomes. Only condensed semantic baseline units are stored.

  • Absolute Data Sovereignty & Portability

    You own the final insights. Whether your agents live in GCP, AWS, or OpenAI, they can all plug into your self-hosted, centralized overarching memory system. Your distilled graphs act as high-quality datasets for future model fine-tuning.

Canonical Definitions

The Definition

Condensate is a local-first, peer-to-peer data synchronization protocol designed for AI Agent memory structures. It utilizes deterministic Merkle-DAGs for state tracking and SEC (Strong Eventual Consistency) to ensure conflict-free convergence across decentralized, offline-capable environments.

In short: It is git-like version control for an AI's structured memory, built natively for edge runtimes.

Defining Properties

  • Decentralized Concurrency: Multiple agents can mutate local memory state simultaneously without coordination.
  • Cryptography-First: Every state change is hashed and signed, forming an immutable provenance chain.
  • Deterministic Merge: Concurrent divergent states represent multiple realities that merge without conflict upon data sync (CRDT-like behavior).
  • Offline-First: Local reads and writes have zero network latency. Syncing happens asynchronously.

What Condensate is NOT

  • Not a Vector Database: It does not perform semantic similarity search. It manages the causal graph of memory elements.
  • Not a Blockchain: There is no global consensus mechanism, no token, and no wasteful mining. It relies on peer-to-peer Byzantine fault-tolerant replication.
  • Not a relational DB: It does not use SQL or enforce rigid tables. It maps an evolving cognitive ontology.

Formal Technical Specification

The Condensate protocol operates entirely on a causal directed acyclic graph (DAG) framework. All interactions generate structured semantic changes (deltas) which are gossiped among nodes.

01 System Model & Data Structures

The core data structure is an immutable Directed Acyclic Graph. Nodes represent memory snapshots; edges represent delta operations.

interface CondensateNode {
  hash: string;         // SHA-256 hash of the node payload
  parents: string[];    // Array of parent hashes
  payload: Operation[]; // The semantic diff applied
  signature: string;    // Ed25519 signature of the author
}

02 Conflict Resolution & Guarantees

Strong Eventual Consistency (SEC)

All nodes that have received the same set of updates will compute the identical memory state natively, without requiring a central coordinator or leader election.

Tie-breaking Algorithm

For operations occupying the identical logical timestamp, deterministic conflict resolution algorithms (e.g., Lexical ordering of author public keys paired with Lamport Clocks) dictate the final merged state.

Implementation & SDKs

Condensate provides native SDKs for the languages where AI agents live. All SDKs are strictly typed and communicate with the local Condensate daemon via gRPC.

Python SDK

pip install condensate

from condensate.client import CondensateClient

client = CondensateClient("http://localhost:8000", "api_key")
client.add_item(
    project_id="proj_1",
    source="api",
    text="User requested scheduling"
)

TypeScript SDK

npm install @condensate-io/sdk

import { CondensatesClient } from '@condensate-io/sdk';

const client = new CondensatesClient("http://localhost:8000", "api_key");
await client.addItem({
    project_id: "proj_1",
    source: "api",
    text: "User requested scheduling"
});

Integration Examples

Google Agentic Development Kit (ADK)

Condensate integrates natively with Google ADK via the Model Context Protocol (MCP) server.

from google_adk import Agent
from condensate.mcp import CondensateMCP

# Initialize Condensate as an MCP tool provider
memory_tools = CondensateMCP(project_id="adk-agent-1").get_tools()

agent = Agent(
    model="gemini-2.5-pro",
    tools=memory_tools,
    system_instruction="You are a helpful assistant with long-term canonical memory."
)

OpenAI Swarm & Assistants API

Inject Condensate operations directly into OpenAI function calls for persistent state across threads.

import openai
from condensate.client import CondensateClient

client = CondensateClient("http://localhost:8000", "api_key")

def save_memory(intent: str, text: str):
    return client.add_item(project_id="openai-agent", source="tool", text=text)

# Implement as an OpenAI tool function
tools = [{
    "type": "function",
    "function": {
        "name": "save_memory",
        "description": "Save critical user details for future sessions."
    }
}]

AWS Bedrock & Lambda

Deploy Condensate alongside serverless AWS workloads for highly available inference memory.

import boto3
from condensate.client import CondensateClient

bedrock = boto3.client('bedrock-runtime')
condensate = CondensateClient(base_url="https://internal-condensate-lb", api_key="aws_secret")

def lambda_handler(event, context):
    # Retrieve contextual memory prior to Bedrock invocation
    context_docs = condensate.retrieve(query=event["user_input"], project_id="aws-prod")
    
    # ... inject context_docs into Bedrock prompt ...

Real-World Adoption

Autogen CrewAI LangChain Trusted by highly scalable agentic frameworks

Threat Model & Security

Attacker Capabilities Considered

  • Byzantine Peers: Peers in the network can send maliciously structured DAG segments or attempt temporal isolation. Note that because Condensate relies on deterministic hash-chaining, peers cannot forge another agent's history without the private key.
  • Network Listening: All synchronization happens over untrusted networks.
  • Data Poisoning (Prompt Injection): Malicious payloads may attempt to poison the AI's long-term memory to alter future inference behavior.

Encryption & Key Management

By default, Condensate relies on AES-256-GCM for at-rest and transport encryption. Synchronization connections are negotiated using X25519 elliptic curve Diffie-Hellman.

Keys are managed via external KMS integration or local secure enclaves. The protocol assumes keys are held securely by the host OS.

Trust Assumptions & Limitations

Condensate assumes the local Agent runtime is not compromised. If an attacker gains root access to the node running Condensate, they can extract the local private key.

Furthermore, human-in-the-loop (HITL) assertions are highly recommended for untrusted edge environments to prevent data poisoning via injection.

Comparison Matrices

How Condensate positions itself against existing technologies for AI state management.

Feature / Guarantee Condensate Standard CRDTs (Automerge) Relational DBs (Postgres) Vector DBs (Pinecone)
Primary Use Case AI Memory Graphs Collaborative Text/JSON Structured CRUD Data Semantic Search
Conflict Resolution Deterministic Merge (SEC) Deterministic Merge Last-Write-Wins (or locks) N/A (Append mostly)
Network Model P2P Decentralized P2P Decentralized Client-Server Client-Server
Cryptography First Yes (Merkle-DAGs) No No No
Context Optimization Semantic Distillation None None Retrieval only

Academic & Technical References

Conflict-Free Replicated Data Types

Shapiro, M., Preguiça, N., Baquero, C., & Bourdoncle, F. (2011). Conflict-free replicated data types. In Symposium on Self-Stabilizing Systems (pp. 386-400).

Read Paper →

CAP Theorem & Eventual Consistency

Brewer, E. A. (2000). Towards robust distributed systems. ACM Symposium on Principles of Distributed Computing (PODC).

Read Paper →

Merkle DAGs for Content Addressing

Benet, J. (2014). IPFS - Content Addressed, Versioned, P2P File System. arXiv preprint arXiv:1407.3561.

Read Paper →

Governance & Roadmap

Open Governance Model

Condensate operates under a strict Open Governance model. All protocol upgrades, cryptographic primitive swaps, and schema changes must pass through the public RFC process.

  • License: Apache 2.0 (Permissive, open source).
  • RFC Process: Proposals require a technical spec and minimum quorum by core maintainers.
  • Core Maintainers: The initial steering committee is elected based on GitHub contributions.

Roadmap & Version History

v0.1.0
Initial Protocol Release

Launch of TS/Python SDKs, basic deterministic DAG sync, and AES encryption.

v0.5.0
HITL Assertions

Manual review pipelines and instruction injection heuristics.

v1.0.0 (Q4)
Federation Support

Multi-orchestrator native syncing without centralized hubs.

Comprehensive FAQ

Is Condensate a CRDT?

Condensate behaves similarly to a Commutative Replicated Data Type (CRDT) by ensuring that concurrent operations yield the same final state regardless of the order in which they are received. However, it operates on a more complex data ontology specifically designed for agent intent and entities, not just plaintext or JSON properties.

Can it work completely offline?

Yes. Condensate is an offline-first architecture. Read and write operations occur instantly against the local database node. When a network connection is established, the local DAG synchronizes asynchronously with peers to reach state convergence.

How does it handle conflicting realities?

If two agents update the same cognitive node simultaneously, Condensate captures both states as divergent branches in the Merkle-DAG. Through Strong Eventual Consistency (SEC) and lexical Lamport tie-breaking, the daemon resolves the conflict deterministically so all nodes eventually adopt the exact same branch.

Glossary of Terms

Conflict-free
A property indicating that data structures can be concurrently modified and independently merged without manual conflict resolution or data corruption.
Replica
A full, independent copy of the agent's memory DAG hosted locally by an individual orchestrator or edge node.
Merkle structure
A tree or graph where every node is labeled with the cryptographic hash of its child nodes, allowing for rapid and secure verification of entire data sets.
Causal ordering
A logical ordering of events such that if event A caused event B, all nodes in the system will observe A before B.
Deterministic merge
An algorithmic guarantee that merging the same sets of conflicting state changes will definitively produce the identical resulting state on every peer.
End-to-end encryption
Security paradigm where data is encrypted on the sender's client and only decrypted on the recipient's client, opaque to intermediate transport networks.
Identity authority
The root cryptographic keypair that proves the authenticity of an agent or user. Ed25519 signatures validate all DAG operations.
Federation
The capability for disparate agent memory networks to communicate and share specific DAG sub-graphs securely across organizational boundaries.