[ PROTOCOL SPECIFICATION // CASUAL MERKLE-DAG ]

Current AI Memory is Built for Search.
Condensate is Built for Cognition.

Move past flat vector spaces and vendor lock-in. Give your autonomous multi-agent swarms a sovereign, cryptographically-signed causal graph that actually learns over time.

Interactive Paradigm Simulator
Server X is down
10:00 AM
Resolves via CRDT Merge
Server X is up
10:05 AM
Canonical State Resolved

Updates are nodes linked by a causal chronological edge. The system automatically returns the latest cryptographically-signed verified node.

[ Bottleneck Analysis: Status Quo AI Memory ]

1. Vector RAG

[ 01 // NO TRUTH VALUE ]

Relies blindly on proximity. Cannot distinguish between a verified fact, a hallucinated lie, or an outdated piece of data. Leads to token-wasting runtime sorting and contradiction blindness.

2. Vendor Threads

[ 02 // THE SILOED HORIZON ]

Locked in walled gardens (e.g. OpenAI Memory). Zero cross-model interoperability. An OpenAI planner and a local Llama agent cannot share the same cognitive graph.

3. DB Logic Layers

[ 03 // BRITTLE CONCURRENCY ]

Centralized master-slave architecture. Relying on Last-Write-Wins overwrites nuanced edge mutations during offline-first or concurrent agent deployments.

[ Core Architecture & Distributed Protocols ]

[ Cryptographically-Signed Merkle-DAGs ]

Instead of a bag of text chunks, Condensate stores memory as a verifiable graph. Entities are extracted, canonicalized, and explicitly linked via semantic edges.

  • Every state change is hashed and signed.
  • Immutable provenance chain.
  • Logically resolved ground truth, not just proximity.
Hash: 0x1A2B [Agent_A // Merkle Root]
Hash: 0x3C4D [Agent_B // Mutation Node]
<-- Parent: 0x1A2B

[ Universal Semantic Bus ]

The daemon runs on-premise or within a private VPC. It ties disparate AI models together into a shared cognitive space.

An OpenAI planner, an Anthropic coder, and a local Llama data-extractor can all concurrently read from and write to the exact same memory substrate.

VPC Memory
Substrate
OpenAI Planner
Claude Coder
Llama Extractor

Conflict-Free Replicated Data Types (CRDTs) [ RE-MERGE SPECS // CLICK TO EXPAND ]

Concurrent agents generate divergent "branches" of reality that merge deterministically when the network syncs.

[ ACTIVE LEARNING ENGINE ]

[ The Synapse Engine: Causal Feedback Reinforcement ]

Traditional memory waits to be queried. Condensate learns. Using Causal Feedback Networks based on Hebbian principles ("Neurons that fire together, wire together"), the graph structurally reinforces semantic pathways independent of LLM weights.

1. Synapse Creation

When the Condenser extracts entities and assertions, it emits candidate synapses based on `co_occurs`, `same_entity`, or `same_goal` signals.

Synapse {
  from_memory_id: UUID, to_memory_id: UUID
  relation: "same_goal"
  weight: 0.1
}

2. Hebbian Strengthening

Synapses are reinforced when connected memories are successfully retrieved together, proving their utility to the swarm.

if memory_A and memory_B retrieved_together:
    synapse.weight += learning_rate * relevance

3. Adaptive Decay & Pruning

Prevents exponential graph bloat. Unused connections decay over time, and weak synapses are aggressively pruned.

synapse.weight *= decay_rate
if synapse.weight < prune_threshold: archive()

4. Memory Consolidation

Using `networkx` Louvain clustering, dense subgraphs are identified. Local LLMs synthesize these into higher-order "Policies" and "Learnings".

New Policy Synthesized: "User prefers local-first AI infra + verifiable memory over opaque SaaS memory."

[ Telemetry, Throughput & Performance Specs ]

The Parser Specification

Enterprise architects require minimal inference latency. Condensate uses a lightweight, locally executed extraction engine (Rust-based bindings wrapped in Python) to parse raw text into explicit semantic edges before committing to the Merkle-DAG.

  • Extraction Overhead: < 15ms per chunk
  • Local Model Support: Natively supports fast sub-8B models for parsing.
  • Throughput: Handles 10k+ concurrent edge mutations per second via async batching.

Benchmark Metrics

Minimizing runtime context stuffing guarantees ROI for infrastructure engineers.

Context Token Overhead (Vector RAG) 100% (Baseline)
Context Token Overhead (Condensate) ~15-20%
Retrieval Latency (Pinecone Baseline) ~80ms
Retrieval Latency (Condensate Local Read) < 5ms

[ Zero-Lockin SDK Integration ]

Condensate provides native SDKs for the languages where AI agents live. All SDKs are strictly typed and communicate with the local Condensate daemon via gRPC.

Python SDK

pip install condensate

from condensate.client import CondensateClient

client = CondensateClient("http://localhost:8000", "api_key")
client.add_item(
    project_id="proj_1",
    source="api",
    text="User requested scheduling"
)

TypeScript SDK

npm install @condensate-io/sdk

import { CondensatesClient } from '@condensate-io/sdk';

const client = new CondensatesClient("http://localhost:8000", "api_key");
await client.addItem({
    project_id: "proj_1",
    source: "api",
    text: "User requested scheduling"
});

Integration Examples

Google Agentic Development Kit (ADK)

Condensate integrates natively with Google ADK via the Model Context Protocol (MCP) server.

from google_adk import Agent
from condensate.mcp import CondensateMCP

# Initialize Condensate as an MCP tool provider
memory_tools = CondensateMCP(project_id="adk-agent-1").get_tools()

agent = Agent(
    model="gemini-2.5-pro",
    tools=memory_tools,
    system_instruction="You are a helpful assistant with long-term canonical memory."
)

OpenAI Swarm & Assistants API

Inject Condensate operations directly into OpenAI function calls for persistent state across threads.

import openai
from condensate.client import CondensateClient

client = CondensateClient("http://localhost:8000", "api_key")

def save_memory(intent: str, text: str):
    return client.add_item(project_id="openai-agent", source="tool", text=text)

# Implement as an OpenAI tool function
tools = [{
    "type": "function",
    "function": {
        "name": "save_memory",
        "description": "Save critical user details for future sessions."
    }
}]

AWS Bedrock & Lambda

Deploy Condensate alongside serverless AWS workloads for highly available inference memory.

import boto3
from condensate.client import CondensateClient

bedrock = boto3.client('bedrock-runtime')
condensate = CondensateClient(base_url="https://internal-condensate-lb", api_key="aws_secret")

def lambda_handler(event, context):
    # Retrieve contextual memory prior to Bedrock invocation
    context_docs = condensate.retrieve(query=event["user_input"], project_id="aws-prod")
    
    # ... inject context_docs into Bedrock prompt ...

Real-World Adoption

Autogen CrewAI LangChain Integrates natively with highly scalable agentic frameworks

[ Threat Model & Cryptographic Verification ]

Attacker Capabilities Considered

  • Byzantine Peers: Peers in the network can send maliciously structured DAG segments or attempt temporal isolation. Note that because Condensate relies on deterministic hash-chaining, peers cannot forge another agent's history without the private key.
  • Network Listening: All synchronization happens over untrusted networks.
  • Data Poisoning (Prompt Injection): Malicious payloads may attempt to poison the AI's long-term memory to alter future inference behavior.

Encryption & Key Management

By default, Condensate relies on AES-256-GCM for at-rest and transport encryption. Synchronization connections are negotiated using X25519 elliptic curve Diffie-Hellman.

Keys are managed via external KMS integration or local secure enclaves. The protocol assumes keys are held securely by the host OS.

Trust Assumptions & Limitations

Condensate assumes the local Agent runtime is not compromised. If an attacker gains root access to the node running Condensate, they can extract the local private key.

Furthermore, human-in-the-loop (HITL) assertions are highly recommended for untrusted edge environments to prevent data poisoning via injection.

[ Architectural Positioning & Guarantees ]

How Condensate positions itself against existing technologies for AI state management.

Feature / Guarantee Condensate Standard CRDTs (Automerge) Relational DBs (Postgres) Vector DBs (Pinecone)
Primary Use Case AI Memory Graphs Collaborative Text/JSON Structured CRUD Data Semantic Search
Conflict Resolution Deterministic Merge (SEC) Deterministic Merge Last-Write-Wins (or locks) N/A (Append mostly)
Network Model P2P Decentralized P2P Decentralized Client-Server Client-Server
Cryptography First Yes (Merkle-DAGs) No No No
Context Optimization Semantic Distillation None None Retrieval only

[ Academic Foundations & Citations ]

Conflict-Free Replicated Data Types

Shapiro, M., Preguiça, N., Baquero, C., & Bourdoncle, F. (2011). Conflict-free replicated data types. In Symposium on Self-Stabilizing Systems (pp. 386-400).

Read Paper →

CAP Theorem & Eventual Consistency

Brewer, E. A. (2000). Towards robust distributed systems. ACM Symposium on Principles of Distributed Computing (PODC).

Read Paper →

Merkle DAGs for Content Addressing

Benet, J. (2014). IPFS - Content Addressed, Versioned, P2P File System. arXiv preprint arXiv:1407.3561.

Read Paper →

[ Open-Source Governance & Teleological Path ]

Open Governance Model

Condensate operates under a strict Open Governance model. All protocol upgrades, cryptographic primitive swaps, and schema changes must pass through the public RFC process.

  • License: Apache 2.0 (Permissive, open source).
  • RFC Process: Proposals require a technical spec and minimum quorum by core maintainers.
  • Core Maintainers: The initial steering committee is elected based on GitHub contributions.

Roadmap & Version History

v0.1.0
Initial Protocol Release

Launch of TS/Python SDKs, basic deterministic DAG sync, and AES encryption.

v0.5.0
HITL Assertions

Manual review pipelines and instruction injection heuristics.

v1.0.0 (Q4)
Federation Support

Multi-orchestrator native syncing without centralized hubs.

[ Operations & Architecture FAQ ]

Is Condensate a CRDT?

Condensate behaves similarly to a Commutative Replicated Data Type (CRDT) by ensuring that concurrent operations yield the same final state regardless of the order in which they are received. However, it operates on a more complex data ontology specifically designed for agent intent and entities, not just plaintext or JSON properties.

Can it work completely offline?

Yes. Condensate is an offline-first architecture. Read and write operations occur instantly against the local database node. When a network connection is established, the local DAG synchronizes asynchronously with peers to reach state convergence.

How does it handle conflicting realities?

If two agents update the same cognitive node simultaneously, Condensate captures both states as divergent branches in the Merkle-DAG. Through Strong Eventual Consistency (SEC) and lexical Lamport tie-breaking, the daemon resolves the conflict deterministically so all nodes eventually adopt the exact same branch.

[ Glossary of Cryptographic & Consensus Terms ]

Conflict-free
A property indicating that data structures can be concurrently modified and independently merged without manual conflict resolution or data corruption.
Replica
A full, independent copy of the agent's memory DAG hosted locally by an individual orchestrator or edge node.
Merkle structure
A tree or graph where every node is labeled with the cryptographic hash of its child nodes, allowing for rapid and secure verification of entire data sets.
Causal ordering
A logical ordering of events such that if event A caused event B, all nodes in the system will observe A before B.
Deterministic merge
An algorithmic guarantee that merging the same sets of conflicting state changes will definitively produce the identical resulting state on every peer.
End-to-end encryption
Security paradigm where data is encrypted on the sender's client and only decrypted on the recipient's client, opaque to intermediate transport networks.
Identity authority
The root cryptographic keypair that proves the authenticity of an agent or user. Ed25519 signatures validate all DAG operations.
Federation
The capability for disparate agent memory networks to communicate and share specific DAG sub-graphs securely across organizational boundaries.