Technical Architecture

One binary. Zero dependencies.
Total control.

Echo is a single Go binary (~15MB) that runs anywhere. SQLite for storage, FTS5 for search, optional vector embeddings, and a knowledge graph. No Docker required. No database server. No runtime dependencies.

<10ms
FTS5 search
~50ms
Hybrid search
<5ms
Memory insert
<100ms
Cold start
~15MB
Binary size

System Architecture

Five layers, one binary. Everything from client ingestion to persistent storage runs through a single process with zero external dependencies.

Echo Memory System Architecture CLIENTS ENTRY POINTS ENGINE SERVICES STORAGE Claude Desktop Claude Code Cursor GPT / Gemini Voice Notes Web Dashboard MCP Server JSON-RPC / stdio REST API HTTP :3002 / Fly.io Audio Pipeline watch / transcribe / store Echo Memory Engine Single Go binary · ~15MB · Zero external dependencies Auth Tenant Isolation Rate Limiter Router Memory CRUD store / search / delete Knowledge Graph links / backlinks / related Hybrid Search FTS5 + vector + RRF Agent Tools handoffs / tasks / traces SQLite + WAL memories, tenants, usage FTS5 Index full-text search Vector Store 512-dim embeddings (opt.) memory_links typed graph edges Clients Entry Points Engine Services Storage

Knowledge Hierarchy

Echo organizes information at three levels, from individual memories to interconnected knowledge to agent-ready context.

Memories

The atomic unit. A text blob with optional tags, metadata, and vector embedding. Indexed by FTS5 for instant keyword search and optionally by vector for semantic search.

// Schema
id, content, tags[], domain,
category, embedding[512],
created_at, tenant_id

Knowledge Graph

Memories link to each other through typed relationships. "Informs," "contradicts," "supports," "supersedes." Traverse up to depth 2 to surface connections your keyword search would miss.

// Link types
source_id -> target_id
relation: informs | contradicts
| supports | supersedes | related

Agent Context

Purpose-built tools for AI agents: exit reports, decision traces, handoffs between agents, session kickoff with relevant context. Your agents inherit institutional memory.

// Agent flow
session_kickoff -> recall_traces
-> work -> exit_report
-> handoff -> next_agent

Tool Inventory

10 MCP tools organized by category. Every tool is also available via REST API.

Memory Operations

POST /memoriesStore with tags + metadata
GET /memories/searchFull-text search (FTS5)
GET /memories/semantic-searchVector similarity search
GET /memories/hybrid-searchCombined FTS + vector
GET /memories/{id}Fetch by ID
DELETE /memories/{id}Remove permanently

Knowledge Graph

POST /linksCreate typed relationship
GET /relatedGraph traversal (depth 1-2)

Organization

GET/POST /domainsKnowledge domains
GET/POST /tasksTask management

Infrastructure

GET /healthServer status check
GET /statsMemory counts + sizes
GET /usageAPI usage tracking
GET /token-savingsToken savings metrics
GET /accountTier + limits info

Security Architecture

Recently hardened with 25 security fixes. Every layer is designed for multi-tenant isolation and defense in depth.

Authentication

  • API keys hashed with SHA-256, never stored raw
  • Constant-time comparison (prevents timing attacks)
  • Raw key shown once at creation, then discarded
  • ADMIN_SECRET fail-closed (empty = deny all admin)

Tenant Isolation

  • Every query scoped by tenant_id
  • Cross-tenant link validation prevents data leaks
  • PRAGMA foreign_keys = ON enforced
  • _txlock=immediate for TOCTOU protection

Rate Limiting

  • Per-tenant token bucket algorithm
  • Tier-aware: Free 60 / Pro 1,000 / Team 5,000 req/min
  • Body size limits on all POST endpoints
  • CORS with preflight caching

Production Hardening

  • Dev mode blocked in production (FLY_APP_NAME check)
  • Path normalization prevents trailing-slash bypass
  • Stripe webhook signature verification
  • DB-backed idempotency (survives restarts)

Data Flow & Privacy

Three deployment paths. You choose how your data moves. We never see it unless you ask us to host it.

Cloud Hosted

Client HTTPS Fly.io (US-East) Echo Binary SQLite (persistent volume)

Data stored on Fly.io persistent volumes in US-East. Encrypted in transit via HTTPS. Fly.io shared-cpu-1x, 256MB RAM.

Self-Hosted

Client localhost:3002 Echo Binary SQLite (local disk)

Data never leaves your machine. No network calls. No telemetry. Run on Mac, Windows, Linux, or a Raspberry Pi.

Voice Pipeline

Phone Cloud Sync Watch Folder Transcribe Extract Echo Store

Talk into your phone. Audio syncs to a watch folder, gets compressed (ffmpeg), transcribed (Whisper), extracted (Claude), and stored. You get a Telegram notification when it lands.

Privacy Guarantees

Self-hosted = air-gapped

Data never leaves your machine. No outbound network calls. No phone-home. No telemetry.

No usage tracking sent

Usage stats are per-tenant for your own billing visibility. We never send analytics externally.

Keys hashed at rest

API keys are SHA-256 hashed before storage. The raw key is shown exactly once at creation.

Embeddings are optional

Vector search uses OpenAI text-embedding-3-small. If you opt out, FTS5 keyword search works without any external calls.

No card numbers stored

Stripe handles all billing. We never see, process, or store credit card information.

HTTPS everywhere

Cloud tier encrypts all data in transit. Fly.io persistent volumes for data at rest in US-East region.

Deployment Options

From a single command to a Docker container. Pick the path that fits your stack.

Go Binary

Download. Run. Done. SQLite database auto-created on first launch.

# That's it. No install.
$ ./echo-memory-server
listening on :3002
Mac, Windows, Linux, ARM (Raspberry Pi)

Docker

Multi-stage Alpine build. Final image is ~8MB. Mount a volume for persistence.

$ docker build -t echo .
$ docker run -p 3002:3002 \
-v echo-data:/data echo
~8MB image, Alpine-based

Fly.io (Managed)

We run it for you. Free tier included. Upgrade when you need more.

# Sign up, get a key
$ curl echo-memory.fly.dev
/api/v1/memories \
-H "X-API-Key: echo_sk_..."
shared-cpu-1x, 256MB RAM, persistent volume

Built for Production

Echo runs 24/7 on real infrastructure. These are not aspirational features.

0
External dependencies
No Postgres. No Redis. No Docker required. Just the binary and a filesystem.
WAL
Concurrent reads
SQLite Write-Ahead Logging enables concurrent readers without blocking writes.
RRF
Hybrid ranking
Reciprocal Rank Fusion combines FTS5 keyword scores with vector similarity for best-of-both results.
Multi
Tenant isolation
Every query is scoped. Every link is validated. Tenants cannot access each other's data.

Ready to integrate?

Read the docs, grab an API key, or download the binary. Five minutes to first API call.