Plate I · fig. 1

Memoriæ custos.

the keeper of memory.

A local-first memory layer for your shell, your agentic coding sessions (Claude Code, Codex, Cursor, opencode), and your browsing — within an allowlist you control. Hippo runs as a daemon on your Mac, captures activity locally, and answers questions about it with a model that lives on your GPU.

curl -fsSL https://hippobrain.org/install.sh | sh
Plate II · fig. 2

What hippo captures, exactly.

Three source families today, grouped by where the data comes from. Nothing is captured that isn't on this list.

Shell

A zsh hook (shell/hippo.zsh) writes a JSON line per command to the daemon's Unix socket. PWD, command, exit status, duration, redacted command text. No environment variables, no terminal output.

technical detail →

Agentic CLIs

On-disk session transcripts from Claude Code, Codex, Cursor, and opencode. Claude Code is tailed via FSEvents on JSONL; Codex and Cursor are polled JSONL transcripts; opencode is polled from its own SQLite store at ~/.local/share/opencode/opencode.db. All four are segmented and deduped on (session_id, segment_index).

technical detail →

Firefox

A WebExtension speaks Native Messaging to the daemon. Only allowlisted domains (currently 44, edited by you) yield events; everything else is ignored before it's written down.

technical detail →

Plate III · fig. 3 — daemon · brain · MCP

How it works.

Hippo architecture: capture sources feed the daemon, the brain enriches with a local LLM, the MCP server answers questions. Three capture-source families on the left — zsh hook, agentic-CLI session watchers for Claude Code, Codex, Cursor, and opencode, and the Firefox WebExtension — write events to the hippo-daemon Rust process via a Unix socket. The daemon redacts secrets and writes to a SQLite database with FTS5 plus sqlite-vec. The hippo-brain Python process reads enrichment queues from SQLite, calls a local OpenAI-compatible inference server (oMLX by default; LM Studio also works), and writes knowledge nodes back to SQLite. An MCP server exposes ask, search_knowledge, search_events, and get_entities tools to client editors over stdio. zsh hook Agentic CLIs Claude Code Codex Cursor opencode Firefox extension hippo-daemon Rust · redact · write SQLite FTS5 · vec0 hippo-brain Python · inference MCP server ask · search · entities

fig. 3 — sectio architectonica. Capture sources feed the Rust daemon; the brain reads enrichment queues; the MCP server answers questions over stdio.

Two processes share a SQLite database. The daemon captures, redacts, and writes; the brain enriches; the MCP server answers questions. Inference runs on your GPU via a local OpenAI-compatible server (oMLX by default; LM Studio also works). The architecture is documented at /docs/reference/lifecycle.

Plate IV · fig. 4

Privacy, by construction.

All inference local.

Hippo's brain calls a local OpenAI-compatible server (oMLX by default; LM Studio also works) on localhost. The model file lives on your disk; the inference runs on your GPU. There is no network call to a hosted LLM, ever.

No telemetry by default.

Hippo ships an OpenTelemetry stack you can opt into, and when you do, it points at localhost:4317. There is no telemetry endpoint upstream. Nothing leaves your machine unless you stand up a remote collector yourself.

Secrets redacted before storage.

The daemon runs every captured event through a redaction pass before it touches the database. Patterns are user-editable in ~/.config/hippo/redact.toml. Review them at /privacy.

Plate V · fig. 5

See it work.

A real session, redacted. Run hippo doctor to confirm the wiring, then hippo ask to query the knowledge base.

$ hippo doctor
[OK]   daemon       v0.27.0 — pid 5142, uptime 3d 14h
[OK]   brain        v0.27.0 — http://127.0.0.1:8841
[OK]   schema       v16
[OK]   socket       /Users/you/.local/share/hippo/hippo.sock
[OK]   shell hook   ~/.config/zsh/hippo.zsh present
[OK]   claude hook  ~/.claude/settings.json hook entry matches repo
[OK]   browser NM   ~/Library/Application Support/Mozilla/NativeMessagingHosts/hippo_daemon.json
[OK]   probes       last canary round-trip 247ms (shell)
[OK]   alarms       0 unacknowledged
[OK]   inference    http://127.0.0.1:8000 reachable; loaded model qwen3.6-35b-a3b-ud-mlx
ten checks complete in 0.61s — all green.
$ hippo ask "what was that anti-pattern about probes leaking into search?"

  AP-6: probe events MUST NOT appear in user-facing queries. Synthetic
  canaries from `hippo probe` carry a probe_tag column on every row;
  every consumer (RAG, MCP search, hippo ask) filters on probe_tag IS NULL
  upstream. A Semgrep rule enforces this at review time.

  See:
  · docs/capture/anti-patterns.md (3.91)
  · crates/hippo-daemon/src/probe.rs:142-178 (3.42)
  · brain/src/hippo_brain/search.py:88-112 (3.18)