Plate I · custodia

Privacy.

— a data-flow story, not a marketing claim.

Hippo runs locally. The interesting question isn’t whether a hosted service has your data — it doesn’t exist — it’s how the daemon handles the data on your machine. This page documents the entire pipeline, source by source.

Data flow: capture sources to enriched knowledge, all on-device. Three capture-source families on the left — shell hook, agentic-CLI session watchers (Claude Code, Codex, Cursor, opencode), and Firefox browsing — flow through a redactor in the daemon. Redacted events land in a local SQLite database. The brain reads enrichment queues from that database, calls a local OpenAI-compatible inference server (oMLX by default; LM Studio also works) bound to localhost, and writes knowledge nodes back. The MCP server reads only from SQLite to answer questions over stdio. Nothing in this diagram crosses the machine boundary, drawn as a dashed sepia border around the entire flow. stays on machine ↓ shell hook agentic watchers firefox NM redactor daemon · Rust SQLite FTS5 · vec0 inference localhost · GPU MCP server stdio

fig. — fasciculus: every arrow stays inside the dashed boundary.

Shell capture (zsh)

The hook lives at shell/hippo.zsh. On every command, it writes a single JSON line to the daemon’s Unix socket. It captures: PWD, command text, exit status, duration, timestamp. It does not capture environment variables, terminal output, or anything that wasn’t typed at the prompt.

The hook is fire-and-forget: a one-byte write to the socket, no acknowledgement, no blocking. If the daemon is down, your shell isn’t slowed down; the event is simply not captured.

Agentic CLI session capture

Hippo watches four agentic CLIs today: Claude Code, Codex, Cursor, and opencode. Claude Code is tailed via an FSEvents watcher on ~/.claude/projects/**/*.jsonl; Codex and Cursor are polled JSONL transcripts (~/.codex/sessions, ~/.cursor/projects); opencode is polled from its own SQLite store at ~/.local/share/opencode/opencode.db. All four parse new segments out of their transcripts on every growth, deduplicate on (session_id, segment_index), and insert them. Tool-use blocks and assistant turns become individual rows. The transcript files themselves are owned by each CLI — hippo doesn’t modify them.

Firefox browsing (allowlist)

A WebExtension speaks Native Messaging to the daemon. Only allowlisted domains produce events; everything else is filtered before anything crosses the extension boundary. The default allowlist is 44 documentation sites (MDN, Rust std, Python docs, etc.). You edit it in the extension’s options. There is no “capture all” mode.

Redaction

Every event — from every source — runs through the redactor in the same transaction as the database insert. The redactor is a regex pipeline that lives at ~/.config/hippo/redact.toml; you can read and edit it.

Default patterns include AWS access keys, GitHub PATs, generic Bearer tokens, OpenAI keys, npm tokens, and a long tail of cloud-provider patterns. The catalogue is documented at /docs/reference/redaction; if you find a pattern you wish was redacted by default, please file an issue.

What stays on your machine

What you can turn off