跳到主要内容
本文尚未翻译

本文暂时显示英文原文,中文翻译正在进行中。翻译完成后将自动更新。

原文链接:English Version

Session Storage

Hermes Agent uses a SQLite database (~/.hermes/state.db) to persist session metadata, full message history, and model configuration across CLI and gateway sessions. This replaces the earlier per-session JSONL file approach.

Source file: hermes_state.py

Architecture Overview

~/.hermes/state.db (SQLite, WAL mode)
├── sessions — Session metadata, token counts, billing
├── messages — Full message history per session
├── messages_fts — FTS5 virtual table for full-text search
└── schema_version — Single-row table tracking migration state

Key design decisions:

  • WAL mode for concurrent readers + one writer (gateway multi-platform)
  • FTS5 virtual table for fast text search across all session messages
  • Session lineage via parent_session_id chains (compression-triggered splits)
  • Source tagging (cli, telegram, discord, etc.) for platform filtering
  • Batch runner and RL trajectories are NOT stored here (separate systems)

SQLite Schema

Sessions Table

CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY,
source TEXT NOT NULL,
user_id TEXT,
model TEXT,
model_config TEXT,
system_prompt TEXT,
parent_session_id TEXT,
started_at REAL NOT NULL,
ended_at REAL,
end_reason TEXT,
message_count INTEGER DEFAULT 0,
tool_call_count INTEGER DEFAULT 0,
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
cache_read_tokens INTEGER DEFAULT 0,
cache_write_tokens INTEGER DEFAULT 0,
reasoning_tokens INTEGER DEFAULT 0,
billing_provider TEXT,
billing_base_url TEXT,
billing_mode TEXT,
estimated_cost_usd REAL,
actual_cost_usd REAL,
cost_status TEXT,
cost_source TEXT,
pricing_version TEXT,
title TEXT,
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
);

CREATE INDEX IF NOT EXISTS idx_sessions_source ON sessions(source);
CREATE INDEX IF NOT EXISTS idx_sessions_parent ON sessions(parent_session_id);
CREATE INDEX IF NOT EXISTS idx_sessions_started ON sessions(started_at DESC);
CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique
ON sessions(title) WHERE title IS NOT NULL;

Messages Table

CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL REFERENCES sessions(id),
role TEXT NOT NULL,
content TEXT,
tool_call_id TEXT,
tool_calls TEXT,
tool_name TEXT,
timestamp REAL NOT NULL,
token_count INTEGER,
finish_reason TEXT,
reasoning TEXT,
reasoning_details TEXT,
codex_reasoning_items TEXT
);

CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, timestamp);

Notes:

  • tool_calls is stored as a JSON string (serialized list of tool call objects)
  • reasoning_details and codex_reasoning_items are stored as JSON strings
  • reasoning stores the raw reasoning text for providers that expose it
  • Timestamps are Unix epoch floats (time.time())
CREATE VIRTUAL TABLE IF NOT EXISTS messages_fts USING fts5(
content,
content=messages,
content_rowid=id
);

The FTS5 table is kept in sync via three triggers that fire on INSERT, UPDATE, and DELETE of the messages table:

CREATE TRIGGER IF NOT EXISTS messages_fts_insert AFTER INSERT ON messages BEGIN
INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;

CREATE TRIGGER IF NOT EXISTS messages_fts_delete AFTER DELETE ON messages BEGIN
INSERT INTO messages_fts(messages_fts, rowid, content)
VALUES('delete', old.id, old.content);
END;

CREATE TRIGGER IF NOT EXISTS messages_fts_update AFTER UPDATE ON messages BEGIN
INSERT INTO messages_fts(messages_fts, rowid, content)
VALUES('delete', old.id, old.content);
INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;

Schema Version and Migrations

Current schema version: 6

The schema_version table stores a single integer. On initialization, _init_schema() checks the current version and applies migrations sequentially:

VersionChange
1Initial schema (sessions, messages, FTS5)
2Add finish_reason column to messages
3Add title column to sessions
4Add unique index on title (NULLs allowed, non-NULL must be unique)
5Add billing columns: cache_read_tokens, cache_write_tokens, reasoning_tokens, billing_provider, billing_base_url, billing_mode, estimated_cost_usd, actual_cost_usd, cost_status, cost_source, pricing_version
6Add reasoning columns to messages: reasoning, reasoning_details, codex_reasoning_items

Each migration uses ALTER TABLE ADD COLUMN wrapped in try/except to handle the column-already-exists case (idempotent). The version number is bumped after each successful migration block.

Write Contention Handling

Multiple hermes processes (gateway + CLI sessions + worktree agents) share one state.db. The SessionDB class handles write contention with:

  • Short SQLite timeout (1 second) instead of the default 30s
  • Application-level retry with random jitter (20-150ms, up to 15 retries)
  • BEGIN IMMEDIATE transactions to surface lock contention at transaction start
  • Periodic WAL checkpoints every 50 successful writes (PASSIVE mode)

This avoids the "convoy effect" where SQLite's deterministic internal backoff causes all competing writers to retry at the same intervals.

_WRITE_MAX_RETRIES = 15
_WRITE_RETRY_MIN_S = 0.020 # 20ms
_WRITE_RETRY_MAX_S = 0.150 # 150ms
_CHECKPOINT_EVERY_N_WRITES = 50

Common Operations

Initialize

from hermes_state import SessionDB

db = SessionDB() # Default: ~/.hermes/state.db
db = SessionDB(db_path=Path("/tmp/test.db")) # Custom path

Create and Manage Sessions

# Create a new session
db.create_session(
session_id="sess_abc123",
source="cli",
model="anthropic/claude-sonnet-4.6",
user_id="user_1",
parent_session_id=None, # or previous session ID for lineage
)

# End a session
db.end_session("sess_abc123", end_reason="user_exit")

# Reopen a session (clear ended_at/end_reason)
db.reopen_session("sess_abc123")

Store Messages

msg_id = db.append_message(
session_id="sess_abc123",
role="assistant",
content="Here's the answer...",
tool_calls=[{"id": "call_1", "function": {"name": "terminal", "arguments": "{}"}}],
token_count=150,
finish_reason="stop",
reasoning="Let me think about this...",
)

Retrieve Messages

# Raw messages with all metadata
messages = db.get_messages("sess_abc123")

# OpenAI conversation format (for API replay)
conversation = db.get_messages_as_conversation("sess_abc123")
# Returns: [{"role": "user", "content": "..."}, {"role": "assistant", ...}]

Session Titles

# Set a title (must be unique among non-NULL titles)
db.set_session_title("sess_abc123", "Fix Docker Build")

# Resolve by title (returns most recent in lineage)
session_id = db.resolve_session_by_title("Fix Docker Build")

# Auto-generate next title in lineage
next_title = db.get_next_title_in_lineage("Fix Docker Build")
# Returns: "Fix Docker Build #2"

The search_messages() method supports FTS5 query syntax with automatic sanitization of user input.

results = db.search_messages("docker deployment")

FTS5 Query Syntax

SyntaxExampleMeaning
Keywordsdocker deploymentBoth terms (implicit AND)
Quoted phrase"exact phrase"Exact phrase match
Boolean ORdocker OR kubernetesEither term
Boolean NOTpython NOT javaExclude term
Prefixdeploy*Prefix match
# Search only CLI sessions
results = db.search_messages("error", source_filter=["cli"])

# Exclude gateway sessions
results = db.search_messages("bug", exclude_sources=["telegram", "discord"])

# Search only user messages
results = db.search_messages("help", role_filter=["user"])

Search Results Format

Each result includes:

  • id, session_id, role, timestamp
  • snippet — FTS5-generated snippet with >>>match<<< markers
  • context — 1 message before and after the match (content truncated to 200 chars)
  • source, model, session_started — from the parent session

The _sanitize_fts5_query() method handles edge cases:

  • Strips unmatched quotes and special characters
  • Wraps hyphenated terms in quotes (chat-send"chat-send")
  • Removes dangling boolean operators (hello ANDhello)

Session Lineage

Sessions can form chains via parent_session_id. This happens when context compression triggers a session split in the gateway.

Query: Find Session Lineage

-- Find all ancestors of a session
WITH RECURSIVE lineage AS (
SELECT * FROM sessions WHERE id = ?
UNION ALL
SELECT s.* FROM sessions s
JOIN lineage l ON s.id = l.parent_session_id
)
SELECT id, title, started_at, parent_session_id FROM lineage;

-- Find all descendants of a session
WITH RECURSIVE descendants AS (
SELECT * FROM sessions WHERE id = ?
UNION ALL
SELECT s.* FROM sessions s
JOIN descendants d ON s.parent_session_id = d.id
)
SELECT id, title, started_at FROM descendants;

Query: Recent Sessions with Preview

SELECT s.*,
COALESCE(
(SELECT SUBSTR(m.content, 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS preview,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
ORDER BY s.started_at DESC
LIMIT 20;

Query: Token Usage Statistics

-- Total tokens by model
SELECT model,
COUNT(*) as session_count,
SUM(input_tokens) as total_input,
SUM(output_tokens) as total_output,
SUM(estimated_cost_usd) as total_cost
FROM sessions
WHERE model IS NOT NULL
GROUP BY model
ORDER BY total_cost DESC;

-- Sessions with highest token usage
SELECT id, title, model, input_tokens + output_tokens AS total_tokens,
estimated_cost_usd
FROM sessions
ORDER BY total_tokens DESC
LIMIT 10;

Export and Cleanup

# Export a single session with messages
data = db.export_session("sess_abc123")

# Export all sessions (with messages) as list of dicts
all_data = db.export_all(source="cli")

# Delete old sessions (only ended sessions)
deleted_count = db.prune_sessions(older_than_days=90)
deleted_count = db.prune_sessions(older_than_days=30, source="telegram")

# Clear messages but keep the session record
db.clear_messages("sess_abc123")

# Delete session and all messages
db.delete_session("sess_abc123")

Database Location

Default path: ~/.hermes/state.db

This is derived from hermes_constants.get_hermes_home() which resolves to ~/.hermes/ by default, or the value of HERMES_HOME environment variable.

The database file, WAL file (state.db-wal), and shared-memory file (state.db-shm) are all created in the same directory.