ESC
Start typing to search...

Memory System

Teleton uses a three-layer hybrid RAG architecture combining vector embeddings (sqlite-vec) with full-text search (FTS5) across a Knowledge Base, Telegram Feed, and Session state -- all stored in SQLite.

Overview

The memory system is organized into three distinct layers, each serving a different purpose:

LayerContentsPersistence
Knowledge BaseChunked Markdown documents (MEMORY.md, memory/*.md)Permanent -- survives restarts
Telegram FeedArchived messages, users, and chat metadataPermanent -- grows over time
SessionsConversation state, message history, context windowEphemeral -- compacted and summarized

All data lives in a single SQLite database with WAL mode enabled for concurrent reads. When sqlite-vec is available, the system performs hybrid search -- merging FTS5 keyword scores with vector cosine similarity for every query.

Knowledge Base

Source Files

The knowledge base is built from Markdown files in the workspace:

  • MEMORY.md -- the root memory document (in workspace root)
  • memory/*.md -- additional topic-specific memory files

Chunking Strategy

Documents are split into chunks for indexing with the following rules:

ParameterValue
Target chunk size500 characters
Maximum chunk size1000 characters
Boundary rulesRespects heading boundaries, code blocks, and list groups

Incremental Indexing

Each chunk is hashed with SHA-256. On re-index, only chunks whose hash has changed are re-embedded and written. Unchanged chunks are skipped, keeping re-indexing fast.

Knowledge Sources

Every knowledge entry is tagged with a source field:

SourceOriginDescription
memoryMarkdown filesChunks parsed from MEMORY.md and memory/*.md
sessionLLM summariesSummaries generated during session compaction
learnedAgent interactionsFacts the agent picks up during conversations

Embedding Providers

Configured via embedding.provider in config.yaml. Three options are available:

Local (default)

Runs entirely on-device using ONNX Runtime with the Xenova/all-MiniLM-L6-v2 model (384 dimensions). Zero API cost, works offline.

config.yaml
embedding:
  provider: "local"

Anthropic

Uses the Anthropic API for higher-quality embeddings. Requires a valid API key.

config.yaml
embedding:
  provider: "anthropic"

None

Disables vector search entirely. Only FTS5 keyword search is used. Simplest setup with no embedding dependencies.

config.yaml
embedding:
  provider: "none"

Embedding Cache

All computed embeddings are cached in the embedding_cache SQLite table to avoid redundant computation:

ParameterValue
TTL60 days
Max entries50,000
Eviction policyLRU (least recently used)

Telegram Feed

All incoming and outgoing Telegram messages are archived for later retrieval. The feed is stored across three tables:

TableContents
tg_messagesFull message text, sender ID, chat ID, timestamp, optional embedding vector
tg_messages_ftsFTS5 index over message text for keyword search
tg_chatsChat metadata (title, type, member count)
tg_usersUser metadata (name, username, phone)

When vector search is enabled, each message also gets an embedding stored alongside the text, allowing semantic search across the entire message archive.

Context Building (RAG)

When the agent processes an incoming message, context is assembled in four steps:

RAG Pipeline
1. Fetch 10 most recent messages from the current chat
2. Hybrid search the Knowledge Base  -- top 5 chunks
3. Hybrid search the Telegram Feed   -- top 5 messages
4. Deduplicate results

The retrieved context is injected into the prompt in two labeled blocks:

Injected Context Format
[Relevant knowledge from memory]
  ... matched knowledge base chunks ...

[Relevant messages from Telegram feed]
  ... matched archived messages ...

Memory Tools

Two tools are exposed to the agent for explicit memory operations:

ToolDescription
memory_writeWrite content to persistent memory (MEMORY.md) or to the current daily log file
memory_readRead from persistent memory or retrieve daily log entries

The agent can use these tools proactively -- for example, saving important user preferences to MEMORY.md so they persist across sessions.

Daily Logs

A daily log file is automatically created for each day the agent is active:

Path
~/.teleton/workspace/memory/{YYYY-MM-DD}.md

Daily logs contain:

  • Session notes and conversation summaries
  • Memory flushes from session compaction
  • Milestone events and notable interactions

System prompt inclusion (DM only): The logs for yesterday and today are automatically included in the system prompt, capped at 100 lines each. This gives the agent short-term memory across restarts. Group chats do not receive daily log context.

Session Memory

Before a session is compacted or a daily reset occurs, the system preserves key information:

  1. An LLM generates a summary of the old session's conversation
  2. The summary is saved to memory (either persistent MEMORY.md or the daily log)
  3. The agent retains key facts, preferences, and context across session boundaries

This ensures continuity -- even after a context window reset, the agent remembers what matters.

Observation Masking

To save context window space, old tool results are compressed into a compact format:

Masked Format
[Tool: send_message - OK]
[Tool: search_messages - OK]
[Tool: get_balance - ERROR: insufficient funds]
RuleDetail
Last 10 resultsKept intact (full output preserved)
Error resultsAlways kept intact regardless of age
Older resultsCompressed to [Tool: name - OK]
Size reduction~90% per masked result

Context Compaction

When the conversation grows too large, automatic compaction kicks in:

ThresholdTriggerAction
50% of context windowSoft warningMemory flush warning -- agent is prompted to save important facts
200+ messages or 75% of context windowHard compactionFull compaction cycle runs

The compaction process:

  1. AI generates a summary of the old conversation
  2. Old messages are replaced with the summary
  3. The last 20 messages are kept intact
  4. A new session ID is assigned

Privacy

Memory context injection follows strict privacy boundaries:

Chat TypeMemory ContextReason
Direct MessagesFull context included (MEMORY.md + daily logs)Private 1:1 conversation, safe to include personal context
Group ChatsOwn-chat feed RAG search (recent messages from that group), but not MEMORY.md, STRATEGY.md, or cross-chat searchPrevents cross-user information leakage

This separation ensures that private notes, preferences, and personal information stored in memory are never exposed in group conversations.

Configuration

Memory-related settings in config.yaml:

config.yaml
embedding:
  provider: "local"    # "local" | "anthropic" | "none"
  model: null          # Override default model (optional)

storage:
  sessions_file: "~/.teleton/sessions.json"
  memory_file: "~/.teleton/memory.json"
  history_limit: 100
KeyDefaultDescription
embedding.provider"local"Embedding backend: local (ONNX), anthropic, or none
embedding.modelnullOverride the default model for the chosen provider
storage.sessions_file~/.teleton/sessions.jsonPath to session state file
storage.memory_file~/.teleton/memory.jsonPath to memory metadata file
storage.history_limit100Maximum messages retained in raw history

Database Tables

All data is stored in a single SQLite database (schema version 1.13.0). Key tables:

TablePurpose
metaSchema metadata (stores current schema version)
knowledgeKnowledge base chunks (text, hash, source, embedding)
knowledge_ftsFTS5 index over knowledge chunks
knowledge_vecsqlite-vec virtual table for vector similarity search over knowledge chunks
sessionsConversation session state and history
tg_messagesArchived Telegram messages
tg_messages_ftsFTS5 index over Telegram messages
tg_messages_vecsqlite-vec virtual table for vector similarity search over archived messages
tg_chatsTelegram chat metadata
tg_usersTelegram user metadata
embedding_cacheCached embedding vectors (60-day TTL, 50k max, LRU)
exec_auditCommand execution audit log (tool, command, exit code, stdout/stderr, duration)
tool_indexTool RAG index: tool name, description, and search text for semantic tool selection
tool_index_ftsFTS5 index over tool_index for keyword-based tool search
tool_configRuntime tool configuration overrides (enabled, scope) set via admin commands
tasksScheduled and pending agent tasks
task_dependenciesDependency graph between tasks