Architecture - Teleton Agent Documentation

System Overview

Property	Value
Version	`0.7.5`
License	MIT
Runtime	Node.js (ESM), TypeScript compiled with tsup
Built-in tools	132 across 11 categories
LLM providers	15 (Anthropic, OpenAI, Google, xAI, Groq, OpenRouter, Moonshot, Mistral, Cerebras, ZAI, MiniMax, HuggingFace, Cocoon, Local, Claude Code)
LLM abstraction	`@mariozechner/pi-ai` (pi-ai)
Memory database	SQLite via better-sqlite3, schema v1.13.0
Vector search	sqlite-vec (384-dim) + FTS5 hybrid
Telegram layer	GramJS Layer 222 (TONresistor/gramjs fork)
Blockchain	TON — wallet, transfers, DEX (STON.fi, DeDust), DNS
Extension points	Plugin SDK, MCP servers (stdio + Streamable HTTP + SSE), Bot SDK
Optional WebUI	Hono + React dashboard on port 7777
Logging	Pino structured logging (redacts secrets)

Source Directory Layout

All source lives under src/. The table below describes every subdirectory.

Directory	Purpose
`src/index.ts`	`TeletonApp` class — top-level wiring: config load, registry, bridge, memory initialization, start/stop lifecycle
`src/agent/`	Core agent: `runtime.ts` (agentic loop, `processMessage()`), `lifecycle.ts` (state machine), `client.ts` (pi-ai wrapper, model resolution), `tools/` (registry, categories, Tool RAG index)
`src/telegram/`	GramJS client wrapper (`bridge.ts`), message dispatcher (`handlers.ts`), admin commands (`admin.ts`), group debouncer (`debounce.ts`), Telegram formatting, scheduled task executor
`src/memory/`	SQLite schema & migrations (`schema.ts`), knowledge indexing, hybrid RAG search, context builder, compaction manager, observation masking, daily logs
`src/session/`	Per-chat session store (SQLite), transcript read/write, memory hook (save on reset), session migration
`src/soul/`	System prompt assembly (`loader.ts`): SOUL.md, SECURITY.md, STRATEGY.md, MEMORY.md, workspace section, response format, owner/user identity, context injection
`src/config/`	Zod config schema (`schema.ts`), provider registry (`providers.ts`), shared model catalog (`model-catalog.ts`), YAML loader, configurable-keys helper
`src/providers/`	Provider-specific credential helpers — `claude-code-credentials.ts` (OAuth token rotation from Claude Code install)
`src/ton/`	TON wallet service, transfer builder, payment verifier, TonAPI client, endpoint configuration
`src/bot/`	Bot SDK runtime: `InlineRouter` (callback query dispatch), `GramJSBotClient`, `PluginRateLimiter`, styled keyboard builder
`src/sdk/`	Plugin SDK factory — frozen, namespaced service objects: `sdk.ton`, `sdk.telegram`, `sdk.bot`, `sdk.db`, `sdk.secrets`, `sdk.storage`
`src/deals/`	Deals & Escrow module (built-in): Grammy bot, NFT gift marketplace, TON payment verification, deal state machine
`src/cocoon/`	Cocoon Network adapter — XML tool injection into system prompt, `<tool_call>` parser for Qwen3 models that do not support native tool calling
`src/workspace/`	Workspace path constants (`paths.ts`), directory initialization for `~/.teleton/workspace/`
`src/webui/`	Optional Hono HTTP server: REST API routes (agent control, config, token usage, MCP status, plugin marketplace), SSE status stream, setup wizard backend
`src/cli/`	CLI entry point (`teleton`), `start`/`stop`/`setup`/`onboard` commands, interactive setup wizard
`src/constants/`	Shared numeric constants: limits, timeouts, tool name sets, API endpoints
`src/utils/`	Pino logger factory, fetch with timeout, sanitize (prompt injection defense), retry helper, error utilities, GramJS bigint helpers

Startup Sequence

The TeletonApp constructor runs synchronously, followed by an async start() call. Steps occur in strict order:

Config load — YAML parsed and validated with Zod (src/config/schema.ts). TonAPI / Toncenter keys applied globally.
Soul load — SOUL.md read from ~/.teleton/workspace/ (falls back to built-in default).
Tool registry — registerAllTools() registers 9 static categories (Telegram, TON, DNS, STON.fi, DeDust, Journal, Workspace, Web, Bot). loadModules() adds 2 built-in modules: Deals and Exec.
Memory init — SQLite database opened at ~/.teleton/memory.db, schema migrations run, sqlite-vec loaded if embedding provider is not none.
WebUI start (optional) — Hono server starts on port 7777 before the agent, so it survives agent stop/restart.
Plugin load — External plugins scanned from ~/.teleton/plugins/, SDK injected, tools registered.
MCP connect — Configured MCP servers connected (stdio, Streamable HTTP, or SSE). Tools registered into registry.
Tool RAG index — All tools indexed into tool_embeddings table (sqlite-vec + FTS5) for semantic retrieval.
Knowledge index — MEMORY.md and memory/*.md files chunked and indexed.
Telegram connect — GramJS connects to Telegram servers. Owner name/username auto-resolved from API if not yet persisted.
Modules start — Each module's start(pluginContext) called (Deals bot connects, background jobs begin).
Message loop — bridge.onNewMessage() registered; debouncer activated for group messages.

AgentLifecycle State Machine

The AgentLifecycle class (src/agent/lifecycle.ts) manages the agent's operational state and is used by the WebUI to trigger start/stop without direct access to TeletonApp.

State	Meaning
`stopped`	Initial state and state after a clean shutdown.
`starting`	Connecting to Telegram, loading plugins, indexing tools. Concurrent `start()` calls return the existing promise.
`running`	Agent is connected and processing messages.
`stopping`	Flushing debouncer, draining in-flight messages, stopping modules, disconnecting bridge.

State changes are emitted as stateChange events (EventEmitter), consumed by the WebUI SSE endpoint for live status updates. The lifecycle also tracks uptime (seconds since entering running) and the last error message if startup failed.

Calling stop() while in starting state waits for startup to complete before initiating shutdown. Calling start() while stopping throws — the state machine enforces a clean transition sequence.

Message Processing Pipeline

Every inbound Telegram message passes through a fixed sequence of stages before the LLM sees it.

Stage 1 — Ingest & Debounce

The GramJS bridge fires onNewMessage for each new message (regular and service messages like gift offers). All messages are enqueued into MessageDebouncer. DMs and admin commands bypass debouncing; group messages are held for debounce_ms (default 1500 ms) and batched, protecting against rapid-fire sequences. The debouncer buffers up to 20 messages per chat.

Stage 2 — Admin Command Check

AdminHandler inspects messages starting with /. If the sender is in admin_ids, built-in commands (/pause, /resume, /reset, /status, /boot, /task, etc.) are dispatched directly. /boot and /task fall through to the agent with injected context. Messages arriving while paused are silently dropped (admin commands still execute).

Stage 3 — Scheduled Task Check

Messages from the agent's own user ID matching the format [TASK:uuid] are intercepted and dispatched to the scheduled task executor rather than the normal message handler. The task is loaded from SQLite, dependency checks run, and the agent is invoked with task-specific context.

Stage 4 — Session & Context Load

AgentRuntime.processMessage() loads or creates the chat session. If a daily reset policy triggers (daily_reset_hour, idle_expiry_minutes), the existing session memory is saved before the session is cleared. The conversation transcript is loaded from the session file.

Stage 5 — RAG Context Retrieval

For non-trivial messages, the user message is embedded (384-dim vector). The ContextBuilder runs a hybrid search (sqlite-vec cosine + FTS5 BM25) against the knowledge base and Telegram feed history, surfacing up to 5 relevant chunks. Results are sanitized with sanitizeForContext() before injection into the system prompt. Memory statistics (message count, chat count, knowledge chunks) are also injected.

Stage 6 — System Prompt Assembly

buildSystemPrompt() in src/soul/loader.ts assembles the prompt in section order:

Soul (SOUL.md)
Security rules (SECURITY.md, if present)
Strategy (STRATEGY.md, DM only)
Workspace description (tool list, directory layout)
Response format guidelines
Owner identity (if configured)
Persistent memory (MEMORY.md + recent daily logs, DM only)
Current user identity
RAG context (relevant knowledge + feed)
Memory flush warning (when approaching compaction threshold)

Stage 7 — Preemptive Compaction

Before calling the LLM, CompactionManager checks if the conversation context exceeds the soft threshold (50% of model context window, or 64K tokens). If so, older messages are summarized using a cheaper utility model and replaced with a compact summary. The session transcript is updated atomically.

Stage 8 — Tool RAG Selection

With 132 built-in tools, sending all tools to every LLM call is impractical for providers with tool limits. The Tool RAG system selects the most relevant tools for the current message:

The message embedding is compared against the tool index using hybrid search (60% vector weight, 40% keyword weight, minimum score 0.1).
Top top_k=25 tools are retrieved, plus any always_include patterns (defaults: telegram_send_message, telegram_reply_message, telegram_send_photo, telegram_send_document, journal_*, workspace_*, web_*).
For providers with no tool limit (Anthropic, Claude Code), Tool RAG is skipped by default — all tools are sent.
Tool RAG is also skipped for trivial messages (single-word acknowledgements, emoji-only, etc.).

Stage 9 — Agentic Loop

The LLM is called via chatWithContext(). The loop iterates up to max_agentic_iterations (default 5) times:

Observation masking applied: tool results older than the last 10 are replaced with placeholder text to reduce token usage while preserving call structure.
pi-ai complete() called with the assembled context and selected tools.
If the response contains tool calls, each is dispatched to the ToolRegistry with full ToolContext (bridge, db, chatId, senderId, config).
Tool results exceeding 50,000 characters are truncated; the summary/message field is preserved when available.
Results are appended to the context. For the Cocoon provider, results are wrapped in <tool_response> tags in a user message (Qwen3 XML format).
Loop continues until no tool calls are returned or the iteration limit is reached.

The loop handles three error classes automatically: context overflow (session reset + retry), rate limiting (exponential backoff, up to 3 retries), and 5xx server errors (exponential backoff, up to 3 retries).

Stage 10 — Post-Loop Compaction & Response

After the loop, a second compaction check runs on the updated context (hard threshold: 75% of context window, 200 messages). Token usage is accumulated across all iterations and logged. The final text content is returned as the agent response, which the message handler sends back to Telegram. If the agent used a Telegram send tool directly, the text response is suppressed to avoid duplicate messages.

Tool System

Tools are the agent's primary action interface. Each tool is a TypeScript object with a JSON Schema definition and an executor function. The ToolRegistry manages registration, scope filtering, permission overrides, and execution.

Tool Categories (132 built-in tools)

Category	Count	Source	Description
Telegram	76	`src/agent/tools/telegram/`	Send messages, media, polls, stickers, scheduled messages, group management, contacts, gift marketplace, stories, reactions
TON Blockchain	15	`src/agent/tools/ton/`	Wallet balance, transfers, transaction history, TonAPI queries, payment verification, Jetton info
DNS & Domains	8	`src/agent/tools/dns/`	Resolve .ton domains, set/get DNS records, manage TON Site ADNL records
STON.fi	5	`src/agent/tools/stonfi/`	DEX swap, quote, liquidity pool info, asset list
DeDust	5	`src/agent/tools/dedust/`	DEX swap, quote, vault info, pool queries
Journal	3	`src/agent/tools/journal/`	Append to daily log, write session note, read recent memory
Workspace	6	`src/agent/tools/workspace/`	List, read, write, delete, rename files in `~/.teleton/workspace/`; workspace stats
Web	2	`src/agent/tools/web/`	Fetch URL content, web search
Bot	1	`src/agent/tools/bot/`	Send inline bot message with styled buttons (Bot SDK)
Deals & Escrow	5	`src/deals/module.ts`	Create deal, accept/reject, check status, list active deals (TON payment escrow via Grammy bot)
Exec	4	`src/agent/tools/exec/`	Run shell commands, read file, write file, list directory (admin-only scope)

Tool Scopes

Each tool is registered with a scope that controls when it is available:

Scope	Available in
`always`	All contexts (DMs and groups)
`dm-only`	Direct messages only
`group-only`	Group chats only
`admin-only`	Sender must be in `admin_ids`

Scope can be overridden per-tool at runtime via the tool_config table in memory.db (requires restart to take effect). The registry enforces scope at registration, at getForContext() filtering, and again inside execute() — three independent checkpoints.

Tool RAG Index

At startup, all registered tools (including plugins and MCP tools) are indexed into the tool_embeddings table. The index uses the same hybrid approach as memory search: sqlite-vec for semantic similarity and FTS5 for keyword matching. Hot-reload plugins trigger a differential re-index (removed tools deleted, new tools added).

The default top_k=25 can be adjusted in config under tool_rag.top_k. The always_include list ensures core communication tools are never filtered out, regardless of the message content.

Memory & RAG

All persistent state lives in a single SQLite database at ~/.teleton/memory.db, currently at schema version 1.13.0. The database uses sqlite-vec for 384-dimensional vector storage and FTS5 for full-text search.

Key Tables

Table	Purpose
`knowledge`	Chunked knowledge base: MEMORY.md, memory/*.md, learned facts. Sources: `memory`, `session`, `learned`.
`knowledge_fts`	FTS5 virtual table over `knowledge.text` for BM25 keyword search.
`tg_messages`	Telegram message feed (indexed for RAG context retrieval).
`tg_messages_vec`	sqlite-vec companion table — 384-dim embeddings for feed messages.
`sessions`	Per-chat session metadata (sessionId, message count, token usage, provider, model).
`tool_config`	Runtime scope overrides per tool name.
`tool_embeddings`	Tool RAG index — tool name, description embedding, FTS text.
`deals`	Escrow deal state (created, accepted, funded, completed, cancelled, expired).
`tasks`	Scheduled task definitions with dependency graph and status.
`exec_audit`	Audit log for all Exec tool invocations.
`meta`	Schema version and other metadata key-value pairs.

Embedding Providers

Provider	Config value	Model	Notes
Local ONNX	`local` (default)	`Xenova/all-MiniLM-L6-v2`	Runs in-process via @xenova/transformers. No API key. Model downloaded on first run.
Anthropic	`anthropic`	voyage-3 (via Anthropic API)	Requires Anthropic API key. Higher quality, API cost applies.
None (FTS5 only)	`none`	—	Disables sqlite-vec. Only keyword search available. No embedding overhead.

Hybrid Search

The ContextBuilder merges vector and keyword results with reciprocal rank fusion. The minimum hybrid score is 0.15. Vector weight is 0.6, keyword weight is 0.4. When the embedding provider is none, the system falls back to FTS5-only search.

Compaction

The CompactionManager monitors conversation context size. Two thresholds apply:

Soft threshold: 50% of model context window — triggers a memory flush warning injected into the system prompt, prompting the agent to write important facts to memory before they are lost.
Hard threshold: 75% of model context window, or 200 messages — triggers automatic compaction: older messages are summarized by a utility model, replaced with the summary, and the transcript is rewritten.

After daily session reset or context overflow reset, the session memory is saved via saveSessionMemory() before the session is cleared, preserving continuity across resets.

LLM Providers

All LLM calls are routed through @mariozechner/pi-ai, a unified LLM client that handles request formatting, streaming, and response normalization across providers. Teleton adds a thin adapter layer in src/agent/client.ts for provider-specific concerns.

Provider Registry

Provider ID	Display Name	Default Model	Tool Limit	Notes
`anthropic`	Anthropic (Claude)	`claude-opus-4-6`	Unlimited	Recommended default. Tool RAG skipped (all tools sent).
`claude-code`	Claude Code (Auto)	`claude-opus-4-6`	Unlimited	Auto-rotates credentials from local Claude Code install (`~/.claude/.credentials.json`). Retries once on 401.
`openai`	OpenAI (GPT-4o)	`gpt-4o`	128	pi-ai native.
`google`	Google (Gemini)	`gemini-2.5-flash`	128	Tool schemas sanitized for Gemini compatibility (`schema-sanitizer.ts`).
`xai`	xAI (Grok)	`grok-3`	128	pi-ai native.
`groq`	Groq	`llama-3.3-70b-versatile`	128	pi-ai native.
`openrouter`	OpenRouter	`anthropic/claude-opus-4.5`	128	pi-ai native. Routes to 100+ models.
`moonshot`	Moonshot (Kimi K2.5)	`k2p5`	128	Uses `kimi-coding` API at `api.kimi.com/coding`. Alias: `kimi-k2.5` maps to `k2p5`.
`mistral`	Mistral AI	`devstral-small-2507`	128	pi-ai native. `<think>` blocks stripped from output.
`cerebras`	Cerebras	`qwen-3-235b-a22b-instruct-2507`	128	pi-ai native.
`zai`	ZAI (Zhipu)	`glm-4.7`	128	pi-ai native.
`minimax`	MiniMax	`MiniMax-M2.5`	128	pi-ai native.
`huggingface`	HuggingFace	`deepseek-ai/DeepSeek-V3.2`	128	pi-ai native.
`cocoon`	Cocoon Network (Decentralized)	`Qwen/Qwen3-32B`	128	No API key. Pays inference costs in TON. Tools injected as XML in system prompt; `<tool_call>` parsed from text response. Requires local `cocoon` daemon.
`local`	Local (Ollama, vLLM, LM Studio...)	auto-discovered	128	OpenAI-compatible API. `base_url` required. Models discovered at startup via `/models` endpoint.

Tool Limit Enforcement

At startup, if the total tool count exceeds the active provider's tool limit, a warning is logged. At runtime, getForContext() and getForContextWithRAG() both respect the limit, truncating to the most relevant tools when Tool RAG is active. Providers with toolLimit: null (Anthropic, Claude Code) receive all available tools.

Utility Model

Each provider defines a utilityModel — a fast, cheap model used for compaction summarization and session memory saving. Examples: claude-haiku-4-5-20251001 for Anthropic, gpt-4o-mini for OpenAI, gemini-2.0-flash-lite for Google. The utility model can be overridden in config via agent.utility_model.

Extension Architecture

Plugin SDK

External plugins are placed in ~/.teleton/plugins/. Each plugin is a Node.js module that exports a PluginModule object. At startup, loadEnhancedPlugins() discovers and loads each plugin, injecting a frozen SDK instance.

The SDK exposes four namespaced service objects:

Namespace	Capabilities
`sdk.ton`	Wallet address, balance, transfers, Jetton info, TonAPI queries
`sdk.telegram`	Send messages, media, stickers, read chat info
`sdk.bot`	Inline keyboard messages, callback query routing via `InlineRouter`
`sdk.db`	Isolated SQLite database per plugin (proxied ATTACH/DETACH, no cross-plugin access)
`sdk.secrets`	Read/write secret values to `~/.teleton/secrets/` (mode 0o700)
`sdk.storage`	Simple key-value storage in plugin's isolated directory

Plugin isolation: the SDK object is Object.freeze()-d before injection, the config object is deep-cloned, and the plugin's database is sandboxed via a DB proxy. Plugins may define lifecycle hooks: onMessage, onCallbackQuery, start, stop, configure, migrate.

In development mode (dev.hot_reload: true), PluginWatcher monitors the plugins directory with chokidar and performs differential re-registration on file changes — removed tools deleted, new tools indexed into Tool RAG.

MCP Servers

Teleton connects to external MCP (Model Context Protocol) servers at startup. Three transport types are supported:

Transport	Config field	Use case
stdio	`command`	Local processes (npx, Python scripts, executables)
Streamable HTTP	`url` (primary)	Remote HTTP servers, modern MCP implementations
SSE	`url` (fallback)	Legacy SSE-based MCP servers

MCP tools are registered into the main ToolRegistry under a mcp_<servername> module namespace. They participate in Tool RAG indexing and scope filtering identically to built-in tools. Each MCP server can be assigned a scope (always, dm-only, etc.).

Bot SDK & InlineRouter

The Bot SDK enables plugins to build interactive inline experiences using Telegram's callback query mechanism. The InlineRouter (src/bot/inline-router.ts) routes incoming callbackQuery updates to the plugin that registered the matching action prefix.

Styled buttons use GramJS Layer 222 colored button support (KeyboardButtonColor). The PluginRateLimiter prevents abuse by capping callback query rates per user. The Deals module's Grammy bot instance serves as the underlying bot connection; the InlineRouter middleware installs on top of it before the Deals module starts.

Data Persistence

All user data is stored under ~/.teleton/. No cloud sync or external storage is used by default.

Path	Contents	Permissions
`~/.teleton/memory.db`	Main SQLite database: knowledge, sessions, messages, tools, deals, tasks, audit log	Default (0o644)
`~/.teleton/telegram_session.txt`	GramJS session string (Telegram auth)	Default
`~/.teleton/sessions/<id>.json`	Per-chat conversation transcripts (pi-ai message arrays)	Default
`~/.teleton/wallet.json`	TON wallet mnemonic (24-word BIP39 phrase, plaintext JSON). No encryption.	`0o600` (owner read/write only)
`~/.teleton/secrets/`	Plugin secret values directory	`0o700` (owner only)
`~/.teleton/workspace/`	Agent workspace: SOUL.md, MEMORY.md, STRATEGY.md, SECURITY.md, memory/, downloads/, uploads/, memes/, temp/	Default
`~/.teleton/plugins/`	External plugin directories	Default
`~/.teleton/config.yml`	Main YAML configuration file	Default

The wallet file uses writeFileSync({ mode: 0o600 }) on creation. The secrets directory is created with mode: 0o700. The mnemonic is stored as a JSON array of 24 strings — there is no PBKDF2 or encryption at rest. Disk encryption (e.g. LUKS, FileVault) is recommended for production deployments.

Pino logging is configured with a redact list covering apiKey, password, secret, token, and mnemonic fields, preventing credential leakage in log output.

Optional WebUI

When webui.enabled: true, a Hono HTTP server starts on port 7777 (configurable) before the agent. The WebUI survives agent stop/restart cycles — it remains running while the agent state machine transitions.

Key WebUI capabilities:

Live agent status via SSE stream (state, uptime, last error)
Start/Stop agent via AgentLifecycle callbacks
Real-time token usage display (accumulated across all LLM calls)
Plugin marketplace (browse, install, uninstall from ~/.teleton/plugins/)
MCP server status (connected, tool count, transport type)
Config editor (read/write config.yml via configurable-keys API)
Setup wizard (provider selection, API key, model, Telegram credentials)
React frontend built with Vite, served as SPA from dist/web/

The WebUI is protected by a bearer token (webui.auth_token, auto-generated if omitted). It binds to 127.0.0.1 by default — do not expose it directly to the internet without a reverse proxy and TLS.