Architecture
Teleton Agent v0.7.5 is an autonomous Telegram userbot powered by 15 LLM providers and 132 built-in tools across 11 categories. It runs as a single Node.js process, connecting to Telegram via GramJS (Layer 222), persisting all state in SQLite, and routing every message through a multi-iteration agentic loop.
System Overview
| Property | Value |
|---|---|
| Version | 0.7.5 |
| License | MIT |
| Runtime | Node.js (ESM), TypeScript compiled with tsup |
| Built-in tools | 132 across 11 categories |
| LLM providers | 15 (Anthropic, OpenAI, Google, xAI, Groq, OpenRouter, Moonshot, Mistral, Cerebras, ZAI, MiniMax, HuggingFace, Cocoon, Local, Claude Code) |
| LLM abstraction | @mariozechner/pi-ai (pi-ai) |
| Memory database | SQLite via better-sqlite3, schema v1.13.0 |
| Vector search | sqlite-vec (384-dim) + FTS5 hybrid |
| Telegram layer | GramJS Layer 222 (TONresistor/gramjs fork) |
| Blockchain | TON — wallet, transfers, DEX (STON.fi, DeDust), DNS |
| Extension points | Plugin SDK, MCP servers (stdio + Streamable HTTP + SSE), Bot SDK |
| Optional WebUI | Hono + React dashboard on port 7777 |
| Logging | Pino structured logging (redacts secrets) |
Source Directory Layout
All source lives under src/. The table below describes every subdirectory.
| Directory | Purpose |
|---|---|
src/index.ts | TeletonApp class — top-level wiring: config load, registry, bridge, memory initialization, start/stop lifecycle |
src/agent/ | Core agent: runtime.ts (agentic loop, processMessage()), lifecycle.ts (state machine), client.ts (pi-ai wrapper, model resolution), tools/ (registry, categories, Tool RAG index) |
src/telegram/ | GramJS client wrapper (bridge.ts), message dispatcher (handlers.ts), admin commands (admin.ts), group debouncer (debounce.ts), Telegram formatting, scheduled task executor |
src/memory/ | SQLite schema & migrations (schema.ts), knowledge indexing, hybrid RAG search, context builder, compaction manager, observation masking, daily logs |
src/session/ | Per-chat session store (SQLite), transcript read/write, memory hook (save on reset), session migration |
src/soul/ | System prompt assembly (loader.ts): SOUL.md, SECURITY.md, STRATEGY.md, MEMORY.md, workspace section, response format, owner/user identity, context injection |
src/config/ | Zod config schema (schema.ts), provider registry (providers.ts), shared model catalog (model-catalog.ts), YAML loader, configurable-keys helper |
src/providers/ | Provider-specific credential helpers — claude-code-credentials.ts (OAuth token rotation from Claude Code install) |
src/ton/ | TON wallet service, transfer builder, payment verifier, TonAPI client, endpoint configuration |
src/bot/ | Bot SDK runtime: InlineRouter (callback query dispatch), GramJSBotClient, PluginRateLimiter, styled keyboard builder |
src/sdk/ | Plugin SDK factory — frozen, namespaced service objects: sdk.ton, sdk.telegram, sdk.bot, sdk.db, sdk.secrets, sdk.storage |
src/deals/ | Deals & Escrow module (built-in): Grammy bot, NFT gift marketplace, TON payment verification, deal state machine |
src/cocoon/ | Cocoon Network adapter — XML tool injection into system prompt, <tool_call> parser for Qwen3 models that do not support native tool calling |
src/workspace/ | Workspace path constants (paths.ts), directory initialization for ~/.teleton/workspace/ |
src/webui/ | Optional Hono HTTP server: REST API routes (agent control, config, token usage, MCP status, plugin marketplace), SSE status stream, setup wizard backend |
src/cli/ | CLI entry point (teleton), start/stop/setup/onboard commands, interactive setup wizard |
src/constants/ | Shared numeric constants: limits, timeouts, tool name sets, API endpoints |
src/utils/ | Pino logger factory, fetch with timeout, sanitize (prompt injection defense), retry helper, error utilities, GramJS bigint helpers |
Startup Sequence
The TeletonApp constructor runs synchronously, followed by an async start() call. Steps occur in strict order:
- Config load — YAML parsed and validated with Zod (
src/config/schema.ts). TonAPI / Toncenter keys applied globally. - Soul load —
SOUL.mdread from~/.teleton/workspace/(falls back to built-in default). - Tool registry —
registerAllTools()registers 9 static categories (Telegram, TON, DNS, STON.fi, DeDust, Journal, Workspace, Web, Bot).loadModules()adds 2 built-in modules: Deals and Exec. - Memory init — SQLite database opened at
~/.teleton/memory.db, schema migrations run, sqlite-vec loaded if embedding provider is notnone. - WebUI start (optional) — Hono server starts on port 7777 before the agent, so it survives agent stop/restart.
- Plugin load — External plugins scanned from
~/.teleton/plugins/, SDK injected, tools registered. - MCP connect — Configured MCP servers connected (stdio, Streamable HTTP, or SSE). Tools registered into registry.
- Tool RAG index — All tools indexed into
tool_embeddingstable (sqlite-vec + FTS5) for semantic retrieval. - Knowledge index —
MEMORY.mdandmemory/*.mdfiles chunked and indexed. - Telegram connect — GramJS connects to Telegram servers. Owner name/username auto-resolved from API if not yet persisted.
- Modules start — Each module's
start(pluginContext)called (Deals bot connects, background jobs begin). - Message loop —
bridge.onNewMessage()registered; debouncer activated for group messages.
AgentLifecycle State Machine
The AgentLifecycle class (src/agent/lifecycle.ts) manages the agent's operational state and is used by the WebUI to trigger start/stop without direct access to TeletonApp.
| State | Meaning |
|---|---|
stopped | Initial state and state after a clean shutdown. |
starting | Connecting to Telegram, loading plugins, indexing tools. Concurrent start() calls return the existing promise. |
running | Agent is connected and processing messages. |
stopping | Flushing debouncer, draining in-flight messages, stopping modules, disconnecting bridge. |
State changes are emitted as stateChange events (EventEmitter), consumed by the WebUI SSE endpoint for live status updates. The lifecycle also tracks uptime (seconds since entering running) and the last error message if startup failed.
Calling stop() while in starting state waits for startup to complete before initiating shutdown. Calling start() while stopping throws — the state machine enforces a clean transition sequence.
Message Processing Pipeline
Every inbound Telegram message passes through a fixed sequence of stages before the LLM sees it.
Stage 1 — Ingest & Debounce
The GramJS bridge fires onNewMessage for each new message (regular and service messages like gift offers). All messages are enqueued into MessageDebouncer. DMs and admin commands bypass debouncing; group messages are held for debounce_ms (default 1500 ms) and batched, protecting against rapid-fire sequences. The debouncer buffers up to 20 messages per chat.
Stage 2 — Admin Command Check
AdminHandler inspects messages starting with /. If the sender is in admin_ids, built-in commands (/pause, /resume, /reset, /status, /boot, /task, etc.) are dispatched directly. /boot and /task fall through to the agent with injected context. Messages arriving while paused are silently dropped (admin commands still execute).
Stage 3 — Scheduled Task Check
Messages from the agent's own user ID matching the format [TASK:uuid] are intercepted and dispatched to the scheduled task executor rather than the normal message handler. The task is loaded from SQLite, dependency checks run, and the agent is invoked with task-specific context.
Stage 4 — Session & Context Load
AgentRuntime.processMessage() loads or creates the chat session. If a daily reset policy triggers (daily_reset_hour, idle_expiry_minutes), the existing session memory is saved before the session is cleared. The conversation transcript is loaded from the session file.
Stage 5 — RAG Context Retrieval
For non-trivial messages, the user message is embedded (384-dim vector). The ContextBuilder runs a hybrid search (sqlite-vec cosine + FTS5 BM25) against the knowledge base and Telegram feed history, surfacing up to 5 relevant chunks. Results are sanitized with sanitizeForContext() before injection into the system prompt. Memory statistics (message count, chat count, knowledge chunks) are also injected.
Stage 6 — System Prompt Assembly
buildSystemPrompt() in src/soul/loader.ts assembles the prompt in section order:
- Soul (
SOUL.md) - Security rules (
SECURITY.md, if present) - Strategy (
STRATEGY.md, DM only) - Workspace description (tool list, directory layout)
- Response format guidelines
- Owner identity (if configured)
- Persistent memory (
MEMORY.md+ recent daily logs, DM only) - Current user identity
- RAG context (relevant knowledge + feed)
- Memory flush warning (when approaching compaction threshold)
Stage 7 — Preemptive Compaction
Before calling the LLM, CompactionManager checks if the conversation context exceeds the soft threshold (50% of model context window, or 64K tokens). If so, older messages are summarized using a cheaper utility model and replaced with a compact summary. The session transcript is updated atomically.
Stage 8 — Tool RAG Selection
With 132 built-in tools, sending all tools to every LLM call is impractical for providers with tool limits. The Tool RAG system selects the most relevant tools for the current message:
- The message embedding is compared against the tool index using hybrid search (60% vector weight, 40% keyword weight, minimum score 0.1).
- Top
top_k=25tools are retrieved, plus anyalways_includepatterns (defaults:telegram_send_message,telegram_reply_message,telegram_send_photo,telegram_send_document,journal_*,workspace_*,web_*). - For providers with no tool limit (Anthropic, Claude Code), Tool RAG is skipped by default — all tools are sent.
- Tool RAG is also skipped for trivial messages (single-word acknowledgements, emoji-only, etc.).
Stage 9 — Agentic Loop
The LLM is called via chatWithContext(). The loop iterates up to max_agentic_iterations (default 5) times:
- Observation masking applied: tool results older than the last 10 are replaced with placeholder text to reduce token usage while preserving call structure.
- pi-ai
complete()called with the assembled context and selected tools. - If the response contains tool calls, each is dispatched to the
ToolRegistrywith fullToolContext(bridge, db, chatId, senderId, config). - Tool results exceeding 50,000 characters are truncated; the summary/message field is preserved when available.
- Results are appended to the context. For the Cocoon provider, results are wrapped in
<tool_response>tags in a user message (Qwen3 XML format). - Loop continues until no tool calls are returned or the iteration limit is reached.
The loop handles three error classes automatically: context overflow (session reset + retry), rate limiting (exponential backoff, up to 3 retries), and 5xx server errors (exponential backoff, up to 3 retries).
Stage 10 — Post-Loop Compaction & Response
After the loop, a second compaction check runs on the updated context (hard threshold: 75% of context window, 200 messages). Token usage is accumulated across all iterations and logged. The final text content is returned as the agent response, which the message handler sends back to Telegram. If the agent used a Telegram send tool directly, the text response is suppressed to avoid duplicate messages.
Tool System
Tools are the agent's primary action interface. Each tool is a TypeScript object with a JSON Schema definition and an executor function. The ToolRegistry manages registration, scope filtering, permission overrides, and execution.
Tool Categories (132 built-in tools)
| Category | Count | Source | Description |
|---|---|---|---|
| Telegram | 76 | src/agent/tools/telegram/ | Send messages, media, polls, stickers, scheduled messages, group management, contacts, gift marketplace, stories, reactions |
| TON Blockchain | 15 | src/agent/tools/ton/ | Wallet balance, transfers, transaction history, TonAPI queries, payment verification, Jetton info |
| DNS & Domains | 8 | src/agent/tools/dns/ | Resolve .ton domains, set/get DNS records, manage TON Site ADNL records |
| STON.fi | 5 | src/agent/tools/stonfi/ | DEX swap, quote, liquidity pool info, asset list |
| DeDust | 5 | src/agent/tools/dedust/ | DEX swap, quote, vault info, pool queries |
| Journal | 3 | src/agent/tools/journal/ | Append to daily log, write session note, read recent memory |
| Workspace | 6 | src/agent/tools/workspace/ | List, read, write, delete, rename files in ~/.teleton/workspace/; workspace stats |
| Web | 2 | src/agent/tools/web/ | Fetch URL content, web search |
| Bot | 1 | src/agent/tools/bot/ | Send inline bot message with styled buttons (Bot SDK) |
| Deals & Escrow | 5 | src/deals/module.ts | Create deal, accept/reject, check status, list active deals (TON payment escrow via Grammy bot) |
| Exec | 4 | src/agent/tools/exec/ | Run shell commands, read file, write file, list directory (admin-only scope) |
Tool Scopes
Each tool is registered with a scope that controls when it is available:
| Scope | Available in |
|---|---|
always | All contexts (DMs and groups) |
dm-only | Direct messages only |
group-only | Group chats only |
admin-only | Sender must be in admin_ids |
Scope can be overridden per-tool at runtime via the tool_config table in memory.db (requires restart to take effect). The registry enforces scope at registration, at getForContext() filtering, and again inside execute() — three independent checkpoints.
Tool RAG Index
At startup, all registered tools (including plugins and MCP tools) are indexed into the tool_embeddings table. The index uses the same hybrid approach as memory search: sqlite-vec for semantic similarity and FTS5 for keyword matching. Hot-reload plugins trigger a differential re-index (removed tools deleted, new tools added).
The default top_k=25 can be adjusted in config under tool_rag.top_k. The always_include list ensures core communication tools are never filtered out, regardless of the message content.
Memory & RAG
All persistent state lives in a single SQLite database at ~/.teleton/memory.db, currently at schema version 1.13.0. The database uses sqlite-vec for 384-dimensional vector storage and FTS5 for full-text search.
Key Tables
| Table | Purpose |
|---|---|
knowledge | Chunked knowledge base: MEMORY.md, memory/*.md, learned facts. Sources: memory, session, learned. |
knowledge_fts | FTS5 virtual table over knowledge.text for BM25 keyword search. |
tg_messages | Telegram message feed (indexed for RAG context retrieval). |
tg_messages_vec | sqlite-vec companion table — 384-dim embeddings for feed messages. |
sessions | Per-chat session metadata (sessionId, message count, token usage, provider, model). |
tool_config | Runtime scope overrides per tool name. |
tool_embeddings | Tool RAG index — tool name, description embedding, FTS text. |
deals | Escrow deal state (created, accepted, funded, completed, cancelled, expired). |
tasks | Scheduled task definitions with dependency graph and status. |
exec_audit | Audit log for all Exec tool invocations. |
meta | Schema version and other metadata key-value pairs. |
Embedding Providers
| Provider | Config value | Model | Notes |
|---|---|---|---|
| Local ONNX | local (default) | Xenova/all-MiniLM-L6-v2 | Runs in-process via @xenova/transformers. No API key. Model downloaded on first run. |
| Anthropic | anthropic | voyage-3 (via Anthropic API) | Requires Anthropic API key. Higher quality, API cost applies. |
| None (FTS5 only) | none | — | Disables sqlite-vec. Only keyword search available. No embedding overhead. |
Hybrid Search
The ContextBuilder merges vector and keyword results with reciprocal rank fusion. The minimum hybrid score is 0.15. Vector weight is 0.6, keyword weight is 0.4. When the embedding provider is none, the system falls back to FTS5-only search.
Compaction
The CompactionManager monitors conversation context size. Two thresholds apply:
- Soft threshold: 50% of model context window — triggers a memory flush warning injected into the system prompt, prompting the agent to write important facts to memory before they are lost.
- Hard threshold: 75% of model context window, or 200 messages — triggers automatic compaction: older messages are summarized by a utility model, replaced with the summary, and the transcript is rewritten.
After daily session reset or context overflow reset, the session memory is saved via saveSessionMemory() before the session is cleared, preserving continuity across resets.
LLM Providers
All LLM calls are routed through @mariozechner/pi-ai, a unified LLM client that handles request formatting, streaming, and response normalization across providers. Teleton adds a thin adapter layer in src/agent/client.ts for provider-specific concerns.
Provider Registry
| Provider ID | Display Name | Default Model | Tool Limit | Notes |
|---|---|---|---|---|
anthropic | Anthropic (Claude) | claude-opus-4-6 | Unlimited | Recommended default. Tool RAG skipped (all tools sent). |
claude-code | Claude Code (Auto) | claude-opus-4-6 | Unlimited | Auto-rotates credentials from local Claude Code install (~/.claude/.credentials.json). Retries once on 401. |
openai | OpenAI (GPT-4o) | gpt-4o | 128 | pi-ai native. |
google | Google (Gemini) | gemini-2.5-flash | 128 | Tool schemas sanitized for Gemini compatibility (schema-sanitizer.ts). |
xai | xAI (Grok) | grok-3 | 128 | pi-ai native. |
groq | Groq | llama-3.3-70b-versatile | 128 | pi-ai native. |
openrouter | OpenRouter | anthropic/claude-opus-4.5 | 128 | pi-ai native. Routes to 100+ models. |
moonshot | Moonshot (Kimi K2.5) | k2p5 | 128 | Uses kimi-coding API at api.kimi.com/coding. Alias: kimi-k2.5 maps to k2p5. |
mistral | Mistral AI | devstral-small-2507 | 128 | pi-ai native. <think> blocks stripped from output. |
cerebras | Cerebras | qwen-3-235b-a22b-instruct-2507 | 128 | pi-ai native. |
zai | ZAI (Zhipu) | glm-4.7 | 128 | pi-ai native. |
minimax | MiniMax | MiniMax-M2.5 | 128 | pi-ai native. |
huggingface | HuggingFace | deepseek-ai/DeepSeek-V3.2 | 128 | pi-ai native. |
cocoon | Cocoon Network (Decentralized) | Qwen/Qwen3-32B | 128 | No API key. Pays inference costs in TON. Tools injected as XML in system prompt; <tool_call> parsed from text response. Requires local cocoon daemon. |
local | Local (Ollama, vLLM, LM Studio...) | auto-discovered | 128 | OpenAI-compatible API. base_url required. Models discovered at startup via /models endpoint. |
Tool Limit Enforcement
At startup, if the total tool count exceeds the active provider's tool limit, a warning is logged. At runtime, getForContext() and getForContextWithRAG() both respect the limit, truncating to the most relevant tools when Tool RAG is active. Providers with toolLimit: null (Anthropic, Claude Code) receive all available tools.
Utility Model
Each provider defines a utilityModel — a fast, cheap model used for compaction summarization and session memory saving. Examples: claude-haiku-4-5-20251001 for Anthropic, gpt-4o-mini for OpenAI, gemini-2.0-flash-lite for Google. The utility model can be overridden in config via agent.utility_model.
Extension Architecture
Plugin SDK
External plugins are placed in ~/.teleton/plugins/. Each plugin is a Node.js module that exports a PluginModule object. At startup, loadEnhancedPlugins() discovers and loads each plugin, injecting a frozen SDK instance.
The SDK exposes four namespaced service objects:
| Namespace | Capabilities |
|---|---|
sdk.ton | Wallet address, balance, transfers, Jetton info, TonAPI queries |
sdk.telegram | Send messages, media, stickers, read chat info |
sdk.bot | Inline keyboard messages, callback query routing via InlineRouter |
sdk.db | Isolated SQLite database per plugin (proxied ATTACH/DETACH, no cross-plugin access) |
sdk.secrets | Read/write secret values to ~/.teleton/secrets/ (mode 0o700) |
sdk.storage | Simple key-value storage in plugin's isolated directory |
Plugin isolation: the SDK object is Object.freeze()-d before injection, the config object is deep-cloned, and the plugin's database is sandboxed via a DB proxy. Plugins may define lifecycle hooks: onMessage, onCallbackQuery, start, stop, configure, migrate.
In development mode (dev.hot_reload: true), PluginWatcher monitors the plugins directory with chokidar and performs differential re-registration on file changes — removed tools deleted, new tools indexed into Tool RAG.
MCP Servers
Teleton connects to external MCP (Model Context Protocol) servers at startup. Three transport types are supported:
| Transport | Config field | Use case |
|---|---|---|
| stdio | command | Local processes (npx, Python scripts, executables) |
| Streamable HTTP | url (primary) | Remote HTTP servers, modern MCP implementations |
| SSE | url (fallback) | Legacy SSE-based MCP servers |
MCP tools are registered into the main ToolRegistry under a mcp_<servername> module namespace. They participate in Tool RAG indexing and scope filtering identically to built-in tools. Each MCP server can be assigned a scope (always, dm-only, etc.).
Bot SDK & InlineRouter
The Bot SDK enables plugins to build interactive inline experiences using Telegram's callback query mechanism. The InlineRouter (src/bot/inline-router.ts) routes incoming callbackQuery updates to the plugin that registered the matching action prefix.
Styled buttons use GramJS Layer 222 colored button support (KeyboardButtonColor). The PluginRateLimiter prevents abuse by capping callback query rates per user. The Deals module's Grammy bot instance serves as the underlying bot connection; the InlineRouter middleware installs on top of it before the Deals module starts.
Data Persistence
All user data is stored under ~/.teleton/. No cloud sync or external storage is used by default.
| Path | Contents | Permissions |
|---|---|---|
~/.teleton/memory.db | Main SQLite database: knowledge, sessions, messages, tools, deals, tasks, audit log | Default (0o644) |
~/.teleton/telegram_session.txt | GramJS session string (Telegram auth) | Default |
~/.teleton/sessions/<id>.json | Per-chat conversation transcripts (pi-ai message arrays) | Default |
~/.teleton/wallet.json | TON wallet mnemonic (24-word BIP39 phrase, plaintext JSON). No encryption. | 0o600 (owner read/write only) |
~/.teleton/secrets/ | Plugin secret values directory | 0o700 (owner only) |
~/.teleton/workspace/ | Agent workspace: SOUL.md, MEMORY.md, STRATEGY.md, SECURITY.md, memory/, downloads/, uploads/, memes/, temp/ | Default |
~/.teleton/plugins/ | External plugin directories | Default |
~/.teleton/config.yml | Main YAML configuration file | Default |
The wallet file uses writeFileSync({ mode: 0o600 }) on creation. The secrets directory is created with mode: 0o700. The mnemonic is stored as a JSON array of 24 strings — there is no PBKDF2 or encryption at rest. Disk encryption (e.g. LUKS, FileVault) is recommended for production deployments.
Pino logging is configured with a redact list covering apiKey, password, secret, token, and mnemonic fields, preventing credential leakage in log output.
Optional WebUI
When webui.enabled: true, a Hono HTTP server starts on port 7777 (configurable) before the agent. The WebUI survives agent stop/restart cycles — it remains running while the agent state machine transitions.
Key WebUI capabilities:
- Live agent status via SSE stream (state, uptime, last error)
- Start/Stop agent via
AgentLifecyclecallbacks - Real-time token usage display (accumulated across all LLM calls)
- Plugin marketplace (browse, install, uninstall from
~/.teleton/plugins/) - MCP server status (connected, tool count, transport type)
- Config editor (read/write
config.ymlvia configurable-keys API) - Setup wizard (provider selection, API key, model, Telegram credentials)
- React frontend built with Vite, served as SPA from
dist/web/
The WebUI is protected by a bearer token (webui.auth_token, auto-generated if omitted). It binds to 127.0.0.1 by default — do not expose it directly to the internet without a reverse proxy and TLS.