Agentic Loop
The definitive reference for Teleton's Think-Act-Observe reasoning loop -- from message ingestion through session management, context building, tool selection, iterative execution, and response formatting.
1. Overview
Teleton implements a Think-Act-Observe reasoning loop. When the agent receives a message it:
- Think. Reasons about the user's request, the conversation history, and the available tools.
- Act. Optionally calls one or more tools to gather information or perform actions.
- Observe. Inspects the tool results, then decides whether the task is complete or another iteration is needed.
This cycle repeats up to a configurable maximum number of iterations (default: 5). If the LLM produces a final text response or the limit is reached, the loop exits and the response is delivered.
2. Message Entry Point
Every inbound Telegram message passes through a multi-stage pipeline before reaching the agent runtime:
- TelegramBridge.onNewMessage() -- GramJS fires the raw event.
- MessageDebouncer.enqueue() -- groups messages are batched with a
1500 mswindow; DMs and admin messages are dispatched immediately. - MessageHandler policy checks:
- DM policy -- allowlist, open, admin-only, or disabled.
- Group policy -- open, allowlist, or disabled.
- Rate-limit enforcement.
- Mention requirement (groups only, if configured).
- If all checks pass, the message is forwarded to AgentRuntime.processMessage().
GramJS event
-> TelegramBridge.onNewMessage()
-> MessageDebouncer.enqueue() # 1500ms batch (groups) / immediate (DMs)
-> MessageHandler # policy + rate-limit + mention checks
-> AgentRuntime.processMessage() # enters the agentic loop3. Session Management
Each chat (identified by chatId) is mapped to a UUID-based session via getOrCreateSession(chatId).
Reset Policies
| Policy | Trigger | Default |
|---|---|---|
daily_reset | Fires at a configured hour each day | 4:00 AM |
idle_expiry | Fires after N minutes of inactivity | 1440 min (24 h) |
Reset Procedure
- The old session transcript is summarized by the LLM.
- The summary is saved to long-term memory.
- A new session ID is generated and the conversation starts fresh.
Transcript Storage
Every message and tool call is persisted as JSONL:
~/.teleton/sessions/{sessionId}.jsonl4. Context Building (RAG)
Before every LLM call the system assembles rich context from multiple sources:
- Recent messages -- the 10 most recent messages from the current chat.
- Knowledge base search -- hybrid search (vector + FTS5) returns the top 5 chunks.
- Telegram feed search -- hybrid search returns the top 5 messages from monitored feeds.
- Deduplication -- overlapping results are merged.
- Injection -- results are injected into the prompt as structured sections.
0.6 * vectorScore + 0.4 * keywordScore. Note: memory retrieval uses equal 0.5/0.5 weights — see Memory System.5. System Prompt Construction
The system prompt is assembled dynamically by buildSystemPrompt(). The sections are appended in this order:
- Soul personality -- loaded from
SOUL.md, or a built-in default if the file does not exist. - Security rules -- loaded from
SECURITY.md(if present). - Strategy -- loaded from
STRATEGY.md(if present, DM only). - Workspace intro -- brief description of the workspace environment.
- Response format guidelines -- instructions on message length, Markdown usage, etc.
- Owner information -- the configured owner's identity.
- Memory context (DM only):
MEMORY.md-- up to 150 lines of persistent memory.- Daily logs -- yesterday + today, 100 lines each.
- Current user info -- username, ID, timezone offset.
- RAG search results -- the context built in step 4.
- Memory flush warning -- injected if the context is approaching token limits, prompting the agent to persist important information.
6. Tool Selection
Two modes determine which tools the LLM sees:
All Tools (Tool RAG disabled)
Every tool that passes scope filtering is sent to the LLM. Simple but token-expensive with large tool registries.
Tool RAG (enabled)
Semantic search selects the most relevant tools for the current message:
- The user message is embedded as a vector.
- Hybrid search scores each tool (tool selection weights):
0.6 * vectorScore + 0.4 * keywordScore. - The top-K tools are returned (default 25).
- Always-include patterns are preserved regardless of score:
telegram_send_message,journal_*,workspace_*,web_*. - Provider-specific tool limits are applied: Anthropic and claude-code have unlimited tool calls; all other providers cap at 128.
tool_rag:
enabled: false # toggle Tool RAG
top_k: 25 # max tools returned by semantic search7. The Iteration Loop (Core)
This is the heart of the agent. The pseudocode below describes exactly what happens on each iteration:
iteration = 0
while iteration < max_agentic_iterations: # default 5
# 1. Mask old tool results to save context space
maskOldToolResults(transcript)
# -> keep last 10 results intact
# -> keep error results intact
# -> replace older ones with "[Tool: name - OK]"
# 2. Call LLM via pi-ai library
response = llm.call(
systemPrompt,
transcript, # user + assistant + tool messages
tools # selected tool definitions
)
# Provider-specific handling:
# - Cocoon: injects tool definitions into prompt text
# - Gemini: sanitizes JSON schemas for compatibility
# 3. Handle errors
if response.error == "context_overflow":
archiveTranscript()
resetSession()
retry()
if response.error == 429: # rate limit
exponentialBackoff(maxRetries=3)
# 4. Process tool calls
for toolCall in response.toolCalls:
validate(toolCall, registry)
checkScope(toolCall) # dm-only, admin-only, etc.
checkModulePermissions(toolCall)
result = execute(toolCall, timeout=30_000) # 30s timeout
if result.size > 50KB:
result = truncate(result)
transcript.append(result)
# 5. Decide: continue or break
if response.stopReason == "toolUse" AND toolCalls.length > 0:
iteration++
continue # next iteration
else:
break # done -- return response
iteration++/loop admin command.8. Message Envelope Format
Every user message is wrapped in a structured envelope before being added to the transcript. The format varies by context:
Direct Message
[Telegram User (@username, id:123) +2h 2026-02-20 15:30 UTC] <user_message>Hello!</user_message>Group Message
[Telegram Group (+5m 2026-02-20 15:30) User: Hello everyone!Media Message
[photo msg_id=456] [Telegram User (@username, id:123) +2h 2026-02-20 15:30 UTC] <user_message>Check this out</user_message>The envelope encodes the sender's identity, timezone offset, timestamp, and any attached media type, giving the LLM full situational awareness.
9. Observation Masking
As the conversation grows, old tool results are compressed to prevent context bloat:
- The last 10 tool results are kept intact.
- Error results are always kept intact (regardless of age).
- All older results are replaced with a one-line summary:
- Success:
[Tool: name - OK] - Failure:
[Tool: name - ERROR - summary]
- Success:
This achieves approximately 90% size reduction per masked result while preserving the agent's awareness of what tools were called and whether they succeeded.
10. Context Compaction
When observation masking alone is not enough, full context compaction kicks in:
| Threshold | Trigger | Action |
|---|---|---|
| Soft (50% of context window) | Token count exceeds half the model's window | Inject a memory flush warning, prompting the agent to persist important facts to MEMORY.md |
| Hard (200+ messages or 75% of window) | Message count or token count crosses limit | Full compaction (see below) |
Full Compaction Process
- The LLM summarizes the entire conversation so far.
- Old messages are replaced with the summary.
- The last 20 messages are kept intact for continuity.
- A new session ID is generated.
11. Response Formatting
After the loop exits, the response is determined as follows:
- If the LLM produced a text response, it is returned to the user.
- If the
telegram_send_messagetool was used during the loop, the response is empty (the message was already delivered). - If tool calls were made but no text was produced, a fallback message is returned.
After delivery, the session record is updated with the message count, model name, and provider used.
12. Group vs DM Differences
The agent behaves differently depending on the chat type:
| Aspect | Direct Message | Group |
|---|---|---|
| Memory | Full (MEMORY.md + daily logs) | None (privacy) |
| Strategy | Included in system prompt | Excluded (privacy) |
| Tool scope | dm-only tools available | group-only tools available |
| Debounce | None (immediate dispatch) | 1500 ms batching window |
| Mention | Not required | Required if configured |
13. Configuration Reference
Key knobs that control the agentic loop:
| Parameter | Default | Notes |
|---|---|---|
agent.max_agentic_iterations | 5 | Range 1-50. Adjustable at runtime via /loop. |
agent.max_tokens | 4096 | Max output tokens per LLM call. |
agent.temperature | 0.7 | LLM sampling temperature. |
telegram.debounce_ms | 1500 | Group message batching window. |
tool_rag.enabled | true | Enable semantic tool selection. |
tool_rag.top_k | 25 | Max tools returned by Tool RAG. |
| Compaction: message limit | 200 | Hard compaction after 200 messages. |
| Compaction: token threshold | 75% | Hard compaction at 75% of context window. |
| Compaction: keep last | 20 | Messages preserved after compaction. |
agent:
max_agentic_iterations: 5
max_tokens: 4096
temperature: 0.7
telegram:
debounce_ms: 1500
tool_rag:
enabled: false
top_k: 2514. Error Handling
The loop is designed to recover gracefully from a range of failures:
| Error | Detection | Recovery |
|---|---|---|
| Context overflow | LLM returns a context-length error | Archive the transcript, reset the session, and retry the message. |
| Tool timeout | Execution exceeds 30 seconds | Return an error result to the LLM so it can reason about the failure. |
| Rate limit (429) | HTTP 429 from provider | Exponential backoff, up to 3 retries. |
| LLM provider error | Non-429 API error | Retry once; if persistent, return a fallback error message. |
| Corrupt transcript | Malformed JSONL entries detected | Auto-sanitize: strip invalid entries and continue. |