Pi Coding Agent: Architecture, Agent Loop, Extension System & Ecosystem

Overview

Pi is an open-source, MIT-licensed, terminal-based AI coding agent created by Mario Zechner (@badlogic), best known as the creator of the libGDX game framework. The project lives at badlogic/pi-mono and its playful official website is shittycodingagent.ai — a self-deprecating name that inadvertently became its brand identity.

Pi’s founding thesis: context engineering is paramount. By shipping with only 4 built-in tools and a system prompt under 1,000 tokens, Pi maximizes the context window available for actual code and project information. Everything else is delegated to an extension system, making Pi radically extensible without being opinionated.

As of March 2026: 28.3K GitHub stars, 3K forks, 3,400+ commits, 181 releases (latest v0.63.1), 349+ npm dependents, and 134+ contributors. Pi powers OpenClaw (180K+ GitHub stars), one of the most-starred open-source projects of early 2026.

Origins & History

Zechner built Pi after growing frustrated with Claude Code, which he described as having become “a spaceship with 80% of functionality I have no use for”. The system prompt and tools changed on every release, breaking his workflows and altering model behavior unpredictably.

Timeline:

August 2025: pi-mono monorepo created with foundational TypeScript/npm workspace structure
November 30, 2025: Public launch via blog post “What I learned building an opinionated and minimal coding agent”
December 2, 2025: Pi appeared on the Terminal-Bench 2.0 leaderboard
January 31, 2026: Armin Ronacher (Flask creator) published “Pi: The Minimal Agent Within OpenClaw”, giving Pi prominent external validation
Early 2026: Pi became the engine powering OpenClaw, reaching massive visibility
February 2026: Peter Steinberger (OpenClaw creator) joined OpenAI to build personal agents
March 2026: Rapid daily releases continue; v0.63.1 with Gemini 3.1 Pro Preview support

Monorepo Architecture

Pi is organized as a layered monorepo with strict dependency enforcement (Nader Dabit’s deep dive):

┌─────────────────────────────────────────────────┐
│            Applications Layer                    │
│  pi-coding-agent │ pi-mom │ pi-pods │ pi-web-ui │
├─────────────────────────────────────────────────┤
│              UI Layer                            │
│           pi-tui (Terminal UI)                   │
├─────────────────────────────────────────────────┤
│            Agent Layer                           │
│         pi-agent-core (Agent Loop)               │
├─────────────────────────────────────────────────┤
│           Foundation Layer                       │
│      pi-ai (Unified Multi-Provider LLM API)      │
└─────────────────────────────────────────────────┘

Package	Purpose
pi-ai	Unified LLM API supporting 15+ providers with streaming, tool calling (TypeBox schemas), cross-provider context handoffs, thinking/reasoning support, and cost/token tracking
pi-agent-core	Agent loop: tool execution, AJV validation, event streaming, state management, message queuing (steering + follow-up)
pi-coding-agent	Full CLI runtime: 4 built-in tools, JSONL session persistence, AGENTS.md context files, auto-compaction, extension/skill/theme system
pi-tui	Terminal UI with differential rendering, synchronized output, markdown rendering, live streaming diffs
pi-web-ui	Web components for AI chat interfaces
pi-mom	Slack bot that delegates to the coding agent
pi-pods	CLI for managing vLLM GPU pod deployments

The Agent Loop

Core Mechanics

The agent loop in pi-agent-core follows the standard agentic pattern: send messages to the LLM → execute tool calls → feed results back → repeat until the model stops calling tools. A critical design decision: the loop has no max-steps, no timeouts, no iteration limits. As Zechner writes: “The agent loop just loops until the agent says it’s done.”

Event Lifecycle

The full lifecycle (extensions documentation):

pi starts
│
├── session_directory (CLI startup only)
├── session_start
│
▼
user sends prompt ──────────────────────────────────┐
│                                                    │
├── input (can intercept/transform/handle)           │
├── before_agent_start (inject messages, modify      │
│   system prompt)                                   │
├── agent_start                                      │
│   ├── message_start / message_update / message_end │
│   │                                                │
│   │   ┌── turn (repeats while LLM calls tools) ───┐
│   │   │                                            │
│   │   ├── turn_start                               │
│   │   ├── context (modify messages before LLM)     │
│   │   ├── before_provider_request                  │
│   │   │   LLM responds, may call tools:            │
│   │   │   ├── tool_execution_start                 │
│   │   │   ├── tool_call (can block)                │
│   │   │   ├── tool_execution_update                │
│   │   │   ├── tool_result (can modify)             │
│   │   │   └── tool_execution_end                   │
│   │   └── turn_end                                 │
│   │                                                │
│   └── agent_end                                    │
│                                                    │
user sends another prompt ◄─────────────────────────┘

Execution Flow (Step-by-Step)

User input received — the input event fires, extensions can intercept/transform
Skill/template expansion — if input matches a /skill:name pattern, Pi expands it lazily
before_agent_start — extensions inject messages or modify system prompt (RAG, memory, dynamic context)
agent_start — core loop begins
Turn begins — context event fires (extensions rewrite message history), then before_provider_request
LLM call via streamFn() — provider-agnostic streaming layer normalizes responses into text_delta, thinking_delta, toolcall_delta, done, error
Tool call detection — parameters validated against TypeBox schemas via AJV; validation errors returned as tool results for self-correction
Tool execution — dual-payload results: content (LLM-visible) + details (UI-only)
Results fed back — loop returns to step 5 for another turn
Termination — when LLM produces a response without tool calls, agent_end fires; follow-up messages in queue trigger new runs

No Built-in Planning Mode

Pi deliberately rejects planning modes, ReAct frameworks, and chain-of-thought scaffolding. Instead it relies on:

The LLM’s inherent reasoning ability — “All frontier models have been RL-trained extensively, so they inherently understand what a coding agent is”
File-based plans — write plans to PLAN.md files for observability and cross-session persistence
Thinking levels — five tiers (minimal, low, medium, high, xhigh) control model reasoning depth, specified via model name suffix (e.g., sonnet:high)

The Four Core Tools

Tool	Description
read	Read file contents and images; supports directory listing
write	Create or overwrite files completely
edit	Surgical find/replace text edits with progressive output parsing
bash	Execute shell commands (default 30s timeout)

Additional tools (grep, find, ls) exist but are disabled by default. Extensions can register custom tools via pi.registerTool() or override built-in tools by registering with the same name. The --no-tools flag starts Pi with zero built-in tools for fully custom setups.

Context Management

Auto-Compaction:

Triggers when contextTokens > contextWindow - reserveTokens (default reserve: 16,384 tokens)
Walks backwards through messages to keepRecentTokens threshold (default: 20,000 tokens)
LLM generates structured summary (goal, constraints, progress, decisions, next steps, read/modified files)
Summary replaces older messages in-memory; full history retained in JSONL file
Customizable via session_before_compact extension hook

Session Persistence:

Append-only JSONL files with tree structure via id/parentId fields
Mutable leafId pointer enables in-place branching without creating new files
/tree navigates session history; /fork creates new session from a branch point
Sessions auto-save to ~/.pi/agent/sessions/ organized by working directory

Cross-Provider Context Handoffs:

Mid-session model switching via /model or Ctrl+L across 15+ providers
Thinking traces converted to plain text for provider compatibility

Human-in-the-Loop

Pi intentionally ships without permission popups. Instead:

Extension-based approval gates via tool_call event: block dangerous operations with ctx.ui.confirm()
Steering messages (Enter): delivered after current tool, interrupt remaining tools
Follow-up messages (Alt+Enter): queued until agent finishes, then delivered
Abort via Escape, /stop, or ctx.abort()

Termination

Natural: loop exits when LLM produces response without tool calls — no step limits, no token limits on iterations
User-initiated: Escape, /stop, steering messages, ctx.abort()
Context overflow: auto-compaction recovers and retries

Extension & Plugin System

Architecture: TypeScript Modules with Event-Driven API

Pi’s extension model is custom — not based on VSCode extensions, LSP, or any existing standard. Extensions are TypeScript modules that export a default function receiving an ExtensionAPI object:

import type { ExtensionAPI } from "@mariozechner/pi-coding-agent";
 
export default function (pi: ExtensionAPI) {
  pi.on("tool_call", async (event, ctx) => {
    // intercept tool calls
  });
  pi.registerTool({ /* custom tool */ });
  pi.registerCommand("mycommand", { /* slash command */ });
}

Extensions are loaded via jiti (just-in-time TypeScript transpiler) — no compilation step needed. Save a .ts file, run /reload, and it’s live.

Discovery Locations

Location	Scope
`~/.pi/agent/extensions/*.ts`	Global (single files)
`~/.pi/agent/extensions/*/index.ts`	Global (directories)
`.pi/extensions/*.ts`	Project-local
`.pi/extensions/*/index.ts`	Project-local (directories)
`settings.json` → `extensions` array	Custom paths, npm/git packages
`-e ./path.ts` CLI flag	Quick testing without installing

What Plugins Can Do

Custom Tools — register LLM-callable tools via pi.registerTool() with TypeBox parameter schemas
Slash Commands — register interactive commands (e.g., /stats) via pi.registerCommand()
Keyboard Shortcuts — register hotkeys via pi.registerShortcut()
Model Providers — dynamically register/override LLM providers including OAuth flows via pi.registerProvider()
Custom UI Components — full TUI components, overlays, status bar, widgets via ctx.ui.custom()
Message Renderers — custom display for tool calls/results via pi.registerMessageRenderer()
CLI Flags — custom runtime flags via pi.registerFlag()
Event Interception — block/modify tool calls, inject LLM context, intercept user input, customize compaction
Sub-agents — spawn Pi instances via tmux or --mode json subprocesses
State Persistence — write to session JSONL via pi.appendEntry(customType, data)

ExtensionAPI Surface

Registration: pi.on(), pi.registerTool(), pi.registerCommand(), pi.registerShortcut(), pi.registerFlag(), pi.registerMessageRenderer(), pi.registerProvider()

Messaging: pi.sendMessage() (with steer/followUp/nextTurn delivery modes), pi.sendUserMessage(), pi.appendEntry()

Model Control: pi.setModel(), pi.getThinkingLevel(), pi.setThinkingLevel()

Tool Access: pi.getActiveTools(), pi.getAllTools(), pi.setActiveTools(), pi.exec()

ExtensionContext (ctx): ctx.ui (dialogs, notifications, overlays), ctx.cwd, ctx.sessionManager, ctx.modelRegistry, ctx.isIdle(), ctx.abort(), ctx.compact(), ctx.getSystemPrompt(), ctx.getContextUsage()

Custom Tool Example

import { Type } from "@sinclair/typebox";
import { StringEnum } from "@mariozechner/pi-ai";
 
export default function (pi: ExtensionAPI) {
  pi.registerTool({
    name: "my_tool",
    label: "My Tool",
    description: "Does something useful for the LLM",
    parameters: Type.Object({
      action: StringEnum(["search", "analyze"] as const),
      query: Type.String(),
    }),
    async execute(toolCallId, params, signal, onUpdate, ctx) {
      onUpdate?.({ content: [{ type: "text", text: "Working..." }] });
      return { content: [{ type: "text", text: "Result here" }] };
    },
  });
}

Permission Gate Example

pi.on("tool_call", async (event, ctx) => {
  if (event.toolName === "bash" && event.input.command.includes("rm -rf")) {
    const ok = await ctx.ui.confirm("Allow dangerous command?", "Allow?");
    if (!ok) return { block: true, reason: "Blocked by user" };
  }
});

Skills: On-Demand Capability Packages

Skills are a complementary extensibility mechanism — markdown files (SKILL.md) following the Agent Skills standard that describe workflows in natural language. Key property: skills are loaded lazily — only names and descriptions appear in the system prompt, with full instructions loaded on-demand. This is “progressive disclosure.”

Skills are compatible with Claude Code, Codex CLI, and other agents.

First-party skills (badlogic/pi-skills): brave-search, browser-tools, Google Calendar/Drive/Gmail CLI, transcribe (Groq Whisper), VS Code integration, YouTube transcript retrieval.

Package System

{
  "name": "my-package",
  "keywords": ["pi-package"],
  "pi": {
    "extensions": ["./extensions"],
    "skills": ["./skills"],
    "prompts": ["./prompts"],
    "themes": ["./themes"]
  }
}

Management commands: pi install npm:@scope/pkg, pi install git:github.com/user/repo, pi remove, pi update, pi list, pi config. A package gallery aggregates community packages.

Security Model

No built-in sandboxing by default. Extensions run with full system permissions. Zechner’s stance: “As soon as your agent can write code and run code, it’s pretty much game over” for security — theatrical guardrails are pointless.

Community-built sandboxing:

nono: Kernel-level via Landlock (Linux) / Seatbelt (macOS) — irreversible, deny-by-default
gondolin: Linux micro-VM isolation
Docker/containers: recommended for production use

Ecosystem Comparisons

Feature	Pi	Claude Code	Cursor	Aider	Cline
Interface	Terminal CLI	Terminal CLI	IDE (VS Code fork)	Terminal CLI	VS Code Extension
Open Source	MIT	No	No	Apache 2.0	Apache 2.0
Models	15+ providers, mid-session switching	Anthropic only	Multi-model	100+ models	All major models
Core Tools	4	20+ built-in	IDE-integrated	Git-integrated	Approval-gated
System Prompt	~1,000 tokens	~10,000+ tokens	N/A	Moderate	Moderate
Extension System	TypeScript (50+ examples)	Hooks (6-8)	Plugins	Limited	MCP, tools
MCP	No (deliberate)	Yes	Yes	No	Yes
Sub-agents	Via extensions/tmux	Built-in	Built-in	No	v3.58 native
Permissions	None (YOLO)	Built-in prompts	Built-in	Limited	Approve everything
Sessions	Tree-structured JSONL	Linear	Linear	Git branches	Checkpoints
Price	Free (BYOK)	$20-200/mo	$20-200/mo	Free (BYOK)	Free (BYOK)

Where Pi wins vs. Claude Code: multi-model freedom, system prompt efficiency (~1K vs ~10K tokens), extension depth (20+ lifecycle hooks vs 6-8), cost (smaller prompts = fewer tokens), session tree architecture, full transparency (MIT OSS).

Where Claude Code wins: deepest reasoning (Opus 4.5 at 80.9% SWE-bench), batteries-included experience, enterprise support, larger community.

The No-MCP Stance: Pi’s rejection of MCP is principled — popular MCP servers consume 7-9% of context (13-18K tokens) with tools the model may never use. Zechner advocates “CLI tools with README files” accessible via bash, loaded only when needed.

oh-my-pi: The Batteries-Included Fork

oh-my-pi by Can Boluk transforms Pi into a comprehensive terminal coding environment:

Hash-anchored edits (Hashline): Content-hash anchors replacing verbatim text — 6.7%-68.3% edit success rate improvements across models
LSP integration: 11 LSP operations, 40+ language configs (Rust, Go, Python, TypeScript, Java, Kotlin, etc.), auto-diagnostics on write/edit
Subagent system: 6 bundled agents (explore, plan, designer, reviewer, task, quick_task) with parallel execution via git worktrees
Browser automation: Puppeteer-based with 14 stealth plugins
Python kernel: Embedded IPython with streaming output
TTSR (Time-Traveling Stream Rules): Mid-stream LLM interception to prevent invalid output
AI-powered git: Conventional commits with hunk-level staging and dependency-ordered split commits
Native performance: Rust-based N-API bindings for CPU-intensive operations
65+ syntax themes with auto dark/light switching

Currently at v13.14.0, built on Bun runtime for fast startup.

Notable Community Projects

Project	Description
shitty-extensions	cost-tracker, handoff, oracle, plan-mode, memory-mode, usage-bar
pi-messenger	Multi-agent coordination with presence tracking, file reservations, crew-based task orchestration
pi-extensions (tmustier)	files-widget, tab-status, agent-guidance, arcade minigames
awesome-pi-agent	Curated list with 310+ stars
nono	Kernel-level sandboxing by the Sigstore author
pi-nvim	Neovim bridge
VS Code extension	Official VS Code integration

OpenClaw: Pi at Scale

Pi’s breakout moment came as the engine inside OpenClaw (Peter Steinberger’s personal AI assistant), which reached 180K+ GitHub stars. As Shaw noted: “What OpenClaw did was merge Pi with Claude Skills. That’s the killer combo.”

OpenClaw uses Pi’s SDK mode to embed the agent into a messaging gateway across WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Google Chat, and Microsoft Teams. This validated Pi’s architecture at massive production scale.

⚠️ Security concern: researchers found 341 malicious skills out of 2,857 on ClawHub — a 12% contamination rate, mostly installing Atomic Stealer malware.

Key Design Insights

Context engineering over feature accumulation: Pi’s ~1K token system prompt vs competitors’ ~10K+ leaves dramatically more context for actual code
Self-extending agent philosophy: If you want a feature, ask the agent to build it as an extension — the agent extends itself (Armin Ronacher)
Minimal scaffolding works: Terminal-Bench results show that “Terminus 2” (just a raw tmux session) holds its own against sophisticated tooling, validating that frontier models don’t need hand-holding
Tree-structured sessions: Unlike linear conversation histories, Pi’s branching sessions enable “side quests” without polluting main context
Dual-payload tool results: Separating LLM-visible content from UI-rendered details prevents context pollution while enabling rich rendering
Validation as self-correction: Tool parameter validation errors are returned as tool results, enabling LLM self-correction within the natural loop

Practical Recommendations

Choose Pi when:

Multi-model flexibility matters (switch between Claude, GPT, Gemini, local)
You want to build your own agent on top of a solid SDK
Minimizing LLM costs is important
You prefer terminal-native, composable Unix workflows
Deep customization without forking internals is desired
Vendor independence and open source is required

Choose alternatives when:

Maximum reasoning on hard problems → Claude Code (80.9% SWE-bench)
Batteries-included IDE experience → Cursor
Git-native workflow → Aider
Human-in-the-loop approval flows → Cline
Fully autonomous coding → Devin / OpenHands
MCP integrations are critical → Claude Code, Cursor, or Cline

Quartz 4

Explorer

Pi Coding Agent - Agent Loop, Extension and Plugin System

Pi Coding Agent: Architecture, Agent Loop, Extension System & Ecosystem

Overview

Origins & History

Monorepo Architecture

The Agent Loop

Core Mechanics

Event Lifecycle

Execution Flow (Step-by-Step)

No Built-in Planning Mode

The Four Core Tools

Context Management

Human-in-the-Loop

Termination

Extension & Plugin System

Architecture: TypeScript Modules with Event-Driven API

Discovery Locations

What Plugins Can Do

ExtensionAPI Surface

Custom Tool Example

Permission Gate Example

Skills: On-Demand Capability Packages

Package System

Security Model

Ecosystem Comparisons

oh-my-pi: The Batteries-Included Fork

Notable Community Projects

OpenClaw: Pi at Scale

Key Design Insights

Practical Recommendations

Sources

Graph View

Table of Contents

Quartz 4

Explorer

Pi Coding Agent - Agent Loop, Extension and Plugin System

Pi Coding Agent: Architecture, Agent Loop, Extension System & Ecosystem

Overview

Origins & History

Monorepo Architecture

The Agent Loop

Core Mechanics

Event Lifecycle

Execution Flow (Step-by-Step)

No Built-in Planning Mode

The Four Core Tools

Context Management

Human-in-the-Loop

Termination

Extension & Plugin System

Architecture: TypeScript Modules with Event-Driven API

Discovery Locations

What Plugins Can Do

ExtensionAPI Surface

Custom Tool Example

Permission Gate Example

Skills: On-Demand Capability Packages

Package System

Security Model

Ecosystem Comparisons

oh-my-pi: The Batteries-Included Fork

Notable Community Projects

OpenClaw: Pi at Scale

Key Design Insights

Practical Recommendations

Related Notes

Sources

Graph View

Table of Contents