Think
@cloudflare/think lets you build a stateful AI chat agent — one that streams replies, remembers the conversation, and calls tools — by extending a single base class. You provide a model with getModel(), and Think wires up the rest of the chat lifecycle for you: the agentic loop (the model calls tools, reads the results, and keeps going until it has an answer), message persistence, streaming, client tools, stream resumption, and extensions — all backed by Durable Object SQLite.
Think works as both a top-level agent (WebSocket chat to browser clients via useAgentChat) and a sub-agent (a child agent that another agent drives over RPC via chat()).
npm install @cloudflare/think @cloudflare/ai-chat agents ai @cloudflare/shell zod workers-ai-providerimport { Think } from "@cloudflare/think";import { createWorkersAI } from "workers-ai-provider";import { routeAgentRequest } from "agents";
export class MyAgent extends Think { getModel() { return createWorkersAI({ binding: this.env.AI })( "@cf/moonshotai/kimi-k2.6", ); }}
export default { async fetch(request, env) { return ( (await routeAgentRequest(request, env)) || new Response("Not found", { status: 404 }) ); },};import { Think } from "@cloudflare/think";import { createWorkersAI } from "workers-ai-provider";import { routeAgentRequest } from "agents";
export class MyAgent extends Think<Env> { getModel() { return createWorkersAI({ binding: this.env.AI })( "@cf/moonshotai/kimi-k2.6", ); }}
export default { async fetch(request: Request, env: Env) { return ( (await routeAgentRequest(request, env)) || new Response("Not found", { status: 404 }) ); },} satisfies ExportedHandler<Env>;That is it. Think handles the WebSocket chat protocol, message persistence, the agentic loop, message sanitization, stream resumption, client tool support, and workspace file tools.
import { useAgent } from "agents/react";import { useAgentChat } from "@cloudflare/ai-chat/react";
function Chat() { const agent = useAgent({ agent: "MyAgent" }); const { messages, sendMessage, status } = useAgentChat({ agent });
return ( <div> {messages.map((msg) => ( <div key={msg.id}> <strong>{msg.role}:</strong> {msg.parts.map((part, i) => part.type === "text" ? <span key={i}>{part.text}</span> : null, )} </div> ))}
<form onSubmit={(e) => { e.preventDefault(); const input = e.currentTarget.elements.namedItem("input"); sendMessage({ text: input.value }); input.value = ""; }} > <input name="input" placeholder="Send a message..." /> <button type="submit">Send</button> </form> </div> );}import { useAgent } from "agents/react";import { useAgentChat } from "@cloudflare/ai-chat/react";
function Chat() { const agent = useAgent({ agent: "MyAgent" }); const { messages, sendMessage, status } = useAgentChat({ agent });
return ( <div> {messages.map((msg) => ( <div key={msg.id}> <strong>{msg.role}:</strong> {msg.parts.map((part, i) => part.type === "text" ? <span key={i}>{part.text}</span> : null, )} </div> ))}
<form onSubmit={(e) => { e.preventDefault(); const input = e.currentTarget.elements.namedItem( "input", ) as HTMLInputElement; sendMessage({ text: input.value }); input.value = ""; }} > <input name="input" placeholder="Send a message..." /> <button type="submit">Send</button> </form> </div> );}{ "$schema": "./node_modules/wrangler/config-schema.json", // Set this to today's date "compatibility_date": "2026-06-04", "compatibility_flags": [ "nodejs_compat" ], "ai": { "binding": "AI" }, "durable_objects": { "bindings": [ { "class_name": "MyAgent", "name": "MyAgent" } ] }, "migrations": [ { "new_sqlite_classes": [ "MyAgent" ], "tag": "v1" } ]}# Set this to today's datecompatibility_date = "2026-06-04"compatibility_flags = ["nodejs_compat"]
[ai]binding = "AI"
[[durable_objects.bindings]]class_name = "MyAgent"name = "MyAgent"
[[migrations]]new_sqlite_classes = ["MyAgent"]tag = "v1"Both Think and AIChatAgent extend Agent and speak the same cf_agent_chat_* WebSocket protocol. They serve different goals.
AIChatAgent is a protocol adapter. You override onChatMessage and are responsible for calling streamText, wiring tools, converting messages, and returning a Response. AIChatAgent handles the plumbing — message persistence, streaming, abort, resume — but the LLM call is entirely your concern.
Think is an opinionated framework. It makes decisions for you: getModel() returns the model, getSystemPrompt() or configureSession() sets the prompt, getTools() returns tools. The default onChatMessage runs the complete agentic loop. You override individual pieces, not the whole pipeline.
| Concern | AIChatAgent | Think |
|---|---|---|
| Minimal subclass | ~15 lines (wire streamText + tools + system prompt + response) | 3 lines (getModel() only) |
| Storage | Flat SQL table | Session: tree-structured messages, context blocks, compaction, FTS5 |
| Regeneration | Destructive (old response deleted) | Non-destructive branching (old responses preserved) |
| Context management | Manual | Context blocks with LLM-writable persistent memory |
| Sub-agent RPC | Not built in | chat() with StreamCallback |
| Programmatic turns | saveMessages() | saveMessages(), submitMessages(), continueLastTurn() |
| Compaction | maxPersistedMessages (deletes oldest) | Non-destructive summaries via overlays |
| Search | Not available | FTS5 full-text search per-session and cross-session |
- You need full control over the LLM call (RAG, multi-model, custom streaming)
- You want the
Responsereturn type for HTTP middleware or testing - You are building a simple chatbot with no memory requirements
- You want to ship fast (3-line subclass with everything wired)
- You need persistent memory (context blocks the model can read and write)
- You need long conversations (non-destructive compaction)
- You need conversation search (FTS5)
- You are building a sub-agent system (parent-child RPC with streaming)
- You need proactive agents (programmatic turns from scheduled tasks or webhooks)
- You need durable async submission for webhook or RPC callers
Think has several ways to start or continue a turn. Choose based on who starts the work and what the caller needs back.
| Use case | API |
|---|---|
| A browser user sends chat messages | useAgentChat over the WebSocket chat protocol |
| Server code can wait for the model response | saveMessages() |
| Server code needs fast durable acceptance and later status | submitMessages() |
| Code should create recurring prompt-driven turns or handlers | getScheduledTasks() |
| Parent code needs direct streaming RPC to a specific child | subAgent(...).chat() |
| A parent delegates work to a retained child agent | agentTool() or runAgentTool() |
| Surround a turn with idempotent app-owned side effects | startFiber() |
| Coordinate multi-step durable orchestration | Workflows |
| Add context or messages without starting a model turn | persistMessages() |
| Advanced subclass or recovery code continues an assistant turn | continueLastTurn() |
Use saveMessages() when the caller owns the trigger and can wait for the turn to finish. Use submitMessages() when timeout ambiguity would make retries unsafe.
Use chat() for low-level parent-to-child streaming when your code owns forwarding, cancellation, and replay policy. Use Agent tools when a parent model or workflow delegates to a child agent and you want retained child runs, event replay, abort bridging, and UI drill-in.
Use startFiber() outside Think when the durable unit is an application job around a turn: accepting a webhook once, restoring a serialized channel or thread target, posting a visible reply, or recording app-level recovery policy. Think submissions own conversation admission and turn serialization; managed fibers own external job acceptance, idempotent side effects, and application recovery.
Think's design is inspired by Pi ↗.
- Sessions — context blocks, compaction, search, multi-session (the storage layer Think builds on)
- Sub-agents —
subAgent(),abortSubAgent(),deleteSubAgent()(the base Agent methods for spawning children) - Chat agents —
AIChatAgentfor when you need full control over the LLM call - Long-running agents — sub-agent delegation patterns for multi-week agent lifetimes
- Durable execution —
runFiber()and crash recovery (used bychatRecovery) - Browse the web — full CDP helper API reference