← Blog

Announcing the Theseus Playground

A browser IDE for authoring Theseus agents in markdown. The shape is the one Claude Agent SDK and OpenAI Assistants users already know — a system prompt and a tool catalog — and the playground takes both directly.

Most working AI agents come in two parts: a system prompt that defines the persona and operating contract, and a list of tools the model can call. The Claude Agent SDK and OpenAI Assistants both keep this shape clean. Typical production agents are fifty lines of prompt and a handful of HTTP endpoints behind them.

Deploying one onto a verifiable runtime should not require rewriting any of that. Until recently it did. Theseus agents were authored in a Rust subset called the SHIP IR, compiled to SCALE bytes, and registered on chain via the CLI. The result was correct — each agent got its own on-chain account, signed receipts on every model invocation and tool call, and a content hash of the deployed code — but the on-ramp was a Rust file with ToolSpec literals most agent developers had never written.

The playground takes that on-ramp away. You author the same two files you would for a Claude agent: a THESEUS.md holding the system prompt and the frontmatter naming the model and tools, and one or more SKILL.md files for the capabilities the agent can call. The playground reads the frontmatter, generates the SHIP IR, compiles in the browser via WASM, signs, and submits the register_agent extrinsic. The compiled bytes are byte-identical to what a hand-authored Rust agent would produce. If you want to skip to the worked example, the five-minute walkthrough deploys a weather assistant.

The playground IDE with a workspace open

What this looks like

A working agent in three pieces. Two of them are markdown.

The frontmatter declares everything the chain needs to know about the agent's identity and surface:

---
name: Weather Assistant
id: weather-assistant-v1
model: deepseek-chat
native-tools: none
tools:
  - name: get_weather
    description: Get the current temperature for a city.
    args:
      city: string
  - name: lookup_user
    description: Read a user record by id.
    args:
      user_id: string
---

The markdown body is the system prompt — the same one you would send to Anthropic's system field:

You are a helpful assistant.

When the user asks about the weather, call get_weather with the
city name. When the user asks about a user account, call
lookup_user with the user id. Otherwise just answer directly.

A second file at skills/my-tools/SKILL.md describes when to call each tool. The model reads this when deciding which tool fits a given task. Both files are markdown with YAML frontmatter; nothing else is required.

On deploy the playground reads the tools: block, emits one ToolSpec per entry plus a standard tool-loop control flow (init → think → act → think → done), compiles the assembly to SCALE bytes in the browser, and submits pallet_agents::register_agent with your wallet. The agent receives its own on-chain account, funded with an initial endowment from the deployer. Every subsequent invocation queues a model job, which a prover picks up and resolves on chain. Tool calls dispatch to the operator's tool-executor and the response body returns on chain, signed by the executor.

FIG. 01 / DEPLOY PIPELINE Markdown to an on-chain agent. FRONTMATTER → SHIP IR → SCALE BYTES → SIGNED EXTRINSIC → DERIVED ACCOUNT 01 / AUTHOR 02 / SYNTHESIZE 03 / COMPILE 04 / SIGN 05 / ON CHAIN FILE THESEUS.md system prompt + frontmatter FILE skills/*/SKILL.md when to call each tool Parse frontmatter, emit SHIP IR (synthesized agent.rs) WASM compile, emit SCALE bytes in browser Sign register_agent extrinsic with browser wallet Derived AccountId hash(deployer ‖ code ‖ salt) own balance + history IN BROWSER ON CHAIN Same SCALE bytes a hand-authored Rust agent would produce. The synthesizer is bypassed when an agent.rs is present in the workspace.

If you need a control flow the synthesizer does not generate — pauses for human input, custom state fields, scheduled triggers, structured output schemas — you add an agent.rs to the workspace and the playground uses yours instead.

The agent is the account of record

The most important thing the playground deploys is not the prompt or the tools, but the agent's own on-chain account. At register_agent time the chain derives an AccountId from hash(deployer || compiled_hash || salt), transfers an initial endowment from your account to that address, and stores the compiled SHIP IR against it. From that point on the agent's address has its own balance and its own on-chain history, distinct from the deployer's. The chain attributes operations to the agent, not to you.

This is the model How AI Agents Actually Own Assets lays out at length. In practical terms it means two things today. First, every inference result, every refused commission, every reconciled price feed lands on chain indexed by the agent's address; that record is the agent's, not the deployer's. Second, the agent's authority is not a private key but its compiled SHIP IR, which is hash-locked at deploy. A buggy or compromised model output can only spend what has been funded into the agent's address, and only on the actions the SHIP IR encodes. There is no operator key to compromise mid-run because the agent's authorization is the code on chain.

What the chain records

Beyond the account itself, the verifiability surface is straightforward. The deployed agent has a deterministic content hash, which anyone can re-derive from the published THESEUS.md and SKILL.md files. If the agent's behavior is updated on chain, the hash diverges and the change is visible.

Every model invocation generates two on-chain events. A Queued event fires when the agent's SHIP IR triggers inference; it names the model tag and includes the prompt bytes. A Verified event fires when the prover submits the result; it carries the output, the reasoning trace, an output hash, and the identity of the prover who signed. The chain stores an input commitment (hash) in persistent state and preserves the full prompt in the event log. The model tag is fixed by the compiled SHIP IR, so a prover cannot quietly route an inference job to a different model than the agent declared.

Every tool call lands on chain with the args sent and the response body the executor submitted, truncated if oversized. Anyone can hash the body and check it against what the executor reported. A tool that lies about what it returned can be challenged on the bytes themselves.

What is still ahead

Immediately ahead: play.theseus.network coming online with the IDE serving from this repo, the playground source going public alongside it, and the chain-side ProxyTool dispatch that lets BYO-typed templates resolve external tool calls end-to-end.

Next: the tools.yaml external-tool catalog and the active allowed-tools filter. Today each skill's allowed-tools frontmatter is documentation. Once the filter lands, those declarations become a runtime gate that structurally limits which tools the model can reach from each skill. Templates here are forward-compatible.

After that: a public Theseus testnet with a registered model surface. The current chain runs locally. demo-agents.theseus.network hosts the thirteen demos against centralized providers today; the on-chain Theseus version of each is wired up in the demo site and waiting for the testnet to flip.

Further out: a transfer_ownership extrinsic so the administrative relationship can move NFT-style; sovereign: true agents that have no controller at all, where the deployed code is the only authority; and wallet integration in the playground so each visitor signs with their own keypair instead of the dev account. The chain pallet currently keeps a deployer-as-owner relationship — the deployer can update the agent's SHIP IR, add or remove skills, deactivate, or cancel a run — and the THESEUS.md written today does not change when any of those ship.

Try it

play.theseus.network — open the IDE.

theseus.network/docs/playground — the five-minute walkthrough. Picks the BYO Markdown template, walks you through editing the frontmatter for your own tools, ends with an on-chain address.

demo-agents.theseus.network — thirteen worked examples. Eight adjudicators (price oracle, bridge guardian, governance reviewer, FAA safety reviewer, sovereign fund, Polymarket adjudicator, launch sniper, Terra failsafe) and five identity-anchored agents (literary author, visual artist, music critic, legal co-author, in-game NPC chronicler). Each is a single THESEUS.md and a skills directory; the playground ships all thirteen as starter templates.

For what we are shipping next, theseuschain.substack.com carries the long-form thread and t.me/theseusnetwork is the chat.