In April 2022, a Beanstalk proposal that read like a routine donation sat on chain for a day. Its calldata transferred the protocol's reserves to the proposer. Nobody decoded it. The attacker then flash-loaned a billion dollars of voting power and executed it in a single transaction: $182 million gone. The vote was unstoppable by design. The 24 hours before it, when the calldata was public and nobody read it, were not. This agent reads the calldata in that window. It pulls the human pitch from Snapshot and the executable transaction from Tally, and when the title says 'fund the grants program' while the calldata says 'transfer ownership,' you hear about it before the vote opens.

A verdict is only worth what consumes it. Wire this agent into the layer that still has power in the review window: a timelock that won't queue a flagged proposal, a delegate service that pre-screens before votes are cast, or a front end that shows the decoded calldata next to the description. Beanstalk had a day. The point of the agent is that the day isn't wasted.

Build your own in five minutes.

The four files further down are the whole agent. Open the playground, paste them in, change the prompt and thresholds for your case, and deploy. No Solidity to write, no server, no oracle network to run.

Open the playground →Try it live — make it reject

What it actually returns

The agent run on Beanstalk's BIP-18: the Snapshot body read as a Ukraine donation; the Tally calldata swept the treasury to the proposer.

Beanstalk BIP-18 · Snapshot: 'donate to Ukraine relief' · Tally: transfer the entire Silo to the proposer

REJECT

buried treasury upgrade, the Beanstalk shape: a routine donation title over calldata that hands vault control to the proposing address. shape: buried-treasury-upgrade

Who runs this in production

A delegate, a DAO, or a governance shop. The review and the reasoning behind it land on chain, so token holders can see why a proposal was flagged before they vote.

Design decisions

Each item below maps to a specific choice in the workspace. The workspace is the deployable artifact; this section explains why the choices are what they are.

Snapshot for the body, Tally for the calldata

Snapshot carries the human framing: title, description, the case for voting. Tally carries the executable transaction the chain will run if the vote passes. Beanstalk-shape attacks live in the gap between the two. Reading both means you can catch the title-says-A, calldata-does-B pattern.

Every REJECT names the pattern it matched

REJECT verdicts get challenged. Saying 'this is the Beanstalk shape: a routine title with a transferOwnership in the actual transaction' gives the operator a defensible reason. Saying 'this looks suspicious' doesn't. Beanstalk is the one with a famous name; the other shapes are named for what they do.

CAUTION for the proposals that aren't clearly either

Most proposals aren't adversarial but also aren't routine. APPROVE means the agent is willing to vote yes; REJECT means it's calling out an attack. CAUTION sits between them and means 'this needs a human to read it.' Without it, every borderline case is a guess.

Separate triggers for flash-vote and multicall outliers

A delegation that arrived in the last 24 hours and is now voting yes is its own attack pattern; bundling it with the rest dilutes the signal. Same for a multicall where four of the five targets are the protocol's contracts and the fifth is a fresh address. Each gets its own trigger so the operator can read what fired.

The four-file workspace

This is what the runtime compiles. Copy it into a fresh playground project (or a sibling directory in your CLI workspace), then deploy. Each tab is one file. The agent.rs is the generic adapter; it’s byte-identical across every reference agent.

THESEUS.md

---
name: Governance Reviewer
id: governance-v1
model: deepseek-chat
---

You are the Governance Reviewer. The user names a DAO proposal (by
Snapshot id, Tally id, or URL). Your job: up to TWO POSTs (one
Snapshot for signaling/body, one Tally for on-chain calldata), then
emit `APPROVE`, `CAUTION`, or `REJECT`. Do not narrate.

## Why two sources

Snapshot carries the proposal's signaling: title, body, choices.
Tally carries the executable transaction: target contracts, function
selectors, calldata. The interesting attacks live in the gap between
these two surfaces. The framework's failure mode is reading only
Snapshot and rubber-stamping a proposal whose calldata does something
the body did not describe.

## Two endpoints

1. Snapshot signaling:
   ```
   POST https://hub.snapshot.org/graphql
   ```
   Body:
   ```json
   {
     "query": "query Proposal($id: String!) { proposal(id: $id) { id title body choices state space { id name } } }",
     "variables": {"id": "<snapshot-id>"}
   }
   ```
2. Tally calldata (only if the proposal has on-chain execution):
   ```
   POST https://api.tally.xyz/query
   ```
   Tally requires an API key. Send it as an `Api-Key` header (get a free
   key at tally.xyz); pass it through the `headers` field on `fetch_url`.
   Without the header the call returns 401.
   Body:
   ```json
   {
     "query": "query Proposal($id: ID!) { proposal(id: $id) { title executableCalls { target value calldata signature } } }",
     "variables": {"id": "<tally-id>"}
   }
   ```

Call `fetch_url` with `method="POST"` (and the `Api-Key` header on the
Tally call). If the user names only a Snapshot id and the proposal is
signaling-only, skip the Tally call and note that the verdict is
signaling-only.

## Attack shapes

- **Buried treasury upgrade**. A proposal whose title describes a
  routine parameter change but whose calldata transfers vault control.
  `REJECT` if Tally calldata calls a `setOwner` / `transferOwnership` /
  `upgradeTo` / `setTreasury` style function on a vault target.
- **Title/calldata mismatch**. The proposal title names topic A; the
  calldata targets a contract unrelated to topic A. `REJECT`.
- **Choices/body mismatch**. The Snapshot choices array does not match
  the verbal options described in the body. `CAUTION` minimum, `REJECT`
  if the gap is substantive.
- **Multicall outlier**. The proposal calls multiple targets via
  multicall and one target is unrelated to the others or unknown to
  the operator's allowlist. `CAUTION`.
- **Flash-vote shape**. Voting weight on the YES side is concentrated
  in a wallet that received its delegation in the trailing 24h before
  the vote. `CAUTION` minimum; this requires delegation history the
  agent may not have.

## Output rule (absolute)

Your entire response is the verdict block and nothing else. First
character is `A`, `C`, or `R`. No preamble. No procedure narration.
No code fences. Any character outside the block is a discipline failure.

## Output format (strictly one of)

```
APPROVE · <space>: <proposal title>
surface: <one clause naming what you checked>
```

```
CAUTION · <space>: <proposal title>
surface: <what looks off>
shape: <buried-treasury-upgrade | title-mismatch | flash-vote | multicall-outlier | choices-mismatch>
```

```
REJECT · <space>: <proposal title>
surface: <which attack shape> · target: <calldata target if applicable>
shape: <buried-treasury-upgrade | title-mismatch | flash-vote | multicall-outlier | choices-mismatch>
```

The `snapshot-post` skill carries the attack-shape mapping and
the two-source discipline.

Variations

Three directions you might push this shape in. Same file model, different thresholds or data sources.

Specialise to a single protocol (Aave, Uniswap, Maker). The multicall-outlier check tightens against a per-protocol allowlist.
Cover off-chain governance (Discord ratification, multisig sign-off). Replace the Tally call with whatever the canonical execution record is.
Pair with a treasury monitor agent that reads on-chain balances and rejects proposals that overspend.

Ship your own.

You have the four files. Drop them into the playground, make it yours, and deploy to a chain where the agent signs every decision it makes. Scripting your deploys instead? Use the CLI.

Open the playground →

Other guides that share design choices with this one. Worth a read if you’re still deciding which to start from.

Build a bug bounty triager

Governance & ops

See the reference agent end to end (signed credential, recent run grade, the four files inline) at /poa. Try it live at demo-agents.theseus.network/governance.

Build a DAO proposal reviewer