Prover Ecosystem
The off-chain compute that runs every model call — and the on-chain machinery that verifies what they submit.
In one paragraph
InferenceQueued events, run AI model inference on GPU hardware, and submit results back along with verification material. The chain accepts results only via the submit_inference_result extrinsic, and only when the attached material verifies. There are two tiers — full provers with cryptographic proofs and lite provers with signature attestation — and the long-term direction is to retire the lite tier entirely.- Full provers run models locally, generate TensorCommitment proofs inline during the forward pass. The chain verifies via a native KZG host function.
- Lite provers route to hosted APIs (OpenAI, etc.) and submit only a signed result. No computational integrity guarantee — pragmatic for breadth at alpha, phased out by mainnet.
- Selection is VRF-based: an on-chain capacity registry filters eligible provers per job, then a verifiable random draw picks one. Non-manipulable.
- Deadlines + slashing: missing a deadline triggers slashing via the staking pallet. Deadlines vary by latency class.
Two tiers, one extrinsic
From the chain’s perspective both tiers submit through the same submit_inference_result. The verification path is what differs.
Full prover
- Runs models locally via vLLM (gpt-oss-120B or similar single-GPU model at alpha).
- Produces TensorCommitment proofs inline during the forward pass: KZG commitments, sumcheck, lookup arguments.
- Chain verifies via a native KZG host function. Consensus safety depends only on what this function accepts.
- Operated by Theseus at alpha. Open registration with staking by Beta.
Lite prover
- Routes inference to a hosted provider (OpenAI API or equivalent).
- Submits result with a signature only — no proof that the computation was correct.
- Pragmatic bridge while full proving matures. Provides model breadth and throughput from day one.
- Phased out by mainnet: at that point the network no longer accepts signature-only results.
Lite-prover trust assumption
recentRuns.grade field tells verifiers whether an agent’s recent runs were proved or signed.TensorCommitment proof system
The proof system is the cryptographic core of the verification model. Its job is to let the on-chain verifier confirm — in constant time — that a prover ran a specific model on a specific input and produced the claimed output. It’s designed for transformer-shaped networks, handling both the linear operations (matmul) and the nonlinear activations (GELU/SiLU/softmax).
Cryptographic primitives
Built on the BLS12-381 elliptic curve via the Arkworks library, chosen for its well-studied security and efficient pairings. Three primitives compose:
- Tiled multivariate KZG commitments — tensor activations tiled and committed using a product-power Structured Reference String.
- Sumcheck protocol — non-interactive via Fiat-Shamir. Verifies linear operations over committed tensors.
- Plookup-style lookup arguments — for nonlinear activations against precomputed tables.
Proof generation pipeline
Proofs are generated inline during the vLLM forward pass — Python hooks intercept per-layer activations and feed them to the Rust proof engine. Seven steps:
- Input commitment (hash of input tokens).
- Per-layer KZG commitments — tiled multivariate commitments for each transformer layer (attention, FFN, layer norm).
- Sumcheck proofs for linear operations.
- Lookup proofs for nonlinear activations.
- Weight binding — sumcheck opening at a random challenge point, binding the proof to the registered model weights.
- Output commitment.
- Terkle tree aggregation — per-layer commitments aggregated into a single root for compact on-chain verification.
Hardware acceleration
The dominant cost in proof generation is multi-scalar multiplication (MSM), which paces the KZG commitments. To make proving practical at inference speeds, the prover uses GPU acceleration: Metal on macOS for development and CUDA in production. CUDA support extends to MSM, NTT, quantization, and sumcheck proof generation, so the full proof pipeline runs on the same GPU that runs inference.
For the cryptographic deep-dive (with the verifier-side O(1) cost argument), see Tensor Commits.
Selection: VRF lottery + capacity registry
Manipulation-resistant assignment matters: if users or provers could pick which prover handles a given job, you’d open the door to collusion, censorship, and selective computation. The target design uses a verifiable random draw that respects each model’s hardware needs.
- Each registered prover declares hardware capacity (VRAM, RAM, supported models) in an on-chain Capacity Registry.
- For each inference job, the eligible set is filtered to provers whose capacity meets the model’s requirements.
- A VRF draw, seeded from consensus randomness, selects the assigned prover. The selection is non-manipulable and verifiable on-chain.
- Selected provers have a block-based deadline; missing it triggers slashing via the staking pallet.
At alpha (1 full + 2 lite), any registered prover can submit results for any job. VRF-based selection activates as the prover set grows.
Accountability and latency classes
Honest behavior is enforced through economic pressure, not policy. Misbehavior is more costly than honest participation.
Staking
Registered provers post a bond. Misbehavior triggers slashing.
Capacity claims
Provers declare hardware capacity. Failure to deliver assigned jobs signals misreported capacity, triggering penalties.
Deadlines
Each job has a block-based deadline derived from its latency class. Missing it triggers slashing.
Latency classes
Jobs grouped into RT (real-time), Interactive, and Bulk classes with differentiated deadlines and fees.
At alpha with a small prover set, these mechanisms operate in a simplified form. Full staking and slashing activate with open prover registration in Beta — see Status & Roadmap.