← ProofHonest by design

Seven features that prove the AI knows where its authority ends.

We ran a 72-pass QA session against the live product. Every one of the moments below surfaced as the kind of behavior an experienced buyer notices first — the way the system tells the truth about its work, its cost, and its limits. Click any card to try the same surface yourself.

F-338Billion-dollar premium · no MVP excuses

Truncation tells you the truth.

When a long answer hits the response cap mid-section, the surface flags it explicitly — "⚠ Output was cut off at the response cap" — and offers a one-click continuation. No silent half-finished plan. No "go back and re-pay" friction. The credits charged match the tokens that actually shipped.

Feel it on Legal Compliance Checker
F-328 / F-334Nintendo charm · Mystery Box (#72), white-hat scaffolding

The BONUS observations the model genuinely noticed.

On strategy roles (Business Consultant, Product Manager, Proposal Strategist, Feedback Synthesizer), the model can optionally surface up to three side observations it noticed while doing the brief — adjacent broken things, hidden opportunities, questions you should be asking. Strict rule baked in: empty is the default. Never fabricated to fill the box.

Try Product Manager with a 200-word brief
F-351Stripe trust on advisory surfaces · discipline-before-delivery

Self-validation flags the model's own potential overreach.

On authority-sensitive roles (Legal Compliance Checker, Customer Service Rep, PPC Strategist, Bookkeeper, Proposal Strategist), a deterministic post-validator audits the draft against the user's stated policy and flags specific concerns — "Used 'guaranteed' without explicit qualification," "Made an absolute action promise," "Asserted compliance status" — before the draft reaches the customer.

Send Customer Service a real policy
F-363 / F-364Billion-dollar premium · no "coming soon" copy

Builds deploy live URLs with working forms.

Vibe Coder (Senior Developer) ships every site to a real `/preview/<uuid>` URL the visitor can open immediately. Contact forms wire to a honeypot-protected endpoint — bots get a 200, humans get a real submit. No "download the bundle and host it yourself" friction. The work goes live the moment it ships.

Brief Senior Developer for a one-page site
F-375Stripe trust on the money surface

The price is on the chip before you commit.

Pre-flight estimate, pre-hire gate estimate, post-hoc receipt — they all share one format. Inside the band (~4–14 cr) reads as expected; over the band gets explicit "over by N"; under the band gets a clean "under" flag. No surprise costs, no hidden ceilings, no "actual may vary." Same shape Stripe Checkout uses on every premium money surface you've ever paid through.

Watch the chip on any prompt
F-380Stripe trust · no silent failures

You pay for actuals, not estimates.

When the actual run uses fewer tokens than the chip suggested, the difference is refunded back the same second — not at end-of-month, not on request. The receipt shows the math explicitly: estimate / cap / charged / refunded. If the run can't ship at all, the refund event shows up on the chat surface immediately. Stripe never refunds you without an email; we never refund you without a banner.

Run any short prompt and read the receipt
F-519Stripe trust on structured deliverables · discipline-before-delivery

Structured-output roles audit themselves against their own Critical Rules.

On roles that ship a fixed-contract artifact — Financial Analyst's three-scenario model, Recruiter HR's interview scorecard, Automation Architect's blueprint, Proposal Strategist's sectioned proposal — the model checks its own draft against the contract before sending. When a section is missing, the surface flags it by name ("missing the base / bull / bear scenario structure," "missing the sensitivity callout") and asks you to request a fix. Free-form content roles don't fire this — there's no fixed contract to audit against — so it never over-promises validation it can't deliver.

Brief Financial Analyst with a model request

These aren't aspirational claims. Each card maps to a finding documented in the QA session at docs/qa/2026-05-26 and shipped as code in commits referenced from the resolution log. The receipts under /proof show what the squad produces; this gallery shows how the squad behaves while producing it.