Forese
April 29, 2026·7 min read·engineering / router / providers

Behind the scenes: how Forese routes a shot to the right model

Twelve providers, dozens of models, and one prompt. A look at how Forese picks the right camera for the job — and what happens when the first choice fails.

Every shot you generate in Forese passes through a single router. The user picks a model from the picker, sees an estimated cost, and clicks Generate. What happens next is a four-stage flow we built across Phases 04–08 of the build.

Stage 1 — Estimate + hold

Before we call any provider, we estimate cost in millicredits, place a credit hold against the org's free → plan → top-up accounts in priority order, and persist the generations row in pending. If the org can't cover the estimate, we throw insufficient_balance here — the user sees a clean upgrade CTA, not a half-done generation.

Stage 2 — Provider call

The Provider Abstraction Layer (PAL) gives every model the same submit/poll/normalize shape. We call provider.submit(input, ctx) and either wait.forToken (webhook transport) or provider.poll() on an interval (poll transport). Both paths converge on a normalized result.

Stage 3 — Branch on outcome

The normalized result is one of:

  • succeeded — break out to the success path
  • failed_terminal — invalid input or content-policy rejection. No retry; we either charge-and-explain (policy) or refund (input).
  • failed_retryable — transient. Try the next fallback provider.

Stage 4 — Stitch

On success: download → upload to R2 → create a Mux asset → settle the credit hold against the original estimate. On terminal failure: refund proportionally. On every step, we record a generation_event so you can audit later.

What's different about this design

Most "multi-provider" systems wire each model into the call site. We wire them into the router, so every model speaks the same surface. Adding a new provider is one Zod schema, one submit function, one normalize function — and it gets free retries, free fallback, free billing, free auditing.

If you're curious about the codebase, the relevant files are packages/providers/src/router.ts and packages/trigger/src/tasks/generation/submit-generation.ts. The whole flow is ~600 lines.

Want to direct your own?