New Agent Runtime v2 is generally available

Ship AI agents that run in production.

Fermion is the runtime for agentic AI. Deploy tools, models, and long-running workflows behind one fast API — with sub-second cold starts, durable state, and observability built in.

Start building free Read the docs

No credit card required · 5,000 free agent runs every month

Scroll · build, run, observe

Build. Run. Observe.

agent.ts

TypeScript

1import { Fermion } from "@fermion/sdk";
2
3const agent = new Fermion.Agent({
4  model: "orion-4",
5  tools: [search, postgres, browser],
6  memory: "durable",
7});
8
9const run = await agent.stream({
10  input: "Draft the Q3 board update",
11});

Define once. Declare your agent — model, tools, and durable memory — in a few typed lines.

run · orion-4

LIVE

$ agent.stream({ input: "Draft the Q3 board update" })

plannerreasoning · plan readyok

retrieval12 docs · 302msok

tool·postgresSELECT metrics · 96msok

synthesisstreaming 412 tokens·

p95 latency -18%

412ms

Run anywhere. Stream long-running workflows behind one fast API, with sub-second cold starts.

Live traces

last 60 min

10:04p50 · 210msp95 · 412ms

planner180ms

retrieval302ms

synthesis418ms

Observe everything. Trace every tool call, token, and dollar in real time — then replay any run.

Platform

Everything you need to run agents in production.

Stop stitching together queues, sandboxes, and dashboards. Fermion is the full runtime — from the first prototype to millions of runs a day.

Sub-second cold starts

Snapshot-restored sandboxes resume in 380ms, so idle agents still feel instant. Never pay for warm pools you don't need.

One API, every model

Route across frontier and open models with automatic failover, streaming, and per-token cost caps — no vendor rewrites.

Durable execution

Long-running workflows survive restarts, retries, and deploys. State is checkpointed, so nothing is ever lost mid-run.

Observability built in

Trace every tool call, token, and dollar in real time. Replay any run step by step and share a permalink with your team.

Type-safe SDKs

First-class TypeScript and Python with generated types for every tool. Go from a blank file to a deployed agent in an afternoon.

Secure by default

SOC 2 Type II, SSO, and private VPC networking. Your prompts and data are isolated per tenant and never train a model.

Observability

See exactly what your agents are doing.

Every run is a distributed trace. Drill into spans, inspect the inputs and outputs of each tool call, and catch regressions before your users ever notice them.

Live traces

last 60 min

10:04p50 · 210ms10:34p95 · 412ms11:04

planner180ms

retrieval302ms

tool·postgres96ms

synthesis418ms

Waterfall traces for every tool, model, and sub-agent call
Live latency, token, and cost metrics on every single run
Alerts on error-rate and p95 latency regressions

380ms

Median cold start

99.98%

Rolling 90-day uptime

12B+

Agent runs per month

40+

Global edge regions

Pricing

Simple, usage-based pricing.

Start free and scale when you do. No seat taxes, no annual lock-in, no surprises on your bill.

Hobby

$0 /mo

For experiments, prototypes, and side projects.

Start for free

5,000 agent runs / month
2 concurrent sandboxes
7-day trace retention
Community Slack support

TeamMost popular

$99 /mo

For teams running real workloads in production.

Start 14-day trial

250,000 runs included, then $0.40 / 1k
Autoscaling sandboxes
30-day retention + SOC 2 report
Priority email & Slack support

Enterprise

Custom

For scale, compliance, and dedicated support.

Talk to sales

Volume pricing & committed use
Private networking (VPC peering)
SSO / SAML & 99.99% SLA
Dedicated solutions engineer