Model
Model-level middleware. Automatic retry with exponential backoff for transient failures, and complexity-based routing to optimize cost.
model.retry()
Section titled “model.retry()”Wraps LLM calls with exponential backoff for transient failures (rate limits, network errors).
function modelRetry(config?: RetryConfig): Middleware// Default: 2 retries, 1000ms initial delayagent.use(model.retry())
// Custom: 3 retries, 500ms initial delayagent.use(model.retry({ maxRetries: 3, initialDelayMs: 500 }))Config options:
| Option | Type | Default | Description |
|---|---|---|---|
maxRetries | number | 2 | Maximum retry attempts |
initialDelayMs | number | 1000 | Initial delay (doubles each retry) |
Hooks: model — wraps next() in a retry loop.
Retryable errors: RateLimitError, NetworkError. Non-retryable: AuthenticationError, ContentFilterError. For RateLimitError with retryAfter, the delay is the greater of the backoff delay and retryAfter * 1000.
model.router()
Section titled “model.router()”Routes model calls by complexity, switching between cheaper and more capable models automatically. Saves 60-90% on LLM costs for mixed-complexity workloads.
function modelRouter(config: ModelRouterConfig): Middlewareagent.use(model.router({ routes: { simple: "anthropic/claude-haiku-4-5", medium: "anthropic/claude-sonnet-4-6", complex: "anthropic/claude-opus-4-6", },}))
// Custom classifieragent.use(model.router({ routes: { simple: "...", medium: "...", complex: "..." }, classify: (ctx) => { if (ctx.toolDefs.length > 3) return "complex" return "simple" },}))Default classification heuristic:
- simple: < 500 estimated tokens AND no tools defined
- complex: > 2000 tokens OR 5+ tools available
- medium: everything else
Config options:
| Option | Type | Default | Description |
|---|---|---|---|
routes | Record<ComplexityTier, string> | required | Model ID for each tier. ComplexityTier is "simple" | "medium" | "complex". |
classify | (ctx: ModelContext) => ComplexityTier | built-in heuristic | Custom classifier function |
tokenCounter | TokenCounter | chars/4 | Token counter for input complexity estimation |
Hooks: model — classifies and calls ctx.setModel() before next().