Skip to content

Model

Model-level middleware. Automatic retry with exponential backoff for transient failures, and complexity-based routing to optimize cost.

Wraps LLM calls with exponential backoff for transient failures (rate limits, network errors).

function modelRetry(config?: RetryConfig): Middleware
// Default: 2 retries, 1000ms initial delay
agent.use(model.retry())
// Custom: 3 retries, 500ms initial delay
agent.use(model.retry({ maxRetries: 3, initialDelayMs: 500 }))

Config options:

OptionTypeDefaultDescription
maxRetriesnumber2Maximum retry attempts
initialDelayMsnumber1000Initial delay (doubles each retry)

Hooks: model — wraps next() in a retry loop.

Retryable errors: RateLimitError, NetworkError. Non-retryable: AuthenticationError, ContentFilterError. For RateLimitError with retryAfter, the delay is the greater of the backoff delay and retryAfter * 1000.


Routes model calls by complexity, switching between cheaper and more capable models automatically. Saves 60-90% on LLM costs for mixed-complexity workloads.

function modelRouter(config: ModelRouterConfig): Middleware
agent.use(model.router({
routes: {
simple: "anthropic/claude-haiku-4-5",
medium: "anthropic/claude-sonnet-4-6",
complex: "anthropic/claude-opus-4-6",
},
}))
// Custom classifier
agent.use(model.router({
routes: { simple: "...", medium: "...", complex: "..." },
classify: (ctx) => {
if (ctx.toolDefs.length > 3) return "complex"
return "simple"
},
}))

Default classification heuristic:

  • simple: < 500 estimated tokens AND no tools defined
  • complex: > 2000 tokens OR 5+ tools available
  • medium: everything else

Config options:

OptionTypeDefaultDescription
routesRecord<ComplexityTier, string>requiredModel ID for each tier. ComplexityTier is "simple" | "medium" | "complex".
classify(ctx: ModelContext) => ComplexityTierbuilt-in heuristicCustom classifier function
tokenCounterTokenCounterchars/4Token counter for input complexity estimation

Hooks: model — classifies and calls ctx.setModel() before next().