Skip to content

Observability

Observability and telemetry middleware. Track token usage, record tool calls, measure duration, emit structured logs, expose Prometheus metrics, and create OpenTelemetry distributed traces.

Tracks cumulative token usage across all model calls in a session.

function observeUsage(): Middleware
agent.use(observe.usage())
const result = await agent.run("Hello").result
const usage = result.state["observe:usage"]
// { inputTokens: 150, outputTokens: 85 }

Hooks: model — reads response.usage after next().

State keys:

  • observe:usage{ inputTokens: number, outputTokens: number } (reducer: sum)

Records every tool execution including arguments, results, duration, and errors.

function observeTools(): Middleware
agent.use(observe.tools())
const result = await agent.run("Search for cats").result
const calls = result.state["observe:tools"] as ToolCallRecord[]
for (const call of calls) {
console.log(`${call.name}: ${call.duration}ms`)
}

Hooks: tool — wraps next() and records timing.

Each ToolCallRecord contains: { callId: string; name: string; args: Record<string, unknown>; result: unknown; duration: number; error?: string }.

State keys:

  • observe:tools — array of ToolCallRecord (reducer: append)

Measures wall-clock duration of each turn in milliseconds.

function observeDuration(): Middleware
agent.use(observe.duration())
const result = await agent.run("Hello").result
const ms = result.state["observe:duration"] as number
console.log(`Turn took ${ms}ms`)

Hooks: turn — wraps next().

State keys:

  • observe:duration — number (last-write-wins; reflects most recent turn)

Emits structured JSON log events for every lifecycle phase. Suitable for Datadog, Grafana, ELK, and similar logging pipelines.

function observeLog(opts?: ObserveLogOptions): Middleware
// Default: JSON lines to stderr
agent.use(observe.log())
// Custom output (e.g., pino)
agent.use(observe.log({
output: (event) => pino.info(event),
}))
OptionTypeDefaultDescription
output(event: LogEvent) => voidJSON line to stderrCustom output function
recordContentbooleanfalseLog prompt/response content at debug level

Each LogEvent contains:

interface LogEvent {
timestamp: string // ISO 8601
type: string // "session:start", "turn:start", "model:call", etc.
sessionId: string
turnIndex: number
data: Record<string, unknown>
level?: "debug" | "info" | "warn" | "error"
agentName?: string
turnId?: string // present on turn/model/tool events
durationMs?: number // present on end events
error?: { type: string; message: string } // present on failures
traceId?: string // present when OTel span context is active
spanId?: string // present when OTel span context is active
}

Hooks: session, turn, model, tool — logs start/end events for each.

Event types emitted: session:start, session:end, turn:start, turn:end, model:call, model:response, tool:start, tool:end.

Level mapping: info for normal lifecycle, warn for tool errors with recovery, error for failures, debug for content recording.

When used together with observe.traces(), log events include traceId and spanId for log-trace correlation in observability platforms.


Tracks agent performance metrics via the OpenTelemetry Meter API. Provides 10 agent_express_* metrics (counters + histograms) with AI-tuned histogram bucket boundaries.

function observeMetrics(opts?: ObserveMetricsOptions): Middleware
import { Agent, observe } from "agent-express"
const metrics = observe.metrics()
agent.use(metrics)
const { state } = await agent.run("Hello").result
const snapshot = state["observe:metrics"]
// { modelCalls: 1, tokens: { input: 150, output: 85 }, ... }
OptionTypeDefaultDescription
otelbooleanfalseAdditionally emit gen_ai.* standard metrics
meterMeterglobal MeterProviderCustom OTel Meter instance
output(event: MetricEvent) => voidStandalone mode callback (no OTel dependency)

Metrics emitted:

MetricTypeAttributes
agent_express_model_calls_totalCounteragent, model, provider
agent_express_tool_calls_totalCounteragent, tool
agent_express_turns_totalCounteragent
agent_express_sessions_totalCounteragent
agent_express_errors_totalCounteragent, error_source, error_type
agent_express_tokens_totalCounteragent, direction, model
agent_express_model_duration_secondsHistogramagent, model, provider
agent_express_tool_duration_secondsHistogramagent, tool
agent_express_turn_duration_secondsHistogramagent
agent_express_session_duration_secondsHistogramagent

With otel: true, additionally emits gen_ai.client.operation.duration and gen_ai.client.token.usage.

Three modes:

  1. Global MeterProvider (default when @opentelemetry/api installed) — metrics flow to user-configured exporter (Prometheus, OTLP, etc.)
  2. Custom Metermeter option for isolation or testing
  3. Standalone callbackoutput option when @opentelemetry/api is not installed

Pass a custom Meter instance for isolation (e.g., separate exporter per agent) or testing:

import { MeterProvider } from "@opentelemetry/sdk-metrics"
import { PrometheusExporter } from "@opentelemetry/exporter-prometheus"
const exporter = new PrometheusExporter({ port: 9464 })
const provider = new MeterProvider({ readers: [exporter] })
const meter = provider.getMeter("my-agent")
agent.use(observe.metrics({ meter }))
// Metrics exported to http://localhost:9464/metrics

Hooks: session, turn, model, tool.

State keys:

  • observe:metricsMetricsSnapshot with session-scoped counts, tokens, and durations

See the Observability Guide for full production setup examples.


Creates OpenTelemetry distributed tracing spans for every lifecycle phase. Two span naming modes: framework terminology (default) or OTel GenAI conventions.

function observeTraces(opts?: ObserveTracesOptions): Middleware
// Framework span names (default)
agent.use(observe.traces())
// OTel GenAI convention names
agent.use(observe.traces({ otel: true }))
// Standalone mode (no @opentelemetry/api needed)
agent.use(observe.traces({
output: (span) => console.log(JSON.stringify(span))
}))
OptionTypeDefaultDescription
otelbooleanfalseUse OTel GenAI convention span names
recordContentbooleanfalseRecord prompts/responses in span attributes
tracerTracerglobal TracerProviderCustom OTel Tracer instance
output(span: SpanData) => voidStandalone mode callback

Span hierarchy:

session.run {agent} (or invoke_agent with otel: true)
├── turn 0
│ ├── model.call {model} (or chat)
│ └── tool.call {tool} (or execute_tool)
└── session.close {agent} (or close_session)

Each session gets its own traceId. gen_ai.* attributes are always present on model/tool spans regardless of naming mode.

Framework attributes on all spans: agent_express.agent.name, agent_express.session.id, agent_express.turn.id, agent_express.turn.index, agent_express.model, agent_express.provider, agent_express.tool.name, agent_express.call.id.

Three modes:

  1. Global TracerProvider (default) — spans flow to user-configured exporter (Jaeger, Tempo, etc.)
  2. Custom Tracertracer option for isolation or testing
  3. Standalone callbackoutput option emits SpanData objects

Pass a custom Tracer for isolation (e.g., separate exporter per agent) or testing:

import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node"
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base"
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
const exporter = new OTLPTraceExporter({ url: "http://localhost:4318/v1/traces" })
const provider = new NodeTracerProvider()
provider.addSpanProcessor(new BatchSpanProcessor(exporter))
const tracer = provider.getTracer("my-agent")
agent.use(observe.traces({ tracer }))

Hooks: session, turn, model, tool.

See the Observability Guide for full production setup examples.