Observability

Agents are non-deterministic. Without strong observability you cannot debug failures, understand decisions, monitor cost, or improve behavior.

Why observability is non-optional for agents

Traditional applications are deterministic - same input, same output. Agents are not. They make decisions, choose tools, take multi-step actions. Without tracing you cannot answer simple questions like "why did the agent call this tool with these arguments" or "where did the $400 in token spend come from".

The four metrics that matter most:

Tool success/failure rates per package
Latency per tool call (p50, p95, p99)
Token and dollar cost per session
Agent decision paths (which tools were chosen, in what order, why)

Recommended stack

MCP Gateway emits OpenTelemetry spans for every tool call. Export to your observability platform of choice. APAI ships pre-configured integrations with the major options:

Platform	Best for	Open source?
Langfuse	Production LLM apps, cost tracking, user feedback collection	Yes
LangSmith	Teams already using LangChain / LangGraph	No (managed by LangChain)
Arize Phoenix	Evaluation-heavy workflows, both research and production	Yes
Helicone	Lightweight LLM API observability, fast integration	Yes
W&B, PromptLayer, HoneyHive	Experiment tracking + prompt management	Mixed

Configuration

Configure the Gateway to export traces to your platform of choice:

# Langfuse
APAI_OBSERVABILITY_PROVIDER=langfuse
APAI_LANGFUSE_PUBLIC_KEY=pk_...
APAI_LANGFUSE_SECRET_KEY=sk_...
APAI_LANGFUSE_HOST=https://cloud.langfuse.com

# LangSmith
APAI_OBSERVABILITY_PROVIDER=langsmith
APAI_LANGSMITH_API_KEY=ls_...
APAI_LANGSMITH_PROJECT=my-agent

# Arize Phoenix
APAI_OBSERVABILITY_PROVIDER=phoenix
APAI_PHOENIX_ENDPOINT=https://app.phoenix.arize.com
APAI_PHOENIX_API_KEY=phx_...

# Generic OpenTelemetry (any OTel-compatible backend)
APAI_OBSERVABILITY_PROVIDER=otel
APAI_OTEL_ENDPOINT=https://otel-collector.example.com:4317
APAI_OTEL_AUTH=Bearer your-token

v0.1 status: the APAI Gateway image is not yet published. The integration shape above is what Phase 4+ ships. Until then, instrument your own MCP clients directly with the OpenTelemetry SDK and the same providers will work.

What to trace

The MCP Gateway emits these spans automatically. Configure your platform to alert on outliers in each:

apai.tool.call - one span per tool invocation. Attributes: tool name, arguments hash, agent identity, workspace, package, package version, decision result (allow / block / require_approval), latency.
apai.install - one span per install event. Attributes: package, version, source, target, install mode, permissions requested, permissions granted, scanner findings, status.
apai.policy.decision - one span per policy evaluation. Attributes: policy slug, rule id, action, on_match outcome.
apai.passport.read - one span per time a Capability Passport is consumed by an agent.

Privacy and redaction

Tool call arguments may contain sensitive data. Configure redaction at the Gateway before traces leave the workspace:

# .apai/observability.yaml
redact:
  - "Authorization"
  - "X-API-Key"
  - "/secrets/.*"
  - "ssn"
  - "credit_card"
sample_rate: 1.0  # 1.0 = trace every call. Drop to 0.1 for high-volume workspaces.

Audit logs (kept locally) always include full data for compliance. Traces (shipped to your observability platform) are redacted per this config.