Observability

Signal Coverage

  • Tracing: OpenTelemetry + Tempo/Jaeger.

  • Metrics: Prometheus + Grafana.

  • Logs: Loki.

  • LLM Observability: Langfuse or Helicone.

Operational Guarantees

  • End-to-end traceability from user prompt to tool calls.

  • Metric baselines for latency, cost, and quality.

  • Anomaly detection for retrieval drift and model regressions.