Skip to content

Observability: tracing and time travel

Maps to: Observability: tracing, time travel.

Scope

Execution trees for production debugging, cost and error analytics, online evaluation, and forked replay from historical checkpoints.

Design questions

  • Required span metadata (user, tenant, harness version, model, tool, cost).
  • Sampling versus full capture for high-volume workloads.
  • Who can access traces and for how long; redaction in stored spans.
  • Time-travel fork semantics: what re-executes versus what is copied state.

Tradeoffs

  • Full traces enable fast incident response but increase storage and compliance scope.
  • Online LLM judges catch regressions early but add cost and judge bias.
  • Time travel accelerates debugging but can diverge from original production randomness unless controlled.

Evaluation hooks

  • Reproduce reported failure from trace id alone.
  • Online eval fires on canary harness change before full rollout.
  • Fork from checkpoint and compare tool path under alternate prompt.

Reference notes

See LangChain runtime article (improvement loop and time travel figures).