Real-time interaction: streaming and concurrency¶
Maps to: Real-time interaction: streaming, concurrency control (double-texting).
Scope¶
Partial output to clients during long runs, long-lived thread streams, reconnect without gaps, and policies when users send overlapping messages.
Design questions¶
- Stream granularity: tokens, graph deltas, custom events, or combined modes.
- Client reconnect via last-event identifiers versus full replay policies.
- Double-texting strategy: enqueue, reject, interrupt, or rollback, and UI copy for each.
- Cleanup of partial tool calls when interrupting mid-flight.
Tradeoffs¶
- Interrupt-on-new-message feels responsive but risks inconsistent tool side effects.
- Enqueue is safe but can frustrate users who corrected a typo.
- Thread streaming complexity rises when background runs and HITL share one thread.
Evaluation hooks¶
- Dropped connection mid-stream resumes without duplicate or missing events.
- Concurrent messages under each policy; assert state matches documented semantics.
- Latency metrics from first token to final tool result.
Reference notes¶
See LangChain runtime article (concurrent runs figure).