Documentation outline¶

This outline is a starting map for discussions about how to organize AI agent system design documentation. Section titles are provisional; the goal is coverage and traceability to sources, not final taxonomy.

1. Problem framing¶

What problem the agent solves and what “done” means for a run.
Harness versus runtime: prompts, tools, and skills versus durable execution, tenancy, and observability.
Non-functional requirements: latency, cost, quality, reliability, and compliance.

2. Architecture views¶

Control plane versus data plane for agent workloads.
Single-agent loop versus multi-agent orchestration.
Synchronous chat versus background jobs, cron, and proactive agents.
Integration surfaces: MCP, A2A, webhooks, and bespoke APIs.

3. Runtime capabilities (production checklist)¶

Use Production requirements and runtime capabilities as the working matrix. Draft deep dives: runtime-capabilities/README.md (one page per row).

4. Data, memory, and state¶

Short-term thread state versus long-term store semantics.
Checkpointing, replay, and time travel for debugging.
Retention, migration, and model/provider changes without losing organizational memory.

5. Safety, guardrails, and human oversight¶

Middleware and deterministic policy enforcement.
Human-in-the-loop patterns: approval gates, draft review, clarifying questions.
Prompt injection, sandbox boundaries, and credential handling.

6. Operations¶

Draft: 05-operations.md: tracing, evals, streaming, DR, versioning, and cost.

7. Economics and platform constraints¶

Inference scaling, caching, and cost optimization.
Multi-tenancy, RBAC, and operator access models.

8. Ethics, compliance, and product risk¶

Draft: 06-product-and-compliance.md.

9. Reference library¶

Primary deep reference: LangChain runtime article.
Data-agent domain reference: Databricks Genie article.
Case studies: Databricks Genie data agents (maps the Genie article onto this outline).
Social and secondary framing: Sources.

10. Harness versus runtime and program fit¶

Draft: 03-harness-vs-runtime.md, 07-transmute-data-fit.md, 08-decision-log.md.

Open questions for adaptation¶

Whether “harness” and “runtime” remain the top-level split or merge into a single “platform” view.
How much LangSmith or LangChain-specific terminology to keep versus neutral labels.
Which checklist rows become first-class chapters versus appendices.
Where Transmute-Data-specific conventions (eval harnesses, td CLI, Coder workspaces) attach without turning this into an implementation guide too early.