Kodelyth ECC — Session Replay
Bundle, share, and re-run swarm sessions for regression testing, reproducible bug reports, and A/B testing across harnesses, agents, and base refs.
Phase 2.8 of the Devil Roadmap. Companion to the swarm orchestrator (Phase 2.7) — every swarm coordination dir is portable, replayable, and diff-friendly.
Why replay
Three concrete wins:
- Reproducible bug reports. Bundle a buggy swarm into a single JSON file, ship to maintainers. They
replay --executelocally and see the same agent behavior. - Regression testing. Save a known-good baseline. After an agent prompt rev, replay against the new agent and diff handoffs. Did quality regress?
- Model A/B testing. Same task, two harnesses. Side-by-side handoff comparison reveals which model handles the swarm better for your codebase.
CLI
| Command | Purpose |
|---|---|
kodelyth-ecc session-export <session> [flags] | Bundle a coordination dir to JSON. |
kodelyth-ecc session-import <bundle.json> [flags] | Restore a bundle to a coordination dir. |
kodelyth-ecc replay <bundle|session> [flags] | Re-run a session with variations. |
session-export
kodelyth-ecc session-export <session> \
[--out file.json] \
[--task "..."] [--agents a,b,c] [--harness claude] [--base-ref HEAD] \
[--coord-root <dir>]| Flag | Description |
|---|---|
<session> | Required. Directory name under .orchestration/. |
--out | Output JSON path. Default: .orchestration/<session>.bundle.json. |
--task "..." | Enrich meta.task for cleaner replays. |
--agents a,b,c | Enrich meta.agents. |
--harness <h> | Enrich meta.harness. |
--base-ref <ref> | Enrich meta.base_ref. |
--coord-root | Where to look for coordination dirs (default: <repo>/.orchestration). |
session-import
kodelyth-ecc session-import <bundle.json> \
[--target <dir>] [--overwrite] \
[--coord-root <dir>]| Flag | Description |
|---|---|
<bundle.json> | Required. Bundle to restore. |
--target | Output directory. Default: .orchestration/<session-from-bundle>. |
--overwrite | Replace any existing target dir. |
replay
kodelyth-ecc replay <bundle.json|session-name> \
[--harness h] [--agents a,b,c] [--base-ref ref] [--session NAME] [--replace] \
[--execute|--write-only|--json]| Flag | Description |
|---|---|
<target> | Required. Bundle file (ends in .json) or session name in .orchestration/. |
--harness | Override launcher harness. |
--agents | Replace the agent list. |
--base-ref | Branch base for replay worktrees. |
--session | Override auto-generated -replay-N name. |
--replace | Tear down any existing session/worktrees with the same names. |
--execute | Actually spawn worktrees + tmux + agents. |
--write-only | Just materialize coordination files. |
--json | Print plan + planConfig. |
Default mode is dry-run. Always inspect first.
Bundle format
Stable schema kodelyth.session-bundle/v1. Single JSON file:
{
"schema": "kodelyth.session-bundle/v1",
"session": "swarm-2026-05-10-4a",
"exported_at": "2026-05-10T17:30:00Z",
"exported_by": "[email protected]",
"meta": {
"task": "audit oauth flow",
"agents": ["security-reviewer", "code-reviewer", "pair-programmer", "tdd-guide"],
"harness": "claude",
"base_ref": "HEAD"
},
"workers": [
{ "slug": "code-reviewer", "task": "...", "handoff": "...", "status": "..." },
{ "slug": "pair-programmer", "task": "...", "handoff": "...", "status": "..." },
{ "slug": "security-reviewer", "task": "...", "handoff": "...", "status": "..." },
{ "slug": "tdd-guide", "task": "...", "handoff": "...", "status": "..." }
]
}Pure JSON. No archives, no binaries. Diff-friendly for git diff review of regression bundles.
How replay reconstructs the task
- Prefer
meta.taskif the bundle was exported with--task "...". - Fallback: parse the first worker's
task.mdfor the## Shared Tasksection (the agent-shaped header fromscripts/swarm/build-plan.js). - Final fallback: parse the orchestrator's own
## Objectiveblock (works for hand-written tasks).
This means replay works even on bundles that pre-date the --task flag — the heuristic recovers the shared task from the first worker.
Replay variations
| Want to test | Flags |
|---|---|
| Same task, different model | --harness claude vs --harness codex (or vary KODELYTH_ROUTER_*) |
| Same task, different agents | --agents new1,new2,new3 |
| Same task, new code | --base-ref refactor-branch |
| Custom session name | --session my-replay-1 |
| Inspect plan only | (default — dry-run) |
| Just write coordination files | --write-only |
| Full execute | --execute |
Hard rules
- Never
--executewithout inspecting the dry-run. Worktrees mutate disk. - Replays are auto-named (
-replay-N) to avoid collisions. Don't manually reuse the origin name. - Bundles are public artifacts — strip secrets before sharing externally. Use
opensource-sanitizerif needed. - A/B comparisons require human review. Never auto-pick a "winner" between two replays.
- Don't replay across incompatible base refs. A swarm built against
mainmay break if replayed againstfeature-branchwith conflicting changes.
Pairing with the rest of ECC
| Pairs with | How |
|---|---|
| 2.7 swarm orchestrator | Replay only works on swarm coordination dirs. The two ship together. |
| 2.4 cost router | Vary KODELYTH_ROUTER_* env vars across replays for A/B model tests. |
| 2.10 token-budget hook | Replays open new sessions with fresh budgets — no spillover from the origin. |
| 2.5 MCP client mode | Replays inherit the same MCP registry, so tool calls reproduce. |
| opensource-sanitizer | Run on a bundle before sharing externally. |
Examples
Reproducible bug report
# 1. Capture the buggy run
npx kodelyth-ecc swarm --task "..." --execute
# 2. After it finishes, export with rich meta
npx kodelyth-ecc session-export swarm-2026-05-10-4a \
--task "..." --agents code-reviewer,security-reviewer --harness claude \
--out bug-report.bundle.json
# 3. Strip secrets if needed
# (manually edit bug-report.bundle.json)
# 4. Ship to maintainers
gh issue create --body "Reproducer attached: bug-report.bundle.json"Model A/B test
# Run with claude
npx kodelyth-ecc swarm --task "refactor payments module" --agents 4 --harness claude --execute
npx kodelyth-ecc session-export swarm-... --out claude-run.bundle.json
# Replay with codex against the same task
npx kodelyth-ecc replay claude-run.bundle.json --harness codex --execute
npx kodelyth-ecc session-export swarm-...-replay-1 --out codex-run.bundle.json
# Compare handoffs
diff <(jq -r '.workers[] | "\(.slug):\n\(.handoff)"' claude-run.bundle.json) \
<(jq -r '.workers[] | "\(.slug):\n\(.handoff)"' codex-run.bundle.json)Regression check after agent rev
# 1. Save a baseline.
npx kodelyth-ecc session-export swarm-baseline --out baseline.bundle.json
# 2. After updating an agent prompt, replay.
npx kodelyth-ecc replay baseline.bundle.json --execute
npx kodelyth-ecc session-export swarm-baseline-replay-1 --out replay.bundle.json
# 3. Inspect the diff manually — has quality regressed?
diff <(jq -r '.workers[] | "\(.slug)\n\(.handoff)"' baseline.bundle.json) \
<(jq -r '.workers[] | "\(.slug)\n\(.handoff)"' replay.bundle.json)Programmatic use
const bundleLib = require('kodelyth-ecc/scripts/replay/bundle.js');
const replayLib = require('kodelyth-ecc/scripts/replay/replay.js');
const { buildOrchestrationPlan, executePlan } = require('kodelyth-ecc/scripts/lib/tmux-worktree-orchestrator.js');
// Read a bundle
const bundle = bundleLib.readBundle('./oauth-audit.bundle.json');
// Build a replay plan with overrides
const planConfig = replayLib.buildReplayPlanConfig(bundle, {
harness: 'codex',
baseRef: 'main',
});
// Execute
const plan = buildOrchestrationPlan(planConfig);
const result = executePlan(plan);
console.log(`replay started: ${result.sessionName}`);Roadmap interactions
- Phase 2.3 — local dashboard will surface replay history and side-by-side handoff diffs for the same task across runs.
- Phase 2.6 — sandbox layer will isolate replay execution in Docker so re-running an external bundle doesn't trust the source.
- Phase 2.2 — SWE-Bench harness will use bundle replay as its evaluation primitive.
Built into Kodelyth ECC. MIT licensed.