feat(runtime): run bridge workers in isolated worktrees by drewstone · Pull Request #406 · tangle-network/agent-runtime

drewstone · 2026-06-28T18:30:43Z

Summary

lets cli-worktree run through the existing live bridge transport while preserving isolated git worktrees
forwards cwd into bridge sessions so bridge-backed workers edit inside their cut worktree
keeps local CLI behavior unchanged, shares worktree checks, and makes examples typecheck use source imports instead of stale dist output

Checks

pnpm lint
pnpm typecheck
pnpm test
pnpm build
git merge-tree --write-tree origin/main HEAD

tangletools

🟡 Value Audit — sound-with-nits


Verdict	sound-with-nits
Concerns	2 (1 medium-concern, 1 low)
Heuristic	0.0s
Duplication	0.0s
Interrogation	101.2s (2 bridge agents)
Total	101.2s

💰 Value — sound-with-nits

Adds a bridge-transport variant to cliWorktreeExecutor so remote workers get the same isolated-worktree + diff + checks lifecycle as local CLI harnesses; coherent and reuses existing primitives, but duplicates the worktree orchestration that runWorktreeHarness was designed to own once.

What it does: Extends CliWorktreeSeam with an optional bridge (bridgeUrl/bearer/model/sessionId). When set, cliWorktreeExecutor dispatches to a new bridgeWorktreeExecutor that (1) cuts a fresh git worktree off repoRoot, (2) builds a BridgeSeam with cwd=worktree.path and delegates to the EXISTING bridgeExecutor (HTTP SSE transport, unchanged), (3) yields the bridge's streaming UsageEvents up to Scope, (4) capt
Goals it achieves: Give bridge-backed remote coding workers the same isolation + verification guarantees local-harness workers already had: per-spawn worktree (no cross-spawn file collision), captured diff as the artifact, test/typecheck PASS signals, and fail-loud cleanup. Previously a bridge worker ran in whatever cwd the remote session defaulted to and produced no patch artifact at all — now it produces a Worktre
Assessment: Good change, built in the codebase's grain. It reuses bridgeExecutor verbatim (no transport reinvention), reuses the four worktree primitives (createWorktree/captureWorktreeDiff/runWorktreeChecks/removeWorktree), plugs into the existing Executor port (deliver/execute/teardown/resultArtifact) so Scope/Supervisor need no changes, and leaves the local CLI path untouched via an early `if (seam.bridge)
Better / existing approach: Searched src/ for runWorktreeHarness callers (3: worktree-harness.ts itself, worktree-cli-executor.ts:127, in-process-executor.ts:150) and for any pre-existing bridge+worktree combinator. None exists — the bridge path is genuinely new capability. The available improvement is bounded: bridgeWorktreeExecutor (runtime.ts:1162-1321) re-implements ~50 lines of create-worktree → diff → checks → cleanup-
Model: opencode/zai-coding-plan/glm-5.2
Bridge attempts: 2
Bridge warning: opencode/kimi-for-coding/k2p7: bridge stream ended without value-audit content

🎯 Usefulness — sound

Adds the missing cross-product of two existing primitives — cli-bridge transport + git worktree isolation — by composing the live bridgeExecutor inside a worktree lifecycle, reachable from the public supervise() API with no reinvention.

Integration: Fully reachable. supervise(profile, task, { backend }) (supervise.ts:50) → workerFromBackend (supervise.ts:37) → createExecutor(backend) (runtime.ts:1413) → case 'cli-worktree' (runtime.ts:1426) → cliWorktreeExecutor (runtime.ts:1360) which routes to bridgeWorktreeExecutor when seam.bridge is set (runtime.ts:1365). The new path is one field-swap away from the existing `examples/super
Fit with existing patterns: Excellent. The codebase already had two parallel isolation models: backend:'bridge' (resumable session, NO git isolation — N parallel workers share one cwd) and backend:'cli-worktree'+harness (local CLI subprocess, WITH worktree isolation). The PR fills the missing cell — bridge transport WITH per-worker worktree isolation — which is exactly what parallel bridge-backed coding workers need so
Real-world viability: Solid. Each spawn mints its own worktree + sessionId (runtime.ts:1177-1178), so fanout is safe by construction — the same guarantee the existing worktree executor documents (worktree-cli-executor.ts:90). Lifecycle is fail-loud and idempotent: cleanupWorktree guards on a removed flag (runtime.ts:1186-1196); the catch path aborts the controller, tears down inner, and cleans the worktree befo
Model: opencode/zai-coding-plan/glm-5.2
Bridge attempts: 1

🔎 Heuristic Signals

🟡 Cruft: magic number added src/runtime/supervise/runtime.ts

         seam.checkTimeoutMs ?? seam.harnessTimeoutMs ?? bridge.timeoutMs ?? 5 * 60 * 1000,

💰 Value Audit

🟠 bridgeWorktreeExecutor duplicates runWorktreeHarness's worktree lifecycle, violating that file's 'lives ONCE' invariant [duplication] ``

runtime.ts:1216-1298 (createWorktree → bridgeExecutor.execute → captureWorktreeDiff → runWorktreeChecks → cleanupWorktree on throw + teardown) is a near-line-for-line echo of runWorktreeHarness's body at worktree-harness.ts:141-202. The worktree-harness.ts header (lines 1-20) explicitly states the lifecycle 'lives here ONCE' and that createWorktreeCliExecutor + createInProcessExecutor are THIN adapters over it. The streaming/await mismatch is the genuine cause — but a `withWorktree(repoRoot, run

What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass	What it asks
Heuristic	Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication	Do added function/class names already exist elsewhere in the repo?
Value Audit	What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit	Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

_{value-audit · 20260628T183729Z}

tangletools

✅ Auto-approved PR — `01002bee`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: manual · 2026-06-28T18:57:55Z}

tangletools · 2026-06-28T18:58:26Z

✅ No Blockers — `01002bee`

Review health 100/100 · Reviewer score 71/100 · Confidence 90/100 · 15 findings (15 low)

	glm	deepseek	aggregate
Readiness	74	71	71
Confidence	90	90	90
Correctness	74	71	71
Security	74	71	71
Testing	74	71	71
Architecture	74	71	71

Reviewer score is advisory once the run is complete and the verdict has no blockers.

Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision.

🟡 LOW Missing vi.unstubAllEnvs() in afterEach cleanup — src/intelligence/intelligence.test.ts

The afterEach at line 72 calls vi.unstubAllGlobals() and vi.restoreAllMocks() but not vi.unstubAllEnvs(). The new stubNoEndpointEnv() helper (lines 45-48) uses vi.stubEnv(), which is NOT cleaned up by either existing call. This means env stubs leak between tests within this file. Currently harmless because the stubbed values (empty strings for INTELLIGENCE_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_ENDPOINT) don't affect any subsequent test — all later tests either restub to the same values or use expli

🟡 LOW afterEach omits vi.unstubAllEnvs() — env stubs not explicitly restored — src/intelligence/intelligence.test.ts

Line 72-75: afterEach calls vi.unstubAllGlobals() and vi.restoreAllMocks() but NOT vi.unstubAllEnvs(). vi.stubEnv stubs are in a separate namespace from vi.stubGlobal, so the empty-string env stubs from stubNoEndpointEnv() are not explicitly cleaned between tests. In practice this is harmless: all three stubs set the same value ('') and vitest isolates test files, so no cross-file leakage. Flagging as a low-priority robustness nit — adding vi.unstubAllEnvs() to afterEach would make the cleanup symmetric with the stubbing and guard against future tests that stub different env values.

🟡 LOW Exported defaultRunCommand has zero external consumers — src/mcp/worktree-harness.ts

defaultRunCommand was changed from module-private (function) to export function at line 241, but grep shows nothing outside worktree-harness.ts imports it (the bridge executor relies on the fallback in runWorktreeChecks, and test seams inject their own). Either keep it private until a consumer exists, or accept the API surface increase.

🟡 LOW Newly-exported public API has no direct unit tests — src/mcp/worktree-harness.ts

runWorktreeChecks (line 207) and defaultRunCommand (line 241) are now exported. Grep across tests/*.test.ts shows zero direct references; the only runWorktreeHarness-adjacent test (tests/mcp/detached-coder.test.ts:240) checks the in-process executor's result shape, not the check runner. Behavior is unchanged by this diff so regression risk is low, but the spawn/timeout/abort/cap logic in defaultRunCommand and the truncation slice(-cap) in runWorktreeChecks ([line 229](https://github.com/tangle-network/agent-runtime/blob/

🟡 LOW No direct unit tests for newly-exported runWorktreeChecks — src/mcp/worktree-harness.ts

runWorktreeChecks is now a public API but has no focused unit tests. All coverage is indirect: createWorktreeCliExecutor tests exercise it through runWorktreeHarness, and the bridge test exercises it through bridgeWorktreeExecutor. The fallback-to-defaultRunCommand path (line 217: opts.runCommand ?? defaultRunCommand) is never triggered in any test because all tests inject a seam runCommand. Consider adding a direct test that calls runWorktreeChecks without runCommand to validate the default behavior.

🟡 LOW Redundant runCommand resolution at two layers — src/mcp/worktree-harness.ts

runWorktreeHarness resolves const runCommand = opts.runCommand ?? defaultRunCommand at line 137 then threads the resolved value into runWorktreeChecks at line 179, which re-applies opts.runCommand ?? defaultRunCommand at line 217. Harmless (both resolve to the same value), but the inner fallback only matters for the bridge caller in runtime.ts that omits the field. Cosmetic: either drop the resoluti

🟡 LOW Stale file doc claims only local CLI, but type now includes 'bridge' — src/mcp/worktree-harness.ts

Lines 2-9: the module header says the harness is always a 'local coding-harness CLI (claude / codex / opencode)'. But WorktreeHarnessResult.harness.name at line 60 is now typed LocalHarness | 'bridge', where 'bridge' is an HTTP bridge transport, not a local CLI. Extend the doc comment to mention the bridge path (which constructs results without going through runWorktreeHarness).

🟡 LOW Bridge worktree error/cleanup paths untested — src/runtime/supervise/runtime.ts

The test at tests/runtime/worktree-cli-executor.test.ts:330-399 covers only the happy path (createWorktree succeeds, bridge fetch returns 200 SSE, diff and checks succeed, teardown removes the worktree). The catch block at runtime.ts:1292-1297 — which exercises the riskiest code (abort propagation, inner.teardown('brutalKill'), cleanupWorktree on a thrown bridge fetch / git failure / check failure) — has zero coverage for the bridge variant. The non-bridge createWorktreeCliExecutor analog has throw-path coverage via runWorktreeHarness's own catch (worktree-harness.ts:199-202). Recommend adding at minimum: (1) bridge fetch rejects → assert worktree is removed and the error rethrows; (2) captureWorktreeDiff throws post-stream → assert the bridge session's settled output is NOT leaked as a

🟡 LOW Redundant removed = false reset signals inconsistent multi-execute handling — src/runtime/supervise/runtime.ts

At runtime.ts:1223, immediately after worktree = await createWorktree(...), the code does removed = false. But removed is initialized to false at line 1183 and is only ever set to true inside cleanupWorktree. On the first (and contractually only) execute() call, this reset is a no-op. It only matters if execute() is called again after a prior catch cleaned up — but in that case the orphaned-message problem from the deliver() finding already bites, AND worktree would be reassigned (leaking the first inner). The reset suggests the author half-anticipated multi-execute but didn't fully handle it. Either remove the reset (single-execute is the cont

🟡 LOW budgetExempt default differs between harness and bridge worktree paths without seam-level documentation — src/runtime/supervise/runtime.ts

bridgeWorktreeExecutor defaults to budgetExempt: false (line 1208) because the bridge surfaces real token/cost usage via SSE. createWorktreeCliExecutor defaults to true because local harnesses cannot account tokens. This asymmetry is intentional and correct, but neither CliWorktreeSeam.budgetExempt (line 138) nor CliWorktreeBridgeSeam documents the differing defaults. A consumer passing budgetExempt: undefined on the seam gets exempt=true for local harness runs and exempt=false for bridge runs — surprising without docs.

🟡 LOW deliver() after settle orphans messages — diverges from bridgeExecutor's resumability contract — src/runtime/supervise/runtime.ts

At runtime.ts:1198-1204, deliver() forwards to inner.deliver(msg) when inner exists. But inner is created per-execute (line 1242) and its inbox lives inside that inner closure. In the non-worktree bridgeExecutor (runtime.ts:887-892) the inbox is captured in the OUTER closure, so messages delivered between execute() calls survive and are drained on the next resume. Here, a message delivered after execute() settles sits in the OLD inner's inbox; if execute() were called again, a new inner is constructed and pending.splice(0) ([line 1243](https://github.com/tangle-network/agent-runtime/blob/01002bee709f7bc09243184c9682b5a7a613075a/src/runtime/supe

🟡 LOW Missing assertion on agent_profile in bridge request body — tests/runtime/worktree-cli-executor.test.ts

Line 342-348: The fetch stub captures request bodies but the test only asserts on cwd, session_id, and messages (lines 378-383). The bridge worktree executor sends agent_profile: spec.profile (runtime.ts:1232-1233) when bridge.agentProfile is not set. This is part of the bridge wire contract — the test should verify it reaches the bridge. Add: expect(typeof requests[0]?.agent_profile).toBe('object').

🟡 LOW No error-path coverage for the bridge-worktree integration — tests/runtime/worktree-cli-executor.test.ts

The new test (lines 330-399) covers only the happy path: bridge returns 200 with valid SSE, one turn, clean settle, clean teardown. Uncovered: bridge non-200 response (ValidationError path at runtime.ts:1020-1024), fetch rejection (fatal rethrow at runtime.ts:1018), missing body (runtime.ts:1026-1028), interrupt steer triggering turn abort+replan (runtime.ts:1011-1017), and the maxTurns backstop. The bridge executor's own unit tests (if any) should cover these; this integration test is scoped to the worktree+bridge composition's invariants. Not blocking — the happy path validates the core integration (isolation, steering, session continuity,

🟡 LOW No test for deliver() during async execution (mid-stream steering) — tests/runtime/worktree-cli-executor.test.ts

Line 330-399: The bridge worktree test only covers deliver() BEFORE execute(). The executor's deliver() (runtime.ts:1198-1203) routes to inner?.deliver() when the inner executor exists — the mid-execution path that steering during a run relies on. A regression that breaks mid-execution deliver (e.g., by not forwarding to inner) could go undetected.

🟡 LOW afterEach unstubAllGlobals is broad-scope cleanup — tests/runtime/worktree-cli-executor.test.ts

Line 104: afterEach(() => vi.unstubAllGlobals()) restores ALL global stubs, not just fetch. Safe in the current file because only the bridge test stubs globals, but a future test that stubs multiple globals and needs selective restore could be silently affected. Prefer scoped restoration (e.g., save return of stubGlobal and restore explicitly) for precision.

_{tangletools · 2026-06-28T18:58:23Z · trace}

tangletools

✅ Approved — 15 non-blocking findings — `01002bee`

Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision.

Full immutable report for this review: trace

Summary comment for this run: full summary

_{tangletools · 2026-06-28T18:58:23Z · immutable trace}

drewstone added 2 commits June 28, 2026 12:29

feat(runtime): run bridge workers in isolated worktrees

9c100e0

docs(api): refresh runtime API docs

01002be

tangletools reviewed Jun 28, 2026

View reviewed changes

tangletools approved these changes Jun 28, 2026

View reviewed changes

drewstone merged commit 2273544 into main Jun 28, 2026
1 check passed

tangletools mentioned this pull request Jun 28, 2026

chore(release): agent-runtime 0.79.3 #407

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(runtime): run bridge workers in isolated worktrees#406

feat(runtime): run bridge workers in isolated worktrees#406
drewstone merged 2 commits into
mainfrom
feat/live-bridge-worktree

drewstone commented Jun 28, 2026

Uh oh!

tangletools left a comment

Uh oh!

tangletools left a comment

Uh oh!

tangletools commented Jun 28, 2026

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

drewstone commented Jun 28, 2026

Summary

Checks

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

🟡 Value Audit — sound-with-nits

💰 Value — sound-with-nits

🎯 Usefulness — sound

🔎 Heuristic Signals

💰 Value Audit

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — 01002bee

Uh oh!

tangletools commented Jun 28, 2026

✅ No Blockers — 01002bee

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Approved — 15 non-blocking findings — 01002bee

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved PR — `01002bee`

✅ No Blockers — `01002bee`

✅ Approved — 15 non-blocking findings — `01002bee`