Skip to content

feat(runtime): run bridge workers in isolated worktrees#406

Merged
drewstone merged 2 commits into
mainfrom
feat/live-bridge-worktree
Jun 28, 2026
Merged

feat(runtime): run bridge workers in isolated worktrees#406
drewstone merged 2 commits into
mainfrom
feat/live-bridge-worktree

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Summary

  • lets cli-worktree run through the existing live bridge transport while preserving isolated git worktrees
  • forwards cwd into bridge sessions so bridge-backed workers edit inside their cut worktree
  • keeps local CLI behavior unchanged, shares worktree checks, and makes examples typecheck use source imports instead of stale dist output

Checks

  • pnpm lint
  • pnpm typecheck
  • pnpm test
  • pnpm build
  • git merge-tree --write-tree origin/main HEAD

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Value Audit — sound-with-nits

Verdict sound-with-nits
Concerns 2 (1 medium-concern, 1 low)
Heuristic 0.0s
Duplication 0.0s
Interrogation 101.2s (2 bridge agents)
Total 101.2s

💰 Value — sound-with-nits

Adds a bridge-transport variant to cliWorktreeExecutor so remote workers get the same isolated-worktree + diff + checks lifecycle as local CLI harnesses; coherent and reuses existing primitives, but duplicates the worktree orchestration that runWorktreeHarness was designed to own once.

  • What it does: Extends CliWorktreeSeam with an optional bridge (bridgeUrl/bearer/model/sessionId). When set, cliWorktreeExecutor dispatches to a new bridgeWorktreeExecutor that (1) cuts a fresh git worktree off repoRoot, (2) builds a BridgeSeam with cwd=worktree.path and delegates to the EXISTING bridgeExecutor (HTTP SSE transport, unchanged), (3) yields the bridge's streaming UsageEvents up to Scope, (4) capt
  • Goals it achieves: Give bridge-backed remote coding workers the same isolation + verification guarantees local-harness workers already had: per-spawn worktree (no cross-spawn file collision), captured diff as the artifact, test/typecheck PASS signals, and fail-loud cleanup. Previously a bridge worker ran in whatever cwd the remote session defaulted to and produced no patch artifact at all — now it produces a Worktre
  • Assessment: Good change, built in the codebase's grain. It reuses bridgeExecutor verbatim (no transport reinvention), reuses the four worktree primitives (createWorktree/captureWorktreeDiff/runWorktreeChecks/removeWorktree), plugs into the existing Executor port (deliver/execute/teardown/resultArtifact) so Scope/Supervisor need no changes, and leaves the local CLI path untouched via an early `if (seam.bridge)
  • Better / existing approach: Searched src/ for runWorktreeHarness callers (3: worktree-harness.ts itself, worktree-cli-executor.ts:127, in-process-executor.ts:150) and for any pre-existing bridge+worktree combinator. None exists — the bridge path is genuinely new capability. The available improvement is bounded: bridgeWorktreeExecutor (runtime.ts:1162-1321) re-implements ~50 lines of create-worktree → diff → checks → cleanup-
  • Model: opencode/zai-coding-plan/glm-5.2
  • Bridge attempts: 2
  • Bridge warning: opencode/kimi-for-coding/k2p7: bridge stream ended without value-audit content

🎯 Usefulness — sound

Adds the missing cross-product of two existing primitives — cli-bridge transport + git worktree isolation — by composing the live bridgeExecutor inside a worktree lifecycle, reachable from the public supervise() API with no reinvention.

  • Integration: Fully reachable. supervise(profile, task, { backend }) (supervise.ts:50) → workerFromBackend (supervise.ts:37) → createExecutor(backend) (runtime.ts:1413) → case 'cli-worktree' (runtime.ts:1426) → cliWorktreeExecutor (runtime.ts:1360) which routes to bridgeWorktreeExecutor when seam.bridge is set (runtime.ts:1365). The new path is one field-swap away from the existing `examples/super
  • Fit with existing patterns: Excellent. The codebase already had two parallel isolation models: backend:'bridge' (resumable session, NO git isolation — N parallel workers share one cwd) and backend:'cli-worktree'+harness (local CLI subprocess, WITH worktree isolation). The PR fills the missing cell — bridge transport WITH per-worker worktree isolation — which is exactly what parallel bridge-backed coding workers need so
  • Real-world viability: Solid. Each spawn mints its own worktree + sessionId (runtime.ts:1177-1178), so fanout is safe by construction — the same guarantee the existing worktree executor documents (worktree-cli-executor.ts:90). Lifecycle is fail-loud and idempotent: cleanupWorktree guards on a removed flag (runtime.ts:1186-1196); the catch path aborts the controller, tears down inner, and cleans the worktree befo
  • Model: opencode/zai-coding-plan/glm-5.2
  • Bridge attempts: 1

🔎 Heuristic Signals

🟡 Cruft: magic number added src/runtime/supervise/runtime.ts

  •          seam.checkTimeoutMs ?? seam.harnessTimeoutMs ?? bridge.timeoutMs ?? 5 * 60 * 1000,
    

💰 Value Audit

🟠 bridgeWorktreeExecutor duplicates runWorktreeHarness's worktree lifecycle, violating that file's 'lives ONCE' invariant [duplication] ``

runtime.ts:1216-1298 (createWorktree → bridgeExecutor.execute → captureWorktreeDiff → runWorktreeChecks → cleanupWorktree on throw + teardown) is a near-line-for-line echo of runWorktreeHarness's body at worktree-harness.ts:141-202. The worktree-harness.ts header (lines 1-20) explicitly states the lifecycle 'lives here ONCE' and that createWorktreeCliExecutor + createInProcessExecutor are THIN adapters over it. The streaming/await mismatch is the genuine cause — but a `withWorktree(repoRoot, run


What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass What it asks
Heuristic Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication Do added function/class names already exist elsewhere in the repo?
Value Audit What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

value-audit · 20260628T183729Z

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — 01002bee

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: manual · 2026-06-28T18:57:55Z

@tangletools

Copy link
Copy Markdown
Contributor

✅ No Blockers — 01002bee

Review health 100/100 · Reviewer score 71/100 · Confidence 90/100 · 15 findings (15 low)

glm deepseek aggregate
Readiness 74 71 71
Confidence 90 90 90
Correctness 74 71 71
Security 74 71 71
Testing 74 71 71
Architecture 74 71 71

Reviewer score is advisory once the run is complete and the verdict has no blockers.

Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision.

🟡 LOW Missing vi.unstubAllEnvs() in afterEach cleanup — src/intelligence/intelligence.test.ts

The afterEach at line 72 calls vi.unstubAllGlobals() and vi.restoreAllMocks() but not vi.unstubAllEnvs(). The new stubNoEndpointEnv() helper (lines 45-48) uses vi.stubEnv(), which is NOT cleaned up by either existing call. This means env stubs leak between tests within this file. Currently harmless because the stubbed values (empty strings for INTELLIGENCE_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_ENDPOINT) don't affect any subsequent test — all later tests either restub to the same values or use expli

🟡 LOW afterEach omits vi.unstubAllEnvs() — env stubs not explicitly restored — src/intelligence/intelligence.test.ts

Line 72-75: afterEach calls vi.unstubAllGlobals() and vi.restoreAllMocks() but NOT vi.unstubAllEnvs(). vi.stubEnv stubs are in a separate namespace from vi.stubGlobal, so the empty-string env stubs from stubNoEndpointEnv() are not explicitly cleaned between tests. In practice this is harmless: all three stubs set the same value ('') and vitest isolates test files, so no cross-file leakage. Flagging as a low-priority robustness nit — adding vi.unstubAllEnvs() to afterEach would make the cleanup symmetric with the stubbing and guard against future tests that stub different env values.

🟡 LOW Exported defaultRunCommand has zero external consumers — src/mcp/worktree-harness.ts

defaultRunCommand was changed from module-private (function) to export function at line 241, but grep shows nothing outside worktree-harness.ts imports it (the bridge executor relies on the fallback in runWorktreeChecks, and test seams inject their own). Either keep it private until a consumer exists, or accept the API surface increase.

🟡 LOW Newly-exported public API has no direct unit tests — src/mcp/worktree-harness.ts

runWorktreeChecks (line 207) and defaultRunCommand (line 241) are now exported. Grep across tests/*.test.ts shows zero direct references; the only runWorktreeHarness-adjacent test (tests/mcp/detached-coder.test.ts:240) checks the in-process executor's result shape, not the check runner. Behavior is unchanged by this diff so regression risk is low, but the spawn/timeout/abort/cap logic in defaultRunCommand and the truncation slice(-cap) in runWorktreeChecks ([line 229](https://github.com/tangle-network/agent-runtime/blob/

🟡 LOW No direct unit tests for newly-exported runWorktreeCheckssrc/mcp/worktree-harness.ts

runWorktreeChecks is now a public API but has no focused unit tests. All coverage is indirect: createWorktreeCliExecutor tests exercise it through runWorktreeHarness, and the bridge test exercises it through bridgeWorktreeExecutor. The fallback-to-defaultRunCommand path (line 217: opts.runCommand ?? defaultRunCommand) is never triggered in any test because all tests inject a seam runCommand. Consider adding a direct test that calls runWorktreeChecks without runCommand to validate the default behavior.

🟡 LOW Redundant runCommand resolution at two layers — src/mcp/worktree-harness.ts

runWorktreeHarness resolves const runCommand = opts.runCommand ?? defaultRunCommand at line 137 then threads the resolved value into runWorktreeChecks at line 179, which re-applies opts.runCommand ?? defaultRunCommand at line 217. Harmless (both resolve to the same value), but the inner fallback only matters for the bridge caller in runtime.ts that omits the field. Cosmetic: either drop the resoluti

🟡 LOW Stale file doc claims only local CLI, but type now includes 'bridge' — src/mcp/worktree-harness.ts

Lines 2-9: the module header says the harness is always a 'local coding-harness CLI (claude / codex / opencode)'. But WorktreeHarnessResult.harness.name at line 60 is now typed LocalHarness | 'bridge', where 'bridge' is an HTTP bridge transport, not a local CLI. Extend the doc comment to mention the bridge path (which constructs results without going through runWorktreeHarness).

🟡 LOW Bridge worktree error/cleanup paths untested — src/runtime/supervise/runtime.ts

The test at tests/runtime/worktree-cli-executor.test.ts:330-399 covers only the happy path (createWorktree succeeds, bridge fetch returns 200 SSE, diff and checks succeed, teardown removes the worktree). The catch block at runtime.ts:1292-1297 — which exercises the riskiest code (abort propagation, inner.teardown('brutalKill'), cleanupWorktree on a thrown bridge fetch / git failure / check failure) — has zero coverage for the bridge variant. The non-bridge createWorktreeCliExecutor analog has throw-path coverage via runWorktreeHarness's own catch (worktree-harness.ts:199-202). Recommend adding at minimum: (1) bridge fetch rejects → assert worktree is removed and the error rethrows; (2) captureWorktreeDiff throws post-stream → assert the bridge session's settled output is NOT leaked as a

🟡 LOW Redundant removed = false reset signals inconsistent multi-execute handling — src/runtime/supervise/runtime.ts

At runtime.ts:1223, immediately after worktree = await createWorktree(...), the code does removed = false. But removed is initialized to false at line 1183 and is only ever set to true inside cleanupWorktree. On the first (and contractually only) execute() call, this reset is a no-op. It only matters if execute() is called again after a prior catch cleaned up — but in that case the orphaned-message problem from the deliver() finding already bites, AND worktree would be reassigned (leaking the first inner). The reset suggests the author half-anticipated multi-execute but didn't fully handle it. Either remove the reset (single-execute is the cont

🟡 LOW budgetExempt default differs between harness and bridge worktree paths without seam-level documentation — src/runtime/supervise/runtime.ts

bridgeWorktreeExecutor defaults to budgetExempt: false (line 1208) because the bridge surfaces real token/cost usage via SSE. createWorktreeCliExecutor defaults to true because local harnesses cannot account tokens. This asymmetry is intentional and correct, but neither CliWorktreeSeam.budgetExempt (line 138) nor CliWorktreeBridgeSeam documents the differing defaults. A consumer passing budgetExempt: undefined on the seam gets exempt=true for local harness runs and exempt=false for bridge runs — surprising without docs.

🟡 LOW deliver() after settle orphans messages — diverges from bridgeExecutor's resumability contract — src/runtime/supervise/runtime.ts

At runtime.ts:1198-1204, deliver() forwards to inner.deliver(msg) when inner exists. But inner is created per-execute (line 1242) and its inbox lives inside that inner closure. In the non-worktree bridgeExecutor (runtime.ts:887-892) the inbox is captured in the OUTER closure, so messages delivered between execute() calls survive and are drained on the next resume. Here, a message delivered after execute() settles sits in the OLD inner's inbox; if execute() were called again, a new inner is constructed and pending.splice(0) ([line 1243](https://github.com/tangle-network/agent-runtime/blob/01002bee709f7bc09243184c9682b5a7a613075a/src/runtime/supe

🟡 LOW Missing assertion on agent_profile in bridge request body — tests/runtime/worktree-cli-executor.test.ts

Line 342-348: The fetch stub captures request bodies but the test only asserts on cwd, session_id, and messages (lines 378-383). The bridge worktree executor sends agent_profile: spec.profile (runtime.ts:1232-1233) when bridge.agentProfile is not set. This is part of the bridge wire contract — the test should verify it reaches the bridge. Add: expect(typeof requests[0]?.agent_profile).toBe('object').

🟡 LOW No error-path coverage for the bridge-worktree integration — tests/runtime/worktree-cli-executor.test.ts

The new test (lines 330-399) covers only the happy path: bridge returns 200 with valid SSE, one turn, clean settle, clean teardown. Uncovered: bridge non-200 response (ValidationError path at runtime.ts:1020-1024), fetch rejection (fatal rethrow at runtime.ts:1018), missing body (runtime.ts:1026-1028), interrupt steer triggering turn abort+replan (runtime.ts:1011-1017), and the maxTurns backstop. The bridge executor's own unit tests (if any) should cover these; this integration test is scoped to the worktree+bridge composition's invariants. Not blocking — the happy path validates the core integration (isolation, steering, session continuity,

🟡 LOW No test for deliver() during async execution (mid-stream steering) — tests/runtime/worktree-cli-executor.test.ts

Line 330-399: The bridge worktree test only covers deliver() BEFORE execute(). The executor's deliver() (runtime.ts:1198-1203) routes to inner?.deliver() when the inner executor exists — the mid-execution path that steering during a run relies on. A regression that breaks mid-execution deliver (e.g., by not forwarding to inner) could go undetected.

🟡 LOW afterEach unstubAllGlobals is broad-scope cleanup — tests/runtime/worktree-cli-executor.test.ts

Line 104: afterEach(() => vi.unstubAllGlobals()) restores ALL global stubs, not just fetch. Safe in the current file because only the bridge test stubs globals, but a future test that stubs multiple globals and needs selective restore could be silently affected. Prefer scoped restoration (e.g., save return of stubGlobal and restore explicitly) for precision.


tangletools · 2026-06-28T18:58:23Z · trace

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Approved — 15 non-blocking findings — 01002bee

Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision. | Full multi-shot audit completed 6/6 planned shots over 7 changed files. Global verifier still owns final merge decision.

Full immutable report for this review: trace

Summary comment for this run: full summary


tangletools · 2026-06-28T18:58:23Z · immutable trace

@drewstone drewstone merged commit 2273544 into main Jun 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants