fix(sdk): custom agent loop parity for continuations, steering, and subtasks#3936
fix(sdk): custom agent loop parity for continuations, steering, and subtasks#3936ericallam wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: 174ba12 The changes in this PR will be included in the next version bump. This PR includes changesets to release 25 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughThis PR addresses three behavioral issues in custom agent loops and chat session handling. The primary fix introduces resume cursor seeding that scans session history before user code attaches listeners, preventing replay of already-answered messages on continuation. Tool subtask execution now threads parent session context so task-backed tools can stream progress to the root chat. Chat capture streaming receives explicit message ID generation to avoid text loss during mid-stream steering. The raw chat session iterator is reordered to seed cursors before stop-signal creation, with cleanup made safe against early termination. A changelog entry documents all three fixes. 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install timed out. The project may have too many dependencies for the sandbox. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
… steering, and subtasks
Three fixes for chat.customAgent raw loops and chat.createSession:
Continuation boots replayed already-answered user messages into the
first wait: the .in SSE tail attached (via createStopSignal or any
listener) before a resume cursor existed, so S2 replayed from seq 0.
The custom-agent wrapper and createChatSession's first next() now seed
both manager cursors from the latest turn-complete header before
anything attaches, the same boot logic chat.agent uses. Seeding only
setLastSeqNum after attach (the reverted earlier attempt) does not
work because dispatch is gated on the other cursor.
Steering a hand-rolled loop mid-stream wiped the in-flight assistant
text: pipeChatAndCapture called toUIMessageStream without
generateMessageId, so a prepareStep injection starting a new step
regenerated the assistant id and the frontend replaced the partial
message. It now stamps the server-generated id like chat.agent's pipe.
Task-backed tools (ai.toolExecute) failed from custom agent loops with
"session handle is not initialized" on the child run: the chatId only
threaded from the per-turn context that raw loops never set. It now
falls back to the session handle the customAgent wrapper binds at boot,
so child tasks can stream into the parent's chat with
chat.stream.writer({ target: "root" }).
The wire can omit the continuation flag on a run that still has prior turns. The cursor scan doubles as the prior-state probe (a fresh session has no turn-complete on .out and seeds nothing), so run it on every custom-loop boot instead of gating on continuation or attempt number, mirroring the snapshot-exists arm of chat.agent's boot check.
ce2b2d8 to
174ba12
Compare
@trigger.dev/build
trigger.dev
@trigger.dev/core
@trigger.dev/python
@trigger.dev/react-hooks
@trigger.dev/redis-worker
@trigger.dev/rsc
@trigger.dev/schema-to-json
@trigger.dev/sdk
commit: |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b764dd36-a9df-4872-8ae5-c8fc3d0adcfd
📒 Files selected for processing (2)
.changeset/custom-agent-loop-fixes.mdpackages/trigger-sdk/src/v3/ai.ts
✅ Files skipped from review due to trivial changes (1)
- .changeset/custom-agent-loop-fixes.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (33)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: internal / 🧪 Unit Tests: Internal (4, 12)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: internal / 🧪 Unit Tests: Internal (8, 12)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: internal / 🧪 Unit Tests: Internal (6, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (11, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (5, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (2, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (3, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (10, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (1, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (7, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (12, 12)
- GitHub Check: internal / 🧪 Unit Tests: Internal (9, 12)
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: typecheck / typecheck
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: packages / 🧪 Unit Tests: Packages (3, 3)
- GitHub Check: packages / 🧪 Unit Tests: Packages (2, 3)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: packages / 🧪 Unit Tests: Packages (1, 3)
- GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
- GitHub Check: Build and publish previews
🧰 Additional context used
📓 Path-based instructions (6)
packages/trigger-sdk/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Files:
packages/trigger-sdk/src/v3/ai.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects insteadImport from
@trigger.dev/sdkwhen writing Trigger.dev tasks. Never use@trigger.dev/sdk/v3or deprecatedclient.defineJob
Files:
packages/trigger-sdk/src/v3/ai.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
**/*.{ts,tsx,js,jsx}: Prefer static imports over dynamic imports. Only use dynamicimport()when circular dependencies cannot be resolved, code splitting is needed for performance, or the module must be loaded conditionally at runtime
Import subpaths only frompackages/core(@trigger.dev/core), never import from the root
Files:
packages/trigger-sdk/src/v3/ai.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
packages/trigger-sdk/src/v3/ai.ts
packages/trigger-sdk/**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (packages/trigger-sdk/CLAUDE.md)
Always import from
@trigger.dev/sdk. Never use@trigger.dev/sdk/v3(deprecated path alias)
Files:
packages/trigger-sdk/src/v3/ai.ts
**/*.{js,ts,tsx,jsx,css,json,md}
📄 CodeRabbit inference engine (AGENTS.md)
Use Prettier for code formatting and run
pnpm run formatbefore committing
Files:
packages/trigger-sdk/src/v3/ai.ts
🧠 Learnings (10)
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma error P1001 ("Can't reach database server") in TypeScript, don’t assume a single error shape. Prisma can surface P1001 via two different error classes/fields: `PrismaClientKnownRequestError` exposes it as `err.code === "P1001"` (common during mid-query connection drops), while `PrismaClientInitializationError` exposes it as `err.errorCode === "P1001"` (common on client startup failure). Therefore, predicates should use `err.code === "P1001" || err.errorCode === "P1001"`. Do not flag `err.code === "P1001"` as “unreachable/never matches,” as it is expected in production.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-05-18T08:21:27.694Z
Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma errors for P1001 ("Can't reach database server"), do not assume it only appears under a single property name. Prisma may surface P1001 via either `PrismaClientKnownRequestError` (`err.code === "P1001"`, e.g., mid-query connection drops) or `PrismaClientInitializationError` (`err.errorCode === "P1001"`, e.g., client startup connection failure). To reliably detect the condition, check `err.code === "P1001" || err.errorCode === "P1001"`, and avoid review rules that would incorrectly flag `err.code === "P1001"` as unreachable/never-matching.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-03-31T21:37:27.212Z
Learnt from: isshaddad
Repo: triggerdotdev/trigger.dev PR: 3283
File: docs/migration-n8n.mdx:19-21
Timestamp: 2026-03-31T21:37:27.212Z
Learning: When reviewing code in `packages/trigger-sdk/src/v3`, treat `tasks.triggerAndWait()` and `tasks.batchTriggerAndWait()` as real exported APIs. They are defined in `shared.ts` and re-exported via the `tasks` object in `tasks.ts`, and they take the task ID string as their first argument (not a task instance). This is distinct from the instance methods `yourTask.triggerAndWait()` and `yourTask.batchTriggerAndWait()`. Do not flag calls to `tasks.triggerAndWait()` or `tasks.batchTriggerAndWait()` as non-existent or incorrectly invoked.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-05-17T08:08:12.370Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3644
File: packages/trigger-sdk/src/v3/ai.ts:8695-8746
Timestamp: 2026-05-17T08:08:12.370Z
Learning: In the Trigger v3 session resume/streams logic, ensure session resumption uses sequence cursors rather than timestamps. Specifically: for each turn-complete control record written to `session.out`, include a `session-in-event-id` header whose value is the committed-consume cursor (`session.in.lastDispatchedSeqNum`). On boot/resume, scan `session.out` for the latest turn-complete record, read the `session-in-event-id` header, and seed the `sessionStreams` manager for `.in` using both `lastSeqNum` and `lastDispatchedSeqNum` so previously processed user messages are not replayed. Do not use `setMinTimestamp`/`lastOutTimestamp` for resume ordering in this flow.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-05-18T14:19:56.437Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3655
File: packages/trigger-sdk/src/v3/ai.ts:8667-8731
Timestamp: 2026-05-18T14:19:56.437Z
Learning: In the Trigger SDK (v3) when making raw `fetch` calls to the Trigger API (including override paths such as `createChatStartSessionAction`), set the request headers to match `ApiClient`: `Content-Type`, `Authorization`, and `x-trigger-source: "sdk"`. Also forward the current preview branch by setting `x-trigger-branch` to `apiClientManager.branchName`. Prefer using the shared `overrideRequestHeaders(accessToken)` helper instead of manually constructing headers, so requests route correctly to preview environments.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-05-19T22:37:47.286Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3671
File: packages/trigger-sdk/test/recovery-boot.test.ts:456-457
Timestamp: 2026-05-19T22:37:47.286Z
Learning: In `packages/trigger-sdk` (Trigger.dev SDK), `logger.warn` (and other SDK logger methods) should route to the Trigger.dev structured logger sink, not to `console.warn`. In SDK tests, `vi.spyOn(console, "warn")` (or similar console spies) should only be used to suppress stray console output; reviewers should not suggest asserting on `console.warn` spies to verify SDK-internal warning/fallback log behavior. Use the SDK’s structured-logger outputs/capture approach instead of console spies.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-06-04T18:16:35.386Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3836
File: apps/supervisor/src/backpressure/backpressureMonitor.ts:3-5
Timestamp: 2026-06-04T18:16:35.386Z
Learning: When reviewing TypeScript in this repo, apply the rule “prefer type aliases over interfaces” only to data/object shapes and union/intersection type modeling. If an interface is being used as a behavioral contract for collaborators to implement (e.g., method-shape interfaces that define required behavior, such as `BackpressureLogger` / `BackpressureSignalSource` in `apps/supervisor/src/backpressure/backpressureMonitor.ts`), keep it as an `interface` and do not flag it as a type-alias-vs-interface violation.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
📚 Learning: 2026-06-09T17:58:04.699Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3879
File: apps/webapp/app/models/vercelIntegration.server.ts:619-630
Timestamp: 2026-06-09T17:58:04.699Z
Learning: In this codebase, outbound raw `fetch` calls should typically rely on Node/undici’s default request timeout (about ~300s) rather than adding a per-call `AbortController` + `setTimeout` wrapper inside individual functions (e.g. in files like `apps/webapp/app/models/vercelIntegration.server.ts`). During code review, do not flag the absence of a per-call timeout on a single `fetch` as an issue; if per-call timeouts are needed, they should be implemented via a codebase-wide convention (e.g., a shared fetch wrapper or documented pattern) rather than ad-hoc per-function changes.
Applied to files:
packages/trigger-sdk/src/v3/ai.ts
🔇 Additional comments (2)
packages/trigger-sdk/src/v3/ai.ts (2)
224-263: LGTM!Also applies to: 5166-5169, 9043-9059, 9401-9404
8670-8679: LGTM!
| } else { | ||
| // Hand-rolled chat.customAgent loops never set per-turn context, but | ||
| // the wrapper binds the session handle at run boot — thread the | ||
| // chatId from it so subtask chat helpers (`chat.stream.writer` | ||
| // with target "root") can open the parent's session. | ||
| const sessionHandle = locals.get(chatSessionHandleKey); | ||
| if (sessionHandle) { | ||
| toolMeta.chatId = sessionHandle.id; | ||
| } |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Preserve the external chatId contract here.
Line 972 copies sessionHandle.id into toolMeta.chatId. In this file, getChatSession().id is also exposed as chat.sessionId and documented as the Session friendlyId (session_*), while ToolCallExecutionOptions.chatId / ai.chatContext() are documented as the chat's external chatId. That means custom-loop subtasks will surface/store a different identifier than normal chat.agent turns. Persist the boot payload's chatId in locals and use that here instead of the Session handle id.
🛠️ Suggested direction
+const chatExternalIdKey = locals.create<string>("chat.externalId");
...
locals.set(chatSessionHandleKey, sessions.open(payload.chatId));
+locals.set(chatExternalIdKey, payload.chatId);
...
- const sessionHandle = locals.get(chatSessionHandleKey);
- if (sessionHandle) {
- toolMeta.chatId = sessionHandle.id;
+ const chatId = locals.get(chatExternalIdKey);
+ if (chatId) {
+ toolMeta.chatId = chatId;
}
Summary
Three fixes that bring custom agent loops (
chat.customAgenthand-rolled loops andchat.createSession) up to the behaviorchat.agentusers already get, and that the docs already promise:.inresume cursor is now seeded before any listener attaches, using the same boot logic aschat.agent.chat.pipeAndCapture(also backingturn.complete()) streamed without a server-generated message id, so aprepareStepinjection regenerated the assistant id mid-stream and the frontend replaced the partial message, discarding everything streamed before the injection.ai.toolExecutefailed with "chat.agent session handle is not initialized" because the parent's chatId only threaded from the per-turn context that hand-rolled loops never set. It now falls back to the session handle thechat.customAgentwrapper binds at run boot, so children can stream progress into the chat withchat.stream.writer({ target: "root" })(the documented sub-agent pattern).Root cause on the replay fix
Attaching any
.inlistener (chat.createStopSignal,chat.messages.on, the first wait) opens the SSE tail withLast-Event-IDtaken from the seq cursor at attach time. Custom loops attached before any cursor existed, so S2 replayed from seq 0. The fix resolves the cursor from the latest turn-complete header and seeds both manager cursors (setLastSeqNumdrives the SSE resume point,setLastDispatchedSeqNumgates waiter dispatch) before attach;chat.createSessionnow creates its stop signal lazily on the first iteration, after the seed. Seeding only the first cursor after attach does not work, which is why the earlier attempt at this was reverted.All three were reproduced red-green against the references ai-chat project: the replay repro showed the continuation wait consuming a stale message in 403ms with the real message arriving via steering injection; post-fix the wait consumes the real message directly with no injection. Steering now preserves the full in-flight response, and the deepResearch sub-agent streams its progress parts into a raw-loop parent. Existing behavior verified unchanged: full SDK unit suite,
chat.agentsteering, and stop-then-continue onchat.createSession.