Feature request: expose extended prompt-cache TTL (1h) for orchestrator-shaped sessions

## Use case

We run a multi-agent build orchestrator on the Agent SDK: a long-lived main session dispatches specialist subagents (via the Agent tool) and waits for them to return. Those waits routinely exceed the 5-minute prompt-cache TTL.

## Measured impact

Profiling one production-shaped session (37 API turns, ~430k-token context, usage taken from the SDK transcript JSONL):

- Two dispatch-wait gaps of ~8–9 minutes each expired the cache; the next turns show `cache_read_input_tokens: 0` and `cache_creation_input_tokens` of ~426k and ~437k respectively — full-context re-writes at the 1.25× creation rate.
- Those two events account for **~75% of the session's total cache-creation spend**. Steady-state turns are cache-clean (creations of a few hundred tokens against growing reads), so the TTL is the dominant avoidable cost for this workload shape.

The Anthropic API's extended-TTL beta (1-hour `cache_control` TTL) would convert these re-writes into reads, but as far as we can tell neither `Options` in `sdk.d.ts` nor any documented env var exposes it through the SDK.

## Ask

A way to opt a session into the extended cache TTL — an `Options` field, or honouring an env var passed through to the underlying client. Happy to provide fuller traces if useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: expose extended prompt-cache TTL (1h) for orchestrator-shaped sessions #344

Use case

Measured impact

Ask

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature request: expose extended prompt-cache TTL (1h) for orchestrator-shaped sessions #344

Description

Use case

Measured impact

Ask

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions