Skip to content

[BUG]: AzureKeyVaultV2 exits with 57005 (0xDEAD) and no diagnostic output #22195

@bryanchen-d

Description

@bryanchen-d

Task

AzureKeyVault@2 — version 2.273.1

Environment

  • Agent: hosted Windows 1ES pool (Microsoft-managed), Agent.Version 4.273.0
  • Node runtime: node24 (C:\vss-agent\4.273.0\externals\node24\bin\node.exe)
  • Authentication: Workload Identity Federation (OIDC) service connection
  • Org: monacotools ADO, pipeline: microsoft/vscode product build

Problem

The task exits with exit code 57005 (0xDEAD) and the only output emitted is the agent's generic launcher message:

Exit code 57005 returned from process: file name 'C:\vss-agent\4.273.0\externals\node24\bin\node.exe',
  arguments '"D:\a\_work\_tasks\AzureKeyVault_1e244d32-2dd4-4165-96fb-b7441ca9331e\2.273.1\run.js"'

There is zero diagnostic output from the task itself — no stdout, no stderr, no ##[error], no indication of which phase failed (OIDC token exchange, AAD auth, vault data-plane connection, secret enumeration, individual secret fetch). The Node process is clearly being killed/crashing before the task can flush anything.

Example failing build (internal): https://dev.azure.com/monacotools/Monaco/_build/results?buildId=441275&view=logs&j=15ec5277-fdc6-550c-b203-acb65f44f31c&t=44617740-cedf-5dec-9e84-fa378148cb58

Step: Azure Key Vault: Get Secrets fetching github-distro-mixin-password.

The same exit-code signature was reported broadly across many ADO pipelines on 2026-05-20 (see Azure/azure-sdk-for-js#38609), so this is not isolated to one pipeline.

Why this matters

57005 / 0xDEAD alone tells the consumer nothing actionable. The root cause could be any of:

  • OIDC token endpoint timeout / network egress issue on the agent
  • AAD throttling or transient 5xx
  • Key Vault data-plane regional issue
  • DNS resolution failure (e.g. vault.azure.net, login.microsoftonline.com, vstoken.dev.azure.com)
  • An uncaught exception / unhandledRejection inside the task
  • The Node process being killed by the OS (OOM, signal, antivirus, etc.)
  • A regression in the 2.273.x task or node24 runner combo

Without any task-side log, every consumer has to guess and wait. Two prior 57005 issues (#20156, #20221) hit the same opaque exit code in AzurePowerShell@5 / AzureFileCopy@6 and required deep external investigation to attribute to a pwsh stack overflow — let's not repeat that for the Node tasks.

Ask

Improve diagnostic surface in AzureKeyVaultV2:

  1. Top-level safety net in the entry point — wrap main() in a try/catch that calls tl.error(err.stack ?? err.message) and tl.setResult(tl.TaskResult.Failed, ...) before the process exits, so anything that throws gets logged.
  2. uncaughtException / unhandledRejection handlers that log the error and call tl.setResult before exit. These are not redundant with (1) — they catch async failures that escape the awaited chain.
  3. Phase heartbeat logs at tl.debug or even info level so the last line printed identifies the phase that died:
    • resolving service connection auth scheme
    • requesting OIDC token from <endpoint>
    • exchanging OIDC token for AAD access token
    • connecting to vault <name>.vault.azure.net
    • listing secrets / fetching secret <name>
  4. Distinguish signal-kill from thrown error — on process.on('exit'), log the exit code and whether a signal was received, so consumers can tell "task threw" vs. "OS killed Node" vs. "exited cleanly with non-zero code".
  5. Optionally: surface the underlying @azure/identity / @azure/keyvault-secrets error details (status code, request ID, endpoint) instead of letting a rejection propagate raw.

Even a one-line "phase X failed: " would turn this from "ticket-the-platform-team" into "fix-the-network/secret/cred-immediately" for the vast majority of consumers.

Repro

Not deterministic — observed transiently on hosted Windows 1ES agents under WIF auth. Cannot reproduce on demand; the diagnostic improvements above are precisely what's needed to attribute future occurrences.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions