Skip to content

Feature: Validate structured_output against --json-schema as an agent eval gate (fail-closed for downstream safety) #1408

@sauravbhattacharya001

Description

@sauravbhattacharya001

Problem

When --json-schema is supplied via claude_args, the action exposes the model's structured result as the structured_output step output, and the documented pattern is to branch real CI behavior on it with fromJSON(). The shipped example examples/test-failure-analysis.yml does exactly this — it auto-retries a test suite when fromJSON(...).is_flaky == true && fromJSON(...).confidence >= 0.7.

The gap is that the action never validates structured_output against the schema it was given. Reading base-action/src/run-claude-sdk.ts, the only schema-related guard fails the run when structured output is entirely absent:

// --json-schema was provided but Claude did not return structured_output.

There is no check that the returned object actually conforms to the supplied schema — required keys present, types correct, enums/ranges respected. A run can emit structured_output that:

  • is missing a required field that downstream fromJSON() reads,
  • has the right keys but wrong types (e.g. confidence: "high" instead of a number), or
  • silently drops a field, so fromJSON(...).field evaluates to null/empty and a downstream if: condition resolves to a default-but-wrong branch.

In every one of these cases the step still reports conclusion: success, and the autonomous action gated on it (retry the suite, merge, deploy, label, comment) proceeds on a malformed contract. For a coercible numeric like confidence, an out-of-range or wrong-typed value flowing into a >= 0.7 gate is a safety-relevant failure, not a cosmetic one.

This is the structured-contract analogue of #1392 (runs report success when output is hallucinated/empty/malformed) and a natural extension of the validation surface from #1388. #1392 covers the prose/diff layer; this covers the machine-readable layer that pipelines actually branch on. ajv is already present transitively via the MCP SDK, so the validator dependency is effectively free.

Proposal

Validate structured_output against the provided --json-schema before it is exposed as a step output, and surface the result as a first-class, gateable signal. Default to fail-closed so an invalid contract cannot silently drive a downstream action.

A minimal, opt-out-able shape:

- uses: anthropics/claude-code-action@v1
  id: triage
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    claude_args: |
      --json-schema '{"type":"object","properties":{"is_flaky":{"type":"boolean"},"confidence":{"type":"number","minimum":0,"maximum":1}},"required":["is_flaky","confidence"]}'
    # NEW: what to do when structured_output does not conform to --json-schema
    on_schema_violation: fail        # fail (default) | warn | ignore

New step outputs:

outputs:
  structured_output_valid:        # "true" | "false"
  structured_output_violations:   # JSON array of ajv-style errors, "[]" when valid

Downstream gating can then require a valid contract before taking any action:

- name: Retry flaky tests
  # only act when the agent's contract actually conformed to the schema
  if: >
    steps.triage.outputs.structured_output_valid == 'true' &&
    fromJSON(steps.triage.outputs.structured_output).is_flaky == true &&
    fromJSON(steps.triage.outputs.structured_output).confidence >= 0.7
  run: ./scripts/retry-tests.sh

Behavior:

  • fail (default): schema violation → core.setFailed() with the violation list, mirroring the existing "did not return structured_output" failure path. The unsafe downstream action never runs.
  • warn: emit a warning + populate structured_output_violations, leave structured_output_valid=false, do not fail — lets teams adopt incrementally.
  • ignore: preserve today's pass-through behavior for back-compat.

Validation should run with the same ajv configuration semantics as the schema dialect Claude was prompted with, and structured_output_violations should be stable/machine-readable so it can feed the structured telemetry proposed in #1390 and the regression baselines in #1394.

Safety angle

Schema enforcement is the boundary between "the agent emitted JSON" and "the agent emitted the contract this pipeline is allowed to act on." Today that boundary is unguarded: an autonomous workflow can retry, merge, or deploy based on a field that is absent or wrong-typed, with a green check the whole way. Fail-closed validation converts a silent malformed-contract failure into an explicit, debuggable stop — which is exactly the property you want before an agent is permitted to take an irreversible action. It also makes structured agent outputs eval-gradeable: a checkpoint can assert "valid against schema" as a hard precondition rather than re-deriving conformance in every consuming workflow.

This builds on #1388 (output validation) and #1392 (false-success on malformed output), and feeds #1390 / #1394 (telemetry + regression detection) by making contract conformance a first-class, trended signal.

Offer to contribute

I'm building agent-eval — a tiered eval framework for AI agent outputs, where schema/contract conformance is the cheapest tier-0 gate before any semantic checks run. Happy to contribute the ajv-backed validation path, the on_schema_violation input, and the two new outputs (plus tests covering missing-required, wrong-type, and out-of-range cases) if there's interest in this direction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfeature-requestp3Minor bug or general feature request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions