Problem
When --json-schema is supplied via claude_args, the action exposes the model's structured result as the structured_output step output, and the documented pattern is to branch real CI behavior on it with fromJSON(). The shipped example examples/test-failure-analysis.yml does exactly this — it auto-retries a test suite when fromJSON(...).is_flaky == true && fromJSON(...).confidence >= 0.7.
The gap is that the action never validates structured_output against the schema it was given. Reading base-action/src/run-claude-sdk.ts, the only schema-related guard fails the run when structured output is entirely absent:
// --json-schema was provided but Claude did not return structured_output.
There is no check that the returned object actually conforms to the supplied schema — required keys present, types correct, enums/ranges respected. A run can emit structured_output that:
- is missing a
required field that downstream fromJSON() reads,
- has the right keys but wrong types (e.g.
confidence: "high" instead of a number), or
- silently drops a field, so
fromJSON(...).field evaluates to null/empty and a downstream if: condition resolves to a default-but-wrong branch.
In every one of these cases the step still reports conclusion: success, and the autonomous action gated on it (retry the suite, merge, deploy, label, comment) proceeds on a malformed contract. For a coercible numeric like confidence, an out-of-range or wrong-typed value flowing into a >= 0.7 gate is a safety-relevant failure, not a cosmetic one.
This is the structured-contract analogue of #1392 (runs report success when output is hallucinated/empty/malformed) and a natural extension of the validation surface from #1388. #1392 covers the prose/diff layer; this covers the machine-readable layer that pipelines actually branch on. ajv is already present transitively via the MCP SDK, so the validator dependency is effectively free.
Proposal
Validate structured_output against the provided --json-schema before it is exposed as a step output, and surface the result as a first-class, gateable signal. Default to fail-closed so an invalid contract cannot silently drive a downstream action.
A minimal, opt-out-able shape:
- uses: anthropics/claude-code-action@v1
id: triage
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
claude_args: |
--json-schema '{"type":"object","properties":{"is_flaky":{"type":"boolean"},"confidence":{"type":"number","minimum":0,"maximum":1}},"required":["is_flaky","confidence"]}'
# NEW: what to do when structured_output does not conform to --json-schema
on_schema_violation: fail # fail (default) | warn | ignore
New step outputs:
outputs:
structured_output_valid: # "true" | "false"
structured_output_violations: # JSON array of ajv-style errors, "[]" when valid
Downstream gating can then require a valid contract before taking any action:
- name: Retry flaky tests
# only act when the agent's contract actually conformed to the schema
if: >
steps.triage.outputs.structured_output_valid == 'true' &&
fromJSON(steps.triage.outputs.structured_output).is_flaky == true &&
fromJSON(steps.triage.outputs.structured_output).confidence >= 0.7
run: ./scripts/retry-tests.sh
Behavior:
fail (default): schema violation → core.setFailed() with the violation list, mirroring the existing "did not return structured_output" failure path. The unsafe downstream action never runs.
warn: emit a warning + populate structured_output_violations, leave structured_output_valid=false, do not fail — lets teams adopt incrementally.
ignore: preserve today's pass-through behavior for back-compat.
Validation should run with the same ajv configuration semantics as the schema dialect Claude was prompted with, and structured_output_violations should be stable/machine-readable so it can feed the structured telemetry proposed in #1390 and the regression baselines in #1394.
Safety angle
Schema enforcement is the boundary between "the agent emitted JSON" and "the agent emitted the contract this pipeline is allowed to act on." Today that boundary is unguarded: an autonomous workflow can retry, merge, or deploy based on a field that is absent or wrong-typed, with a green check the whole way. Fail-closed validation converts a silent malformed-contract failure into an explicit, debuggable stop — which is exactly the property you want before an agent is permitted to take an irreversible action. It also makes structured agent outputs eval-gradeable: a checkpoint can assert "valid against schema" as a hard precondition rather than re-deriving conformance in every consuming workflow.
This builds on #1388 (output validation) and #1392 (false-success on malformed output), and feeds #1390 / #1394 (telemetry + regression detection) by making contract conformance a first-class, trended signal.
Offer to contribute
I'm building agent-eval — a tiered eval framework for AI agent outputs, where schema/contract conformance is the cheapest tier-0 gate before any semantic checks run. Happy to contribute the ajv-backed validation path, the on_schema_violation input, and the two new outputs (plus tests covering missing-required, wrong-type, and out-of-range cases) if there's interest in this direction.
Problem
When
--json-schemais supplied viaclaude_args, the action exposes the model's structured result as thestructured_outputstep output, and the documented pattern is to branch real CI behavior on it withfromJSON(). The shipped exampleexamples/test-failure-analysis.ymldoes exactly this — it auto-retries a test suite whenfromJSON(...).is_flaky == true && fromJSON(...).confidence >= 0.7.The gap is that the action never validates
structured_outputagainst the schema it was given. Readingbase-action/src/run-claude-sdk.ts, the only schema-related guard fails the run when structured output is entirely absent:// --json-schema was provided but Claude did not return structured_output.There is no check that the returned object actually conforms to the supplied schema — required keys present, types correct, enums/ranges respected. A run can emit
structured_outputthat:requiredfield that downstreamfromJSON()reads,confidence: "high"instead of anumber), orfromJSON(...).fieldevaluates tonull/empty and a downstreamif:condition resolves to a default-but-wrong branch.In every one of these cases the step still reports
conclusion: success, and the autonomous action gated on it (retry the suite, merge, deploy, label, comment) proceeds on a malformed contract. For a coercible numeric likeconfidence, an out-of-range or wrong-typed value flowing into a>= 0.7gate is a safety-relevant failure, not a cosmetic one.This is the structured-contract analogue of #1392 (runs report success when output is hallucinated/empty/malformed) and a natural extension of the validation surface from #1388. #1392 covers the prose/diff layer; this covers the machine-readable layer that pipelines actually branch on.
ajvis already present transitively via the MCP SDK, so the validator dependency is effectively free.Proposal
Validate
structured_outputagainst the provided--json-schemabefore it is exposed as a step output, and surface the result as a first-class, gateable signal. Default to fail-closed so an invalid contract cannot silently drive a downstream action.A minimal, opt-out-able shape:
New step outputs:
Downstream gating can then require a valid contract before taking any action:
Behavior:
fail(default): schema violation →core.setFailed()with the violation list, mirroring the existing "did not return structured_output" failure path. The unsafe downstream action never runs.warn: emit a warning + populatestructured_output_violations, leavestructured_output_valid=false, do not fail — lets teams adopt incrementally.ignore: preserve today's pass-through behavior for back-compat.Validation should run with the same
ajvconfiguration semantics as the schema dialect Claude was prompted with, andstructured_output_violationsshould be stable/machine-readable so it can feed the structured telemetry proposed in #1390 and the regression baselines in #1394.Safety angle
Schema enforcement is the boundary between "the agent emitted JSON" and "the agent emitted the contract this pipeline is allowed to act on." Today that boundary is unguarded: an autonomous workflow can retry, merge, or deploy based on a field that is absent or wrong-typed, with a green check the whole way. Fail-closed validation converts a silent malformed-contract failure into an explicit, debuggable stop — which is exactly the property you want before an agent is permitted to take an irreversible action. It also makes structured agent outputs eval-gradeable: a checkpoint can assert "valid against schema" as a hard precondition rather than re-deriving conformance in every consuming workflow.
This builds on #1388 (output validation) and #1392 (false-success on malformed output), and feeds #1390 / #1394 (telemetry + regression detection) by making contract conformance a first-class, trended signal.
Offer to contribute
I'm building agent-eval — a tiered eval framework for AI agent outputs, where schema/contract conformance is the cheapest tier-0 gate before any semantic checks run. Happy to contribute the
ajv-backed validation path, theon_schema_violationinput, and the two new outputs (plus tests covering missing-required, wrong-type, and out-of-range cases) if there's interest in this direction.