Skip to content

Bug: Runs report success (exit 0) when output is hallucinated, empty, or malformed #1392

@sauravbhattacharya001

Description

@sauravbhattacharya001

Bug Description

Runs complete with exit code 0 and no error signal even when the output is objectively broken — hallucinated file paths, empty diffs, truncated responses, or format that doesn't match what was requested.

Reproduction

  1. Trigger a claude-code-action run on a moderately complex task
  2. Run completes successfully (green check)
  3. Inspect the output: contains references to files that don't exist, or entire sections are empty/placeholder text

Expected Behavior

A run that produces hallucinated content, empty output, or malformed structure should not report as successful. At minimum, the action should expose output signals (e.g. \steps.claude.outputs.validation_passed) so downstream steps can gate on quality — not just exit code.

Actual Behavior

Exit code 0, green checkmark, no indication anything is wrong. Users discover the problem only on manual review.

Impact

  • In CI/CD pipelines, a silently broken run wastes an entire cycle
  • Automated triggers (cron, issue assignment) produce garbage with no alert
  • Teams lose trust in the action because 'success' doesn't mean 'correct'

Environment

  • Running 16+ automated agent jobs daily
  • Multiple repos, cron-triggered and event-triggered
  • Problem frequency: ~15-20% of runs produce output that should not have passed

Suggested Fix

A lightweight validation check before marking a run as successful. Could be opt-in:

\\yaml

  • uses: anthropics/claude-code-action@v1
    with:
    validate_output: true
    validation_checks: 'format,completeness,hallucination'
    \\

Related: built this pattern in agent-eval — deterministic checks first, heuristics second, model-judge only when needed. Happy to contribute.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfeature-requestp3Minor bug or general feature request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions