Skip to content

feat(evaluation): add pi0 dispatch and VLA schema v1 reconciliation to policy_runner #930

@junkataoka

Description

@junkataoka

Problem

evaluation/sil/policy_runner.py only dispatches ACT policies. With pi0 training landing (#916), there is no SIL eval path for pi0 checkpoints. Additionally, P1 (#927) introduces VLA evaluation schema v1 emission in run_evaluation.py, which needs to be reconciled with the existing toolchain metrics schema for consistency across policies.

Proposed solution

  • Add pi0 / pi0_fast dispatch branch to policy_runner.py
  • Add a pi0 eval submit script + workflow YAML
  • Reconcile VLA schema v1 fields with the toolchain's existing eval output so both ACT and pi0 evals emit the same shape

Acceptance criteria

  • policy_runner.py accepts pi0 checkpoints and runs SIL rollouts
  • Eval submit script + workflow YAML for pi0
  • ACT and pi0 emit identical schema v1 JSON
  • Metric schema documented in eval README

Dependencies

Estimate

3–4 person-days

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions