feat(evaluation): add pi0 dispatch and VLA schema v1 reconciliation to policy_runner

## Problem

`evaluation/sil/policy_runner.py` only dispatches ACT policies. With pi0 training landing (#916), there is no SIL eval path for pi0 checkpoints. Additionally, P1 (#927) introduces VLA evaluation schema v1 emission in `run_evaluation.py`, which needs to be reconciled with the existing toolchain metrics schema for consistency across policies.

## Proposed solution

- Add pi0 / pi0_fast dispatch branch to `policy_runner.py`
- Add a pi0 eval submit script + workflow YAML
- Reconcile VLA schema v1 fields with the toolchain's existing eval output so both ACT and pi0 evals emit the same shape

## Acceptance criteria

- [ ] `policy_runner.py` accepts pi0 checkpoints and runs SIL rollouts
- [ ] Eval submit script + workflow YAML for pi0
- [ ] ACT and pi0 emit identical schema v1 JSON
- [ ] Metric schema documented in eval README

## Dependencies

- Blocked on: #927 (schema v1 emitter contract must stabilize in review first)

## Estimate

3–4 person-days

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(evaluation): add pi0 dispatch and VLA schema v1 reconciliation to policy_runner #930

Problem

Proposed solution

Acceptance criteria

Dependencies

Estimate

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(evaluation): add pi0 dispatch and VLA schema v1 reconciliation to policy_runner #930

Description

Problem

Proposed solution

Acceptance criteria

Dependencies

Estimate

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions