Skip to content

feat(training): port lerobot_act_train_end_to_end AzureML pipeline #928

@junkataoka

Description

@junkataoka

Problem

The toolchain trains LeRobot policies via a single AzureML CommandJob (training/il/scripts/submit-azureml-lerobot-training.sh) and stops there. There is no AzureML Pipeline that bundles preprocess → train → evaluate as a DAG, and no opt-in register step. Users have to manually chain three submits and check artifacts between each.

Proposed solution

Port lerobot_act_train_end_to_end as an AzureML Pipeline

  • 3-step DAG (preprocess → train → evaluate) as the default pipeline
  • 4-step DAG (+ register) as an opt-in second pipeline file
  • No eval-gates-register enforcement
  • No external lineage hooks / no eval-owned candidate-tag patching
  • Compute names parameterized as pipeline inputs
  • Hydra removed from train / evaluate components in favor of env-var style consistent with the rest of the toolchain

Acceptance criteria

  • az ml job create --file training/il/workflows/azureml/lerobot-pipeline.yaml runs the 3-step DAG end-to-end
  • Opt-in 4-step pipeline registers the trained model on success
  • No engagement-specific args remain in component YAMLs
  • README documents both pipelines and submit script

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions