feat: add LeRobot imitation learning pipelines for OSMO and Azure ML by akzaidi · Pull Request #165 · Azure-Samples/azure-nvidia-robotics-reference-architecture

akzaidi · 2026-02-12T00:07:15Z

PR Soundtrack: Gorillaz - White Flag

Summary

Adds end-to-end LeRobot imitation learning support across OSMO and Azure ML, covering training (multiple data sources), checkpoint management, and edge inference.

What Changed

Training Pipelines

OSMO workflows for ACT policy training from three data sources: HuggingFace Hub, Azure Blob Storage, and OSMO-managed datasets (lerobot-train.yaml, lerobot-train-dataset.yaml)
Azure ML workflow (lerobot-train.yaml) with MLflow experiment tracking, system metrics, and model registration
Consolidated workflow using base64-encoded zip payload pattern (matching IsaacLab workflow convention) instead of fragile git clone
Submission scripts (submit-osmo-lerobot-training.sh, submit-azureml-lerobot-training.sh) with full CLI for dataset, policy, and compute configuration
Pipeline script (run-lerobot-pipeline.sh) for train → evaluate → register flow

Training Modules (`src/training/scripts/lerobot/`)

Module	Purpose
`train.py`	Core training loop with MLflow logging and configurable hyperparameters
`checkpoints.py`	Checkpoint upload and model registration via Azure ML SDK
`download_dataset.py`	Dataset acquisition from HuggingFace Hub or Azure Blob Storage
`bootstrap.py`	Environment setup, dependency installation, payload extraction

Inference

PolicyRunner — framework-agnostic wrapper for ACT policy inference with normalization stats
robot_types.py — observation and command data classes with UR10E joint mapping
act_inference_node.py — ROS2 node with dry-run safety gate and JointTrajectory publishing
OSMO inference workflow (lerobot-infer.yaml) and submission script
Offline test script (test-lerobot-inference.py) for validating policy output shape and normalization

Checkpoint Management

Replaced mlflow.register_model with MLClient.models.create_or_update to avoid azureml_artifacts_builder tracking URI bug
Shared _get_aml_client and _register_model_via_aml helpers for consistent registration

Documentation

docs/lerobot-inference.md — inference setup with AML and HuggingFace model pull instructions
Updated scripts/README.md, workflows/README.md, workflows/osmo/README.md, and workflows/azureml/README.md with LeRobot usage

Files Changed

 23 files changed, 4520 insertions(+), 79 deletions(-)

New files (19): Training modules, inference modules, workflows, submission scripts, docs
Modified files (4): READMEs, VS Code settings

Key Design Decisions

MLflow-only logging — removed WANDB support in favor of MLflow with system metrics for consistency with the existing Azure ML integration
Payload packaging — training source (src/training/) is base64-zip encoded into the workflow YAML, avoiding container image rebuilds for training logic changes
Azure ML SDK for registration — direct MLClient usage instead of MLflow's register_model to work around tracking URI limitations in OSMO environments
Dry-run gate on inference — ROS2 node requires explicit opt-in before publishing joint commands to physical hardware

- add OSMO workflows for ACT training (HF Hub, Azure Blob, OSMO dataset sources) and inference - add Azure ML workflow and submission script for LeRobot training with MLflow integration - add end-to-end pipeline script for train → evaluate → register flow - add offline inference test script with pre/post processor normalization - update scripts and workflows README docs with LeRobot usage 🤖 - Generated by Copilot

- add robot observation and command data classes with UR10E joint mapping - add framework-agnostic PolicyRunner wrapping ACT policy with normalization - add ROS2 inference node with dry-run safety gate and JointTrajectory publishing - add inference documentation with AML and HuggingFace model pull instructions 🤖 - Generated by Copilot

- add --from-blob, --storage-account, --blob-prefix to submit script - add MLflow metric logging and checkpoint upload to azure-data workflow - decouple blob dataset container from log storage container env var - remove ad-hoc submit-lerobot-training.sh script 🚀 - Generated by Copilot

- remove azure.ai.ml and azure.identity dependencies from training workflows - use mlflow.log_artifacts and mlflow.register_model for checkpoint registration - pass active MLflow run to upload functions instead of standalone ML client - update registration step to use MLflow tracking URI and experiment context - remove unused threading and datetime imports 🔄 - Generated by Copilot

… packaging - merge lerobot-train-azure-data.yaml into lerobot-train.yaml with conditional blob handling - replace fragile git clone with base64-encoded zip payload pattern matching IsaacLab workflow - remove WANDB support in favor of MLflow-only logging with system metrics - delegate training logic to Python modules via packaged src/training payload 🔧 - Generated by Copilot

…er_model - replace mlflow.register_model with MLClient.models.create_or_update to avoid azureml_artifacts_builder tracking_uri bug - extract shared _get_aml_client and _register_model_via_aml helpers - simplify upload_checkpoints_to_azure_ml to use shared helper 🐛 - Generated by Copilot

- accept main's markdownlint-cli2 config and chat location settings - keep [json] editor settings from feature branch - adopt main's table formatting style in workflows README

github-actions · 2026-02-12T00:14:45Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copilot

Pull request overview

Adds end-to-end LeRobot imitation learning support across OSMO and Azure ML by introducing new workflow templates, submission/pipeline scripts, and Python modules for training, dataset acquisition, checkpoint registration, and ACT policy inference.

Changes:

Added OSMO workflows for LeRobot training (inline payload + dataset mount) and evaluation/optional AML model registration.
Added Azure ML command-job template + submission script for LeRobot training with environment registration.
Introduced new Python modules for LeRobot training orchestration, blob dataset download/prep, checkpoint upload/registration, and ACT inference (including a ROS2 node), plus updated docs/READMEs and VS Code settings.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 15 comments.

Show a summary per file

File	Description
workflows/osmo/lerobot-train.yaml	OSMO LeRobot training workflow using inline base64 payload and MLflow logging.
workflows/osmo/lerobot-train-dataset.yaml	OSMO LeRobot training workflow using an OSMO dataset mount (includes WANDB/MLflow toggles).
workflows/osmo/lerobot-infer.yaml	OSMO evaluation workflow that downloads a policy and optionally registers it to AML.
workflows/osmo/README.md	Documents new OSMO LeRobot workflows and submission commands.
workflows/azureml/lerobot-train.yaml	AzureML command-job template for LeRobot training submission.
workflows/azureml/README.md	Documents AzureML LeRobot training template and usage.
workflows/README.md	Updates workflow directory overview + adds LeRobot examples/sections.
src/training/scripts/lerobot/train.py	MLflow-wrapping training orchestrator that parses logs and uploads/registers checkpoints.
src/training/scripts/lerobot/download_dataset.py	Azure Blob dataset download + dataset fixes (stats/timestamps).
src/training/scripts/lerobot/checkpoints.py	Checkpoint artifact upload and AML model registration helpers.
src/training/scripts/lerobot/bootstrap.py	AML MLflow bootstrap + HuggingFace authentication helpers.
src/training/scripts/lerobot/init.py	Package init for LeRobot training scripts.
src/inference/scripts/act_inference_node.py	ROS2 node for running ACT inference and optionally publishing joint commands.
src/inference/robot_types.py	Observation/command dataclasses for ACT inference integration.
src/inference/policy_runner.py	Framework-agnostic ACT policy runner that normalizes inputs and produces joint commands.
scripts/test-lerobot-inference.py	Offline ACT inference validation script against dataset observations.
scripts/submit-osmo-lerobot-training.sh	Submits OSMO LeRobot training workflow and packages training payload inline.
scripts/submit-osmo-lerobot-inference.sh	Submits OSMO LeRobot evaluation workflow with optional AML registration.
scripts/submit-azureml-lerobot-training.sh	Registers AzureML environment and submits LeRobot training job via `az ml job create`.
scripts/run-lerobot-pipeline.sh	Orchestrates train → wait/poll → evaluate → optional registration in OSMO.
scripts/README.md	Adds LeRobot scripts and pipeline usage documentation.
docs/lerobot-inference.md	New documentation for offline inference + ROS2 deployment.
.vscode/settings.json	Adds excludes for datasets/cache folders and tweaks JSON editor settings.

workflows/osmo/lerobot-train-dataset.yaml

scripts/submit-azureml-lerobot-training.sh

workflows/azureml/lerobot-train.yaml

scripts/run-lerobot-pipeline.sh

src/training/scripts/lerobot/download_dataset.py

src/inference/policy_runner.py

src/training/scripts/lerobot/train.py

- fix f-string quoting bug and add subscription param in azureml submission script - replace empty string defaults with 'none' sentinels in azureml job template - add ENCODED_ARCHIVE guard and jq dependency for osmo workflows - fix DefaultAzureCredential to use workload identity in download_dataset.py - remove WANDB references from docs, add front matter to lerobot-inference.md 🔧 - Generated by Copilot

- switch table separators to compact |---| style matching main - restore OSMO inference parameters table rows - remove lerobot-train-dataset.yaml from directory tree 📝 - Generated by Copilot

- remove extra space after ## in LeRobot Inference heading (MD019) - rename duplicate Inference Parameters heading to OSMO Inference Parameters (MD024) 📝 - Generated by Copilot

…kflow templates table

akzaidi added 7 commits February 6, 2026 12:47

fix: resolve merge conflicts with main

67e24b5

- accept main's markdownlint-cli2 config and chat location settings - keep [json] editor settings from feature branch - adopt main's table formatting style in workflows README

akzaidi marked this pull request as ready for review February 12, 2026 01:12

akzaidi requested review from MarkForesman, agreaves-ms and Copilot February 12, 2026 01:12

Copilot started reviewing on behalf of akzaidi February 12, 2026 01:13 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

nguyena2 approved these changes Feb 12, 2026

View reviewed changes

akzaidi added 4 commits February 12, 2026 14:17

style: align table separators with main branch formatting

76d06cd

- switch table separators to compact |---| style matching main - restore OSMO inference parameters table rows - remove lerobot-train-dataset.yaml from directory tree 📝 - Generated by Copilot

fix: resolve markdown lint errors in workflows README

520cf2c

- remove extra space after ## in LeRobot Inference heading (MD019) - rename duplicate Inference Parameters heading to OSMO Inference Parameters (MD024) 📝 - Generated by Copilot

docs(workflows): remove lerobot-train-dataset reference from OSMO wor…

0305ebb

…kflow templates table

akzaidi merged commit baef32d into main Feb 12, 2026
8 checks passed

akzaidi deleted the feat/lerobot-il branch February 12, 2026 23:05

azure-robotics-release-bot bot mentioned this pull request Feb 12, 2026

chore(main): release 0.3.0 #173

Closed

WilliamBerryiii mentioned this pull request Feb 13, 2026

fix(workflows): remove release-please skip guard that prevents tag creation on release merges #174

Open

2 tasks

azure-robotics-release-bot bot mentioned this pull request Feb 18, 2026

chore(main): release 0.3.0 #313

Merged

WilliamBerryiii added this to the v0.3.0 milestone Mar 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add LeRobot imitation learning pipelines for OSMO and Azure ML#165

feat: add LeRobot imitation learning pipelines for OSMO and Azure ML#165
akzaidi merged 11 commits intomainfrom
feat/lerobot-il

akzaidi commented Feb 12, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

akzaidi commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Training Pipelines

Training Modules (src/training/scripts/lerobot/)

Inference

Checkpoint Management

Documentation

Files Changed

Key Design Decisions

Uh oh!

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

akzaidi commented Feb 12, 2026 •

edited

Loading

Training Modules (`src/training/scripts/lerobot/`)

github-actions bot commented Feb 12, 2026 •

edited

Loading