Skip to content

feat(robot): OpenPI policy harness, H.264 trace video, rollout batching against one agent#425

Open
lukass16 wants to merge 19 commits into
v6from
v6-robot-3
Open

feat(robot): OpenPI policy harness, H.264 trace video, rollout batching against one agent#425
lukass16 wants to merge 19 commits into
v6from
v6-robot-3

Conversation

@lukass16

@lukass16 lukass16 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Issue

The v6 robot harness needed to drive real OpenPI policy servers, run concurrent rollouts efficiently, and stream camera data to traces without bloating each step with JPEG frames. Slow sim boots (e.g. Isaac Sim) also exceeded the default env connect timeout.

Solution

  • Add RemoteModel — WebSocket/msgpack client for OpenPI policy servers (lazy connect, supports actions / action response keys).
  • Add BatchedAgent / BatchedModel — coalesce concurrent ainfer() calls into stacked forwards for parallel rollouts.
  • Adopt OpenPI slash-delimited observation keys end-to-end; add OpenPIAdapter so a stock OpenPI server drives the harness with no agent changes.
  • Stream per-camera H.264/CMAF video via VideoStreamer (hud/agents/robot/video.py); numeric state stays on ObservationStep, frames go as VideoSegmentStep spans.
  • Raise RobotClient connect ready_timeout default to 240s for slow container boots.
  • Also includes Modal/Daytona eval runtime providers merged from lukass/modal-daytona-runtimes.

Outcome / Verification

  • Robot rollout against OpenPI policy server via RemoteModel + OpenPIAdapter
  • Concurrent rollouts via BatchedAgent(batch_size=N)
  • Trace shows video_segment spans with playable H.264 segments
  • Env connect succeeds on slow Isaac Sim boots

Note

Medium Risk
Robot rollout, inference batching, and trace shape change observability (video segments vs per-step images); connect timeout and init download behavior affect all env provisioning paths.

Overview
Robot harness gains an OpenPI path: RemoteModel talks to a policy server over WebSocket, OpenPIAdapter maps observations to OpenPI wire keys, and Model is now stateless with a fixed [N, T, A] batch contract (LeRobot inlined; Ensembler / lerobot_infer removed). BatchedModel / BatchedAgent coalesce concurrent ainfer calls into one forward for in-process models only (RemoteModel stays one agent per rollout).

Tracing stops embedding per-tick JPEGs on ObservationStep; RobotAgent runs VideoStreamer (PyAV/x264 CMAF) and emits VideoSegmentStep spans with optional trace_id on Step.emit. RobotClient.get_control_rate() drives encoder FPS. The robot extra now requires av>=12.

Platform polish: default connect(..., ready_timeout) rises 120s → 240s; hud init can download GitHub starter presets (--preset / TTY picker) with safe tarball extract; RL cookbook uses file-level MODEL / TASKSET instead of HUD_MODEL / HUD_TASKSET; new v6 Environments and Tasks docs plus .gitignore exception so docs/v6/build/ stays tracked; version 0.6.1.

Reviewed by Cursor Bugbot for commit 4c85e4a. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread hud/agents/robot/batching.py Outdated
Comment thread hud/agents/robot/batching.py
Comment thread hud/agents/robot/batching.py
Comment thread hud/agents/robot/model.py Outdated
Comment thread hud/agents/robot/batching.py
Comment thread hud/capabilities/robot.py Outdated
Comment thread hud/eval/runtime.py Outdated
except Exception: # not found: build it under this name
await daytona.snapshot.create(
CreateSnapshotParams(name=self.snapshot_name, image=self._image)
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Daytona snapshot probe swallows errors

Medium Severity

DaytonaRuntime._ensure_snapshot treats any snapshot.get failure like a missing snapshot and always calls snapshot.create. Transient API or auth errors can trigger a redundant create attempt and mark the snapshot resolved, hiding the real failure until sandbox startup.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 446a05b. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

There are 4 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 82c1ef8. Configure here.

Comment thread hud/agents/robot/video.py
if self._init_sent and btype == b"mdat":
self._dispatch(self._pending)
self._pending = b""
return len(b) # return the number of bytes written

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MP4 sink buffer grows unbounded

High Severity

SegmentEncoder.write advances _scan after extracting MP4 boxes but never discards consumed bytes from _buf, while _pos keeps growing with every mux write. Each camera encoder retains a full copy of all muxed output for the episode, so long rollouts or many cameras can inflate memory without bound.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 82c1ef8. Configure here.

Comment thread hud/agents/robot/agent.py
# Start camera video at env's control rate; capture trace id for encoder span attribution.
self._video = video.VideoStreamer(
fps=client.get_control_rate(), trace_id=get_current_trace_id()
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LeRobot policy not reset per episode

Medium Severity

Episode startup no longer calls policy.reset() on LeRobot checkpoints. The prior harness reset the policy (and optional ensembler) in on_episode_start; that hook was removed while reusing the same LeRobotModel across sequential rollouts, so internal episode state can carry into the next episode.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 82c1ef8. Configure here.

Comment thread hud/agents/robot/model.py
"""Ship one request dict → the server's ``[T, A]`` chunk, returned as ``[1, T, A]``."""
self.connect() # lazy connect on first call (blocks until the server is up)
chunk = np.asarray(self._client.infer(batch)[self.response_key], dtype=np.float32)
return chunk[None] # add the leading N=1 batch dim

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared RemoteModel lacks infer lock

Medium Severity

RemoteModel.infer uses one lazy WebSocket client with no serialization. Concurrent rollouts that share a single RemoteModel (common when fanning out parallel OpenPI evals) can interleave infer calls on the same connection and corrupt requests or responses.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 82c1ef8. Configure here.

jdchawla29 and others added 19 commits June 19, 2026 14:58
Feat: hud-python sdk v6
Adds a -p/--preset flag (and an interactive picker on a TTY) so hud init can fetch the same starter environments as the platform's environments/new flow. Presets live in hud/cli/presets.py (blank, browser, deepresearch, cua, autonomous-businesses, verilog) and are materialized by downloading the repo's main tarball from codeload (no git, path-traversal-safe). With no preset in a non-interactive shell it still writes the minimal local scaffold.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Apply tar members' execute bits after write so starter entrypoints/scripts stay runnable. Pass preset=None in the direct-call init tests (typer Option defaults to OptionInfo when the command function is called directly).

Co-authored-by: Cursor <cursoragent@cursor.com>
feat(cli): hud init --preset to scaffold from GitHub starters
Docker for slow envs like Isaac Sim publishes the port before @env.initialize finishes, so hello retries
can exceed 120s on slow container boots.
Add a weightless Model that queries a remote policy server over the OpenPI
msgpack/WebSocket protocol: the adapter builds the request dict, the server
owns all pre/post-processing + the forward, and infer() ships it and returns
the [T, A] chunk. connect() is lazy and idempotent (blocks until the server
is up); response_key covers "actions" (stock OpenPI) vs "action" (Cosmos).
…erence

BatchedModel wraps any Model and coalesces concurrent ainfer() calls into a
single stacked forward: a lazily-started worker drains up to batch_size queued
calls (or flushes after max_wait_s for the suite tail), runs one inner.infer,
and scatters the [N, T, A] rows back to each caller.

BatchedAgent wraps a RobotAgent and shallow-clones it per run so each rollout
keeps isolated episode state while sharing the one batched model. Usage stays a
one-liner: BatchedAgent(agent, batch_size=8) with max_concurrent set to match.
Migrate the robot harness to OpenPI-standard, slash-delimited observation
keys end-to-end, and add a thin OpenPIAdapter so a generic OpenPI policy
server drives the harness with no agent code changes.
Replace per-tick JPEG observation images with per-camera H.264/CMAF video
streaming for robot traces:

- Add hud/agents/robot/video.py (SegmentEncoder/VideoStreamer): encode each
  camera on a background thread, emitting CMAF fragments as VideoSegmentStep
  spans without blocking the act loop.
- RobotAgent starts/finalizes the streamer at the env control rate; finalize
  in `finally` so a crashed run still leaves video.
- ObservationStep.from_obs records only numeric state now; camera frames travel
  as video.
- Step.emit accepts an explicit trace_id so the encoder thread (no contextvars
  trace context) attributes spans correctly.
- Add RobotClient.get_control_rate(); add "video_segment" RobotStepSource;
  add PyAV (av>=12) to the robot extra.
Remove the per-episode model.reset() hook (Model/LeRobotModel/RemoteModel/
BatchedModel + agent.on_episode_start); per-episode state lives only on the
agent, so a shared BatchedModel can no longer clear one rollout's policy
state mid-episode. Document that RemoteModel is not batchable (OpenPI server
has no batched-request shape) on RemoteModel, BatchedModel, and BatchedAgent.
…ship

Spell out on Model.infer/ainfer that implementations must keep the leading
batch dim N (ainfer indexes [0], BatchedModel scatters rows along it) and add
a one-line assert in LeRobotModel.infer. Document that BatchedAgent mutates the
passed-in agent in place, leaving it permanently batched.

Co-authored-by: Cursor <cursoragent@cursor.com>
Clamp get_control_rate to max(1, round(...)) so sub-0.5 Hz contracts no longer
emit 0 FPS on VideoSegmentStep. Init _hooks_done before add_capability in
Environment.__init__. Load optional robot deps via importlib for pyright, add
shim-test ignores, and ruff-format flagged files.

Co-authored-by: Cursor <cursoragent@cursor.com>
Wrap long lines, move NDArray to TYPE_CHECKING, noqa intentional 0.0.0.0
bind in LocalRuntime, and reformat legacy shim test imports.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants