Skip to content

Feat: hud-python sdk v6#421

Merged
jdchawla29 merged 200 commits into
mainfrom
v6
Jun 19, 2026
Merged

Feat: hud-python sdk v6#421
jdchawla29 merged 200 commits into
mainfrom
v6

Conversation

@Parth220

@Parth220 Parth220 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Note

High Risk
This is a major SDK and protocol shift (v5 agents cannot drive v6-served environments) plus CI test setup changes that drop browser/Playwright provisioning, which can hide regressions in computer-use paths if those tests still exist.

Overview
This PR ships HUD Python SDK v6 as the primary surface: environments expose a thin control channel with capabilities (ssh, mcp, cdp, rfb, robot) and tasks (@env.template() generators), while agent harnesses own the tools. User-facing narrative moves from v5 scenarios/MCP tools to protocol-first manifest → tasks.start → tasks.grade, with Task.run(agent) returning a Job/Run instead of hud.eval() / env("scenario", ...).

Documentation is restructured on Mintlify: default v6 nav (docs/v6/), v5 tagged Legacy under docs/v5/, redirects from old paths, new Migrate to v6 guide, agent skill doc, and refreshed site styling (docs.json, custom.css). Several long-form cookbooks are removed from the old tree and replaced or relocated (e.g. v6 coding-agent, ops-diagnostics, a2a-chat, robot-benchmark).

Runnable examples land under cookbooks/ (A2A chat server moved out of the SDK as reference code; codex-style agent; v6 chat_env using EvaluationResult and templates). README and CONTRIBUTING are rewritten for v6 workflows (hud init, hud deploy, hud eval without --rootdir=hud).

CI/dev ergonomics: GitHub Actions drops Xvfb/Playwright install from the test matrix; .githooks/pre-push is removed. .gitignore expands for local/experimental dirs. Adds AGENTS.md (and CLAUDE.md pointer) for contributor/agent guidance.

Reviewed by Cursor Bugbot for commit c673f40. Bugbot is set up for automated code reviews on this repo. Configure here.

jdchawla29 and others added 30 commits April 27, 2026 16:07
Decouple agent native tools from environment primitives
# Conflicts:
#	docs/reference/agents.mdx
#	hud/environment/environment.py
#	hud/environment/tests/test_environment.py
#	hud/tools/computer/base.py
#	hud/tools/computer/gemini.py
#	hud/tools/executors/xdo.py
#	hud/tools/tests/test_computer.py
Parth220 and others added 14 commits June 19, 2026 12:43
[codex] add modal runtime provider wiring
hud/train/: TrainingClient (forward_backward, optim_step, step, custom forward/backward) over the HUD training service, keyed by model id. New 'hud models' CLI group (list, fork, checkpoints, head --set). settings: hud_rl_url; drop the old eval/training.py BYO helper. Docs: v6 training how-to rewritten for the managed trainer + new reference/training page; rl-training cookbook.

Co-authored-by: Cursor <cursoragent@cursor.com>
Training POSTs (forward_backward/optim_step/backward) are non-idempotent, so make_request now uses max_retries=0 there (a silent retry would double-apply the optimizer/gradient or collide on the checkpoint name). Adds the 2048 RL cookbook example.

Co-authored-by: Cursor <cursoragent@cursor.com>
lorenss-m and others added 2 commits June 19, 2026 14:44
Drop the divergent 'Task Run:' / 'Batch Run:' prefixes; default job names now use the bare subject (task id for a single task, '{taskset} (N tasks)' for a batch), matching the lone-rollout and chat paths and aligning with the platform's '{subject} on {model}' convention.

Co-authored-by: Cursor <cursoragent@cursor.com>
@jdchawla29 jdchawla29 marked this pull request as ready for review June 19, 2026 21:55
@jdchawla29 jdchawla29 merged commit 7a8955c into main Jun 19, 2026
10 of 11 checks passed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77cd964ee9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread hud/cli/eval.py
Comment on lines +673 to +674
if resolved.is_dir() or resolved.suffix == ".py":
return resolved

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Serve the env source instead of tasks.py

When hud eval is run on the scaffolded tasks.py (which imports task factories from env.py and only exposes Task objects), this branch passes tasks.py to LocalRuntime. The child then runs load_environment(tasks.py, --env <task.env>), but that file has no Environment, so the default hud init workflow fails before any rollout can start. Use the task's captured _source/the containing env module (or the directory) for local placement instead of the task list file.

Useful? React with 👍 / 👎.

Comment thread hud/agents/tool_agent.py
Comment on lines +124 to +125
if cap.protocol in wanted and cap.protocol not in connections:
connections[cap.protocol] = await run.client.open(cap.protocol)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Open capabilities by name, not protocol

If an env publishes more than one binding for the same protocol (for example two rfb/3.8 screens or multiple MCP tool servers), run.client.open(cap.protocol) calls HudClient.binding() with an ambiguous protocol ref and raises before the agent loop starts; even without the raise, the dict keyed by protocol would drop the extra binding. Iterate by capability name and keep distinct connections so same-protocol capabilities remain usable.

Useful? React with 👍 / 👎.

@mintlify

mintlify Bot commented Jun 19, 2026

Copy link
Copy Markdown

Docs PR opened: #436

Removed the broken v6 Build nav group and repointed six broken links to existing v5 and v6 reference pages.

@mintlify

mintlify Bot commented Jun 19, 2026

Copy link
Copy Markdown

Docs PR opened: #437

Rewrote short, generic SEO descriptions on 31 v6, platform, and migrate pages to unique 130–155 character summaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants