Skip to content

[WIP: do not review] - cherry pick in Daphne's kill of visualize.c#350

Closed
eugenevinitsky wants to merge 71 commits intomainfrom
ev/in-process-render
Closed

[WIP: do not review] - cherry pick in Daphne's kill of visualize.c#350
eugenevinitsky wants to merge 71 commits intomainfrom
ev/in-process-render

Conversation

@eugenevinitsky
Copy link
Copy Markdown

Summary

  • Replaces old subprocess-based rendering (export .bin → spawn xvfb-run ./visualize) with new headless Raylib + ffmpeg renderer
  • Rendering now done via python -m pufferlib.render_video subprocess wrapped with xvfb-run
  • Ported C renderer from 2.0 into drive.h with headless mode support

Status

Work in progress — testing on cluster.

🤖 Generated with Claude Code

Daphne and others added 30 commits December 30, 2025 10:27
Authored by vinitsky.eugene@gmail.com
When num_maps exceeds available maps in the directory:
- If allow_map_resampling=True (default): prints prominent warning and proceeds with available maps
- If allow_map_resampling=False: raises ValueError with helpful message

This allows training on small map datasets without manually adjusting num_maps.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…vention (#244)

Remove hard-coded reliance on map_xxx.bin format
Delete legacy convenience scripts that are no longer referenced anywhere

🤖 Generated with [Claude Code](https://claude.com/claude-code)
- fix multi-GPU logging
- add submitit scripts
- add singularity builder script
* Make SDC agent 0 and make sure it is always intialized and controlled

* Oops little fix

* Small fix: Make sure to increment num_actors.

* Make SDC agent 0 and make sure it is always intialized and controlled

* Oops little fix

* Small fix: Make sure to increment num_actors.

---------

Co-authored-by: Daphne Cornelisse <cor.daphne@gmail.com>
* Fix incorrect argument name.

* Small revert.

* Incorporate PR feedback.
* quick commit so you can read the code

* Improve naming of sampling argument to better describe its function.

* Ensure that initialization works with Carla maps or other, when sdc_index=-1.

* Simplify CARLA compatibility

* Replace num_maps by wosac_num_maps in all the eval scripts

* Comment about random baseline.

* Bug fix: do not reweight by the total weight (0.95).

* Add this back for now.

* Delete agent shrinking code.

* Ignore ttc metric when agents are not vehicles

* Update WOSAC weights to align with 2024 challenge since we don't have the traffic light metric.

* Update table

* Revert MAX_AGENTS to original.

* Update formatting.

---------

Co-authored-by: Wael Boumediene Doulazmi <wbd2016@cs713.hpc.nyu.edu>
Co-authored-by: Waël Doulazmi <waeldoulazmi@gmail.com>
* Add an arg in build_ocean so Raylib works on Debian (#266)

* Z-axis Feature

* Cleaning up unnecessary files

* Removing unused variable and updating drive.c

* Fixing corner case

* Resetting files to default config and removing whitesapce differences

* Pragnay/remove hardcoded offsets (#264)

* Replaced hardcoded offsets with programmatically generated ones

* code cleanup

* Fix potential linker errors

* Removing differance aray based filtering for better z aupdate accuracy

* Demo fixes with z axis (#265)

* merged demo fixes from 2.0 latest, reduced control deltas for classic dynamics to have easier control

* precommit fixes

* Demo only supports discrete actions for now

* Removing z update bug

* Minor

---------

Co-authored-by: Waël Doulazmi <73849155+WaelDLZ@users.noreply.github.com>
Co-authored-by: Pragnay Mandavilli <108453901+mpragnay@users.noreply.github.com>
…rge. (#270)

* Reordering of function to follow the new drive.h file structure

* pre-declaring some functions to fix compiling

* mem leak fix

---------

Co-authored-by: mpragnay <pm3881@nyu.edu>
* tmp async render support

* Fixed to serializable render_videos for python multiprocessing to work. Added render_queue for async rendering and logging videos

* Monotonous wandb step fix by using a custom render_step

* binary and video files naming convention fixes for render intervals less than 50

* minor changes

* limit renders by num_workers

* code cleanup

* pre-commit fixes

* Code cleanup
* Visualizer and control mode, init mode config description updates in docs

* Fix Wrong names in control modes

* add configs to drive.c, init_only_controlled and demo bug fixes, fix tables in docs

* minor fix

* Remove control_tracks_to_predict in another place
* Added render_eval mode, minor bug fixes for building in clang

* revert to default configs

* minor changes

* Added Documentation for mode

* String Concatenation fixes, exception handling for timeout, doc suggestions by greptile

* Created a Separate Render Mode, updated docs

* code cleanup
* Step 0: Adding the scenario_id in the Drive Struct and loading it in the Wosac bindings

- Requires to recompile the maps though
- Didn't think about CARLA compatibility yet

Next steps:

- Do the same work with track_ids (should be easy)
- Adapt the SMART evaluation code (I hope easy as well)

* Step 2: Give the real WOMD ids in the evaluator bindings. Also give is_track_to_predict as a bool.

Next step: update the import trajectories evaluations script.

* Evaluate imported trajectories is way better now

* Warning in evaluate_imported_trajectories

* Update the map_000.bin file

* Checked that CARLA is still fine, removed comment
added new sub-types for entities Agent, RoadMapElement, TrafficControlElement
* Improve random map resampling code.

* Refactor env: Create separate resampling function.

* Works with wosac_use_map_as_resampling_target = False.

* More elegant and clean solution to map resampling.

* Clean up.

* Put batch iteration in the WOSACEvaluator class to avoid repetition (keep puffer code clean).

* Improve naming.

* Clean up code and drop duplicates.

* Fix util functions so that we can run wosac during training.

* Log more metrics to wandb.

* Remove unused variable map_idex.

* Fix human replay eval.

* Drop last scenario from batch as a safety measure.
Implementation of new Agent and RoadMapElement datatypes.

Large PR that introduces the new datatypes in place of the old Entity. 
We swapped old data field names to match the new ones, and disambiguate them between RoadElements and Agents. 

NOTE: The code changes should NOT introduce any behaviour change!
eugenevinitsky and others added 27 commits March 4, 2026 15:11
* Onnx scripts with verification of model structure

* Code cleanup

* Added documentation for model export scripts

* Added explaination of script
* Updated score for each goal segment

* Added Heuristic for last goal segment

* Update pufferlib/ocean/env_binding.h

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Made min_avg_speed configurable

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Changed counts to max supported by sim, max_controlled, num_created and removed everything else to prevent redundencies

* Spawning logic with collision and offroad free spawns, uses 2.0 goal sampling logic for initial goals, configurable params for spawn settings

* Fix mem leaks

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fixed Agent Counts to align with gigaflow feature set (#288)

* Changed counts to max supported by sim, max_controlled, num_created and removed everything else to prevent redundencies

* Fixed memory leaks

* Bug fixing demo and renderer, fixed mem leaks in bindings code, changed defaults to align with requirements

* Code cleanup, error handling

* Minor fixes

* minor bugs

* Pre-Compute lanes for spawning

* Previously Working Settings

* Fixed reset to use the same goal

* Separate out goal resets for bug avoidance with other modes

* Fixed agent collisions for variable dimensions

* Fixed agent collisions for variable dimensions

* working jerk configs

* pre-commit fixes

* Relative Speed Observation fix

* Visualizer Ego POV road lanes added

* Minor config changes, for future expts

* Lane Length Biased Spawning

* Update bin in drive.c

* Reverted to donut goals

* COnfig changes

* Change configs for run

* Fixed max_agents

* Increased timeout, fixed default bin in viz

* Config changes

* Reset experimental configs

* Added back constants

* Added faulty dimensions check

* Minor Fixes

* Addressing concerns with goal_respawn, complete revert of configs, other code cleanup

* Addressing comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- resolve conflicts on merge
- adopt resample_maps() refactoring in drive.py
- made the scenario_id string length attached to a constant.
…a_and_3.0

- merge 3.0 into 3.0_beta
- resolve conflicts on merge
- adopt resample_maps() refactoring in drive.py
- made the scenario_id string length attached to a constant.
1. Removed reward clamping
2. Added an option to switch off reward normalization in case of reward conditioning.

* Rebased over latest 3.0_beta

* Reinstated reward clamping

* Added clamping back again

* Minor
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Conflict resolutions:
- binding.c: Take 3.0_beta (reward randomization, variable agent number, Agent/Road split)
- drive.c: Take 3.0 (keep #include env_config.h and string.h for demo/visualize)
- drive.h: Take 3.0_beta for all 9 conflicts (SCENARIO_ID_STR_LENGTH constant,
  num_created_agents rename, Agent type refactor, sim_* coordinate fields,
  road_elements split, heap allocation for agent indices), but remove the
  temporary 0.7f sim_width/sim_length scaling hack
- drive.py: Take 3.0_beta (reward randomization params, map_path, spawn dims,
  SCENARIO_ID_STR_LENGTH)
- env_binding.h: Take 3.0_beta (SCENARIO_ID_STR_LENGTH constant)
- env_config.h: Take 3.0_beta (add init_variable_agent_number mode)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add code isolation to training launch script

Creates a symlink copy of the source tree per run, with .so files copied
(not linked). This prevents rebuilding the C extension for another branch
from breaking already-running jobs. The isolated copy goes into the run's
data-dir, costing only ~1.7MB extra per run for the .so.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add code isolation to submit_cluster.py, remove standalone script

Creates a symlink copy of the source tree per run with .so files copied
(not linked), so rebuilding for another branch won't break running jobs.
Also adds env setup and cache redirects for container mode.

Removes the standalone train_singularity.sh in favor of using
submit_cluster.py for all launches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix empty --exclude causing sbatch error

NYU cluster rejects --exclude="". Use None instead so submitit
skips the flag entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Use full copy instead of symlinks for code isolation

Simpler and avoids glob/symlink issues. Ignores large files
(binaries, weights, videos, wandb) to keep copy fast and small.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix code isolation: use symlink tree + copy only .so files

copytree was too slow on the 30GB source tree. Switch back to
cp -rs (instant symlink tree) and copy only .so files in pufferlib/.
Also update train_base.yaml to use carla_3D maps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix container inner_cmd quoting so && chains run inside singularity

The inner_cmd with && chains was breaking out of bash -c when
submitit serialized it to an sbatch script. Wrap in single quotes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Speed up code isolation: shallow symlink top-level, deep copy only pufferlib

The full cp -rs on 60K files was too slow on networked filesystems.
Now symlinks top-level dirs individually (instant), then only deep-copies
pufferlib/ (10K files) where the .so lives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove unnecessary quoting around bash -c inner_cmd

subprocess.run with a list passes each element as a separate arg,
so bash -c already receives inner_cmd as one argument. Adding
literal quotes caused bash to interpret it as a filename.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Eugene Vinitsky <eugene@percepta.ai>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ig (#342)

Remove model weights, rendered videos, dSYM debug symbols, and
CLAUDE.local.md from the repo. Add gitignore rules to prevent
recommitting them.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace subprocess-based rendering (export .bin → spawn xvfb-run ./visualize)
with direct in-process headless Raylib + ffmpeg pipeline. The renderer creates
a temporary Drive env with render_mode=RENDER_HEADLESS, runs a full episode
rollout with the current policy, and produces an mp4 via ffmpeg pipe.

Key changes:
- Port headless renderer (Xvfb fork + ffmpeg pipe) from 2.0 to 3.0
- Port window-mode HUD and all 3 view modes from 2.0
- Add render_mode passthrough from Python to C env struct
- Replace subprocess render infrastructure in pufferl.py with _render_in_process()
- Remove render_async, ensure_drive_binary, check_render_queue from training loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LSTMWrapper.forward_eval() requires a state dict with lstm_h/lstm_c.
Initialize zero LSTM state when the policy has an LSTM.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Drive env observation array may not be contiguous, causing
torch.as_tensor to fail with a view/stride error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The C-level Xvfb fork in make_client fails inside Singularity containers
(can't create /tmp lock files). Start Xvfb from Python before creating
the render env, so DISPLAY is already set and the C code skips its
own Xvfb launch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use obs.copy() to guarantee a fully contiguous array before
converting to tensor, fixing .view() failures in encode_observations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Keep Xvfb running across renders to avoid conflicts on repeated calls
- Use tensor.clone() to guarantee contiguity for .view() in policy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
.view() fails on non-contiguous tensor slices from C-allocated buffers.
.reshape() handles both contiguous and non-contiguous cases correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The in-process renderer had stride/contiguity issues with C-allocated
observation buffers. Switch to a subprocess like the other eval types:
- Add pufferlib/render_video.py: standalone script that creates a
  headless Drive env, loads checkpoint, runs rollout, produces mp4
- Simplify _render_in_process to just spawn the subprocess
- Remove Xvfb lifecycle management from training process
- Use .reshape() instead of .view() in torch.py for robustness

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 18, 2026 23:04
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a broad refactor of PufferDrive’s rendering + cluster tooling pipeline, introducing a new headless rendering path (python -m pufferlib.render_video) and adding SLURM/Submitit-based training utilities/configs, alongside significant Drive/WOSAC evaluation + environment/binding updates.

Changes:

  • Add cluster job submission + container setup tooling (Submitit launcher, configs, docs).
  • Add ONNX/bin export + verification scripts/docs.
  • Major Drive simulator/binding/evaluation updates (reward conditioning, map selection, scenario-id plumbing, WOSAC evaluator changes) and new headless video rendering module.

Reviewed changes

Copilot reviewed 54 out of 65 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
setup.py Add tqdm dependency.
pyproject.toml Add cluster extra (submitit/pyyaml).
scripts/verify_onnx.py New ONNX runtime verification script.
scripts/export_onnx.py New ONNX exporter for Drive policy checkpoints.
scripts/export_model_bin.py New exporter to C-consumable flattened .bin weights.
scripts/submit_cluster.py New Submitit-based SLURM launcher with sweep support.
scripts/setup_container.sh New Singularity overlay setup script for cluster runs.
scripts/cluster_status.sh New SLURM status helper script.
scripts/cluster_configs/train_base.yaml Baseline training config for cluster submission.
scripts/cluster_configs/nyu_greene.yaml Example compute config for NYU Greene.
scripts/train_sanity.sh Removed old local training script.
scripts/train_procgen.sh Removed old local training script.
scripts/train_ocean.sh Removed old local training script.
scripts/train_atari.sh Removed old local training script.
scripts/sweep_atari.sh Removed old sweep helper script.
scripts/build_simple.sh Removed old simple C build helper script.
scripts/build_ocean.sh Add -ldl to link flags.
pufferlib/utils.py Update WOSAC subprocess args + refactor C video rendering helper.
pufferlib/render_video.py New headless Raylib+ffmpeg renderer invoked as a subprocess.
pufferlib/pufferl.py Update training loop: truncation handling, optional reward clamp, new render mode, WOSAC eval scheduling, rank-0 logger creation.
pufferlib/ocean/torch.py Drive policy obs encoding updates; ego feature sizing from env.
pufferlib/ocean/env_config.h Expand Drive INI config parsing (reward conditioning, spawn settings, etc.).
pufferlib/ocean/env_binding.h Extend bindings: truncations wiring, rendering args, scenario-id as strings, WOSAC-related outputs.
pufferlib/ocean/drive/visualize.c Update visualizer to new Agent struct + config parsing + drivenet signature.
pufferlib/ocean/drive/error.h Expand error types and messages.
pufferlib/ocean/drive/drivenet.h Add reward conditioning dimension; adjust road feature layout.
pufferlib/ocean/drive/drive.py Major env API updates (map_files, reward conditioning, resample_maps, render args, scenario_id strings).
pufferlib/ocean/drive/drive.c Update demo/rendering CLI parsing and env init from INI.
pufferlib/ocean/drive/datatypes.h New datatypes/constants for conditioning, road types, agent types, and free helpers.
pufferlib/ocean/drive/binding.c Major shared/env init wiring changes (map_files, conditioning, reward bounds, spawn settings).
pufferlib/ocean/benchmark/wosac.ini Update to 2024 challenge weights.
pufferlib/ocean/benchmark/evaluator.py Add batched evaluation loop + random baseline + scenario aggregation changes.
pufferlib/ocean/benchmark/visual_sanity_check.py Update WOSAC sanity setup for new config fields.
pufferlib/ocean/benchmark/metrics_sanity_check.py Switch to random baseline collection (no policy needed).
pufferlib/ocean/benchmark/evaluate_imported_trajectories.py Align trajectories by (scenario_id, id) rather than KDTree.
pufferlib/config/ocean/drive.ini Large default config changes (jerk dynamics, conditioning, WOSAC batching, render section).
docs/theme/extra.css Adjust table row background styling.
docs/src/wosac.md Update baseline table + explain baselines.
docs/src/visualizer.md Expand rendering docs + new puffer render mode.
docs/src/train.md Markdown heading fixes + updated Carla map notes.
docs/src/simulator.md Update control mode docs + add init mode explanation.
docs/src/pufferdrive-2.0.md Update author list + bibtex.
docs/src/interact-with-agents.md Add CLI argument documentation.
docs/src/export-onnx.md New ONNX/bin export documentation.
docs/src/data.md Update error message wording to match new map checks.
docs/src/cluster.md New cluster training guide.
docs/src/SUMMARY.md Add cluster + ONNX docs to sidebar.
data_utils/carla/generate_carla_agents.py Fix velocity calc + adjust defaults + add debug prints.
README.md Add CI badge + update citation authors.
.gitignore Ignore model/bin artifacts and debug symbols.
.github/workflows/utest.yml Change CI branch filters to 2.0.
.github/workflows/train-ci.yml Change CI branch filters to 2.0.
.github/workflows/render-ci.yml Change CI branch filters to 2.0.
.github/workflows/perf-ci.yml Change CI branch filters to 2.0.
.github/workflows/docs.yml Change docs deploy branch to 2.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pufferlib/pufferl.py
Comment on lines 503 to +506
if self.render and self.epoch % self.render_interval == 0:
model_dir = os.path.join(self.config["data_dir"], f"{self.config['env']}_{self.logger.run_id}")
model_files = glob.glob(os.path.join(model_dir, "model_*.pt"))

if model_files:
# Take the latest checkpoint
latest_cpt = max(model_files, key=os.path.getctime)
bin_path = f"{model_dir}.bin"

# Export to .bin for rendering with raylib
try:
export_args = {"env_name": self.config["env"], "load_model_path": latest_cpt, **self.config}

export(
args=export_args,
env_name=self.config["env"],
vecenv=self.vecenv,
policy=self.uncompiled_policy,
path=bin_path,
silent=True,
)
pufferlib.utils.render_videos(
self.config, self.vecenv, self.logger, self.epoch, self.global_step, bin_path
)

except Exception as e:
print(f"Failed to export model weights: {e}")
try:
self._render_in_process()
except Exception as e:
Comment thread scripts/submit_cluster.py
Comment on lines +195 to +198
from_config = {}
if args.compute_config is not None:
from_config = yaml.safe_load(open(args.compute_config, "r"))
from_config = {k: v for k, v in from_config.items() if v is not None}
Comment on lines 38 to +42
void demo() {
// Read configuration from INI file
env_init_config conf = {0};
const char *ini_file = "pufferlib/config/ocean/drive.ini";
if (ini_parse(ini_file, handler, &conf) < 0) {
Comment on lines +104 to 106
Weights *weights = load_weights("resources/drive/puffer_drive_weights.bin");
DriveNet *net = init_drivenet(weights, env.active_agent_count, env.dynamics_model);

Comment on lines +96 to +100
if (env.active_agent_count == 0) {
fprintf(stderr, "Error: No active agents found in map '%s' with init_mode=%d. Cannot run demo.\n", env.map_name,
conf.init_mode);
free_allocated(&env);
return -1;
Comment on lines +262 to +269
// Store map_id
PyObject *map_id_obj = PyLong_FromLong(map_id);
PyList_SetItem(map_ids, env_count, map_id_obj);
// Store agent offset
PyObject *offset = PyLong_FromLong(total_agent_count);
PyList_SetItem(agent_offsets, env_count, offset);
total_agent_count += env->active_agent_count;
env_count++;
Comment thread pufferlib/render_video.py
Comment on lines +79 to +83
# Find and move the mp4
mp4_files = glob.glob("*.mp4")
if mp4_files:
latest_mp4 = max(mp4_files, key=os.path.getctime)
os.makedirs(os.path.dirname(args.output_path), exist_ok=True)
Comment thread scripts/submit_cluster.py
Comment on lines +105 to +107
if program_config is not None:
from_config = yaml.safe_load(open(program_config, "r"))
print("Loaded base config:")
Comment on lines 764 to 768
static PyObject *get_ground_truth_trajectories(PyObject *self, PyObject *args) {
if (PyTuple_Size(args) != 7) {
PyErr_SetString(PyExc_TypeError, "get_ground_truth_trajectories requires 7 arguments");
if (PyTuple_Size(args) != 9) {
PyErr_SetString(PyExc_TypeError, "get_ground_truth_trajectories requires 9 arguments");
return NULL;
}
Comment on lines +4 to +6
#include "../env_config.h"
#include <string.h>
#include "../env_config.h"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants