Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
8d20b8d
Docs: rewrite install guide and make lerobot a required dependency (#…
yuecideng Apr 12, 2026
d2a8dad
Update cobotmagic arm asset. (#228)
matafela Apr 13, 2026
987d04f
Fix opw solver (#229)
matafela Apr 15, 2026
80368bd
Fix crashing when no grasp pose found. (#232)
matafela Apr 15, 2026
59021a4
add rl benchmark (#231)
matafela Apr 15, 2026
d860435
Enhance workspace analyzer computational efficiency (#230)
yuecideng Apr 15, 2026
d52ea77
Update CobotMagic default safe margin (#235)
matafela Apr 16, 2026
d5bf930
Fix demo action shape normalization and control-part mapping (#237)
yuecideng Apr 16, 2026
3bb2592
Refine URDF assembly component prefixes and name casing policy (#236)
chase6305 Apr 19, 2026
50843a5
Refactor: benchmark (#233)
matafela Apr 20, 2026
e0e16ae
Update pytorch kinematic solver benchmark param (#240)
matafela Apr 20, 2026
0497711
fix plan_trajectory (#242)
yhnsu Apr 22, 2026
ccce98a
docs: add NVIDIA driver guide and mesh loading tutorial (#245)
XuanchaoPENG Apr 23, 2026
386dc66
Add multi-version documentation build support (#234)
yuecideng Apr 24, 2026
9fb2204
Fix README docs link (#246)
yuecideng Apr 24, 2026
06258f1
docs: add AI coding agent skills and cross-reference throughout docum…
yuecideng Apr 28, 2026
72e9731
fix: correct the link of website (#249)
yangchen73 Apr 29, 2026
c824c86
docs: upgrade README badges and add auto-sync to introduction.rst (#250)
yuecideng Apr 29, 2026
8e92c08
upgrade pytorch kinematics (#244)
matafela Apr 30, 2026
72b5bb9
chore: upgrade black to 26.3.1 and fix code style (#251)
yuecideng Apr 30, 2026
fc54598
Improve API reference detail and coverage (#252)
yuecideng May 1, 2026
168f11c
feat: Add atomic action abstraction layer for embodied AI motion gene…
yuecideng May 1, 2026
c5e6f00
Fix pytorch solver qpos mapping (#253)
matafela May 7, 2026
23a1208
docs: Remove AI coding agent skills references (#254)
yuecideng May 7, 2026
076fc65
docs: add data generation tutorial for synthesized data pipeline (#238)
yvvonie May 8, 2026
f81b8a6
Update skills structure and roadmap docs (#257)
yuecideng May 8, 2026
9e34ec1
Adapt dexsim v0.4.0 (#226)
yuecideng May 9, 2026
72ffe43
docs: add uv installation support and update dependencies (#258)
yuecideng May 9, 2026
c6ce00f
Fix PyPI release workflow (#259)
yuecideng May 9, 2026
e11eb4e
docs: improve navigation, cross-references, and guides (#262)
yuecideng May 12, 2026
a251699
Fix multiversion docs overwrite on main branch push (#263)
yuecideng May 12, 2026
6e5f745
docs: add academic publications page (#265)
yuecideng May 12, 2026
c322584
Fix multiversion docs overwrite on main push (#266)
yuecideng May 12, 2026
5cdd9c2
ci: fix docs deployment to use GitHub Actions Pages source
yuecideng May 12, 2026
2a8b56c
ci: fix multiversion docs deployment (#267)
yuecideng May 12, 2026
5bc76ce
fix opw tcp (#261)
matafela May 13, 2026
c3f5b4d
Annotate full mesh (#260)
matafela May 19, 2026
83ef059
Adapt DexSim v0.4.1 (#272)
yuecideng May 19, 2026
498dce9
Add emissive light mode to randomize_indirect_lighting (#274)
yuecideng May 20, 2026
d46bedb
fix atomic action (#273)
XuanchaoPENG May 21, 2026
7823731
Add Newton physics backend support
yuecideng May 21, 2026
77963fa
sim-ready pipeline (#271)
XuanchaoPENG May 21, 2026
178c730
wip
yuecideng May 22, 2026
1773778
Fix multi-version GitHub Pages docs deployment (#277)
yuecideng May 22, 2026
010d3cd
update design docs
yuecideng May 23, 2026
2314a05
update cfg
yuecideng May 24, 2026
6878a9b
Merge remote-tracking branch 'origin/main' into feature/newton-physic…
yuecideng May 24, 2026
6c1e26e
wip
yuecideng May 25, 2026
65a8475
wip
yuecideng May 25, 2026
1d8d416
wip
yuecideng May 25, 2026
85a8ee4
wip
yuecideng May 25, 2026
1250aa0
style
yuecideng May 25, 2026
fef305e
standardize lerobot key (#280)
yhnsu May 26, 2026
41edaa4
wip
yuecideng May 26, 2026
bc1a8b3
Fix rigid object init ordering and reset after GPU physics setup (#283)
yuecideng May 26, 2026
a811b1d
Merge remote-tracking branch 'origin/main' into feature/newton-physic…
yuecideng May 26, 2026
87eedfe
wip
yuecideng May 26, 2026
389cfa9
Add YAML support for gym and RL training configs (#284)
yuecideng May 26, 2026
c5192b3
wip
yuecideng May 26, 2026
590211c
Add per-link articulation physics configuration (#278)
yuecideng May 26, 2026
aafc199
wip
yuecideng May 26, 2026
23feec1
wip
yuecideng May 26, 2026
1747cc4
wip
yuecideng May 28, 2026
3e4d4ee
Fix base solver fk (#285)
matafela May 28, 2026
cf5184e
wip
yuecideng May 28, 2026
7f09bb2
wip
yuecideng May 29, 2026
f87d885
wip
yuecideng May 29, 2026
0486c39
feat(newton): enable runtime mutation of physical properties via Newt…
yuecideng May 30, 2026
b3236fc
feat(sim): optimize Newton backend + expand rigid-object test coverage
yuecideng May 31, 2026
8c224a9
wip
yuecideng Jun 1, 2026
826d047
wip
yuecideng Jun 2, 2026
98c12d5
feat: add agent context routing system for EmbodiChain (#288)
yuecideng Jun 3, 2026
dd0f021
wip
yuecideng Jun 3, 2026
d93e61a
Add neural network IK solver (#286)
yangchen73 Jun 3, 2026
4b6852d
wip
yuecideng Jun 3, 2026
6034f7b
wip
yuecideng Jun 3, 2026
70dfc53
Add joint armature support for articulations (#290)
yuecideng Jun 4, 2026
1b86641
Align window controls with DexSim defaults (#291)
yuecideng Jun 4, 2026
9fcc807
Merge branch 'main' into feature/newton-physics-backend
yuecideng Jun 4, 2026
044877e
Improve installation guide and deduplicate gensim install docs (#293)
yuecideng Jun 4, 2026
1819048
Fix recording for PourWater (#292)
yangchen73 Jun 5, 2026
cb3240b
Auto-select default renderer based on GPU (#294)
yuecideng Jun 9, 2026
52e8a22
Merge remote-tracking branch 'origin/main' into feature/newton-physic…
yuecideng Jun 10, 2026
74bf17a
wip
yuecideng Jun 11, 2026
d5007f0
Merge remote-tracking branch 'origin/main' into feature/newton-physic…
yuecideng Jun 15, 2026
b338f4d
init articulation
yuecideng Jun 15, 2026
f8227f6
Merge remote-tracking branch 'origin/main' into feature/newton-physic…
yuecideng Jun 16, 2026
ceec28f
wip
yuecideng Jun 16, 2026
8388b4b
wip
yuecideng Jun 16, 2026
61855dc
wip
yuecideng Jun 16, 2026
c3923a8
add physcis backend
yuecideng Jun 18, 2026
396009f
wip: prepare Newton backend wiring for add_robot
yuecideng Jun 19, 2026
6b4709a
feat: enable RigidObject runtime attr mutation on Newton backend
yuecideng Jun 19, 2026
59fa679
feat: enable add_robot on the Newton backend
yuecideng Jun 19, 2026
1e3e75d
feat: Newton-native physics-attribute config for RigidObject & Articu…
yuecideng Jun 19, 2026
30a6ffe
test: add backend capability parity matrix
yuecideng Jun 19, 2026
3ccce78
feat: push per-link mass live on Newton in Articulation.set_link_phys…
yuecideng Jun 19, 2026
4d43fbc
fix: harden RigidObject not-ready Newton setter paths to mirror to meta
yuecideng Jun 19, 2026
a439bcb
wip
yuecideng Jun 19, 2026
006d2a1
wip
yuecideng Jun 21, 2026
4d0257c
docs: add Newton backend PR design (multi-env + differentiable env)
yuecideng Jun 21, 2026
79d3252
docs: implementation plan for Newton backend PR (multi-env + APG)
yuecideng Jun 21, 2026
86ead01
feat(sim/newton): add _arenas_cloned lifecycle flag
yuecideng Jun 21, 2026
56d0fc6
Revert "feat(sim/newton): add _arenas_cloned lifecycle flag"
yuecideng Jun 21, 2026
fc33a76
docs: revise Newton PR plan — Target 4 already complete
yuecideng Jun 21, 2026
56f1c9a
feat(sim): SimulationManager delegators for Newton diff stepper
yuecideng Jun 21, 2026
611d735
test(sim): isolate grad-guard test with semi_implicit solver
yuecideng Jun 21, 2026
68e3341
feat(sim/diff): Warp-tape <-> PyTorch-autograd bridge
yuecideng Jun 21, 2026
65f4488
feat(gym): DifferentiableEmbodiedEnv for APG
yuecideng Jun 21, 2026
13b981f
feat(gym/tasks): Franka FR3 reach APG example
yuecideng Jun 21, 2026
dec2222
docs: Newton differentiable-env topic + design doc update
yuecideng Jun 21, 2026
b18cb77
Merge branch 'main' into feature/newton-physics-backend
yuecideng Jun 23, 2026
8e1becc
wip
yuecideng Jun 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions agent_context/MAP.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -322,3 +322,33 @@ topics:
- manager-functor
- env-framework
status: active
- id: differentiable-env
title: Differentiable Environment (APG)
aliases:
- differentiable env
- apg
- analytic policy gradient
- differentiable rl
- Warp tape autograd
- NewtonStepFunc
- 可微环境
keywords:
- differentiable
- gradient
- apg
- autograd
- warp tape
- requires_grad
- semi_implicit
- DifferentiableEmbodiedEnv
- NewtonStepFunc
paths:
- topics/differentiable-env/differentiable-env.md
source_of_truth:
- embodichain/lab/gym/envs/differentiable_env.py
- embodichain/lab/sim/diff/
- embodichain/lab/gym/envs/tasks/special/franka_reach_apg.py
related_topics:
- env-framework
- rl-training
status: active
110 changes: 110 additions & 0 deletions agent_context/topics/differentiable-env/differentiable-env.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# differentiable-env

> Topic: Differentiable environment for analytic policy gradient (APG) —
> `DifferentiableEmbodiedEnv` + the `embodichain.lab.sim.diff` Warp-tape
> ↔ PyTorch-autograd bridge.

## Overview

EmbodiChain supports analytic policy gradient (APG) via
`embodichain.lab.gym.envs.differentiable_env.DifferentiableEmbodiedEnv`.
The bridge wraps a Warp tape around one EmbodiChain physics step and
exposes a `torch.autograd.Function`
(`embodichain.lab.sim.diff.NewtonStepFunc`) so PyTorch-side `action`
tensors get a gradient from `tape.backward()`.

## Required configuration

- `NewtonPhysicsCfg(requires_grad=True, solver_cfg={"solver_type": "semi_implicit"})`
- `use_cuda_graph=False` (forced by dexsim when grad mode is on)

The default backend and any other Newton solver are rejected at
construction time by `DifferentiableEmbodiedEnv._validate_diff_cfg`.

## Subclass contract

Task authors implement two methods on `DifferentiableEmbodiedEnv`:

- `_apply_action_kernel(action_wp, tape)` — launch a Warp kernel that
writes joint/body targets into `nm._control` while the tape is open.
The `action_wp` argument is a `wp.array(dtype=wp.float32,
requires_grad=True)` of shape `[num_envs * action_dim]`.
- `_read_outputs(final_state)` — build the `obs` / `reward` /
`terminated` / `truncated` outputs as torch tensors via `wp.to_torch`
so the tape can record the dependency. Must return a dict with
`_order` (tuple of output keys) and `_grad_track` (mapping from output
key to the Warp array that backs its gradient, or `None` for outputs
that don't need grad).

Optionally override `_make_step_fn()` to swap the per-substep advance
function. The default uses `dexsim.engine.newton_physics.DifferentiableStepper.step`;
the Franka APG example overrides it to call `newton.eval_fk` directly
(see "FK bypass" below).

See `embodichain/lab/gym/envs/tasks/special/franka_reach_apg.py` for
the canonical example.

## Why reward must be computed inside the tape

`NewtonStepFunc.forward` keeps the `wp.Tape` open while
`obs_reward_fn(final_state)` runs. Reward must be computed by a Warp
kernel that writes into a `wp.zeros(..., requires_grad=True)` array
inside the tape; `wp.to_torch(reward_wp)` then returns a torch tensor
that carries the tape's gradient. Computing reward in pure torch *after*
the tape closes would detach it from the autograd graph and
`action.grad` would come back as `None`.

The same rule applies to any observation that needs to be
grad-tracked: build it from `wp.to_torch` of a tape-tracked Warp array.

## FK bypass for the Franka task

The `semi_implicit` Newton solver does not propagate gradient through
`joint_target_pos` to `body_q` (verified empirically; the reference
implementation at `/root/sources/analytic_policy_gradients/envs/franka_reach_env.py`
hits the same limitation and uses the same workaround). The Franka APG
example overrides `_make_step_fn()` to call `newton.eval_fk(model,
new_joint_q, joint_qd, fk_state)` directly, bypassing the dynamics
solver. The grad path is then:

action → new_joint_q (action kernel) → eval_fk → body_q → reward kernel → reward_wp → tape.backward → action.grad

The default `_make_step_fn` still uses the differentiable stepper, so
envs whose reward depends on dynamics (not just FK) can use it — but
they should verify the solver actually propagates grad for their
control inputs before relying on it.

## Functor autograd compatibility

Reward/observation functors that compose torch operations on tensors
obtained via `wp.to_torch` are automatically autograd-compatible.
Functors that detour through CPU / NumPy break the graph; those need
torch-only reimplementations for the differentiable path. For now, the
differentiable env computes reward via a dedicated Warp kernel rather
than reusing the standard reward-manager functors — a future task can
audit and port functors as needed.

## Memory

Each step records `sim_steps_per_control` substeps into the tape. For
long horizons or large `num_envs`, pass `truncate_backward_at=K` on the
env config to split the tape and detach at chunk boundaries.

## Source of truth

- `embodichain/lab/gym/envs/differentiable_env.py` —
`DifferentiableEmbodiedEnv` base class.
- `embodichain/lab/sim/diff/bridge.py` — `NewtonStepFunc`,
`tape_context`, `differentiable_step`.
- `embodichain/lab/gym/envs/tasks/special/franka_reach_apg.py` —
example task.
- `embodichain/lab/sim/sim_manager.py` —
`SimulationManager.create_differentiable_stepper` /
`create_gradient_rollout` delegators.
- `/root/sources/dexsim/python/dexsim/engine/newton_physics/differentiable_stepper.py`
— the underlying dexsim primitive.

## Related topics

- env-framework
- rl-training
Loading
Loading