Fix DeloraLinear.unmerge subtracting un-cast delta weight by Chessing234 · Pull Request #3166 · huggingface/peft

Chessing234 · 2026-04-16T00:14:44Z

Bug

`DeloraLinear.merge` converts the delta to the base layer's dtype / device before applying it (lines 191-195):

```python
delta_weight = (
self.get_delta_weight(active_adapter)
.detach()
.to(dtype=base_layer.weight.dtype, device=base_layer.weight.device)
)
...
base_layer.weight.data.add_(delta_weight)
```

`unmerge` omits the same conversion:

```python
self.get_base_layer().weight.data -= self.get_delta_weight(active_adapter)
```

Root cause

`get_delta_weight` returns the output of `_compute_delta`, whose dtype/device follow the adapter parameters (`delora_A`/`delora_B`/`delora_lambda`/`delora_w_norm`), not the base weight. When the base layer is in a different dtype (e.g. bf16/int8 after `prepare_model_for_kbit_training`) or on a different device than the adapter, the in-place `-=` either raises a dtype/device mismatch or silently round-trips the residual differently from how it was added — so `unmerge` no longer reverses `merge`.

Fix

Mirror the merge path: cast the delta to the base layer's dtype/device before subtracting. Behavior is unchanged when the adapter and base already share dtype/device.

DeloraLinear.merge converts the delta to the base layer's dtype/device before applying it (lines 191-195): delta_weight = ( self.get_delta_weight(active_adapter) .detach() .to(dtype=base_layer.weight.dtype, device=base_layer.weight.device) ) base_layer.weight.data.add_(delta_weight) unmerge omits the same conversion: self.get_base_layer().weight.data -= self.get_delta_weight(active_adapter) `get_delta_weight` returns the result of `_compute_delta`, whose dtype/device follow the adapter parameters (delora_A/B/lambda/w_norm), not the base weight. When the base layer is e.g. bf16 / int8 / on a different device than the adapter (a common case after prepare_model_for_kbit_training, FSDP, or DoRA-style mixed setups), the in-place `-=` either raises a dtype/device mismatch or silently casts/round-trips the residual incorrectly, so unmerge no longer reverses merge. Mirror the merge path: cast the delta to the base layer's dtype/device before subtracting.

BenjaminBossan · 2026-04-16T09:56:15Z

Thanks for the PR.

or silently round-trips the residual differently from how it was added

What do you mean by that?

so unmerge no longer reverses merge.

IIUC, due to floating point operations, that can never be avoided, even with explicit type promotion.

github-actions · 2026-05-16T15:23:28Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix DeloraLinear.unmerge subtracting un-cast delta weight#3166

Fix DeloraLinear.unmerge subtracting un-cast delta weight#3166
Chessing234 wants to merge 1 commit into
huggingface:mainfrom
Chessing234:fix/delora-unmerge-cast-delta

Chessing234 commented Apr 16, 2026

Uh oh!

BenjaminBossan commented Apr 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chessing234 commented Apr 16, 2026

Bug

Root cause

Fix

Uh oh!

BenjaminBossan commented Apr 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants