FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div by Chessing234 · Pull Request #3149 · huggingface/peft

Chessing234 · 2026-04-11T10:31:17Z

Bug

_ConvNd.unmerge in src/peft/tuners/ia3/layer.py does not undo the bias scaling that _ConvNd.merge applied. After a merge() -> unmerge() round-trip on an IA3 non-feedforward Conv2d/Conv3d layer whose base layer has bias is not None, the base layer's bias ends up multiplied by ia3_l ** 2 instead of being restored to its pre-merge value, and each subsequent disable_adapters-triggered unmerge compounds the error.

Root cause

In _ConvNd.merge (line 254–256) the bias is scaled with torch.mul:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)

and in _ConvNd.unmerge (line 280–283) the bias is "un-scaled" by also using torch.mul — exactly the same op:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    orig_dtype = base_layer.bias.data.dtype
    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)

So bias_merged = bias * ia3_l and then bias_unmerged = bias_merged * ia3_l = bias * ia3_l ** 2, rather than bias.

The corresponding weight path in the same function (line 278) uses torch.div correctly:

base_layer.weight.data = torch.div(base_layer.weight.data, ia3_scaling + 1e-8).to(orig_dtype)

and the sibling Linear.unmerge bias path at line 152–155 also does it correctly:

if not self.is_feedforward and (base_layer.bias is not None):
    scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
    orig_dtype = base_layer.bias.data.dtype
    base_layer.bias.data = torch.div(base_layer.bias.data, scaling.data + 1e-8).to(orig_dtype)

so the intended shape of the fix is unambiguous — _ConvNd.unmerge was copy-pasted from merge for the bias branch and never flipped to a div.

Worked example

Start with bias = [1.0, 2.0] and IA3 scaling = [3.0, 0.5].

step	current code	expected
after merge	`[3.0, 1.0]`	`[3.0, 1.0]`
after unmerge	`[9.0, 0.5]`	`[1.0, 2.0]`

Fix

Mirror the Linear.unmerge pattern: divide by scaling.data + 1e-8 (the + 1e-8 guards against any IA3 scaling entry that happens to be zero — same tolerance used on the weight line just above and in Linear.unmerge).

                 if not self.is_feedforward and (base_layer.bias is not None):
                     scaling = self.ia3_l[active_adapter].reshape(base_layer.bias.shape)
                     orig_dtype = base_layer.bias.data.dtype
-                    base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype)
+                    base_layer.bias.data = torch.div(base_layer.bias.data, scaling.data + 1e-8).to(orig_dtype)

One-line change, symmetric with the weight path right above and with Linear.unmerge in the same file. No effect on the feedforward or bias is None branches.

\`_ConvNd.unmerge\` in \`src/peft/tuners/ia3/layer.py\` is supposed to undo the corresponding \`_ConvNd.merge\` by inverting the (IA)^3 scaling applied to the base layer's weight **and** bias. The weight path correctly divides by the IA3 scaling (line 278): base_layer.weight.data = torch.div( base_layer.weight.data, ia3_scaling + 1e-8 ).to(orig_dtype) but the bias path right below it accidentally uses \`torch.mul\` again, the same op \`merge\` uses: # merge (line 256) base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype) ... # unmerge (line 283) <-- BUG: should be torch.div base_layer.bias.data = torch.mul(base_layer.bias.data, scaling.data).to(orig_dtype) So a merge followed by an unmerge leaves the bias scaled by \`ia3_l ** 2\` instead of being restored to its original value. For any non-feedforward IA3 Conv layer (Conv2d / Conv3d) whose base layer has \`bias is not None\`, \`merge() -> unmerge()\` permanently corrupts the bias — and subsequent \`disable_adapters\`-triggered unmerges compound the error. The sibling \`Linear.unmerge\` in the same file (line 155) already uses the correct pattern: base_layer.bias.data = torch.div( base_layer.bias.data, scaling.data + 1e-8 ).to(orig_dtype) Mirror that pattern in \`_ConvNd.unmerge\`: divide by \`scaling.data + 1e-8\` so the bias unmerge is the inverse of the bias merge, and the \`+ 1e-8\` tolerance guards against division by any IA3 scaling entry that happens to be zero.

githubnemo

Hey, nice find.

Can you also reduce the tolerance in test_custom_models.py for test_merge? (to verify that the fix works)

--- a/tests/testing_common.py
+++ b/tests/testing_common.py
@@ -608,7 +608,7 @@ class PeftCommonTester:
                 atol, rtol = 1e-3, 1e-3
             if (config.peft_type in {"IA3", "LORA", "OFT"}) and (model_id in conv_ids):
                 # for some reason, the Conv introduces a larger error
-                atol, rtol = 0.3, 0.01
+                atol, rtol = 1e-2, 0.01
             if (config.peft_type == "OFT") and not is_decoder:
                 atol, rtol = 0.3, 0.01
             if model_id == "trl-internal-testing/tiny-Llama4ForCausalLM":

It would also be nice if you'd found out why test_safe_merge doesn't detect this issue.

github-actions · 2026-05-11T16:00:21Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

githubnemo · 2026-05-19T17:44:18Z

gentle ping @Chessing234 - do you still plan on working on this?

Drop atol from 0.3 to 1e-2 (rtol unchanged at 0.01) on the IA3/LORA/OFT + Conv branch in _test_merge_layers. The previous 0.3 tolerance was loose enough to hide the IA3 _ConvNd.unmerge bias bug this PR fixes; with the fix, all 126 Conv2d/Conv3d merge cases pass under the tighter bound, and reverting the fix makes the affected IA3-Conv2d cases fail.

Chessing234 · 2026-05-20T02:00:10Z

Hey @githubnemo, sorry for the silence — yes, still on it.

Just pushed the tolerance change you suggested (atol=1e-2, rtol=0.01) on the IA3/LORA/OFT Conv branch in tests/testing_common.py. Verified locally: with the fix in place, all 126 Conv2d/Conv3d test_merge_layers* cases pass under the tighter bound; reverting the IA3 unmerge bias line back to torch.mul makes 3 of the IA3-Conv2d cases fail — so the tighter tolerance does catch this specific bug now.

On test_safe_merge not detecting it: _test_safe_merge only calls model.merge_and_unload(safe_merge=True) once and compares the unloaded model's logits against the PEFT model's. It never calls unmerge_adapter(). Since the bug lives purely in _ConvNd.unmerge, and safe_merge=True only adds a NaN/Inf guard on the merge path, this test never exercises the buggy code. _test_merge_layers is the right place — it does the full merge → unmerge round-trip — it just wasn't tight enough before. Confirmed by running test_safe_merge[Conv2d * IA3] with the bug reverted: still passes 5/5.

Happy to also add an explicit merge/unmerge round-trip assertion to _test_safe_merge if you'd like that as a separate follow-up, but it felt out of scope here.

githubnemo reviewed Apr 16, 2026

View reviewed changes

github-actions Bot closed this May 19, 2026

githubnemo reopened this May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div#3149

FIX IA3 _ConvNd.unmerge bias uses torch.mul instead of torch.div#3149
Chessing234 wants to merge 2 commits into
huggingface:mainfrom
Chessing234:fix/ia3-convnd-unmerge-bias-div

Chessing234 commented Apr 11, 2026 •

edited by githubnemo

Loading

Uh oh!

githubnemo left a comment •

edited

Loading

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

githubnemo commented May 19, 2026

Uh oh!

Chessing234 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chessing234 commented Apr 11, 2026 • edited by githubnemo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Root cause

Worked example

Fix

Uh oh!

githubnemo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

githubnemo commented May 19, 2026

Uh oh!

Chessing234 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chessing234 commented Apr 11, 2026 •

edited by githubnemo

Loading

githubnemo left a comment •

edited

Loading