feat(lora): Add FIM-guided adaptive LoRA rank allocation (FimConfig + initialize_lora_fim_ranks) by ramkrishs · Pull Request #3204 · huggingface/peft

ramkrishs · 2026-04-28T22:21:30Z

Summary

Adds FimConfig and initialize_lora_fim_ranks() — a data-driven method that redistributes LoRA ranks across layers using the diagonal of the empirical Fisher Information Matrix (eFIM), concentrating rank budget on layers that are most sensitive to the loss.

Proposal issue: #3203

Motivation

LoRA uses a fixed rank r for all adapter matrices. Different layers have different sensitivity to fine-tuning data — early attention layers often require less capacity than later layers; q/v projections often differ from k projections. A fixed-rank allocation wastes capacity on insensitive layers.

EVA (already in PEFT) addresses this via SVD of input activations. This PR uses a complementary signal: the eFIM diagonal (mean squared gradient), which directly measures per-parameter loss sensitivity rather than activation variance. The two are orthogonal — EVA optimizes initialization directions; FIM optimizes rank allocation by sensitivity.

Algorithm

The eFIM diagonal for parameter θᵢ:

F_ii ≈ (1/T) Σ (∂ℓ_t / ∂θ_i)²

Rank allocation:

score_i = mean(F_i)          # per-layer importance
rank_i  ∝ score_i / Σ score_j × budget    # proportional, budget = n_layers × r
rank_i  = clamp(rank_i, r_min, r_max)     # integer, largest-remainder rounding

Total rank budget is preserved: mean rank across layers equals the original r.

API

Follows the EVA pattern exactly: FimConfig dataclass + initialize_lora_fim_ranks() public function + init_lora_weights='fim' trigger in LoraConfig.

from peft import LoraConfig, get_peft_model, FimConfig, initialize_lora_fim_ranks

fim_cfg = FimConfig(
    fim_calibration_batches=8,   # batches for eFIM accumulation
    r_min=1,                     # minimum rank per layer
    r_max=32,                    # maximum rank per layer (default: 2 * r)
    adjust_scaling_factors=True, # preserve lora_alpha / r after reallocation
)

config = LoraConfig(
    r=8,
    init_lora_weights="fim",
    fim_config=fim_cfg,
    target_modules=["q_proj", "v_proj"],
)
model = get_peft_model(base_model, config)

initialize_lora_fim_ranks(model, dataloader=calibration_loader)
# or, with pre-computed scores:
initialize_lora_fim_ranks(model, fim_scores=my_fim_dict)

Files changed

File	Change
`src/peft/tuners/lora/fim.py`	New: `FimConfig`, `initialize_lora_fim_ranks`, internal helpers
`src/peft/tuners/lora/config.py`	Add `fim_config` field, `'fim'` to `init_lora_weights` Literal, validation
`src/peft/tuners/lora/layer.py`	Allow `'fim'` in `reset_lora_parameters` (treated as standard init; rank redistribution happens post-construction)
`src/peft/tuners/lora/__init__.py`	Export `FimConfig`, `initialize_lora_fim_ranks`
`src/peft/tuners/__init__.py`	Propagate exports
`src/peft/__init__.py`	Top-level export
`tests/test_lora_fim.py`	23 unit tests

Tests

23 passed in 5.00s

Covers: FimConfig construction/validation, _compute_layer_importance, _allocate_ranks (budget preservation, clamping, monotonicity), _resize_lora_layer (increase/decrease/noop, scaling adjustment), initialize_lora_fim_ranks end-to-end (with dataloader and pre-computed scores), LoraConfig validation warnings, and top-level import.

No GPU required. All tests run on CPU.

Reference

LeCun et al., Optimal Brain Damage, NeurIPS 1990 — theoretical basis for eFIM diagonal importance
Zhang et al., AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning, ICLR 2023 — related adaptive rank via SVD
Related: pytorch/ao #4352 — same eFIM diagonal applied to weight pruning

This PR was developed with AI assistance. All code has been tested, manually reviewed, and verified against the PEFT codebase conventions.

Introduces FimConfig and initialize_lora_fim_ranks() — a calibration-based method that redistributes LoRA ranks across layers using the diagonal of the empirical Fisher Information Matrix (eFIM), subject to a global rank budget. Layers with high gradient variance (high eFIM score) receive higher rank; layers with low sensitivity receive lower rank. This allows the same total parameter count as fixed-rank LoRA, while concentrating capacity where the loss curvature is highest. Algorithm: F_ii ≈ (1/T) Σ (∂ℓ_t/∂θ_i)² (eFIM diagonal, mean squared gradient) rank_i ∝ mean(F_i) / Σ mean(F_j) × budget, clamped to [r_min, r_max] budget = n_layers × r (mean rank preserved) Files changed: src/peft/tuners/lora/fim.py — FimConfig + initialize_lora_fim_ranks src/peft/tuners/lora/config.py — fim_config field + 'fim' init mode src/peft/tuners/lora/layer.py — allow 'fim' in reset_lora_parameters src/peft/tuners/lora/__init__.py — export FimConfig, initialize_lora_fim_ranks src/peft/tuners/__init__.py — propagate exports src/peft/__init__.py — top-level export tests/test_lora_fim.py — 23 unit tests Relates to: huggingface#3203 Reference: LeCun et al., Optimal Brain Damage, NeurIPS 1990. Signed-off-by: Ramakrishnan Sathyavageeswaran <ramkrishs@outlook.com>

BenjaminBossan · 2026-04-29T10:55:21Z

Thanks for providing this PR @ramkrishs. Is there any paper that shows that this initialization works well with LoRA? Did you run any of your own experiments? Usually, we don't add new methods to PEFT only on the theoretical assumption that they could work.

ramkrishs · 2026-04-29T17:32:15Z

Thank you for the feedback Benjamin.

I'm currently running a structured comparison on GLUE (DeBERTaV3-base) and commonsense reasoning (LLaMA-3-8B) against LoRA, AdaLoRA, and EVA across rank budgets r ∈ {2, 4, 8, 16}. The experiment harness is set up and the first results should be ready within 2–3 weeks. This work is also being written up as a short paper — the closest prior work (AdaLoRA, ICLR 2023) shows that non-uniform rank allocation consistently outperforms fixed-rank LoRA, particularly at low budgets, and our hypothesis is that eFIM-based allocation is more directly tied to the fine-tuning objective than SVD-based signals.

I'll update this PR with the results table and a link to the arXiv preprint once the experiments are complete. Happy to keep this as a draft in the meantime — no action needed from your side until then.

BenjaminBossan · 2026-04-30T09:13:12Z

I'll update this PR with the results table and a link to the arXiv preprint once the experiments are complete.

Thanks, then let's pick this PR up again at that point.

You could also check this init on the PEFT MetaMath benchmark, it'll probably just require a couple of lines of extra code.

ramkrishs mentioned this pull request May 2, 2026

feat(lora): add FIM-guided automatic LoRA rank allocation (fim_auto_rank) axolotl-ai-cloud/axolotl#3639

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(lora): Add FIM-guided adaptive LoRA rank allocation (FimConfig + initialize_lora_fim_ranks)#3204

feat(lora): Add FIM-guided adaptive LoRA rank allocation (FimConfig + initialize_lora_fim_ranks)#3204
ramkrishs wants to merge 1 commit into
huggingface:mainfrom
ramkrishs:feat/fim-adaptive-lora-rank

ramkrishs commented Apr 28, 2026

Uh oh!

BenjaminBossan commented Apr 29, 2026

Uh oh!

ramkrishs commented Apr 29, 2026

Uh oh!

BenjaminBossan commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ramkrishs commented Apr 28, 2026

Summary

Motivation

Algorithm

API

Files changed

Tests

Reference

Uh oh!

BenjaminBossan commented Apr 29, 2026

Uh oh!

ramkrishs commented Apr 29, 2026

Uh oh!

BenjaminBossan commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants