Skip to content

PlusLabNLP/SPHERE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 SPHERE

ICLR 2026: Energy-Regularized Sequential Model Editing on Hyperspheres

arXiv DOI Venue

📦 Installation🚀 Quick Start🌐 Project Page✏️ EasyEdit📄 Paper📊 Slides🎬 Video


SPHERE Overview

Figure: (a) A weight matrix is viewed as a set of neurons (red dots) on a hypersphere. (b) Current SOTA methods introduce perturbations (blue triangles) that interfere with the principal hyperspherical directions of pre-edit weights. (c) SPHERE projects new knowledge onto a sparse space complementary to the principal hyperspherical directions.

📰 News

  • 🔥 [2026.03] We release the pre-computed cov matrices for quick reproduction. See Download.
  • 🔥 [2026.02] SPHERE is supported in EasyEdit.
  • 🎉 [2026.01] SPHERE is accepted by ICLR 2026 (Score: 8884, Top-1.1% in Transfer/Meta/Lifelong Learning track).
  • 🚀 [2025.09] SPHERE is released.

📦 Installation

pip install torch==1.12.1
pip install einops==0.4.0 higher==0.2.1 hydra-core==1.2.0
pip install transformers==4.30.1 datasets==1.18.3
pip install matplotlib==3.6.1 spacy==3.4.1
pip install scipy==1.9.2 scikit-learn==1.0.2 nltk==3.7
📋 Full dependency list
Package Version
pytorch 1.12.1
einops 0.4.0
higher 0.2.1
hydra-core 1.2.0
transformers 4.30.1
datasets 1.18.3
matplotlib 3.6.1
spacy 3.4.1
scipy 1.9.2
scikit-learn 1.0.2
nltk 3.7

📥 Download

We provide the pre-computed cov matrix for both Llama3-8B-Instruct and Qwen2.5-7B-Instruct via Google Drive.

After downloading, decompress the file and place it under the ./data/stats directory.

🚀 Quick Start

Example: Editing Qwen2.5 (7B) on the CounterFact dataset using SPHERE

Step 1: Edit the Model

python3 -m experiments.evaluate \
    --alg_name=AlphaEdit \
    --model_name=./Qwen2.5-7B-Instruct \
    --hparams_fname=Qwen2.5-7B.json \
    --ds_name=mcf \
    --dataset_size_limit=5000 \
    --num_edits=100 \
    --beta_hse=0.5 \
    --alpha=0.5
🔧 Argument details
Argument Description
--alg_name Algorithm name (e.g., AlphaEdit)
--model_name Path to the model (e.g., ./Qwen2.5-7B-Instruct)
--hparams_fname Hyperparameter JSON file (e.g., Qwen2.5-7B.json)
--ds_name Dataset name (e.g., mcf)
--dataset_size_limit Total number of editing samples
--num_edits Batch size for each round of editing
--beta_hse Cumulative Ratio — top percentage of principal directions to suppress (e.g., 0.5 = top 50%)
--alpha Suppression Strength — controls extent of perturbation removal along principal directions

Tip

  • To run the baseline, set beta_hse=0.
  • To use SPHERE on MEMIT / PRUNE / RECT, set beta_hse=0.5, alpha=0.8 to reproduce paper results.

The edited weights from each run are stored as:

📂 Edited_Weight/
└── 📂 <alg_name>/
    └── 📂 <model_name>/
        ├── 📁 <dataset>_weight_data_batch_<batch_size>_<beta_hse>_<alpha>/
        ├── 📁 <dataset>_weight_data_batch_<batch_size>_<beta_hse>_<alpha>/
        └── ...

Step 2: Editing Evaluation

python3 -m scripts.evaluate_each_epoch \
    --model_name=./Qwen2.5-7B-Instruct \
    --weight_folder=./Edited_Weight/<alg_name>/<model_name>/<dataset>_weight_data_batch_<batch_size>_<beta_hse>_<alpha>/ \
    --ds_name=mcf \
    --dataset_size_limit=5000 \
    --generation_test_interval=100
🔧 Argument details
Argument Description
--model_name Path to the model being evaluated
--weight_folder Path to saved weights from previous editing
--ds_name Dataset name (e.g., mcf)
--dataset_size_limit Total number of evaluation samples
--generation_test_interval Run test generation every N evaluation rounds

📊 Results are saved to: ./Edited_Weight/<alg_name>/<model_name>/<dataset>_weight_data_batch_<...>/summary/summary.json

Step 3: Downstream Tasks Evaluation

python3 -m scripts.evaluate_each_epoch \
    --model_name=./Qwen2.5-7B-Instruct \
    --weight_folder=./Edited_Weight/<alg_name>/<model_name>/<dataset>_weight_data_batch_<batch_size>_<beta_hse>_<alpha>/

📊 Results are saved to: ./Edited_Weight/<alg_name>/<model_name>/<dataset>_weight_data_batch_<...>/rect_eval/

📝 Citation

If you find this work useful, please cite our paper:

@inproceedings{liu2026energy,
  title     = {Energy-Regularized Sequential Model Editing on Hyperspheres},
  author    = {Liu, Qingyuan and Gu, Jia-Chen and Yao, Yunzhi and Wang, Hong and Peng, Nanyun},
  booktitle = {The Fourteenth International Conference on Learning Representations},
  year      = {2026}
}

🙏 Acknowledgment & Contact

Our code is built upon MEMIT, EMMET, and AlphaEdit. If you have any questions, feel free to reach out at ql2505(at)columbia.edu.

About

ICLR 2026: Energy-Regularized Sequential Model Editing on Hyperspheres

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages