[arXiv] This is an implementation of ColA: Collaborative Adaptation with Gradient Learning
- Illustration of the Fine-Tuning as a Service (FTaaS) system architecture.
See requirements.txt
- Use
verify.pyto verify method and theory - Experimental control are configured in
config.yml - Use
make.shto generate run script withmake.py - Use
make.pyto generate exp script toscripts - Use
make_dataset.pyto prepare datasets (SAMsum has to be downloaded manually) - Use
process.pyto process exp results - Experimental setup are listed in
make.py - Hyperparameters can be found at
process_control()in utils.py
- Train full fine-tuning for CoLA dataset (RoBERT (base), Sequence Classification,
$B=32$ )python train_model.py --control_name glue-cola_roberta-base_sc_full_32
- Train LoRA for FPB dataset (BART (base), Sequece to Sequence,
$B=32$ )python train_peft.py --control_name fpb-sa_bart-base_s2s_lora_32
- Train ColA (Low Rank) for WikiSQL dataset (BART (base), Sequece to Sequence,
$I=1$ ,$B=32$ )python train_cola.py --init_seed 0 --world_size 1 --num_experiment 1 --resume_mode 1 --control_name wikisql_bart-base_s2s_cola-lowrank-1_32
- Test ColA (Low Rank-Linear) for Dolly dataset (GPT-2, Causal Language Modeling,
$I=1$ ,$B=32$ , Collaboration)python test_cola_dist.py --init_seed 0 --world_size 1 --num_experiment 1 --resume_mode 1 --control_name dolly-15k_gpt2_clm_cola-lowrank~linear-1_32_col
-
Learning curves of (a) Linear (b) MLP and (c) CNN with the CIFAR10 dataset of IC task and Accuracy metric.

-
Learning curves of (a) MNLI (b) SST-2, and (c) MRPC datasets of SC task with and GLUE metric.

-
Learning curves of (a) GPT-2 and (b) Llama-2 (Q, V) on Dolly dataset of CLM task and ROUGE (Longest) metric.

Enmao Diao
Qi Le
Suya Wu
Xinran Wang
Ali Anwar
Jie Ding
Vahid Tarokh
