Skip to content

CodeKunalTomar/fa-ucb

Repository files navigation

Bandits with Untrusted Change-Time Forecasts

Code for the experiments in Bandits with Untrusted Change-Time Forecasts (submitted to ACM Transactions on Algorithms, 2026).

FA-UCB is a UCB-style algorithm for piecewise-stationary multi-armed bandits that receives untrusted predictions of when the reward distributions change. Each prediction is verified by a paired statistical probe (fresh samples taken just before and just after the predicted time); the learner restarts only on a confirmed change. A geometric distrust gate bounds the probing cost of junk predictions logarithmically.

Layout

Script Paper artifact Runtime*
exp_main.py horizon-scaling figure (M-UCB series) + adversarial figure ~25 min
exp_glr_and_gate.py horizon-scaling figure (GLR series, T ≤ 200k) + gate figure ~50 min
exp_glr_400k.py horizon-scaling figure (GLR point at T = 400k) ~45 min
exp_gate.py gate figure (overlapping-prediction schedule) ~4 min
exp_easy_regime.py negative-scope table (easy regime) + asymmetry sweep ~5 min
exp_dense_blindspot.py negative-scope table (dense blind-spot regime) ~12 min
make_figures.py renders figures/*.pdf from the numbers reported by the scripts seconds

*Measured on a laptop CPU (Intel i5-1235U, single process). No GPU is used.

Reproducing

pip install -r requirements.txt
python exp_main.py            # writes results/main.txt
python exp_glr_and_gate.py    # writes results/glr_and_gate.txt
python exp_glr_400k.py        # writes results/glr_400k.txt
python exp_easy_regime.py
python exp_dense_blindspot.py
python exp_gate.py
python make_figures.py        # writes figures/*.pdf

All randomness is seeded (environment seeds and algorithm seeds are separate; matched seeds give paired comparisons across algorithms). Each script prints its summary table and writes it under results/.

Notes

  • The GLR baseline follows the recipe of Besson, Kaufmann, Maillard and Seznec (arXiv:1902.01575) with three deviations for CPU feasibility, documented in the script header and in the paper.
  • The theoretical analysis uses a horizon-aware exploration bonus; the experiments run standard UCB1. This is disclosed in the paper.

License

MIT (see LICENSE).

About

Experiments for 'Bandits with Untrusted Change-Time Forecasts'

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages