Skip to content

CIS6930-Distributed-Machine-Learning/anisotropic-dp-sgd

Repository files navigation

Anisotropic Noise Injection for DP-SGD

Reference implementation for the anisotropic DP-SGD proposal in anisotropic_dp_sgd.pdf.

This repo includes:

  • public subspace estimation from public gradients,
  • anisotropic Gaussian noise sampling with the privacy-floor constraint,
  • per-example gradient clipping and noisy averaging,
  • a runnable private linear regression example,
  • a PyTorch DP CIFAR trainer with public-subspace controls,
  • an experiment comparison script for seed sweeps,
  • tests for the core math and training utilities.

Setup

Create and activate a virtual environment, then install the dev requirements:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements-dev.txt

The CIFAR scripts expect data under data/. Create results/ if you want to save JSON outputs locally:

mkdir -p results

Quick Start

Run the small private linear regression demo:

source .venv/bin/activate
PYTHONPATH=. python examples/private_linear_regression.py

Run a CIFAR experiment from the command line:

source .venv/bin/activate
PYTHONPATH=. python experiments/train_cifar.py \
  --dataset cifar10 \
  --data-root data \
  --model small_cnn \
  --epochs 10 \
  --batch-size 512 \
  --eval-batch-size 256 \
  --physical-batch-size 16 \
  --learning-rate 0.5 \
  --noise-multiplier 1.0 \
  --delta 1e-5 \
  --alphas 0.0 \
  --public-fraction 0.1 \
  --seed 0 \
  --device cpu \
  --num-workers 0 \
  --output-json results/run.json

Use --download only if the dataset is missing. If you are on Apple Silicon, you can replace cpu with mps.

Recommended Workflows

Baseline run

For the isotropic baseline and the reasoning behind the shared settings, see docs/BASELINE_RUN.md.

Anisotropic run

For the anisotropic configuration and the projected utility-oriented variant, see docs/ANISOTROPIC_RUN.md.

Compare seed sweeps

Once you have matching baseline and anisotropic JSON outputs, compare them with the bundled script:

source .venv/bin/activate
PYTHONPATH=. python experiments/compare_cifar_runs.py \
  --baseline-glob 'results/baseline_*.json' \
  --anisotropic-glob 'results/anisotropic_*.json'

Project Layout

  • anisotropic_dp_sgd/subspace.py: public subspace estimation and projection
  • anisotropic_dp_sgd/mechanism.py: clipping and anisotropic Gaussian noise
  • anisotropic_dp_sgd/linear_regression.py: end-to-end training demo logic
  • anisotropic_dp_sgd/models.py: CIFAR-ready models
  • anisotropic_dp_sgd/data.py: CIFAR data loader construction
  • anisotropic_dp_sgd/privacy.py: Opacus RDP accountant wrapper
  • anisotropic_dp_sgd/trainer.py: PyTorch DP classifier training loop
  • examples/private_linear_regression.py: runnable example script
  • experiments/train_cifar.py: CIFAR experiment CLI
  • experiments/compare_cifar_runs.py: baseline-vs-anisotropic summary script
  • tests/: verification for the implementation

Core Idea

The mechanism uses the covariance

Sigma = sigma^2 * (C / B)^2 * [I + alpha * (I - U U^T)]

where:

  • U is a fixed orthonormal basis estimated from public data,
  • variance inside span(U) matches isotropic DP-SGD,
  • variance in the orthogonal complement is amplified by 1 + alpha.

This keeps the minimum directional variance aligned with standard DP-SGD, so the privacy floor is unchanged.

Tests

Run the test suite with:

source .venv/bin/activate
PYTHONPATH=. pytest

Paper Notes

See docs/PAPER_EXPERIMENTS.md for the suggested paper narrative, main result tables, ablations, and reproduction commands.

About

Anisotropic Noise Injection for Improving Utility in Differentially Private SGD

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages