|
| 1 | +# Backend selection |
| 2 | + |
| 3 | +`stable-diffusion.cpp` has two backend assignments: |
| 4 | + |
| 5 | +- `--backend` selects the runtime backend used to execute model graphs. |
| 6 | +- `--params-backend` selects the backend used to allocate model parameters. |
| 7 | + |
| 8 | +If `--params-backend` is not set, parameters use the same backend as their module runtime backend. |
| 9 | + |
| 10 | +## Syntax |
| 11 | + |
| 12 | +A backend assignment can be a single backend name: |
| 13 | + |
| 14 | +```shell |
| 15 | +sd-cli -m model.safetensors -p "a cat" --backend cpu |
| 16 | +``` |
| 17 | + |
| 18 | +This applies to every module that does not have a more specific assignment. |
| 19 | + |
| 20 | +Assignments can also target individual modules: |
| 21 | + |
| 22 | +```shell |
| 23 | +sd-cli -m model.safetensors -p "a cat" --backend te=cpu,vae=cuda0,diffusion=vulkan0 |
| 24 | +``` |
| 25 | + |
| 26 | +The same syntax is used for parameter placement: |
| 27 | + |
| 28 | +```shell |
| 29 | +sd-cli -m model.safetensors -p "a cat" --backend cuda0 --params-backend te=cpu,vae=cpu |
| 30 | +``` |
| 31 | + |
| 32 | +Module names are case-insensitive. Hyphens and underscores in module names are ignored, so `clip_vision`, `clip-vision`, and `clipvision` are equivalent. |
| 33 | + |
| 34 | +`all=`, `default=`, and `*=` can be used to set the default backend inside a mixed assignment: |
| 35 | + |
| 36 | +```shell |
| 37 | +sd-cli -m model.safetensors -p "a cat" --backend all=cuda0,te=cpu |
| 38 | +``` |
| 39 | + |
| 40 | +## Modules |
| 41 | + |
| 42 | +| Module | Purpose | Accepted names | |
| 43 | +| --- | --- | --- | |
| 44 | +| `diffusion` | UNet, DiT, MMDiT, Flux, Wan, Qwen Image, and other diffusion models | `diffusion`, `model`, `unet`, `dit` | |
| 45 | +| `te` | Text encoders and conditioners | `te`, `clip`, `text`, `textencoder`, `textencoders`, `conditioner`, `cond`, `llm`, `t5`, `t5xxl` | |
| 46 | +| `clip_vision` | CLIP vision encoder | `clip_vision`, `clipvision`, `clip-vision`, `vision` | |
| 47 | +| `vae` | VAE and TAE | `vae`, `firststage`, `autoencoder`, `tae` | |
| 48 | +| `controlnet` | ControlNet | `controlnet`, `control` | |
| 49 | +| `photomaker` | PhotoMaker ID encoder and PhotoMaker LoRA | `photomaker`, `photomakerid`, `pmid`, `photo` | |
| 50 | +| `upscaler` | ESRGAN upscaler | `upscaler`, `esrgan`, `hires` | |
| 51 | + |
| 52 | +`te` is the preferred module name for text encoders. `clip` is kept as an accepted alias because many existing commands and model names use CLIP terminology. |
| 53 | + |
| 54 | +## Backend names |
| 55 | + |
| 56 | +Backend names are resolved against the GGML backend device list. Matching is case-insensitive and accepts exact names or unique prefixes, so common values include names such as: |
| 57 | + |
| 58 | +- `cpu` |
| 59 | +- `cuda0` |
| 60 | +- `vulkan0` |
| 61 | +- `metal` |
| 62 | + |
| 63 | +The special values `auto`, `default`, and an empty backend name select the default backend. The default preference is GPU, then integrated GPU, then CPU. |
| 64 | + |
| 65 | +The special value `gpu` selects the first GPU backend, falling back to the first integrated GPU backend. |
| 66 | + |
| 67 | +## Runtime backend vs. parameter backend |
| 68 | + |
| 69 | +The runtime backend controls where graph execution runs. The parameter backend controls where model weights are allocated. |
| 70 | + |
| 71 | +For example: |
| 72 | + |
| 73 | +```shell |
| 74 | +sd-cli -m model.safetensors -p "a cat" --backend cuda0 --params-backend cpu |
| 75 | +``` |
| 76 | + |
| 77 | +This runs all modules on `cuda0`, but stores parameters in CPU RAM. During execution, parameters are moved to the runtime backend as needed. |
| 78 | + |
| 79 | +Per-module assignments can be mixed: |
| 80 | + |
| 81 | +```shell |
| 82 | +sd-cli -m model.safetensors -p "a cat" --backend diffusion=cuda0,te=cpu,vae=cpu --params-backend diffusion=cuda0,te=cpu,vae=cpu |
| 83 | +``` |
| 84 | + |
| 85 | +This keeps text encoding and VAE execution on CPU while the diffusion model runs on GPU. |
| 86 | + |
| 87 | +## Backend sharing and lifetime |
| 88 | + |
| 89 | +Backends are managed by `SDBackendManager`. |
| 90 | + |
| 91 | +Within one manager, backend instances are cached by resolved backend device name. If multiple modules request the same backend, they share the same `ggml_backend_t`. |
| 92 | + |
| 93 | +For example: |
| 94 | + |
| 95 | +```shell |
| 96 | +--backend te=cpu,vae=cpu |
| 97 | +``` |
| 98 | + |
| 99 | +uses one shared CPU backend for both `te` and `vae` runtime execution. |
| 100 | + |
| 101 | +Runtime and parameter assignments also share the same backend cache. If `--backend diffusion=cuda0` and `--params-backend diffusion=cuda0` resolve to the same device, both use the same backend instance. |
| 102 | + |
| 103 | +`SDBackendManager` owns the backend instances and frees them when the context or upscaler is destroyed. Model runners receive non-owning runtime and parameter backend pointers and do not free them. |
| 104 | + |
| 105 | +## Compatibility flags |
| 106 | + |
| 107 | +The older CPU placement flags are still supported: |
| 108 | + |
| 109 | +- `--clip-on-cpu` |
| 110 | +- `--vae-on-cpu` |
| 111 | +- `--control-net-cpu` |
| 112 | +- `--offload-to-cpu` |
| 113 | + |
| 114 | +`--clip-on-cpu`, `--vae-on-cpu`, and `--control-net-cpu` affect runtime backend assignment only when `--backend` is not set. They map to `te=cpu`, `vae=cpu`, and `controlnet=cpu`. |
| 115 | + |
| 116 | +`--offload-to-cpu` affects parameter backend assignment only when `--params-backend` is not set. It is equivalent to: |
| 117 | + |
| 118 | +```shell |
| 119 | +--params-backend cpu |
| 120 | +``` |
| 121 | + |
| 122 | +Explicit `--backend` and `--params-backend` assignments are preferred for new commands. |
0 commit comments