Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 281 Bytes

File metadata and controls

7 lines (4 loc) · 281 Bytes

Offload weights to the CPU to save VRAM without reducing generation speed.

Using --offload-to-cpu allows you to offload weights to the CPU, saving VRAM without reducing generation speed.

Use quantization to reduce memory usage.

quantization