Offload weights to the CPU to save VRAM without reducing generation speed. Using --offload-to-cpu allows you to offload weights to the CPU, saving VRAM without reducing generation speed. Use quantization to reduce memory usage. quantization