1 articles with this tag
Nvidia's Ziv Ilan explains how to reduce diffusion model latency using quantization, caching, and distillation, plus the new FastGen library.