A new AI technique called consistency diffusion can make language models run up to 14 times faster without sacrificing output quality. This speed boost could lower costs and improve user experiences for AI-powered apps.
https://www.together.ai/blog/consistency-diffusion-language-models