Stability AI and Nvidia to Enhance Stable Diffusion XL Performance

Central to this performance boost is the integration of Nvidia's TensorRT, a high-performance optimization framework.

and have combined their expertise to enhance the performance of Stability AI's text-to-image product, Stable Diffusion XL (SDXL). This collaboration aims to improve the speed and efficiency of SDXL, aligning with Stability AI's goal of “unlocking human potential”.

Stable Diffusion is a text-to-image model based on diffusion techniques that can generate detailed images from text prompts, as well as modify existing images with text guidance and is relatively lightweight as it can run on consumer GPUs with at least 8 GB VRAM. Stable Diffusion XL is a new version launched recently, 

Nvidia TensorRT Integration

Central to the expected performance boost is the integration of Nvidia's TensorRT, a high-performance optimization framework. Stability AI has taken the initiative to host the TensorRT versions of SDXL and make the open ONNX weights accessible to SDXL users globally.

TensorRT is an SDK for high-performance deep learning inference that optimizes neural network models trained on all major frameworks and delivers low latency and high throughput on Nvidia GPUs. TensorRT is integrated with PyTorch, TensorFlow, ONNX, MATLAB, and application-specific SDKs such as Nvidia DeepStream, Nvidia Riva, Nvidia Merlin, Nvidia Maxine, Nvidia Morpheus, and Nvidia Broadcast Engine.

Impressive Performance Metrics

The integration results have been noteworthy. By combining TensorRT with the converted ONNX model, there has been a doubling of performance on Nvidia H100 chips. Nvidia H100 GPUs feature a dedicated Transformer Engine that can speed up large models by 30X over the previous generation, enabling industry-leading conversational AI.

This enhancement allows for generating high-definition images in a mere 1.47 seconds. As per the official announcement from Stability AI, “we have seen a double of performance on Nvidia H100 chips after integrating TensorRT and the converted ONNX model.”

Stability AI has hinted at more optimizations, such as the introduction of 8-bit precision, which they believe will further increase both speed and accessibility.