News

Stability AI’s new Stable Audio platform comprises not one but three neural networks. Its core component is U-Net, a latent diffusion model with 907 million parameters.
Stability AI announces new team members Stable Audio’s model architecture consists of a variational autoencoder (VAE), a text encoder, and a U-Net-based conditioned diffusion model.
Cute AI critters generated by the author using Stable Diffusion on his PC. For comparison's sake, a GeForce RTX 2060 card can draw as much as 200 watts to do the same task in only about half the time.
The VAE used for Stable Diffusion 1.x/2.x and other models (KL-F8) has a critical flaw, probably due to bad training, that is holding back all models that use it (almost certainly including DALL-E ...