News

Adobe is the second big tech company to add audio to its AI videos. But there are some key differences between Firefly and Google's Veo 3.
IMS Toucan is a toolkit for training, using, and teaching state-of-the-art Text-to-Speech Synthesis, developed at the Institute for Natural Language Processing (IMS), University of Stuttgart, Germany, ...
In recent years, speech diffusion models have advanced rapidly. Alongside the widely used U-Net architecture, transformer-based models such as the Diffusion Transformer (DiT) have also gained ...
Non-autoregressive Expressive TTS: This project aims to provide a cornerstone for future research and application on a non-autoregressive expressive TTS including Emotional TTS and Conversational TTS.