News

The second new model that Microsoft released today, Phi-4-multimodal, is an upgraded version of Phi-4-mini with 5.6 billion parameters. It can process not only text but also images, audio and video.
Phi-4-multimodal is a 5.6 billion parameter model that uses the mixture-of-LoRAs technique to process speech, vision, and language simultaneously. LoRAs or Low-Rank Adaptations, is a way of ...
As competition in the generative AI field shifts toward multimodal models, Meta has released a preview of what can be its answer to the models released by frontier labs. Chameleon, its new family ...
Today, we are launching Precious3GPT, the first multi omics multispecies multi tissue multimodal transformer model for aging research and drug discovery. It is trained on biomedical text data, and ...
Google has announced the full launch of its latest on-device AI model, Gemma 3n, which was first announced in May 2025. The AI model brings advanced multimodal capabilities, including audio, image ...
Image: Morsa Images/Getty Images Microsoft has unveiled Kosmos-1, which it describes as a multimodal large language model (MLLM) that can not only respond to language prompts but also visual cues ...
The Multimodal transformer with Unified maSKed modeling, or MUSK for short, is trained on over 50 million histopathology images and one billion text tokens from clinical reports to predict cancer ...
Urban, A., et al. (2023) Precious1GPT: multimodal transformer-based transfer learning for aging clock development and feature importance analysis for aging and age-related disease target discovery.