Uniform Multimodal Transformer Model

News

Microsoft releases new Phi models optimized for multimodal processing ...

The second new model that Microsoft released today, Phi-4-multimodal, is an upgraded version of Phi-4-mini with 5.6 billion parameters. It can process not only text but also images, audio and video.

InfoWorld5mon

Microsoft’s Phi-4-multimodal AI model handles speech ... - InfoWorld

Phi-4-multimodal is a 5.6 billion parameter model that uses the mixture-of-LoRAs technique to process speech, vision, and language simultaneously. LoRAs or Low-Rank Adaptations, is a way of ...

VentureBeat1y

Meta introduces Chameleon, a state-of-the-art multimodal model

As competition in the generative AI field shifts toward multimodal models, Meta has released a preview of what can be its answer to the models released by frontier labs. Chameleon, its new family ...

EurekAlert!1y

Insilico open sources Precious3GPT, a novel multimodal multi species ...

Today, we are launching Precious3GPT, the first multi omics multispecies multi tissue multimodal transformer model for aging research and drug discovery. It is trained on biomedical text data, and ...

Hosted on MSN1mon

Google launches Gemma 3n, multimodal Open Source AI model that ... - MSN

Google has announced the full launch of its latest on-device AI model, Gemma 3n, which was first announced in May 2025. The AI model brings advanced multimodal capabilities, including audio, image ...

ZDNet2y

Now Microsoft has a new AI model - Kosmos-1 | ZDNET

Image: Morsa Images/Getty Images Microsoft has unveiled Kosmos-1, which it describes as a multimodal large language model (MLLM) that can not only respond to language prompts but also visual cues ...

The Stanford Daily6mon

MUSK model forms personalized cancer prognosis with predictive AI

The Multimodal transformer with Unified maSKed modeling, or MUSK for short, is trained on over 50 million histopathology images and one billion text tokens from clinical reports to predict cancer ...

News Medical2y

Novel approach to multimodal aging clock for interpretable age ...

Urban, A., et al. (2023) Precious1GPT: multimodal transformer-based transfer learning for aging clock development and feature importance analysis for aging and age-related disease target discovery.

Results that may be inaccessible to you are currently showing.

Hide inaccessible results