Uniform Multimodal Transformer Model

News

Meta introduces Chameleon, a state-of-the-art multimodal model

As competition in the generative AI field shifts toward multimodal models, Meta has released a preview of what can be its answer to the models released by frontier labs. Chameleon, its new family ...

SiliconANGLE4mon

Microsoft releases new Phi models optimized for multimodal processing ...

The second new model that Microsoft released today, Phi-4-multimodal, is an upgraded version of Phi-4-mini with 5.6 billion parameters. It can process not only text but also images, audio and video.

ZDNet2y

Now Microsoft has a new AI model - Kosmos-1 | ZDNET

Image: Morsa Images/Getty Images. Microsoft has unveiled Kosmos-1, which it describes as a multimodal large language model (MLLM) that can not only respond to language prompts but also visual cues ...

InfoQ4y

Alibaba Announces 10 Billion Parameter Multi-Modal AI M6

Alibaba has created an AI model called Multi-Modality to Multi-Modality Multitask Mega-transformer (M6). The model contains 10 billion parameters and is pretrained on a dataset consisting of 1.9TB of ...

The Stanford Daily5mon

MUSK model forms personalized cancer prognosis with predictive AI

The Multimodal transformer with Unified maSKed modeling, or MUSK for short, is trained on over 50 million histopathology images and one billion text tokens from clinical reports to predict cancer ...

VentureBeat4mon

‘Insane’: OpenAI introduces GPT-4o native image generation and it ...

The model also integrates into Sora, OpenAI’s video-generation platform, further expanding multimodal capabilities. In an announcement on X, OpenAI confirmed that GPT-4o’s image generation is ...

InfoWorld4mon

Microsoft’s Phi-4-multimodal AI model handles speech ... - InfoWorld

Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...

Semiconductor Engineering8d

Transformers At The Edge: Efficient LLM Deployment

Complex model architectures, demanding runtime computations, and transformer-specific operations introduce unique challenges.

EurekAlert!12mon

Insilico open sources Precious3GPT, a novel multimodal multi species ...

Today, we are launching Precious3GPT, the first multi omics multispecies multi tissue multimodal transformer model for aging research and drug discovery. It is trained on biomedical text data, and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results