Machine Learning Research
Better Multimodal Performance With Open Weights: Qwen2.5-Omni 7B raises the bar for small multimodal models
Alibaba’s latest open-weights system raises the bar for multimodal tasks in a relatively small model.
Machine Learning Research
Alibaba’s latest open-weights system raises the bar for multimodal tasks in a relatively small model.
Machine Learning Research
Meta updated its popular open-weights models, claiming performance superior to closed competitors in three size classes.
Machine Learning Research
Even without explicit training in reasoning, large language models “think” in ways that may be more deliberate than previously understood.
Machine Learning Research
AI systems designed to generate animated 3D scenes that include active human characters have been limited by a shortage of training data, such as matched 3D scenes and human motion-capture examples. Generated video clips can get the job done without motion capture.
Machine Learning Research
Researchers updated the highly responsive Moshi voice-to-voice model to discuss visual input.
Machine Learning Research
Diffusion transformers learn faster when they can look at embeddings generated by a pretrained model like DINOv2.
Machine Learning Research
Diffusion models usually take many noise-removal steps to produce an image, which takes time at inference. There are ways to reduce the number of steps, but the resulting systems are less effective. Researchers devised a streamlined approach that doesn’t sacrifice output quality.
Machine Learning Research
Google updated its open-weights family of large language models to include versions that handle image and video inputs.
Science
Materials that have specific properties are essential to progress in critical technologies like solar cells and batteries. A machine learning model designs new materials to order.
Machine Learning Research
An AI agent synthesizes novel scientific research hypotheses. It's already making an impact in biomedicine.
Machine Learning Research
Multilingual AI models often suffer uneven performance across languages, especially in multimodal tasks. A pair of lean models counters this trend with consistent understanding of text and images across major languages.
Tech & Society
Large language models built by developers in China may, in some applications, be less useful outside that country because they avoid topics its government deems politically sensitive. A developer fine-tuned DeepSeek-R1 to widen its scope without degrading its overall performance.