
Machine Learning Research
Compact Reasoning: QwQ-32B challenges DeepSeek-R1 and other larger reasoning models
Most models that have learned to reason via reinforcement learning were huge models. A much smaller model now competes with them.
Machine Learning Research
Most models that have learned to reason via reinforcement learning were huge models. A much smaller model now competes with them.
Machine Learning Research
Anthropic’s Claude 3.7 Sonnet implements a hybrid reasoning approach that lets users decide how much thinking they want the model to do before it renders a response.
Machine Learning Research
OpenAI launched GPT-4.5, which may be its last non-reasoning model.
Machine Learning Research
Typical large language models are autoregressive, predicting the next token, one at a time, from left to right. A new model hones all text tokens at once.
Machine Learning Research
Although large language models can improve their performance by generating a chain of thought (CoT) — intermediate text tokens that break down the process of responding to a prompt into a series of steps.
Business
Elon Musk and a group of investors made an unsolicited bid to buy the assets of the nonprofit that controls OpenAI, complicating the AI powerhouse’s future plans.
Machine Learning Research
Replit, an AI-driven integrated development environment, updated its mobile app to generate further mobile apps to order.
Machine Learning Research
xAI’s new model family suggests that devoting more computation to training remains a viable path to building more capable AI.
Machine Learning Research
While Hangzhou’s DeepSeek flexed its muscles, Chinese tech giant Alibaba vied for the spotlight with new open vision-language models.
Machine Learning Research
OpenAI introduced a state-of-the-art agent that produces research reports by scouring the web and reasoning over what it finds.
Machine Learning Research
Google updated the December-vintage reasoning model Gemini 2.0 Flash Thinking and other Flash models, gaining ground on OpenAI o1 and DeepSeek-R1.
Machine Learning Research
As Anthropic, Google, OpenAI, and others roll out agents that are capable of computer use, new work shows how underlying models can be trained to do this.