Machine Learning Research
DeepSeek Sharpens Its Reasoning: DeepSeek-R1, an affordable rival to OpenAI’s o1
A new open model rivals OpenAI’s o1, and it’s free to use or modify.
Machine Learning Research
A new open model rivals OpenAI’s o1, and it’s free to use or modify.
Machine Learning Research
A new model from Hangzhou upstart DeepSeek delivers outstanding performance and may change the equation for training costs.
Machine Learning Research
Merging multiple fine-tuned models is a less expensive alternative to hosting multiple specialized models. But, while model merging can deliver higher average performance across several tasks, it often results in lower performance on specific tasks. New work addresses this issue.
Machine Learning Research
Harvard University amassed a huge new text corpus for training machine learning models.
Machine Learning Research
Large language models have been shown to be capable of lying when users unintentionally give them an incentive to do so. Further research shows that LLMs with access to tools can be incentivized to use them in deceptive ways.
Machine Learning Research
Anthropic analyzed 1 million anonymized conversations between users and Claude 3.5 Sonnet. The study found that most people used the model for software development and also revealed malfunctions and jailbreaks.
Machine Learning Research
In 2025, I expect progress in training foundation models to slow down as we hit scaling limits and inference costs continue to rise.
Machine Learning Research
For years, the best AI models got bigger and bigger. But in 2024, some popular large language models were small enough to run on a smartphone.
Business
Fierce competition among model makers and cloud providers drove down the price of access to state-of-the-art models.
Machine Learning Research
How do agents based on large language models compare to human experts when it comes to proposing machine learning research? Pretty well, according to one study.
Machine Learning Research
Google’s Gemini 2.0 Flash, the first member of its updated Gemini family of large multimodal models, combines speed with performance that exceeds that of its earlier flagship model, Gemini 1.5 Pro, on several measures.
Machine Learning Research
Microsoft updated its smallest model family with a single, surprisingly high-performance model.