
Business
Benchmarking Costs Climb: Reasoning LLMs Are Pricey to Test
An independent AI test lab detailed the rising cost of benchmarking reasoning models.
Business
An independent AI test lab detailed the rising cost of benchmarking reasoning models.
Machine Learning Research
Using an 8-bit number format like FP8 during training saves computation compared to 16- or 32-bit formats, but it can yield less-accurate results. Researchers trained models using 4-bit numbers without sacrificing accuracy.
Machine Learning Research
Large language models excel at processing text but can’t interpret images, video, or audio directly without further training on those media types. Researchers devised a way to overcome this limitation.
Machine Learning Research
OpenAI refreshed its roster of models and scheduled the largest, most costly one for removal.
Machine Learning Research
Google’s new flagship model raised the state of the art in a variety of subjective and objective tests.
Machine Learning Research
If you have a collection of variables that represent, say, a cancer patient and you want to classify the patient’s illness as likely cancer or not, algorithms based on decision trees, such as gradient-boosted trees, typically perform better than neural networks.
Machine Learning Research
Anthropic’s Claude 3.7 Sonnet implements a hybrid reasoning approach that lets users decide how much thinking they want the model to do before it renders a response.
Machine Learning Research
OpenAI launched GPT-4.5, which may be its last non-reasoning model.
Machine Learning Research
Merging multiple fine-tuned models is a less expensive alternative to hosting multiple specialized models. But, while model merging can deliver higher average performance across several tasks, it often results in lower performance on specific tasks. New work addresses this issue.
Machine Learning Research
OpenAI launched not only its highly anticipated o1 model but also an operating mode that enables the model to deliver higher performance — at a hefty price.
Machine Learning Research
Coding agents are improving, but can they tackle machine learning tasks?
Tech & Society
A new study suggests that leading AI models may meet the requirements of the European Union’s AI Act in some areas, but probably not in others.