
Machine Learning Research
Better Video, Fewer Tokens: STORM Processes Fewer Tokens And Still Beats GPT-4o On Video Understanding Benchmarks
Researchers reduced the number of tokens needed to represent video frames to be fed to a transformer.
Machine Learning Research
Researchers reduced the number of tokens needed to represent video frames to be fed to a transformer.
Business
Renowned investment analyst Mary Meeker is back with a report on the AI market, six years after publishing her last survey of the internet.
Machine Learning Research
DeepSeek updated its groundbreaking DeepSeek-R1 large language model to strike another blow for open-weights performance.
Machine Learning Research
DeepSeek made headlines late last year, when it built a state-of-the-art, open-weights large language model at a cost far lower than usual. The upstart developer shared new details about its method.
Machine Learning Research
Anthropic continued its tradition of building AI models that raise the bar in coding tasks.
Machine Learning Research
Using an 8-bit number format like FP8 during training saves computation compared to 16- or 32-bit formats, but it can yield less-accurate results. Researchers trained models using 4-bit numbers without sacrificing accuracy.
Machine Learning Research
OpenAI launched an agentic software-development system.
Machine Learning Research
Improving a large language model’s factual accuracy typically requires making it bigger, which in turn, involves more computation. Researchers devised an architecture that enables models to recall relevant details without significantly increasing the amount of computation required.
Machine Learning Research
An open-source code generator performs comparably to the reasoning models DeepSeek-R1 and OpenAI o1 with a much smaller model.
Machine Learning Research
Microsoft published its latest recipe for training reasoning models, substantially expanding what is still a fairly small base of public knowledge.
Machine Learning Research
Researchers showed that supervised fine-tuning on as few as 1,000 examples can enable a pretrained large language model to reason — and a clever gambit can boost its performance to rival that of top reasoning models.
Machine Learning Research
Alibaba’s new model family may unseat DeepSeek-R1’s four-month reign as the top open-weights large language model.