
Business
Benchmarking Costs Climb: Reasoning LLMs Are Pricey to Test
An independent AI test lab detailed the rising cost of benchmarking reasoning models.
Business
An independent AI test lab detailed the rising cost of benchmarking reasoning models.
The Batch Newsletter
The Batch AI News and Insights: There’s a new breed of GenAI Application Engineers who can build more-powerful applications faster than was possible before, thanks to generative AI.
Letters
There’s a new breed of GenAI Application Engineers who can build more-powerful applications faster than was possible before, thanks to generative AI.
Data Points
Mistral’s new integrated, fine-tunable coding tool. ChatGPT’s connectors, both official and DIY. Nvidia’s new small, open OCR and document analysis model. Reddit’s lawsuit against Anthropic over alleged scraping.
Machine Learning Research
Researchers identified a simple way to mislead autonomous agents based on large language models.
Tech & Society
AI’s thirst for energy is growing, but the technology also could help produce huge energy savings over the next five to 10 years, according to a recent report.
Business
AI is bringing a massive boost in productivity to Duolingo, maker of the most popular app for learning languages.
Machine Learning Research
DeepSeek updated its groundbreaking DeepSeek-R1 large language model to strike another blow for open-weights performance.
Letters
Everyone can benefit by learning to code with AI! At AI Fund, the venture studio I lead, everyone — not just the engineers — can vibe code or use more sophisticated AI-assisted coding techniques.
The Batch Newsletter
The Batch AI News and Insights: Everyone can benefit by learning to code with AI! At AI Fund, the venture studio I lead, everyone — not just the engineers — can vibe code or use more sophisticated AI-assisted coding techniques.
Data Points
BAGEL, an open ByteDance model that can read and write images and text. Perplexity’s Labs, a new tool to generate research artifacts. A database security failure in Lovable’s coding platform. MIT Technology Review’s new report on AI’s energy footprint.
Data Points
NLWeb, an open-source framework to bring AI chat to any website. FLUX.1 Kontext challenges GPT-Image with image generation and editing. LMEval, a new open-source suite for iteratively benchmarking models. Amazon’s new content deal with The New York Times.