Large Language Models (LLMs) - The Batch | DeepLearning.AI (Page 10)

Graph showing how training loss affects token prediction accuracy and hallucination elimination.

Machine Learning Research

Getting the Facts Right: A memory method that reduces hallucinations in LLMs

Large language models that remember more hallucinate less.

o1 Family Benchmarks comparing pass rates across AIME, Codeforces, and GPQA.

Machine Learning Research

Higher Reasoning: OpenAI debuts o1 and pro mode for $200/month

OpenAI launched not only its highly anticipated o1 model but also an operating mode that enables the model to deliver higher performance — at a hefty price.

Table comparing model performance on Mathvista, MMMU, ChartQA, DocVQA, and other tasks.

Machine Learning Research

Mistral’s Vision-Language Contender: Mistral unveils Pixtral Large, a rival to top vision-language models

Mistral AI unveiled Pixtral Large, which rivals top models at processing combinations of text and images.

Flow diagram of an application using LLMs to process prompts and tools for responses.

Business

Agents Open the Wallet: Stripe builds ecommerce agent toolkit for AI to securely spend money

One of the world’s biggest payment processors is enabling large language models to spend real money.

Illustration of a person holding a box with network nodes emerging from it.

Business

AI Power Couple Recommits: Amazon deepens Anthropic partnership with $4 billion investment

Amazon and Anthropic expanded their partnership, potentially strengthening Amazon Web Services’ AI infrastructure and lengthening the high-flying startup’s runway.

Bar charts comparing performance of AI models across six tasks.

Machine Learning Research

Reasoning Revealed: DeepSeek-R1, a transparent challenger to OpenAI o1

An up-and-coming Hangzhou AI lab unveiled a model that implements run-time reasoning similar to OpenAI o1 and delivers competitive performance. Unlike o1, it displays its reasoning steps.

Efficient Foundations animation showing layered AI model components.

Machine Learning Research

More-Efficient Training for Transformers: Researchers reduce transformer training costs by 20% with minimal performance loss

Researchers cut the processing required to train transformers by around 20 percent with only a slight degradation in performance.

Graph showing test loss decreases with more tokens and larger model sizes (103-109 parameters).

Machine Learning Research

Next-Gen Models Show Limited Gains: AI giants rethink model training strategy as scaling laws break down

Builders of large AI models have relied on the idea that bigger neural networks trained on more data and given more processing power would show steady improvements. Recent developments are challenging that idea.

User retrieves vendor contact information to fill out a request form, verifying each entry.

Machine Learning Research

Claude Controls Computers: Anthropic empowers Claude Sonnet 3.5 to operate desktop apps, but cautions remain

API commands for Claude Sonnet 3.5 enable Anthropic’s large language model to operate desktop apps much like humans do. Be cautious, though: It’s a work in progress.

Comparison table of pre-trained models like Mistral, Llama, and Gemma, showcasing performance across evaluation metrics.

Machine Learning Research

Mistral AI Sharpens the Edge: Mistral AI unveils Ministral 3B and 8B models, outperforming rivals in small-scale AI

Mistral AI launched two models that raise the bar for language models with 8 billion or fewer parameters, small enough to run on many edge devices.

Business

AI Bromance Turns Turbulent: Microsoft and OpenAI partnership faces strain as both seek less dependence

Once hailed by OpenAI chief Sam Altman as the “best bromance in tech,” the partnership between Microsoft and OpenAI is facing challenges as both companies seek greater independence.

A GIF showcasing a dynamic spreadsheet interaction using AI, with cells being populated and analyzed automatically.

Machine Learning Research

Enabling LLMs to Read Spreadsheets: A method to process large spreadsheets for accurate question answering

Large language models can process small spreadsheets, but very large spreadsheets often exceed their limits for input length. Researchers devised a method that processes large spreadsheets so LLMs can answer questions about them.