The Batch | DeepLearning.AI (Page 8)

Diagram shows AI traits with pipelines for "evil" vs. "helpful" responses to user queries on animal treatment.

Machine Learning Research

Toward Steering LLM Personality: Persona Vectors allow model builders to identify and edit out sycophancy, hallucinations, and more

Large language models can develop character traits like cheerfulness or sycophancy during fine-tuning. Researchers developed a method to identify, monitor, and control such traits.

Hands strum a guitar covered in labels from major record companies, symbolizing AI music innovation.

Business

Record Labels Back AI-Music Startup: Klay Image emerges from relative obscurity to announce deals with Sony, Warner, and Universal

A music-generation newcomer emerged from stealth mode with licenses to train generative AI models on music controlled by the world’s biggest recording companies.

Two figures, symbolizing Microsoft and Anthropic, handshake to represent partnership and collaboration.

Business

Microsoft and Anthropic Form Alliance: Claude becomes the first leading language model available from all three cloud giants

Having recently revised its agreement with longtime partner OpenAI, Microsoft pledged to invest billions of dollars in Anthropic, one of OpenAI’s top competitors.

Table shows Gemini 3 Pro leading in benchmarks, outperforming Gemini 2.5, Claude Sonnet 4.5, and GPT-5.1.

Machine Learning Research

Google Dominates Arena Leaderboards (For the Moment): Gemini 3 Pro and Nano Banana Pro boast best-in-class multimodal reasoning and image generation

Google introduced Gemini 3 Pro and Nano Banana Pro, its flagship vision-language and image-generation models, and deployed them to billions of users worldwide.

A robot holds a bubble wand, surrounded by bubbles and colorful trees, with a futuristic city skyline.

Letters

Understanding the AI Bubble — If There Is One

Is there an AI bubble? With the massive number of dollars going into AI infrastructure such as OpenAI’s $1.4 trillion plan and Nvidia briefly reaching a $5 trillion market cap, many have asked if speculation and hype have driven the values of AI investments above sustainable values.

The Batch Newsletter

Google Rules Arena Leaderboards, Microsoft+Anthropic, Record Labels Back AI Music, Personality Control for LLMs

The Batch AI News and Insights: Is there an AI bubble? With the massive number of dollars going into AI infrastructure such as OpenAI’s $1.4 trillion plan and Nvidia briefly reaching a $5 trillion market cap, many have asked if speculation and hype have driven the values of...

Technicians monitor holographic ballet dancers on stage via futuristic screens and robotic controls in a theater setting.

Data Points

Inside Olmo 3, a new family of fully open models: Grok 4.1’s uneasy balance between EQ and sycophancy

Nano Banana Pro, Google’s updated image generator. Anthropic’s latest partnerships with Microsoft and Nvidia. Memo, a home robot trained on real-life human tasks. A new AI play modeled on legendary French playwright Molière’s work.

Japanese computer scientists pointing at a screen

Data Points

Meta model detects and segments video objects: Google Gemini 3 wows on benchmark tests and leaderboards

GPT-5.1-Codex-Max, OpenAI’s improved long-context coding model. Music startup Klay’s reported deal with Universal, Warner, and Sony. DeepSeek R1 Slim, a trim, decensored reasoning model. NTT’s Tsuzumi 2, an efficient model optimized for Japan.

Image illustrates the Self-Search method, simulating web searches to improve model accuracy in tests.

Machine Learning Research

More-Efficient Agentic Search: Researchers fine-tune models to search their own parameters to boost recall

Large language models may have learned knowledge that’s relevant to a given prompt, but they don’t always recall it consistently. Fine-tuning a model to search its parameters as though it were searching the web can help it find knowledge in its own weights.

Visual map outlines cybercrime operation phases, highlighting AI-driven processes and human validation steps.

Machine Learning Research

Anthropic Cyberattack Report Sparks Controversy: Security researchers question whether coding agents allow unprecedented automated attacks

Independent cybersecurity researchers pushed back on a report by Anthropic that claimed hackers had used its Claude Code agentic coding system to perpetrate an unprecedented automated cyberattack.

Chart highlights Kimi K2’s top performance in agentic tasks, outperforming rivals in reasoning and coding.

Machine Learning Research

Top Agentic Results, Open Weights: Kimi K2 Thinking outperforms proprietary models with new techniques for agentic tool use

The latest open-weights large language model from Moonshot AI challenges top proprietary LLMs at agentic tasks by executing hundreds of tool calls sequentially and pausing to think between each.

White Waymo vehicle near water, city skyline visible; displays autonomous service for urban freeways.

Machine Learning Research

Self-Driving Cars on U.S. Freeways: Waymo deploys autonomous cars on California and Arizona expressways

Waymo became the first company to offer fully autonomous, driverless taxi service on freeways in the United States.

Latest

Toward Steering LLM Personality: Persona Vectors allow model builders to identify and edit out sycophancy, hallucinations, and more

Record Labels Back AI-Music Startup: Klay Image emerges from relative obscurity to announce deals with Sony, Warner, and Universal

Microsoft and Anthropic Form Alliance: Claude becomes the first leading language model available from all three cloud giants

Google Dominates Arena Leaderboards (For the Moment): Gemini 3 Pro and Nano Banana Pro boast best-in-class multimodal reasoning and image generation

Understanding the AI Bubble — If There Is One

Google Rules Arena Leaderboards, Microsoft+Anthropic, Record Labels Back AI Music, Personality Control for LLMs

Inside Olmo 3, a new family of fully open models: Grok 4.1’s uneasy balance between EQ and sycophancy

Meta model detects and segments video objects: Google Gemini 3 wows on benchmark tests and leaderboards

More-Efficient Agentic Search: Researchers fine-tune models to search their own parameters to boost recall

Anthropic Cyberattack Report Sparks Controversy: Security researchers question whether coding agents allow unprecedented automated attacks

Top Agentic Results, Open Weights: Kimi K2 Thinking outperforms proprietary models with new techniques for agentic tool use

Self-Driving Cars on U.S. Freeways: Waymo deploys autonomous cars on California and Arizona expressways