Data Points

Google’s Nano Banana hits the scene: OpenAI’s latest voice-to-voice model

Anthropic’s browser use extension preview. Automation hits entry-level workers first. Reinforcement Learning from Checklist Feedback. Authors settle copyright lawsuit over pirated books.

Analytics DeepLearning.AI

29 Aug 2025 — 4 min read

Welcome back! In today’s edition of Data Points, you’ll learn more about how:

Anthropic’s browser use extension preview
Automation hits entry-level workers first
Reinforcement Learning from Checklist Feedback
Authors settle copyright lawsuit over pirated books

But first:

OpenAI releases gpt-realtime and updates API for voice applications

OpenAI launched “gpt-realtime,” a new speech-to-speech model that processes audio directly through a single model rather than chaining multiple models together, achieving 82.8 percent accuracy on Big Bench Audio benchmarks (versus 65.6 percent for the previous version). The model also shows significant improvements in instruction following, function calling accuracy, and better understands non-verbal cues and language switching. OpenAI also made its Realtime API generally available with new features including remote MCP server support, image inputs, and phone calling. These releases enable developers to build production-ready voice agents that sound more human and handle complex tasks more reliably for fields such as customer support, personal assistance, and education. The new model costs $32 per 1 million audio input tokens and $64 per 1 million audio output tokens, a 20 percent reduction from earlier pricing. (OpenAI)

Google’s top-rated image editing model now available

Google DeepMind launched a new image editing model (alternately called Gemini 2.5 Flash Image Preview or “Nano Banana”) in the Gemini app. The model maintains consistent character likeness across edits, addressing a key challenge in AI photo manipulation. Users can change backgrounds, combine multiple photos, and apply iterative edits while preserving the original subject’s appearance, whether editing photos of people or pets. Advanced features include style transfer between images, multi-turn editing for progressive scene building, and the ability to blend photos together for composite scenes. The model is available today in the Gemini app, with all generated images including both visible watermarks and invisible SynthID digital watermarks. (Google)

Anthropic launches limited preview of browser use extension

Anthropic released a Chrome extension that allows Claude to interact directly with websites, clicking buttons and filling forms on users’ behalf. The company is initially testing with 1,000 Max plan users to gather feedback on safety issues before wider release. During internal red-teaming experiments, researchers found that without proper safeguards, malicious actors could use prompt injection attacks to trick Claude into harmful actions like deleting files or stealing data, with a 23.6 percent success rate. Anthropic implemented new defenses including site-level permissions, action confirmations, and advanced classifiers that reduced attack success to 11.2 percent, though the company acknowledges more work remains to reach near-zero risk levels. Users can join the waitlist at claude.ai/chrome, though Anthropic advises avoiding use on sites with financial, legal, or medical information during this research preview phase. (Anthropic)

Study shows AI reduces employment for entry-level workers

Researchers from Stanford University analyzed payroll data from millions of U.S. workers and found that employment for workers aged 22-25 in AI-exposed occupations like software development and customer service declined by 13 percent since late 2022. Employment for older workers in the same occupations and younger workers in less-exposed fields like nursing continued to grow during this period. The study distinguished between AI applications that automate versus augment work, finding employment declines only in occupations where AI primarily automates tasks. These findings provide large-scale evidence that generative AI may be beginning to displace entry-level workers who rely more on formal education than the tacit knowledge that comes with experience. The researchers used data from ADP, the largest U.S. payroll processor, covering the period from January 2021 through July 2025. (Stanford)

Checklists improve model training more than rewards

Apple researchers developed a new training method called Reinforcement Learning from Checklist Feedback (RLCF) that consistently improves language models’ ability to follow complex instructions. The method extracts dynamic checklists from user instructions and evaluates responses against each checklist item using AI judges and verification programs. When applied to Qwen2.5-7B-Instruct, RLCF achieved a 4-point boost in hard satisfaction rate on FollowBench, a 6-point increase on InFoBench, and a 3-point rise in win rate on Arena-Hard. This approach outperformed traditional methods like instruction fine-tuning and reward model-based training, which showed mixed results across benchmarks. The researchers created WildChecklists, a dataset of 130,000 instructions with corresponding checklists, which they plan to release publicly. (arXiv)

Authors settle copyright lawsuit with Anthropic over training

A group of book authors reached a settlement with AI company Anthropic after suing the chatbot maker for using copyrighted books to train its Claude AI system. The settlement comes after a federal judge ruled in June that Anthropic’s use of copyrighted materials for AI training qualified as fair use, but the company still faced trial over how it obtained books from online pirated libraries. The case centered on whether downloading copyrighted works from “shadow libraries” to train AI models constituted copyright infringement, even if the training itself was deemed transformative. This settlement marks a significant development in ongoing legal battles over AI companies’ use of copyrighted materials for model training. Terms of the settlement will be finalized next week, though specific details remain undisclosed. (Associated Press)

Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng shared thoughts on parallel agents as a new way to scale AI, highlighting how running agents simultaneously can speed up research, coding, and other workflows while boosting performance.

“As LLM prices per token continue to fall — thus making these techniques practical — and product teams want to deliver results to users faster, more and more agentic workflows are being parallelized.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth:

Google unveiled Magic Cue, a new proactive AI assistant for the upcoming Pixel 10.
French startup Mistral published detailed data on energy, water, and material consumption for the full lifecycle of its Mistral Large 2 model.
Chinese researchers disguised a modified robot dog as an antelope to study herd behavior in the wild.
Meta introduced DINOv3, an update to its self-supervised learning framework with a new loss term that delivers better image processing and vision performance.

Subscribe to Data Points

Google’s Nano Banana hits the scene: OpenAI’s latest voice-to-voice model

Analytics DeepLearning.AI

Read more

OpenAI security agent finds and plugs holes: Cognition’s SWE-1.5 model brings more speed for coding agents

Cursor introduces a new model built for agents: Claude models sometimes know they’ve been tampered with

Introducing DeepLearning.AI Pro

Monsters of AI: AI Psychosis, Lethal Drones, Decaying Data, Speculative Bubbles