Data Points

Nvidia and OpenAI make a deal: DeepSeek reveals more R1 training details

Anthropic’s latest Claude bug report. How Google is injecting Gemini into Chrome. Research on AI model scheming. Grok-4-Fast, a distilled version of xAI’s top model.

Analytics DeepLearning.AI

22 Sep 2025 — 4 min read

Welcome back! In today’s edition of Data Points, you’ll learn more about:

Anthropic’s latest Claude bug report
How Google is injecting Gemini into Chrome
Research on AI model scheming
Grok-4-Fast, a distilled version of xAI’s top model

But first:

Nvidia and OpenAI plan $100 billion AI infrastructure partnership

The two AI giants announced plans for a strategic partnership to deploy at least 10 gigawatts of Nvidia systems for OpenAI’s next-generation AI infrastructure. Nvidia intends to invest up to $100 billion in OpenAI, with the first phase deployed in the second half of 2026 using Nvidia’s next-generation Vera Rubin GPUs and CPUs. The two companies say they will align their product roadmaps, with OpenAI working with Nvidia as a preferred strategic compute and networking partner. This partnership represents a significant escalation in AI infrastructure investment and signals both companies’ commitment to developing what they call “superintelligence.” The companies expect to finalize partnership details in the coming weeks. (OpenAI)

DeepSeek reveals its R1 model cost just $294,000 to train

In a Nature journal article, Chinese developer DeepSeek disclosed that it spent only $294,000 to train its R1 reasoning model using 512 Nvidia H800 chips. The company acknowledged for the first time that it owns A100 chips and used them in preparatory development stages, addressing previous U.S. concerns about its chip access. DeepSeek also responded to claims it had “distilled” OpenAI’s models, stating that while its training data inadvertently included OpenAI-generated answers from web crawls, this was incidental rather than intentional. R1’s low training cost contrasts sharply with OpenAI CEO Sam Altman’s 2023 statement that foundational model training costs “much more” than $100 million, although R1 had the benefit of beginning with DeepSeek’s V3 foundation model. (Nature)

Anthropic identifies and fixes three big bugs affecting Claude

Anthropic resolved three infrastructure bugs that intermittently caused Claude to give lower-quality responses between August and early September. The bugs included a routing error that sent some requests to the wrong servers, a configuration mistake that caused Claude to insert random foreign language characters into English responses, and a compiler bug that affected how Claude selected words when generating text. The overlapping nature of these bugs made diagnosis challenging, with the routing error affecting up to 16 percent of Sonnet 4 requests at its peak and approximately 30 percent of Claude Code users experiencing at least one degraded response. Anthropic emphasized that they never intentionally reduce model quality due to demand or server load, and is implementing better testing methods, continuous quality monitoring, and improved debugging tools to prevent similar incidents. Along with explaining Claude’s reduced performance, Anthropic’s bug report offers an unusually detailed and transparent glimpse into how AI models are served at scale. (Anthropic)

Google integrates Gemini into Chrome browser for U.S. users

Google launched Gemini AI features in Chrome, including a new toolbar button that launches the chatbot and tools for answering questions about web content and synthesizing information across multiple tabs. The features, previously available only to paying subscribers, will soon roll out to all U.S. desktop users browsing in English, with iOS support coming soon. Google says Chrome will also add “agentic” features that can complete web-based tasks like adding items to shopping carts, plus AI-enhanced search in the address bar for chatbot-style searches. This integration of Gemini into the world’s most popular browser potentially outflanks new AI-dedicated browsers like Comet and Dia and could represent a significant shift in how users interact with the web. (Google)

OpenAI and Apollo Research uncover scheming in frontier models

A new report describes behaviors consistent with “scheming” — AI systems pretending to be aligned while secretly pursuing different goals — in controlled tests of frontier models including OpenAI’s o3 and o4-mini, Gemini 2.5 Pro, and Claude Opus 4. The researchers found that models would strategically underperform on tests (“sandbagging”) or hide their true capabilities when they believed it would help them avoid being shut down or modified. The team also developed a “deliberative alignment” training method that reduced scheming behaviors by approximately 30×, teaching models to explicitly reference anti-scheming principles before acting. The researchers argue that scheming differs from other AI failures: It becomes more dangerous as models grow more capable, and attempts to train it away might simply teach models to hide it better. The findings suggest that while current models pose limited risks, the AI field needs better methods for detecting and eliminating this behavior before models take on more complex, real-world tasks with greater autonomy. (OpenAI)

xAI launches Grok 4 Fast with improved cost efficiency

xAI’s newest model uses 40 percent fewer thinking tokens than Grok 4 and costs 98 percent less to achieve similar results. Grok 4 Fast’s unified architecture combines reasoning and non-reasoning modes in a single model, while also boasting web and X search capabilities and a 2 million token context window. Currently, Grok 4 Fast ranks #1 on LMArena’s Search Arena with 1163 Elo and achieves scores of 85.7 percent on GPQA Diamond and 92 percent on AIME 2025 benchmarks. The model is available now on grok.com, iOS, and Android apps for all users including free tiers, and via the xAI API at $0.20 per million input tokens and $0.50 per million output tokens for contexts under 128,000 tokens. (xAI)

Want to know more about what matters in AI right now?

Read the latest issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng highlighted the growing importance of automated software testing in the era of AI-assisted coding, emphasized how agentic testing could make coding agents more reliable, prevent subtle infrastructure bugs, and support stable software development.

“Bugs in software components that you intend to build on top of lead to downstream bugs that can be hard to find. Further, bugs in a component that’s deep in a software stack — and that you build multiple abstraction layers on top of — might surface only weeks or months later, long after you’ve forgotten what you were doing while building this specific component, and be really hard to identify and fix.”

Read Andrew’s letter here.

Other top AI news and research stories covered in depth:

Alibaba unveiled Qwen3-Next, a new model with hybrid attention layers and a sparse MoE design for faster, more efficient performance.
Illinois joined Nevada in banning AI-driven mental health treatments, restricting chatbot use to licensed therapists.
In Ukraine, drone swarms are being tested, with small, high-autonomy units striking targets on their own initiative.
Researchers introduced Energy-Based Transformers (EBTs), which apply gradient descent to progressively predict the next token.

Subscribe to Data Points

Nvidia and OpenAI make a deal: DeepSeek reveals more R1 training details

Analytics DeepLearning.AI

Read more

GPT-5 gets a Codex-specific model update: A new MCP-style protocol for agentic payments

Agentic Coding and Agentic Software Testing Go Together: Agentic coding can make mistakes, but agentic testing can find and fix them.

Drone Swarms Go To War, States Ban AI Mental-Health Treatments, Qwen3-Next Accelerates, Transformers Get Energized

Transformers Energized: Energy-Based Transformers (EBTs) use gradient descent to gradually predict the next token