GPT-5 gets a Codex-specific model update: A new MCP-style protocol for agentic payments
GitHub’s new MCP server registry. Google and OpenAI’s gold medal achievements at ICPC. VaultGemma, an open, privacy-first language model. How Anthropic’s usage restrictions risk U.S. ire.

Welcome back! In today’s edition of Data Points, you’ll learn more about:
- GitHub’s new MCP server registry
- Google and OpenAI’s gold medal achievements at ICPC
- VaultGemma, an open, privacy-first language model
- How Anthropic’s usage restrictions risk U.S. ire
But first:
OpenAI releases GPT-5-Codex with platform updates for developers
GPT-5-Codex, a specialized coding version of GPT-5, outperforms the base model on code refactoring tasks (51.3 percent vs. 33.9 percent) and can work independently for over 7 hours on complex projects. The model adapts its thinking time to match task difficulty; it responds quickly to simple requests but takes much longer on challenging problems, catching more critical bugs during code reviews. OpenAI also rebuilt its Codex tools with new features like image attachments in the command line, a VS Code extension that syncs between local and cloud work, and infrastructure improvements that cut task completion times by 90 percent. These updates position Codex competitively against Claude Code and other agentic assistants and model scaffolds in the semi-autonomous coding market. Codex comes with all paid ChatGPT plans, with API access planned for the near future. (OpenAI)
Google unveils protocol for AI agents to make secure payments
Google announced AP2, an open protocol that lets AI agents safely make purchases on users’ behalf using cryptographically-signed “Mandates” that prove authorization for purchases or sales. The protocol works with payment methods from credit cards to cryptocurrencies and includes an extension built with Coinbase and the Ethereum Foundation. The new protocol could create a unified framework for AI-driven commerce, ensuring accountability when agents transact autonomously, while also preventing a fragmented agentic payments ecosystem. Over 60 organizations including American Express, Mastercard, and PayPal are collaborating on the protocol, which is now available on GitHub. (Google)
GitHub launches a central hub for discovering MCP servers
GitHub launched the MCP Registry to solve a big developer headache: finding Model Context Protocol (MCP) servers that let AI agents talk to external tools and systems. The registry features curated servers from partners like Figma, Postman, HashiCorp, and Dynatrace, with one-click installation in VS Code and sorting by GitHub stars to help developers quickly find what they need. Without a registry, developers often had to hunt through scattered repositories and community forums to find MCP servers, which slowed adoption and created security risks. The registry marks the first step toward building an open-source MCP registry with Anthropic and the MCP Steering Committee, where developers can self-publish servers that automatically appear in GitHub’s registry. (GitHub)
AI systems make breakthrough at world programming championship
Google’s Gemini 2.5 Deep Think and OpenAI’s reasoning system both achieved gold-medal level performance at the 2025 International Collegiate Programming Contest (ICPC) World Finals. OpenAI earned a perfect 12/12 score that would have placed first among all human participants. Google’s system solved 10 problems, including one that no human team completed, while OpenAI’s ensemble of GPT-5 and an experimental reasoning model solved all 12 without specific ICPC training. Both companies’ systems competed under official ICPC rules with the same time constraints as human teams, showing significant advances in AI’s abstract reasoning and problem-solving capabilities. (Google and X)
Google releases VaultGemma, a language model with built-in privacy
At 1 billion parameters, Google says VaultGemma is the largest open language model trained from scratch with differential privacy, a mathematical technique that prevents the model from memorizing individual training examples by. carefully adding calibrated noise during training. However, this requires significantly larger batch sizes and more computational resources than standard training. Google’s research establishes new scaling laws that help developers understand the trade-offs between compute budget, privacy guarantees, and model performance when training with differential privacy. This work provides useful guidance for organizations seeking to build AI systems that protect user privacy while maintaining useful capabilities. VaultGemma’s weights are available on Hugging Face and Kaggle, along with a technical report detailing the training methodology. (Google)
Anthropic faces U.S. government backlash over law enforcement usage restrictions
Anthropic’s refusal to allow its AI models for certain law enforcement purposes has created tensions with the Trump administration, according to two senior officials. The company recently declined requests from federal law enforcement contractors because its usage policy prohibits surveillance of U.S. citizens, affecting agencies like the FBI, Secret Service, and Immigration and Customs Enforcement. This poses challenges for government contractors since Anthropic’s Claude models are sometimes the only top-tier AI models cleared for top secret security situations through Amazon Web Services GovCloud. The dispute highlights broader questions about how much control AI companies should have over government use of their technology, particularly as those governments use AI to take over controversial functions. (Semafor)
Still want to know more about what matters in AI right now?
Read this week’s issue of The Batch for in-depth analysis of news and research.
This week, Andrew Ng highlighted the growing importance of automated software testing in the era of AI-assisted coding, emphasizing how agentic testing can make coding agents more reliable, prevent subtle infrastructure bugs, and support stable software development.
“Automatically testing infrastructure software components that you intend to build on top of is especially helpful and results in more stable infrastructure and less downstream debugging.”
Read Andrew’s full letter here.
Other top AI news and research stories we covered in depth:
- Alibaba unveiled Qwen3-Next, a new model with hybrid attention layers and a sparse MoE design for faster, more efficient performance.
- Illinois joined Nevada in banning AI-driven mental health treatments, restricting chatbot use to licensed therapists.
- In Ukraine, drone swarms are being tested, with small, high-autonomy units striking targets on their own initiative.
- Researchers introduced Energy-Based Transformers (EBTs), which apply gradient descent to progressively predict the next token.