Anthropic, OpenAI donate open-source projects: GPT-5.2 arrives with new document creation abilities

Mistral’s Devstral 2’s cost-effective coding. GLM-4.6V’s addition of native function calling to vision-language reasoning. Cursor 2.2’s debug and web developer modes. A new executive order preempting local AI regulations in the U.S.

Man in field with net catching digital bugs flying from computer screen on table.

In today’s edition of Data Points, you’ll learn more about:

  • Mistral’s Devstral 2’s cost-effective coding
  • GLM-4.6V’s addition of native function calling to vision-language reasoning
  • Cursor 2.2’s debug and web developer modes
  • A new executive order preempting local AI regulations in the U.S.

But first:

Agentic AI Foundation created for open-source AI projects

At launch, the Agentic AI Foundation (AAIF) will provide neutral governance for three open-source projects: Anthropic’s Model Context Protocol (MCP), Block’s goose agent framework, and OpenAI’s AGENTS.md standard. MCP, released in November 2024, is now the standard protocol for connecting AI models to tools and data, with over 10,000 published servers. Block’s goose combines language models with MCP-based integration for local-first agent workflows, while OpenAI’s AGENTS.md provides markdown-based project guidance adopted by over 60,000 repositories and frameworks. AAIF’s co-founders include Amazon Web Services, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, and OpenAI, with additional contributions from across the technology industry. (AAIF)

GPT-5.2 claims superior performance at white-collar knowledge work

OpenAI launched GPT-5.2, a model series designed for tasks like spreadsheet creation, presentation building, and code writing. The company claims the model outperforms industry professionals on 70.9 percent of knowledge work tasks across 44 occupations in the GDPval benchmark, while operating at 11 times the speed and less than 1 percent of the cost. GPT-5.2 Thinking achieved 55.6 percent on SWE-Bench Pro for software engineering tasks and 80 percent on SWE-bench Verified, while GPT-5.2 Pro scored 93.2 percent on GPQA Diamond graduate-level science questions and crossed the 90 percent threshold on ARC-AGI-1. GPT-5.2 Instant rounds out the suite. The models are available now in ChatGPT for paid subscribers and through the API at $1.75 per million input tokens and $14 per million output tokens for GPT-5.2 and $15/$120 for GPT-5.2 Pro, with a 90 percent discount on cached inputs. (OpenAI)

Mistral releases Devstral 2 coding models under open licenses

Mistral launched Devstral 2, a 123-billion-parameter coding model, and Devstral Small 2, a 24-billion-parameter version, both designed for code generation and autonomous software engineering tasks. Devstral 2 achieved 72.2 percent on SWE-bench Verified while being five times smaller than DeepSeek V3.2 and eight times smaller than Kimi K2. Mistral claims Devstral 2 operates up to seven times more cost-efficiently than Claude Sonnet on real-world tasks. The company released Devstral 2’s weights under a modified MIT license and Devstral Small 2’s under Apache 2.0, with both models supporting a 256,000-token context window. Mistral also introduced Mistral Vibe CLI, an open-source command-line tool that enables natural language code automation across entire codebases. Devstral 2 is currently free via API and will later cost 40 cents per million input tokens and 2 dollars per million output tokens, while Devstral Small 2 will cost 10 cents and 30 cents respectively. (Mistral)

Zhipu’s GLM update adds native tool calling to VL models

Zhipu AI launched GLM-4.6V, a series of open-weight reasoning models that the company says integrate native function calling capabilities for the first time in vision-language models. The release includes a 106 billion parameter model for cloud deployment and a 9 billion parameter model for local use, both supporting 128,000 token context windows that can process around 150 pages of documents or one hour of video at once. The models accept images and other visual inputs directly as tool parameters and can understand tool outputs like search results and charts to use in their reasoning. GLM-4.6V achieves state-of-the-art performance on over 20 benchmarks among open models of similar size, with applications including content generation, visual web search, converting screenshots to code, and document analysis. The models are available on HuggingFace and ModelScope, and can be accessed through the Z.ai platform and Zhipu’s OpenAI-compatible API. (Zhipu)

Cursor adds debug mode and visual editor for web development

Cursor introduced Debug Mode, an agent workflow that fixes complex bugs by combining runtime logging, hypothesis generation, and human verification. The system instruments code with logs, generates multiple theories about errors, and asks developers to reproduce issues and confirm fixes. Cursor also released a visual editor for its browser that lets developers drag elements, inspect components and props, and describe changes while pointing and clicking, integrating the web app, codebase, and editing tools in one window. Both updates aim to make Cursor 2.2 a more full-featured and user-friendly web development application for people working with both codebases and design elements. (Cursor and again, Cursor)

Trump moves to override regulations, create national AI framework


President Trump signed an executive order authorizing a single federal regulatory framework for AI that preempts state laws. The order establishes an AI Litigation Task Force within 30 days to challenge state AI regulations deemed inconsistent with federal policy, particularly targeting laws like Colorado’s algorithmic discrimination ban that the administration argues may force AI models to produce false results. States with “onerous” AI laws identified by the Commerce Department within 90 days could lose eligibility for non-deployment funds under the 42.5 billion dollar Broadband Equity Access and Deployment program. The order directs the Federal Trade Commission to issue guidance on whether state laws requiring changes to AI outputs violate federal prohibitions on deceptive practices. It also calls for legislation establishing a uniform federal framework while preserving state authority over child safety, data center infrastructure, and government procurement. (CNBC and The White House)


Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng talked about building agentic workflows using a simple recipe with frontier LLMs, the importance of scaffolding for reliable agents, and using the aisuite package for easy LLM provider switching and tool usage.

“Hardly any of today’s many practical, commercially valuable agentic workflows were built using this simple approach. Today’s agents need much more scaffolding — that is, code that guides its step-by-step actions — rather than just letting an LLM have access to some tools and fully autonomously decide what to do.” 

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth:


A special offer for our community

DeepLearning.AI recently launched the first-ever subscription plan for our entire course catalog! As a Pro Member, you’ll immediately enjoy access to:

  • Over 150 AI courses and specializations from Andrew Ng and industry experts
  • Labs and quizzes to test your knowledge 
  • Projects to share with employers 
  • Certificates to testify to your new skills
  • A community to help you advance at the speed of AI

Enroll now to lock in a year of full access for $25 per month paid upfront, or opt for month-to-month payments at just $30 per month. Both payment options begin with a one week free trial. Explore Pro’s benefits and start building today!

Try Pro Membership