Claude Levels Up: Anthropic launches Claude Sonnet 4.5 and the Claude Agent SDK, and overhauls Claude Code for developers

Loading the Elevenlabs Text to Speech AudioNative Player...

Anthropic updated its mid-size Claude Sonnet model, making it the first member of the Claude family to reach version 4.5. It also enhanced the Claude Code agentic coding tool with long-desired features.

Claude Sonnet 4.5: The new model offers a substantial increase in performance as well as a variable budget for reasoning tokens. 

  • Input/output: Text and images in (up to between 200,000 to 1 million tokens depending on service tier), text out (up to 64,000 tokens)
  • Availability: Free via Claude.ai, API access $3/$15 per million tokens input/output via Anthropic, Amazon Bedrock, and Google Vertex
  • Features: Reasoning with variable token budget, extended processing time (“hours” according to the documentation), serial (rather than simultaneous) completion of tasks
  • Knowledge cutoff: January 2025
  • Undisclosed: Model architecture, training data and methods 

Results: In Anthropic’s tests, Claude Sonnet 4.5’s coding metrics stood out, but it performed well on broader assessments, too.

  • With a reasoning budget of 32,000 tokens, Claude Sonnet 4.5 currently tops the LM Arena Text Leaderboard. Without reasoning, it ranks fourth.
  • On coding challenges in SWE-bench Verified, Claude Sonnet 4.5 (82 percent) raised the state of the art, outperforming previous leaders Claude Sonnet 4 (80.2 percent) and Claude Opus 4.1 (79.4 percent).
  • It achieved 61.4 percent on the computer-use benchmark OSWorld, well ahead of other models in available leaderboards.
  • It achieved 100 percent on AIME 2025’s math problems when it used Python tools, although GPT-5 dominated when neither model used tools.
  • On tests of visual reasoning such as GPQA-Diamond and MMMLU, Sonnet 4.5 generally outperformed the larger Claude Opus 4.1 but fell short of Google Gemini Pro 4.5 and OpenAI GPT-5.

Claude Code: Anthropic’s agentic coding tool got a design overhaul that adds a number of fresh capabilities. Notably, it comes with a software development kit — based on the same software infrastructure, toolkit, orchestration logic, and memory management that underpins Claude Code — for building other agentic tools. 

  • Claude Agent SDK. The new software development kit pairs Claude models with software tools for web search, file management, code deployment, and other autonomous capabilities. It provides building blocks for all of Claude Code’s functionality so you can build your own agentic applications.
  • Context tracking. Agentic use cases require continuity even when inputs exceed a model’s input context limit. When a model’s message history approaches this limit, Claude Code asks the model to summarize the most critical details and passes the summary to the model as the latest input. It also removes tool results when they’re no longer needed, making room for further input.
  • Memory. A new API “memory tool” enables the model to store and retrieve especially important information like project states outside the input.
  • Checkpoints. Claude Code now stores checkpoints, preserving safe states that it can revert to in case of mistakes. It also added an IDE extension that can be used in VSCode and similar applications in lieu of the terminal.

Behind the news: Founded by ex-Open AI employees, Anthropic markets itself as an alternative to that company: safer, more humane, and more tasteful. Although it hasn’t stopped touting those values, the emphases have grown simpler: coding and workplace productivity. While ChatGPT may be synonymous with AI among consumers, Anthropic is focusing on software developers and businesses.

Why it matters: The coupling of Claude Sonnet 4.5 with the enhanced Claude Code reflects Anthropic’s emphasis on workplace productivity. This focus speaks to some of the business world’s anxieties: When will AI pay off for my workforce? When will it transform what they do? For now, coding (via Claude Code or a competitor) is one obvious answer.

We’re thinking: The Claude Agent SDK is a significant release that will enable many developers to build powerful agentic apps. We look forward to an explosion of Claude-based progeny!