Data Points

MiniMax M1 tackles Qwen3, DeepSeek-R1, Claude 4 Opus, and more: Gemini 2.5 models get price changes and a new partner

Cursor introduces a new-high end subscription. Harvard makes its public-domain books data set widely available. Essential AI introduces its own auto-labeled data set. MIT builds an LLM that can fine-tune itself.

Analytics DeepLearning.AI

20 Jun 2025 — 4 min read

Welcome back! In today’s edition of Data Points, you’ll learn more about:

Cursor introduces a new-high end subscription
Harvard makes its public-domain books data set widely available
Essential AI introduces its own auto-labeled data set
MIT builds an LLM that can fine-tune itself

But first:

MiniMax’s M1 features extended context windows on a small budget

MiniMax unveiled M1, an open-weight large-scale reasoning model featuring a hybrid mixture-of-experts architecture with lightning attention mechanism. The model contains 456 billion total parameters with 45.9 billion activated, supports 1 million token context length (8 times larger than DeepSeek-R1), and uses 25 percent of the computational resources of DeepSeek-R1 when generating 100,000 tokens. MiniMax trained the model using reinforcement learning on diverse problems from mathematical reasoning to software engineering, introducing a new CISPO algorithm that improves training efficiency. The release positions M1 as a foundation for AI agents tackling complex real-world tasks, with benchmarks showing it outperforms DeepSeek-R1 and Qwen3-235B on software engineering and long-context challenges. MiniMax offers two versions with 40,000 and 80,000 token thinking budgets. (Hugging Face)

Google launches Gemini 2.5 Flash-Lite with lowest cost in family

Google introduced Gemini 2.5 Flash-Lite in preview, delivering better performance than previous 1.5 and 2.0 Flash models at lower cost and higher speeds. The model features adjustable reasoning capabilities through an API parameter, with “thinking” disabled by default to optimize for speed and cost, making it ideal for high-throughput tasks like classification and summarization at scale. Flash-Lite supports native tools including Google Search grounding, code execution, URL context, and function calling. The model costs $0.10 per million input tokens and $0.40 per million output tokens. (Google)

Cursor launches $200 monthly Ultra plan with 20x more usage

Cursor introduced Ultra, a $200 per month subscription tier, responding to power users who wanted predictable pricing rather than usage-based fees. The new tier relies on multi-year partnerships with OpenAI, Anthropic, Google, and xAI. Cursor also enhanced its Pro plan with an unlimited-with-rate-limits model and removed all restrictions on tool calls, though existing users can keep their current 500-request limit setup. The launch comes as Anysphere’s AI coding assistant reaches $500 million in annualized recurring revenue, with major clients including Nvidia, Uber, and Adobe. Competition in AI coding tools intensifies with OpenAI reportedly acquiring rival Windsurf and Anthropic gaining traction with Claude Code. (Cursor)

Harvard releases nearly one million historic books to train AI models

Harvard University released a collection of almost one million books to AI researchers Thursday, featuring texts from as early as the 15th century in 254 languages. The Institutional Books 1.0 dataset contains 394 million scanned pages and 242 billion tokens, offering AI developers access to carefully preserved historical texts on literature, philosophy, law, and agriculture. Tech companies facing copyright lawsuits over using modern creative works without consent view public domain materials as a safer alternative for training AI systems. The set was announced earlier this year, but was originally only available to the Harvard community. The collection is now available for free download on Hugging Face, with financial support from Microsoft and OpenAI. (Associated Press)

Essential AI releases 24-trillion-token dataset to simplify curation

Essential AI released Essential-Web v1.0, a 24-trillion-token dataset containing 23.6 billion documents, each labeled with a 12-category taxonomy covering topic, format, content complexity, and quality. The taxonomy labels are generated by EAI-Distill-0.5b, a fine-tuned 0.5 billion-parameter model that achieves annotator agreement within 3% of larger models like Qwen2.5-32B-Instruct. Using simple SQL-style filters, researchers can create specialized datasets that match or exceed state-of-the-art performance in math, web code, STEM, and medical domains. This approach transforms the traditionally complex and expensive process of curating training data into a straightforward search problem, potentially democratizing access to high-quality datasets for AI development. The dataset is freely available on HuggingFace at EssentialAI/essential-web-v1.0. (arXiv)

LLMs learn to update their own weights with new fine-tuning framework

MIT researchers introduced Self-Adapting LLMs (SEAL), a framework that enables language models to modify their own weights by generating fine-tuning data and update instructions. When given new input, the model creates a “self-edit” that can restructure information, specify optimization parameters, or use tools for data augmentation and gradient updates. The system uses reinforcement learning with downstream performance as the reward signal, allowing the model to learn effective self-editing strategies without requiring separate adaptation modules. This approach represents an advance toward AI systems that can autonomously adapt to new tasks and knowledge, addressing a key limitation of current static language models. (arXiv)

Still want to know more about what matters in AI right now?

Read this week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng spoke out in support of high-skilled immigration, sharing his story and warning that visa restrictions could hurt U.S. leadership in AI by discouraging international talent.

“Failure to attract promising students and high-skilled workers would have a huge negative impact on American competitiveness in AI. Indeed, a recent report by the National Security Commission on Artificial Intelligence exhorts the government to ‘strengthen AI talent through immigration.”

Read Andrew’s full letter here.

Other top AI news and research stories we covered in depth:

Apple sharpened its generative AI efforts with updates to its on-device and cloud models, plus a new developer API.
Disney and Universal joined the AI copyright battle, suing Midjourney for allegedly infringing on their intellectual property.
OpenAI introduced o3-pro, an enhanced reasoning model designed to tackle harder problems by using more tokens at inference.
Researchers from Stanford and Princeton fine-tuned a language model to detect racial discrimination in historical property records.

Subscribe to Data Points

MiniMax M1 tackles Qwen3, DeepSeek-R1, Claude 4 Opus, and more: Gemini 2.5 models get price changes and a new partner

Analytics DeepLearning.AI

Read more

OpenAI security agent finds and plugs holes: Cognition’s SWE-1.5 model brings more speed for coding agents

Cursor introduces a new model built for agents: Claude models sometimes know they’ve been tampered with

Introducing DeepLearning.AI Pro

Monsters of AI: AI Psychosis, Lethal Drones, Decaying Data, Speculative Bubbles