The Batch | DeepLearning.AI (Page 37)

Modern office with digital dashboards, charts, and team collaboration.

Data Points

Training a reasoning model for less than $450: OpenAI adds agent-ish tools to ChatGPT

Moondream’s lightweight vision model adds gaze detection. Fine-tuning Flux Pro’s generative model with just a few images. Copilot Chat brings simple agents to Microsoft 365. Google adds Gemini to all its Workspace plans.

Two colleagues discuss their chatbot’s success; one suggests hiring an AI Product Manager.

Letters

AI Product Managers Will Be In-Demand: As the cost of building AI products falls, demand for people who know what to build will rise. Get ready for an explosion in AI Product Management!

Writing software, especially prototypes, is becoming cheaper. This will lead to increased demand for people who can decide what to build. AI Product Management has a bright future!

The Batch Newsletter

Tumbling Training Costs, Desktop AI Supercomputer, Tighter AI Export Restrictions, Improved Contrastive Loss

The Batch AI News and Insights: Writing software, especially prototypes, is becoming cheaper. This will lead to increased demand for people who can decide what to build. AI Product Management has a bright future!

X-CLR loss: training models to link text captions and image similarity.

Machine Learning Research

Calibrating Contrast: X-CLR, an approach to contrastive learning for better vision models

Contrastive loss functions make it possible to produce good embeddings without labeled data. A twist on this idea makes even more useful embeddings.

GB10 Superchip architecture with Blackwell GPU and Grace CPU.

Hardware

AI Supercomputer on Your Desk: Nvidia introduced Project Digits, a $3,000 home supercomputer for AI models

Nvidia’s new desktop computer is built specifically to run large AI models.

World map of AI export restrictions: Tier 1 (green), Tier 2 (gray), Tier 3 (red).

Tech & Society

U.S. Moves to Expand AI Export Restrictions: New U.S. rules limit AI tech access worldwide, reshaping global markets

The United States proposed limits on exports of AI technology that would dramatically expand previous restrictions, creating a new international hierarchy for access to advanced chips and models.

DeepSeek-V3 accuracy across benchmarks compared to other AI models.

Machine Learning Research

DeepSeek Ups the Open Weights Ante: DeepSeek-V3 redefines LLM performance and cost efficiency

A new model from Hangzhou upstart DeepSeek delivers outstanding performance and may change the equation for training costs.

Data Points

Report claims AI will create millions of net jobs: rStar-Math boosts small models’ math skills to o1’s level

Court filings show Meta pirated model training data. Stability’s SPAR3D speeds up 3D image generation. How robots aid nursing care workers in Japan. Deliberative alignment uses more compute to ensure safety.

Image of a modern, well-lit room filled with developers working on laptops and monitors at various desks.

Data Points

Nvidia announces Cosmos world models at CES: Microsoft’s Phi-4 model now available on Hugging Face

AI careers remain just as hot as you might expect. Columbia’s GET model predicts gene expression. Cohere’s North brings easy and secure automaton to enterprise. Meta pauses older AI characters but will introduce new ones this year.

Illustration of tech tools like OpenAI, MongoDB, Heroku, and Python with Andrew Ng working on a laptop

Letters

My AI-Assisted Software Development Stack: The software development stack is evolving fast. Here are some things to consider as you choose components.

Using AI-assisted coding to build software prototypes is an important way to quickly explore many ideas and invent new things.

The Batch Newsletter

When Good Models Do Bad Things, What Users Really Want, More Training Data!, Better Model Merging

The Batch AI News and Insights: Using AI-assisted coding to build software prototypes is an important way to quickly explore many ideas and invent new things.

Diagram of Localize-and-Stitch merging fine-tuned models by combining critical weights into one model.

Machine Learning Research

Better Performance From Merged Models: Localize-and-Stitch improves methods for merging and fine-tuning multiple models

Merging multiple fine-tuned models is a less expensive alternative to hosting multiple specialized models. But, while model merging can deliver higher average performance across several tasks, it often results in lower performance on specific tasks. New work addresses this issue.

Latest

Training a reasoning model for less than $450: OpenAI adds agent-ish tools to ChatGPT

AI Product Managers Will Be In-Demand: As the cost of building AI products falls, demand for people who know what to build will rise. Get ready for an explosion in AI Product Management!

Tumbling Training Costs, Desktop AI Supercomputer, Tighter AI Export Restrictions, Improved Contrastive Loss

Calibrating Contrast: X-CLR, an approach to contrastive learning for better vision models

AI Supercomputer on Your Desk: Nvidia introduced Project Digits, a $3,000 home supercomputer for AI models

U.S. Moves to Expand AI Export Restrictions: New U.S. rules limit AI tech access worldwide, reshaping global markets

DeepSeek Ups the Open Weights Ante: DeepSeek-V3 redefines LLM performance and cost efficiency

Report claims AI will create millions of net jobs: rStar-Math boosts small models’ math skills to o1’s level

Nvidia announces Cosmos world models at CES: Microsoft’s Phi-4 model now available on Hugging Face

My AI-Assisted Software Development Stack: The software development stack is evolving fast. Here are some things to consider as you choose components.

When Good Models Do Bad Things, What Users Really Want, More Training Data!, Better Model Merging

Better Performance From Merged Models: Localize-and-Stitch improves methods for merging and fine-tuning multiple models