Meta unites its AI teams under Superintelligence Labs: Google tempts developers with a free, open-source Gemini CLI

How Baidu’s newly open-sourced Ernie model puts competitive pressure on rivals. How Cloudflare’s new pay-per-crawl service could boost the cost of training data. How HongShan’s periodically updated Xbench gives developers a business-focused test for evaluating models.

Concert crowd enjoying band with glowing holographic musicians on stage, red lighting, live music atmosphere.

Welcome back! In today’s edition of Data Points, you’ll learn more about:

  • How Baidu’s newly open-sourced Ernie model puts competitive pressure on rivals
  • How Cloudflare’s new pay-per-crawl service could boost the cost of training data
  • How HongShan’s periodically updated Xbench gives developers a business-focused test for evaluating models
  • How an AI-generated rock band reveals a market for generated music

But first:

Meta Poaches Top Talent to Build Superintelligence Labs

Meta reorganized its AI teams into Meta Superintelligence Labs, bringing in 11 senior hires from OpenAI, Anthropic, and Google under the direction of Scale AI founder Alexandr Wang and former GitHub CEO Nat Friedman. The unit merges groups previously in charge of research, large language models, and products. Zuckerberg has pledged to invest “hundreds of billions” of dollars in next-generation models, and the new lab marks an aggressive push to build systems that match or surpass human performance and intensifying AI’s talent and spending race. (Bloomberg)

Google launches open-source Gemini CLI

Gemini CLI is a command line interface that executes natural language commands using the company’s Gemini Pro 2.5 model. The CLI offers a free tier with 60 requests per minute and 1,000 requests per day, and it supports the MCP standard to connect to external data and services. It directly challenges OpenAI’s Codex and Anthropic’s Claude Code by offering free access to similar command line capabilities, potentially accelerating AI adoption among developers who avoid paid tools. (VentureBeat)

Baidu open-sources Ernie chatbot

Search giant Baidu announced it will open-source its Ernie chatbot, marking a shift from its previous closed approach and challenging closed competitors. The decision follows an aggressive pricing strategy that has included making Ernie free in February and slashing API prices by 80 percent in March, as the company seeks to undercut rivals and build a developer ecosystem around its technology. Industry analysts describe the latest move as a "declaration of war on pricing." (Silicon Angle)

Cloudflare blocks AI crawlers by default and tests pay-per-crawl

In a decisive move against companies that crawl the web to collect AI training data, Cloudflare turned on its AI-bot blocker by default for all customers and launched a Pay Per Crawl beta that lets web publishers charge scrapers. The service enables publishers to block identified AI crawlers altogether or selectively and to allow crawlers that don’t collect training data, such as search engine crawlers. The change potentially raises the cost of large-scale web scraping and may encourage model developers to pay licensing fees for data instead of gathering it freely. (Wired)

Chinese venture capital firm launches dynamic AI benchmark

HongShan Capital Group released Xbench, a free benchmark designed to test models on both academic knowledge and business task execution, including activities like sourcing job candidates and matching advertisers with influencers. HongShan intends to update the benchmark quarterly with new questions and maintains a half-public, half-private dataset to prevent models from memorizing desired responses. Currently, ChatGPT o3 ranks first across categories followed by ByteDance Doubao and Google Gemini 2.5 Pro. The regularly updated benchmark could reduce the risk of systems overfitting to fixed tests. (MIT Technology Review)

AI-generated rock band covertly gains 500,000 Spotify listeners

A fake band called Velvet Sundown drew more than 500,000 Spotify listeners within weeks of releasing two rock-style albums. Online sleuths found no record of the four listed band members, spotted image and lyric artifacts typical of generation models, and noted that Spotify and some other services do not require AI-generated music to be disclosed. The case illustrates that generated music can reach a mass audience, and it may increase pressure on businesses and regulators to support watermarking and transparency. (Ars Technica)


Want to know more about what matters in AI right now?

Read the latest issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng shared a strategy for building with AI: Reduce project scope to fit your available time. He shows how even small builds can accelerate learning and unlock user feedback.

“If you have only an hour, find a small component of an idea that you’re excited about that you can build in an hour. With modern coding assistants like Anthropic’s Claude Code (my favorite dev tool right now), you might be surprised at how much you can do even in short periods of time!”

Read Andrew’s letter here.

Other top AI news and research stories covered in depth:


Subscribe to Data Points