China and Nvidia make a deal: OpenAI’s new health and wellness app

Universal and Nvidia’s music partnership. A new set of benchmarks for evaluating top models. Test-time learning, a method of training for long-context. Character.AI’s settlements in teen suicide cases.

Home workout setup with treadmill and screen displaying real-time fitness metrics, ideal for tracking exercise progress.

In today’s edition of Data Points, you’ll learn more about:

  • Universal and Nvidia’s music partnership
  • A new set of benchmarks for evaluating top models
  • Test-time learning, a method of training for long-context
  • Character.AI’s settlements in teen suicide cases

But first:

China approves Nvidia H200 chip imports, with restrictions

China will allow local companies to purchase Nvidia’s H200 AI chips as soon as this quarter, but will bar their use by the military, sensitive government agencies, critical infrastructure, and state-owned enterprises. The approval is significant for Nvidia, which CEO Jensen Huang estimates could tap into a $50 billion Chinese AI chip market. Chinese tech giants Alibaba and ByteDance each expressed interest in ordering more than 200,000 H200 units, which are priced around 27,000 dollars per chip. Nvidia will reportedly require full upfront payment from Chinese customers with no cancellation or refund options, transferring financial risk to buyers amid regulatory uncertainty. The H200, part of Nvidia’s older Hopper generation, delivers roughly six times the performance of the previously blocked H20 chip and remains more powerful than current offerings from Chinese rival Huawei. (Reuters)

OpenAI’s ChatGPT Health, a chatbot with medical record access

ChatGPT Health allows users to connect medical records and wellness apps like Apple Health, Function, and MyFitnessPal for personalized health guidance. The service operates as an isolated space within ChatGPT with purpose-built encryption, separate memory storage, and a guarantee that health conversations will not train foundation models. OpenAI developed the product over two years with more than 260 physicians across 60 countries, who provided feedback on over 600,000 model outputs and helped create HealthBench, an evaluation framework based on clinical standards. Users on Free, Go, Plus, and Pro plans outside the European Economic Area, Switzerland, and United Kingdom can join the waitlist, with broader access planned for web and iOS in coming weeks. (OpenAI)

UMG and Nvidia team up for AI music generation and understanding

Universal Music Group and Nvidia announced a partnership to advance AI in music discovery and creation, marking the first collaboration between the world’s largest record label and the leading AI chipmaker. The companies will build on Nvidia’s Music Flamingo model, an audio-language system that allows listeners to explore music through emotional narrative and cultural resonance rather than traditional genre or style tags. The partnership includes plans for an artist incubator where musicians can co-design and test new AI tools. The collaboration represents a significant shift in the music industry’s approach to AI, moving from lawsuits against music startups toward cooperative ventures that aim to protect catalogs while capitalizing on the technology. Universal previously sued AI music companies Suno and Udio for copyright infringement but later settled and struck deals to collaborate on new platforms. (Universal Music Group)

Artificial Analysis switches benchmarks to measure productivity over test scores

Artificial Analysis released version 4.0 of its Intelligence Index, shifting evaluation from traditional academic tests to real-world economic tasks across 44 occupations and 9 industries. The new GDPval-AA benchmark tests whether AI can produce actual professional deliverables like documents, slides, and spreadsheets, with models receiving shell access and web browsing through the “Stirrup” agentic harness. OpenAI’s GPT-5.2 with extended reasoning leads the new index at 1442 ELO, followed by Anthropic’s Claude Opus 4.5 at 1403, though top models now score 50 or below on the recalibrated scale compared to 73 on the previous version. The update also introduces AA-Omniscience, which revealed that Google’s Gemini 3 Pro achieves 54 percent accuracy but shows an 88 percent hallucination rate, while Claude 4.5 Sonnet Thinking demonstrates 48 percent hallucination—exposing that high accuracy does not guarantee trustworthiness when models guess rather than abstain on uncertain questions. (VentureBeat)

AI model learns from context by updating its weights during use

Researchers at Nvidia, Stanford, and other institutions introduced TTT-E2E, which treats long-context language modeling as a continual learning problem rather than an architecture challenge. The model uses a standard Transformer with sliding-window attention but continues learning at test time through next-token prediction, compressing context into its weights. In experiments with 3 billion parameter models trained on 164 billion tokens, TTT-E2E matched the scaling behavior of full attention across context lengths while maintaining constant inference latency like RNNs, running 2.7× faster than full attention at 128K context on an H100 GPU. The method achieved lower test loss than full attention throughout entire context windows, with most gains coming from earlier tokens. The research challenges a fundamental assumption in how model-makers build AI systems for long context, with test-time learning potentially making information retrieval faster and more scalable. (arXiv)

Google and Character.AI settle lawsuits over teen deaths

The companies agreed to settle five lawsuits accusing their chatbots of causing harm to children, including one case where a 14-year-old died by suicide after conversing with a bot that allegedly encouraged him to end his life. The settlements, filed in federal courts across Florida, Texas, Colorado, and New York, follow mounting scrutiny of AI chatbots and their effects on users, particularly children. Character.AI banned users under 18 in November 2024, while the Federal Trade Commission opened an inquiry into chatbot safety. The settlement terms were not disclosed, and the agreements have not been finalized. (The New York Times)


Want to know more about what matters in AI right now?

Read the latest issue of The Batch for in-depth analysis of news and research. 

Last week, Andrew Ng discussed a new course designed to teach non-coders how to build AI-driven applications, emphasized the importance of coding skills for productivity, and encouraged everyone to engage with AI tools.

“I’ve often spoken about why everyone should learn to code. I’m seeing a rapidly growing productivity gap between people who know how to code and those who don’t. For many job roles I hire for, I now require at least basic coding knowledge.” 

Read Andrew’s letter here.

Other top AI news and research stories covered in depth:


A special offer for our community

DeepLearning.AI recently launched the first-ever subscription plan for our entire course catalog! As a Pro Member, you’ll immediately enjoy access to:

  • Over 150 AI courses and specializations from Andrew Ng and industry experts
  • Labs and quizzes to test your knowledge
  • Projects to share with employers
  • Certificates to testify to your new skills
  • A community to help you advance at the speed of AI

Enroll now to lock in a year of full access for $25 per month paid upfront, or opt for month-to-month payments at just $30 per month. Both payment options begin with a one week free trial. Explore Pro’s benefits and start building today!

Try Pro Membership