Seedance creates new movie-making interface: Even top models struggle at identifying visual specifics

The future of ads in ChatGPT. AI stresses in high-tech workplaces. Big models’ curse of multilingual training. Opus 4.6’s expensive fast mode.

Designers create CGI sword fight on green screen with high-tech monitors and motion capture equipment.

In today’s edition of Data Points, you’ll learn more about:

  • The future of ads in ChatGPT
  • AI stresses in high-tech workplaces
  • Big models’ curse of multilingual training
  • Opus 4.6’s expensive fast mode

But first:

Seedance 2.0, a top video generation model with multimodal inputs

Bytedance released Seedance 2.0, a video generation model that accepts images, videos, audio, and text as simultaneous inputs, giving creators precise control over visual style, motion, camera work, rhythm, and narrative. The model uses an @-mention syntax to explicitly reference uploaded assets in natural-language prompts — for example, setting a first frame with an image, extracting camera movements from a reference video, syncing to audio, and describing the desired action in text. The system supports up to 12 files per generation (9 images, 3 videos of up to 15 seconds, and 3 MP3 files), producing outputs between 4 and 15 seconds. Key capabilities include improved physics and motion accuracy, character and object consistency across frames, motion and camera replication from reference videos, video extension and editing without full regeneration, audio-synchronized generation with lip-sync across languages, and beat-synced editing for music-video-style content. Users can also generate long unbroken shots, swap characters while preserving action, apply visual effects, and transfer styles between videos. Seedance 2.0 is currently only available to select users on Bytedance’s Jimeng AI video platform. (WaveSpeed)

Top models fall short on discerning new visual benchmark

Researchers at Moonshot AI released WorldVQA, a benchmark measuring whether multimodal language models can recognize specific visual objects rather than hallucinate or provide generic labels. The dataset contains 3,500 image-question pairs across nine categories, requiring exact answers—identifying a dog breed as simply “dog” counts as incorrect. Results expose a hard ceiling: Google’s Gemini 3 Pro leads at 47.4 percent accuracy, followed by Kimi K2.5 at 46.3 percent, Claude Opus 4.5 at 36.8 percent, and GPT-5.2 at 28 percent, meaning no tested model cracks fifty percent. The benchmark reveals predictable failure patterns: Models perform relatively well on brands and sports—topics saturated in training data—but collapse on nature and cultural categories, falling back to generic terms like “flower” instead of naming specific species. More troubling, all tested models exhibit severe overconfidence. For all AI systems, the inability to reliably recognize what they see and accurately assess their own knowledge limits deployment. (Kimi)

OpenAI begins testing ads for free (and some paid) U.S. users

OpenAI launched an advertising test in ChatGPT for logged-in adult users on the Free and Go subscription tiers in the United States, while Pro, Business, Enterprise, and Education tiers remain ad-free. Ads are clearly labeled and visually separated from ChatGPT's responses, and OpenAI states that advertisements do not influence the model's answers or reasoning. The company matches ads to conversation topics and past interactions, but advertisers receive only aggregate performance data—no access to chat history, memories, or personal details. Users can dismiss ads, control personalization, delete ad data, or upgrade to paid plans to avoid ads entirely. OpenAI frames ads as funding infrastructure for free and low-cost access while excluding sensitive topics like health, mental health, and politics from ad eligibility during the test phase. The company plans to expand the program gradually based on user feedback and evolving safeguards. (OpenAI)

AI accelerates work rather than reducing it

An eight-month study of a 200-person tech company found that generative AI tools intensified work rather than reducing it, as employees voluntarily adopted AI and worked faster, absorbed broader responsibilities, and extended work into evenings and breaks. The research identified three drivers of intensification: task expansion across functional lines (product managers coding, researchers engineering), blurred boundaries between work and non-work as AI reduced friction for starting tasks during breaks, and increased multitasking with multiple AI threads running in parallel. While this voluntary expansion initially appeared as a productivity win, it masked cumulative cognitive strain, workload creep, and rising speed expectations that left workers busier despite efficiency gains. Leaders cannot rely on employee self-regulation; instead, organizations should adopt an “AI practice” with intentional norms including structured decision pauses, sequenced work outputs to prevent constant interruptions, and protected time for human connection. Without active organizational shaping, AI naturally drives intensification rather than contraction, creating risks of burnout, decision degradation, and unsustainable work patterns. (Harvard Business Review)

Research team finds scaling laws for massively multilingual models

Google researchers released ATLAS, a framework quantifying how to efficiently train multilingual language models across 400+ languages. The results formalize the “curse of multilinguality” (performance degradation when adding languages) by introducing a scaling law accounting for model size, data quantity, and language count. Results show supporting twice as many languages requires scaling model size by 1.18x and total training data by 1.66x, with positive transfer offsetting capacity constraints. For practitioners with limited compute budgets, the framework specifies breakeven points between fine-tuning existing multilingual models and pre-training from scratch — typically between 144 billion and 283 billion tokens for 2 billion parameter models. This guidance directly addresses a critical gap: over 50 percent of AI model users speak non-English languages, yet prior scaling research focused almost exclusively on English monolingual settings. (Google Research)

Claude Opus 4.6 comes fast, but at a high cost

Anthropic has released fast mode as a research preview feature in Claude Code, offering faster response latency on the same Opus 4.6 model through a different API configuration that prioritizes speed over cost efficiency. Fast mode is toggled via the /fast command in Claude Code CLI and VS Code Extension, and available exclusively to users on subscription plans (Pro, Max, Team, Enterprise) through extra usage billing, not included in plan rate limits. Fast mode pricing starts at $30 per million input tokens and $150 per million output tokens for contexts under 200K, scaling to $60 and $225 respectively for larger contexts—substantially higher than standard Opus pricing. The feature carries a 50 percent introductory discount through February 16. Anthropic positions fast mode for interactive workflows like rapid iteration, live debugging, and time-sensitive tasks where latency outweighs cost; the company explicitly recommends disabling it for batch processing, CI/CD pipelines, and cost-sensitive workloads. (Claude)


Still want to know more about what matters in AI right now?

Read the latest issue of The Batch for in-depth analysis of news and research.

Last week, Andrew Ng talked about the evolving job market influenced by AI, emphasizing that while AI-related job losses were minimal, the demand for AI skills was reshaping employment opportunities, with workers who adapted to AI becoming more valuable.

“At the same time, when companies build new teams that are AI native, sometimes the new teams are smaller than the ones they replace. AI makes individuals more effective, and this makes it possible to shrink team sizes.”

Read Andrew’s letter here.

Other top AI news and research stories covered in depth:


A special offer for our community

DeepLearning.AI recently launched the first-ever subscription plan for our entire course catalog! As a Pro Member, you’ll immediately enjoy access to:

  • Over 150 AI courses and specializations from Andrew Ng and industry experts
  • Labs and quizzes to test your knowledge 
  • Projects to share with employers 
  • Certificates to testify to your new skills
  • A community to help you advance at the speed of AI

Enroll now to lock in a year of full access for $25 per month paid upfront, or opt for month-to-month payments at just $30 per month. Both payment options begin with a one week free trial. Explore Pro’s benefits and start building today!

Try Pro Membership