Neural Networks - The Batch

Diagram showing sequential task learning steps with images of robotic tasks and flow arrows.

Machine Learning Research

Robots That Adapt to New Tasks: Sony and university researchers train robots on new tasks without catastrophic forgetting

Neural networks can forget how to perform earlier tasks as they learn new ones.

Two graphs show TTT-E2E maintains stable loss and latency across increasing context lengths up to 128k.

Machine Learning Research

Learning Long Context at Inference: Test-Time Training End-to-End (TTT-E2E) retrains model weights to handle long inputs

Large language models typically become less accurate and slower when they process longer contexts, but researchers enabled an LLM to keep accuracy stable and inference time constant as its context grew.

Two comparison tables show AI model performance across varied benchmarks, highlighting LFM2.5-1.2B.

Machine Learning Research

Faster Reasoning at the Edge: Liquid AI’s small reasoning model mixes attention with convolutional layers for efficiency

Reasoning models in the 1 to 2 billion-parameter range typically require more than 1 gigabyte of RAM to run. Liquid AI released one that runs in less than 900 megabytes, and does it with exceptional speed and efficiency.

Flowchart showing Tiny Recursive Model process with stages: input, prediction, and latent refinement.

Machine Learning Research

Small Models Solve Hard Puzzles: Tiny Recursive Model beats larger competitors at games like Sudoku and Maze

Large language models often fail at puzzles like Sudoku, for which a solution includes multiple elements and a single mistake invalidates all of them. Researchers showed that a tiny network, by repeatedly refining its solution, can solve this sort of puzzle well.

Graph shows Ernie-4.5 outperforming competitors in document understanding and visual reasoning tasks.

Machine Learning Research

Baidu’s Multimodal Bids: Giant Ernie 5 natively generates multiple media; Ernie-4.5-VL-28B-A3B-Thinking tops Vision-Language metrics

Baidu debuted two models: a lightweight, open-weights, vision-language model and a giant, proprietary, multimodal model built to take on U.S. competitors.

Flowchart of Text-to-LoRA model processes task embeddings into LoRA adapters, showing weights and losses.

Machine Learning Research

LoRA Adapters On Tap: Text-to-LoRA generates task-specific LoRA adapters directly from natural language descriptions

The approach known as LoRA streamlines fine-tuning by training a small adapter that modifies a pretrained model’s weights at inference. Researchers built a model that generates such adapters directly.

Image illustrates data flow from raw satellite sources through processing to embeddings for climate tracking.

Machine Learning Research

Earth Modeled in 10-Meter Squares: Google’s AlphaEarth Foundations tracks the whole planet’s climate, land use, potential for disasters, in detail and at scale

Researchers built a model that integrates satellite imagery and other sensor readings across the entire surface of the Earth to reveal patterns of climate, land use, and other features.

Table comparing DINO, DINOv2, DINOv3, SigLIP 2, and PE on segmentation, depth estimation, tracking, and classification tasks.

Machine Learning Research

Better Image Processing Through Self-Supervised Learning: Meta’s DINOv3 gets an updated loss term and improved vision performance

DINOv2 showed that a vision transformer pretrained on unlabeled images could produce embeddings that are useful for a wide variety of tasks. Now it has been updated to improve the performance of its embeddings in segmentation and other vision tasks.

BitNet b1.58 matrix multiplication shows ternary weights enabling faster neural network computation.

Machine Learning Research

Low Precision, High Performance: Researchers at Microsoft and Tsinghua researchers propose 1.58-bit AI model that rivals full-precision competitors

Reducing the number of bits used to represent each parameter in a neural network from, say, 16 bits to 8 bits shrinks the network’s size and boosts its speed. Researchers took this approach to an extreme: They built a competitive large language model whose weights are limited to three values.

Dual line graphs showing factual QA accuracy and NLL against memory size for NQ and TQA datasets in AI models.

Machine Learning Research

Memory Layers for More-Factual Output: Meta researchers build Llama-style models that recall details without needing more computing resources

Improving a large language model’s factual accuracy typically requires making it bigger, which in turn, involves more computation. Researchers devised an architecture that enables models to recall relevant details without significantly increasing the amount of computation required.

A participant types while an MEG scan decodes brain activity into text in real-time, showing typed vs. decoded text.

Science

Reading Minds, No Brain Implant Required: Brain2Qwerty, a system that decodes thoughts using brain waves without surgery

To date, efforts to decode what people are thinking from their brain waves often relied on electrodes implanted in the cortex. New work used devices outside the head to pick up brain signals that enabled an AI system, as a subject typed, to accurately guess what they were typing.

Robotic arms collaborating to fold a red garment on a table.

Hardware

Household Help: π0, a machine learning system for household robotics

A new generation of robots can handle some household chores with unusual skill.