Carnegie Mellon University - The Batch

Machine Learning Research

A Transformer Alternative Emerges: Mamba, a new approach that may outperform transformers

An architectural innovation improves upon transformers — up to 2 billion parameters, at least...

Machine Learning Research

Better, Faster Network Pruning: Researchers devise pruning method that boosts AI speed

Pruning weights from a neural network makes it smaller and faster, but it can take a lot of computation to choose weights that can be removed without degrading the network’s performance.

Machine Learning Research

Text or Images, Input or Output: GILL, an innovative approach to multimodal model training

GPT-4V introduced a large multimodal model that generates text from images and, with help from DALL-E 3, generates images from text. However, OpenAI hasn’t fully explained how it built the system. A separate group of researchers described their own method.

Flowcharts show how a new contrastive learning approach uses metadata to improve AI image classifiers

Machine Learning Research

Learning From Metadata: Descriptive Text Improves Performance for AI Image Classification Systems

Images in the wild may not come with labels, but they often include metadata. A new training method takes advantage of this information to improve contrastive learning.

A series of graphs show the carbon emissions associated with training AI models.

Science

Cutting the Carbon Cost of Training: A New Tool Helps NLP Models Lower Their Gas Emissions

You can reduce your model’s carbon emissions by being choosy about when and where you train it.

Animation showing probability of children who may benefit from intervention

Tech & Society

Child-Welfare Agency Drops AI: Oregon and Pennsylvania Halt Use of AI Tool for At-Risk Kids

Officials in charge of protecting children stopped using a machine learning model designed to help them make decisions in difficult cases. The U.S. state of Oregon halted its use of an algorithm intended to identify children who may benefit from intervention.

Illustration of how different data split strategies partition the labelled data

Machine Learning Research

Fine-Tune Your Fine-Tuning: New method optimizes training for few shot NLP models.

Let’s say you have a pretrained language model and a small amount of data to fine-tune it to answer yes-or-no questions. Should you fine-tune it to classify yes/no or to fill in missing words — both viable approaches that are likely to yield different results?

A four-legged robot walking over difficult and changing terrain

Machine Learning Research

Walking the Dog: Training a robot to walk over unsteady terrain with RL.

A reinforcement learning system enabled a four-legged robot to amble over unfamiliar, rapidly changing terrain.

Neural networks generating novel views of a 3D scene based on existing pictures

Machine Learning Research

3D Scene Synthesis for the Real World: Generating 3D scenes with radiance fields and image data

Researchers have used neural networks to generate novel views of a 3D scene based on existing pictures plus the positions and angles of the cameras that took them. In practice, though, you may not know the precise camera

Tech & Society

Unsupervised Prejudice: Image classification models learned bias from ImageNet.

Social biases are well documented in decisions made by supervised models trained on ImageNet’s labels. But they also crept into the output of unsupervised models pretrained on the same dataset.

Data related to Covid-19 symptoms prediction

Machine Learning Research

Cats Cured of Covid: Why some deep learning models thought cats had Covid

Neural networks are famously bad at interpreting input that falls outside the training set’s distribution, so it’s not surprising that some models are certain that cat pictures show symptoms of Covid-19. A new approach won’t mistakenly condemn your feline to a quarantine.

Data and graphs related to teacher networks

Machine Learning Research

Flexible Teachers, Smarter Students: Meta Pseudo Labels improves knowledge distillation.

Human teachers can teach more effectively by adjusting their methods in response to student feedback. It turns out that teacher networks can do the same.