
Machine Learning Research
Faster Learning for Diffusion Models: Pretrained embeddings accelerate diffusion transformers’ learning
Diffusion transformers learn faster when they can look at embeddings generated by a pretrained model like DINOv2.
Machine Learning Research
Diffusion transformers learn faster when they can look at embeddings generated by a pretrained model like DINOv2.
Machine Learning Research
Diffusion models usually take many noise-removal steps to produce an image, which takes time at inference. There are ways to reduce the number of steps, but the resulting systems are less effective. Researchers devised a streamlined approach that doesn’t sacrifice output quality.
Science
Materials that have specific properties are essential to progress in critical technologies like solar cells and batteries. A machine learning model designs new materials to order.
Machine Learning Research
Typical large language models are autoregressive, predicting the next token, one at a time, from left to right. A new model hones all text tokens at once.
Tech & Society
Last year, we saw an explosion of models that generate either video or audio outputs in high quality. In the coming year, I look forward to models that produce video clips complete with audio soundtracks including speech, music, and sound effects.
Machine Learning Research
The gap is narrowing between closed and open models for video generation.
Machine Learning Research
A new model improves on recent progress in generating interactive virtual worlds from still images.
Business
Amazon introduced a range of models that confront competitors head-on.
Machine Learning Research
Researchers devised a way to cut the cost of training video generators. They used it to build a competitive open source text-to-video model and promised to release the training code.
Machine Learning Research
Generative adversarial networks (GANs) produce images quickly, but they’re of relatively low quality. Diffusion image generators typically take more time, but they produce higher-quality output. Researchers aimed to achieve the best of both worlds.
Tech & Society
Google’s new mobile phones put advanced computer vision and audio research into consumers’ hands. The Alphabet division introduced its flagship Pixel 8 and Pixel 8 Pro smartphones at its annual hardware-launch event. Both units feature AI-powered tools for editing photos and videos.
Machine Learning Research
A tweak to diffusion models, which are responsible for most of the recent excitement about AI-generated images, enables them to produce more realistic output.