Machine Learning Research
How DeepSeek Did It: Researchers describe training methods and hardware choices for DeepSeek’s V3 and R1 models
DeepSeek made headlines late last year, when it built a state-of-the-art, open-weights large language model at a cost far lower than usual. The upstart developer shared new details about its method.