Machine Learning Research
Masking Private Data in Training Sets: Google researchers released VaultGemma, an open-weights model redacting personal information
Large language models often memorize details in their training data, including private information that may appear only once, like a person’s name, address, or phone number. Researchers built the first open-weights language model that’s guaranteed not to remember such facts.