Meta’s Smart Glasses Come Into Focus: Meta reveals further details of Aria Gen 2 smart glasses for multisensory AI research
Meta revealed new details about its latest Aria eyeglasses, which aim to give AI models a streaming, multisensory, human perspective.

Meta revealed new details about its latest Aria eyeglasses, which aim to give AI models a streaming, multisensory, human perspective.
What’s new: Meta described its Aria Gen 2 smart-glasses platform in a blog post that focuses on capabilities relevant to research in augmented reality, “embodied AI” such as robot training, and “contextual AI” for personal use. Units will be available to researchers later this year. Meanwhile, you can apply for access to Aria Generation 1 and download open source datasets, models, tools, 3D objects, and evals.
How it works: Aria Generation 2 packs an impressive variety of technologies into a package the shape of a pair of glasses and the weight of an egg (around 75 grams), with battery life of 6 to 8 hours. A suite of sensors enables the unit, in real time, to interpret user activity (including hand motions), surroundings, location, and interactions with nearby compatible devices. A privacy switch lets users disable data collection.
- A Qualcomm SD835 chip with 4GB RAM and 128GB storage processes input and output on the device itself. Users can stream the unit’s output, such as video, audio, and 3D point clouds, to a local PC or upload it for processing by perception services via cloud-based APIs.
- The unit includes five cameras: An RGB camera captures the user’s point of view. Two more help track the user’s visual attention based on gaze direction per eye, vergence point, pupil diameters, and blinking. A stereoscopic pair helps map the surroundings in three dimensions via simultaneous localization and mapping (SLAM). In addition, an ambient light sensor helps control camera exposure. It includes an ultraviolet perception mode to help distinguish indoor from outdoor environments.
- Seven microphones help to monitor surrounding sounds and their locations. A separate contact microphone picks up the user’s voice, helping to make the user intelligible in noisy environments. A pair of open-ear speakers reproduces sounds.
- Other sensors include two motion-sensing inertial measurement units (IMUs), a barometer, and a magnetometer to help track the unit’s motion and orientation; global navigation satellite receiver to help track its location; and a photoplethysmography (PPG) sensor to detect the user’s heart rate. Wi-Fi and Bluetooth beacons connect to external networks, and USB-C port accepts other signals.
- A common clock calibrates and time-stamps most sensor readings with nanosecond resolution to synchronize with external devices including nearby Aria units.
Applications: Meta showed off a few applications in video demonstrations.
- The fields of view of the two stereoscopic cameras overlap by 80 degrees, enabling the system to generate a depth map of a user’s surroundings. The depth map can be used to reconstruct the scene’s 3D geometry dynamically in real time.
- This 3D capability enables the system to track the user’s hands, including articulations of all hand joints, in 3D space. Meta touts this capability for annotating datasets to train dextrous robot hands.
- The contact microphone picks up the user’s voice through vibrations in the unit’s nosebridge rather than the surrounding air. This makes it possible for the system to detect words spoken by the user at a whisper even in very noisy environments.
- The unit broadcasts timing information via sub-gigaHertz radio. Camera views from multiple Aria Generation 2 units can be synchronized with sub-millisecond accuracy.
Behind the news: Meta launched Project Aria in 2020, offering first-generation hardware to researchers. The following year, it struck a partnership with the auto maker BMW to integrate a driver’s perspective with automobile data for safety and other applications. Research projects at a variety of universities followed. Meta unveiled the second-generation glasses in February.
Why it matters: Many current AI models learn from datasets that don’t include time measurements, so they gain little perspective on human experience from moment to moment. Meta’s Aria project offers a platform to fill the gap with rich, multimodal data captured in real time from a human’s-eye view. Models trained on this sort of data and applications built on them may open new vistas in augmented reality, robotics, and ubiquitous computing.
We’re thinking: Google Glass came and went 10 years ago. Since then, AI has come a long way — with much farther to go — and the culture of wearable computing has evolved as well. It’s a great moment to re-explore the potential of smart glasses.