Apple's ARKit 4 introduces new depth capabilities and expands face tracking to more devices

This afternoon, following Apple's 2020 Worldwide Developers Conference (WWDC) keynote, the company detailed ARKit 4, the latest version of its augmented reality (AR) app development kit for iOS devices. Available in beta, it introduces a Depth API that creates a new way to access depth information on the iPad Pro. Location Anchoring -- another new feature -- leverages data from Apple Maps to place AR experiences at geographic points within iPhone and iPad apps. And face tracking across both photos and videos is now supported on any device with the Apple Neural Engine and a front-facing camera.

According to Apple, the Depth API leverages the scene understanding capabilities built into the 2020 iPad Pro's lidar scanner to gather per-pixel information about an environment. When combined with 3D mesh data, the API makes virtual object occlusion more realistic by enabling instant placement of digital objects and blending them seamlessly with their physical surroundings.

As for ARKit 4's Location Anchoring, it supports the placement of AR experiences throughout cities, alongside famous landmarks, and elsewhere. More concretely, it allows developers to anchor AR creations at specific latitude, longitude, and altitude coordinates such that users can move around virtual objects and see them from different perspectives.

On the subject of expanded Face Tracking, it works on any iOS smartphone or tablet packing the A12 Bionic chip and later, including the iPhone X, iPhone XS, iPhone XS Max, iPhone XR, iPad Pro, and iPhone SE. Apple says that it's able to track up to three faces at once.

Beyond those improvements, ARKit 4 ships with motion capture, enabling iOS apps to understand body position and movement as a series of joints and bones and use motion and poses as inputs to AR experiences. It's now possible to simultaneously capture face and world tracking with devices' front and back cameras and to collaborate among multiple people to build an AR world map. Any object, surface, and character can display video textures. And thanks in part to machine learning-driven improvements, apps built using ARKit 4 can detect up to 100 images at a time and get an automatic estimate of the physical size of the object in an image, with better recognition in complex environments.

The previous version of ARKit -- ARKit 3.5, which was released in March -- added a new Scene Geometry API that leverages the 2020 iPad Pro's lidar scanner to create a 3D map of a space, differentiating between floors, walls, ceilings, windows, doors, and seats. It introduced the ability to quickly measure the lengths, widths, and depths of objects from up to five meters away, enabling users to create digital facsimiles that can be used for object occlusion (i.e., making digital objects appear blended into scenes behind real objects).

ARKit 3.5 also improved in the motion capture and people occlusion department, with better depth estimation for people and better height estimation for motion capture. On the 2020 iPad Pro, the lidar scanner enables more accurate three-axis measurements, conferring benefits to existing apps without the need for code changes.