4 technologies that will unlock AR's full potential

Apple’s ARKit is capturing the attention and imagination of developers and media alike, drawn in by an estimated 700 million iPhone owners around the world who now have AR devices. ARKit is setting the stage for the next era of AR development, and now Google has entered the fray with ARCore to cater to Android developers. Yet AR technology is still in its infancy. For real industry maturity to occur, a unified effort must be made to develop the core technologies that allow for truly immersive experiences.

For all the discussion on what the future AR/VR user experience will look like and how to get there, four categories stand out which will serve as a point of focus over the next 1-2 years to push the entire industry forward: Displays, expanding network bandwidths, deep learning, and interactive communication.

These four technological categories work together to create a more believable, immersive user experience, which in turn attracts more developers to build AR content. This could be the closest we ever come to solving AR’s chicken and egg situation: Which comes first, the technology or the developers?

Display

Smartphones are already providing consumers with the level of precision screen specs and computing power required to enjoy AR experiences. Google, for example, is investing in phones with OLED screens (like Pixel) to be compatible with Daydream headsets, not the Daydream headsets itself.

Display tech still trails when it comes to wearable devices such as glasses, however. Field of view, design, bulkiness, and overall style must all be accounted for in the upgrade process. While daunting, precedent serves as promise. We are in the midst of the first generation of AR glasses and, like all tech, future device versions will be more power efficient and stylish. With that in mind, keep an eye out for an increase in integration between vendors and services in the next year, creating a more comprehensive headwear offering.

Bandwidth

Expanding network bandwidths is an essential activity in the advancement of the "Internet of Things" industry -- the key to connecting a device with an environment. Fortunately, network carriers are already working on a solution to boost their network capabilities, with 5G networks coming soon to not only build the infrastructure for high performing AR apps, but also account for a truly mobile future. It is for this reason that AR will ultimately succeed: Its success is dependent on advancements in the same technology other sectors are working to improve. Autonomous vehicles, first-responder connectivity, and real-time gaming each have a reliance on world data, and their breakthroughs will become AR breakthroughs. For the AR industry, this is a tremendous position to be in.

Deep Learning

Deep learning a sub-category of machine learning, where software attempts to mimic the part of brain where thinking occurs using pattern recognition, is already having a huge impact on the capabilities of the entire technology industry, and it is an essential element in pushing the AR industry forward. Computers conduct tasks at a speed that humans cannot mimic, but computers could never match how humans process and sort information.

In AR, deep learning is applied to solve the detection problem in camera-based tracking. This is important because in the future, consumers will have tracking cameras in devices beyond just the smartphone. Because augmented objects are rendered under various viewing conditions, including varying orientation, scale, and light conditions, there is a need for a deep learning kit that can integrate seamlessly across sensors from multiple manufacturers.

Deep learning is essential in fostering real-time image recognition and tracking augmented objects, giving them real positional data and features. By contrast, 3D modeling overlaid atop a smartphone screen is what we see in Pokémon Go. Deep learning’s potential use cases far outnumber 3D modeling’s.

Deep learning’s avenue to mainstream discussion is SLAM (Simultaneous Localization and Mapping), which from a high-level overview is considered the principal technology powering Apple’s ARKit. Specifically, VIO (Visual Inertial Odometry) is a simple SLAM system that more precisely is responsible for ARKit’s functionality. SLAM uses computer vision to create a digital outline of a space, and track a phone’s location in relation to objects. SLAM’s capabilities will improve as processing technology becomes more affordable and Moore’s Law takes effect, but the secret sauce lies in software development, where leading companies are focused on accelerating performance.

Interaction

Lastly, enhancements in interaction are critical to creating truly immersive experiences, where our reality is blended perfectly with the virtual world. By being able to digitally engage with objects in a way that is natural to us, we immerse ourselves in the mixed reality.

Similar to how touchscreens enabled even the most technology-illiterate to engage with smartphones, introducing an easy and natural way for users to engage with AR/VR objects and environments will play a major role in boosting consumer adoption. AR apps currently uses smartphone touch screens to interact, but that is just step one on the ladder to immersion.

Visually-stimulating interaction that will create the next technological revolution falls into three buckets:

Gestural communication: Waving, winking, etc.
Indirect manipulation: Using a tool, such as a hammer or paintbrush, to change an environment.
Direct manipulation: Using your hands or feet to “touch” and manipulate a virtual item.

A gamepad lacks the ability to enable any of the aforementioned forms of interaction but 3D hand-tracking enables all three. When partnered with an AR/VR device, 26 degrees of freedom (as we're developing at uSens) 3D hand-tracking allows for precise, controller-less interaction with a virtual world. In mobile AR/VR situations, less hardware means lower cost, usually leading to higher adoption. Tapping on a phone’s screen to interact was an early method of “controller-less” interaction, but being able to interact with virtual objects in a way that feels natural and real will bring further consumer interest.

Consumers will be the end beneficiaries of a concerted effort to improve the four keys to realizing the potential of AR, with products improving in terms of usability, wearability, mobility, and affordability. Collectively, this all adds up to truly immersive virtual experiences which everyone can enjoy from a range of devices. While advancements still need to be made across the industry for a truly augmented reality future, great companies and great minds are working together to make our dreams a reality.

Dr. Yue Fei is the chief technology officer and co-founder of uSens, Inc., which has its headquarters in San Jose, California.

Display

Bandwidth

Deep Learning

Interaction

More