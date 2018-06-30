“Computer vision” is a decidedly sci-fi concept, so it’s no surprise that the idea of our smart devices becoming all-seeing tools has captured many imaginations. Yet, the true potential of the tech is held back by a widespread misunderstanding of what great applications can look like and what they can achieve.

Computer vision can offer a competitive advantage or a platform for innovation, but it isn’t a magic bullet. The essential steps to weighing up its worth for a given problem start with understanding what’s out there already: A spectrum of applications which will challenge preconceptions about what the tech is really capable of. With that insight, one can start to identify the other kinds of essential customer problems computer vision might solve. Turning the potential into reality is then a question of infrastructure and data requirements, which are vital to teaching a system and ensuring its outputs are useful.

Starting at the beginning

Where did computer vision come from? It’s the product of a proliferation of cheap, high-quality cameras, which has expanded the scope for imagery captured in public, private, and commercial domains. At the same time, advances in machine learning and deep learning technology are allowing us to transform those images into digital signals which, in turn, support a range of actions. To begin with, these visual inspection tasks included tracking stock levels or monitoring a production line. Automating these simple and routine activities helped free people up to do jobs that needed more complex thought. But the future lies in what else the tech can do.

Here are three steps we must take to build more useful computer vision technologies.

1. Focus on detail, scale, and space and time

While these automation functions continue to be useful, new algorithms give us the chance to think of computer vision as diverse and adaptable, even more so than a human eye. Increasing clarity of vision backed up by the power of a brain to interpret and parse sight into action could enhance our abilities to tackle problems that are currently too complex for us.

Suddenly, the tech’s potential is expanded massively, in three key areas: Seeing the imperceptible, seeing at scale, and seeing across space and time.

Seeing the imperceptible is about seeing and interpreting more than our brain alone will allow. Researchers at MIT have been working on detecting a person’s pulse from standard video footage of a person’s face. Computer vision measures very subtle changes in the tone and color of the skin and the derived signal allows you take a pulse without being near the person – let alone needing to touch them. Users can apply this signal to distort the video image in real time to literally make the pulse visible. This has huge potential for ambient sensing, monitoring, and diagnosis in health care.

is about seeing and interpreting more than our brain alone will allow. Researchers at MIT have been working on detecting a person’s pulse from standard video footage of a person’s face. Computer vision measures very subtle changes in the tone and color of the skin and the derived signal allows you take a pulse without being near the person – let alone needing to touch them. Users can apply this signal to distort the video image in real time to literally make the pulse visible. This has huge potential for ambient sensing, monitoring, and diagnosis in health care. Seeing at scale is the ability to monitor and process enormous volumes of visual content. People who inspect flagged content for deletion are responsible for today’s social media moderation, but this approach is reactive and very limited in scale. It also puts those moderators at risk of psychological stress. In a recent project, at Fjord, we looked at how we could use computer vision to help people moderate video content on social media, by redesigning moderation teams to include both people and artificial intelligence. We found that computer vision could detect and act on obvious content violations, while also enriching and pre-editing content that requires human review – increasing the amount of content reviewed and limiting the exposure to troubling content.

is the ability to monitor and process enormous volumes of visual content. People who inspect flagged content for deletion are responsible for today’s social media moderation, but this approach is reactive and very limited in scale. It also puts those moderators at risk of psychological stress. In a recent project, at Fjord, we looked at how we could use computer vision to help people moderate video content on social media, by redesigning moderation teams to include both people and artificial intelligence. We found that computer vision could detect and act on obvious content violations, while also enriching and pre-editing content that requires human review – increasing the amount of content reviewed and limiting the exposure to troubling content. Seeing across space and time allows us to capture footage and observe features that would otherwise be impossible. Ecological surveys of wildlife typically cost a fortune and are time-consuming and difficult to complete. Computer vision is making it all easier, as practitioners use it to map deforestation and biomass reduction using aerial and satellite imagery. Remote camera traps are also helping count wildlife populations in very isolated locations. In industry, companies like Reconstruct Inc. monitor progress on their large building sites by combining autonomously captured footage with building information management systems. The insights can automatically generate progress plans and detect deviations or irregularities in the construction process or design.

2. Size up the data

What these examples indicate is how broad and mature the technology really is. But this may still feel like an opportunity for big tech rather than for other sectors or smaller businesses, due to the sheer amount of data needed to make it work. Data is the fuel that feeds computer vision algorithms, so of course, social media and web platform giants have a natural advantage when it comes to training their computer vision.

However, this shouldn’t stop others from getting started.

Many organizations fail to leverage the image data they already own. For instance, retailers sit on masses of CCTV footage that they usually only inspect after a security incident. If we apply computer vision to that footage, we could understand when queues start to build at checkouts, see when customers appear lost in the store, detect a missing child, or inform the redesign of store layouts.

If companies don’t have the benefit of existing image data, they can start to experiment with user-generated content. Google’s Quick Draw is a game-like experience that tries to guess what you’re sketching on your phone. In just a few months, the program generated 50 million doodles that Google can now harness to automatically interpret people’s hand-drawn scribbles.

Asos, the online fashion retailer, launched the “As seen on me” campaign that encourages users to upload fashion photos of themselves. While the initial motivation was customer engagement and loyalty, its potential for mass personalization through computer vision is clear.

These transitional data-generating services are an important tool that can play a key role in a product strategy that utilizes computer vision. For example, Google has now transformed the quick draw knowledge base into a tool that interprets and enhances your hand-drawn scribbles in real time.

Even without existing or user-generated data, off-the-shelf solutions are available that let you add computer vision to products and services.

3. Put the customer in the picture

The data point only re-enforces the need to put your customer first. At a time when users are more concerned than ever about the way in which companies use their data, we cannot implement a powerful new technology like computer vision without a very clear sense of its purpose – what is the problem it’s there to solve?

The answer is to apply service design principles which dig into human needs first and harness a great understanding of technology’s potential to help users. It starts with design research to identify and investigate pain points, patterns of behavior, and unmet needs, to provide the context for introducing new digital tools into people’s lives.

The applications of this approach have turned computer vision into a tool to help gardeners identify flowers, to let homeowners imagine the impact of daylight, and to help all of us navigate in the dark. Computer vision is a genuinely exciting technology and we’re only just beginning to discover its potential for digital products and services – our future could well be fuelled by it.

So, what’s your vision?

Connor Upton is a data design director for Fjord, design & innovation from Accenture Interactive.