A lot of things happened in 2016.
For starters, 2016 was the year when the filter bubble popped and the fake news controversy shook the media industry. Following the U.S. elections, Facebook came under fire as having influenced the results by enabling the spread of fake news on its platform. A report by Buzzfeed showed how fake stories, such as Pope Francis endorsing Donald Trump, received considerably more engagement than true stories from legitimate media outlets like the New York Times and the Washington Post. Mark Zuckerberg was quick to dismiss the claim, but considering that nearly half of all Americans get their news primarily from the platform, it is very reasonable to believe Facebook did play a role in the elections.
The fake news controversy led to a lot of discussion and some great ideas on how to face it. Under the spotlight, both Facebook and Google reacted by banning fake news sites from advertising with them. Facebook also went a step further by introducing new measures to limit the spread of fake news on its platform, such as the ability for users to report dubious content, which then shows a “disputed” warning label next to it.
While those are promising first steps, I am afraid they won’t be enough. I believe our current misinformation problem is only the tip of a massive iceberg — and this looming disaster starts with AI.
Enter artificial intelligence
2016 was also the year where AI became mainstream. Following a long period of disappointments, AI is making a comeback thanks to recent breakthroughs such as deep learning. Now, rather than having to code the solution to a problem, it is possible to teach the computer to solve the problem on its own. This game-changing approach is enabling incredible products that would have been thought impossible just a few years ago, such as voice-controlled assistants like Amazon Echo and self-driving cars.
While this is great, AI is also enabling some impressive but downright scary new tools for manipulating media. These tools have the power to forever change how we perceive and consume information.
For instance, a few weeks ago, Adobe announced VoCo, a Photoshop for speech. In other words, VoCo is an AI-powered tool that can replicate human voices. All you need is to feed the software a 20-minute long audio recording of someone talking. The AI will analyze it and learn how that person talks. Then, just type anything, and the computer will read your words in that person’s voice. Fundamentally, Adobe built VoCo to help sound editors easily fix audio mistakes in podcasts or movies. However, as you can guess, the announcement led to major concerns about the potential implications of the technology, from reducing trust in journalism to causing major security threats.
This isn’t the end of it. What we can do with audio, we can also do with video:
Face2Face is an AI-powered tool that can do real-time video reenactment. The process is roughly the same as VoCo: Feed the software a video recording of someone talking, and it will learn the subtle ways that person’s face moves and operates. Then, using face-tracking tech, you can map your face to that person’s, essentially making them do anything you want with an uncanny level of realism.
Combine VoCo and Face2Face, and you get something very powerful: the ability to manipulate a video to make someone say exactly what you want in a way that is nearly indistinguishable from reality.
It doesn’t stop here. AI is enabling many other ways to impersonate you. For instance, researchers created an AI-powered tool that can imitate any handwriting, potentially allowing someone to manipulate legal and historical documents or create false evidence to use in court. Even creepier, a startup created an AI-powered memorial chatbot: software that can learn everything about you from your chat logs, and then allow your friends to chat with your digital self after you die.
A Photoshop for everything
Remember the first time you realized that you’d been had? That you saw a picture you thought was real, only to realize it was photoshopped? Well, here we go again.
Back in the days, people used to say that the camera cannot lie. Thanks to the invention of the camera, it was possible, for the first time, to capture reality as it was. Consequently, it wasn’t long before photos became the most trusted pieces of evidence one could rely upon. Phrases like “photographic memory” are a testament to that. Granted, people have been historically manipulating photos, but those edits were rare and required the tedious work of experts. This isn’t the case anymore.
Today’s generation knows very well that the camera does lie, all the time. With the widespread adoption of photo-editing tools such as Photoshop, manipulating and sharing photos has now become one of the Internet favorite’s hobbies. By making it so easy to manipulate photos, these tools also made it much harder to differentiate fake photos from real ones. Today, when we see a picture that seems very unlikely, we naturally assume that it is photoshopped, even though it looks very real.
With AI, we are heading toward a world where this will be the case with every form of media: text, voice, video, etc. To be fair, tools like VoCo and Face2Face aren’t entirely revolutionary. Hollywood has been doing voice and face replacement for many years. However, what is new is that you no longer need professionals and powerful computers to do it. With these new tools, anyone will be able to achieve the same results using a home computer.
VoCo and Face2Face might not give the most convincing results right now, but the technology will inevitably improve and, at some point, be commercialized. This might take a year, or maybe 10 years, but it is only a matter of time before any angry teenager can get their hands on AI-powered software that can manipulate any media in ways that are indistinguishable from the original.
Given how well fake news tends to perform online, and that our trust in the media industry is at an all-time low, this is troubling. Consider, for instance, how such a widespread technology could impact:
- Justice: A lot of what constitutes a piece of evidence today might not be receivable in court anymore. Just like with digital photography, it is only a matter of time before a court establishes a precedent where a piece of written, audio, or video evidence isn’t admissible because there is no way to prove it wasn’t forged using AI-powered tools — even if it looks or sounds perfectly real.
- Politics: We will likely see a lot of shocking videos crafted to discredit and embarrass political opponents. Even in the cases where the video is real, such as the one where Trump brags about being able to grope women, the person in question could easily argue that the video is a fake — that the audio or the footage was edited using software like VoCo — and there would be no way for the public to know for sure.
Entering the post-truth world
In 2016, Oxford Dictionaries chose “post-truth” as the international word of the year, and for good reason. Today, it seems we are increasingly living in a kingdom of bullshit, where the White House spreads “alternative facts” and everything is a matter of opinion.
Technology isn’t making any of this easier. As it improves our lives, it is also increasingly blurring the line between truth and falsehood. Today, we live in a world of Photoshop, CGI, and AI-powered beautifying selfie apps. The Internet promised to democratize knowledge by enabling free access to information. By doing so, it also opened up a staggering floodgate of information that includes loads of rumors, misinformation, and outright lies.
Social media promised to make us more open and connected to the world. It also made us more entrenched in digital echo chambers, where shocking, offensive, and humiliating lies are systematically reinforced, generating a ton of money for their makers in the process. Now AI is promising, among other things, to revolutionize how we create and edit media. By doing so, it will also make distortion and forgery much easier.
This doesn’t mean any of these technologies are bad. Technology, by definition, is a mean to solve a problem — and solving problems is always a good thing. As with everything that improves the world, technological innovation often comes with undesired side effects that tend to grab the headlines. However, in the long run, technology’s benefit to society far outweighs its downsides. The worldwide quality of life has been getting better by almost any possible metric: Education, life expectancy, income, and peace are better than they have ever been in history. Technology, despite its faults, is playing a huge role in all of these improvements.
This is why I believe we should push for the commercialization of tools like VoCo or Face2Face. The technology works. We can’t prevent those who want to use it for evil from getting their hands on it. If anything, making these tools available to everyone will make the public aware of their existence — and by extension, aware of the easily corruptible nature of our media. Just like with Photoshop and digital photography, we will collectively adapt to a world where written, audio, and video content can be easily manipulated by anyone. In the end, we might even end up having some fun with it.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here