Have you tested the recent Alexa web prototype? It lets you to check Amazon Echo’s virtual assistant without having to buy an Amazon Echo — right from your browser. It works great thanks to a rather new technology called WebRTC.
WebRTC is part of HTML5. It allows the use of real time voice and video calling inside a web browser. The best part? It is already supported by Firefox, Chrome and now also Edge. Apple’s Safari still doesn’t support WebRTC, but that will change closer to the end of this year.
This whole thing around WebRTC started somewhere in 2011 as an easy way to code video-calling capabilities into a browser. It attracted a lot of developers who thought it could help them build an alternative to Skype. Since then, on almost a monthly basis, we’ve seen one startup or another announcing their Skype-killer-WebRTC-based-video-chat-service. Unfortunately, most of them don’t look any better than a simple Hello-World implementation of WebRTC.
Based on these efforts, some have judged WebRTC a dud. But it isn’t; it is transformative in ways we can’t yet fathom.
The takeaway here is that WebRTC is only as sophisticated as you make it. You can use it to build “fast food” applications and services — in other words, applications that are quick and easy to build but that aren’t very satisfying — but you can also use it to create an application that ranks as a gourmet meal.
Here are a few services that recently made it into headlines and that make great gourmet meals out of WebRTC:
RingCentral is a cloud based phone system for businesses. It has been around for over a decade, solving the communication needs of enterprises. Earlier this year, it announced its WebRTC API program for developers.
Beam, a rising startup that won TechCrunch Disrupt NY 2016, is all about live-stream gaming, where the gamer streams his screen to viewers. The Beam team wasn’t happy with how slowly video streams get served by Flash or HLS — the current streaming technologies out there. Flash and HLS tend to take upwards of 20 seconds, even after tuning. So what Beam did was use WebRTC’s data channel as its transport for video streams, calling this new technique FTL (yes, that stands for Faster Than Light). This enables Beam to get to sub-second latency in streams with the ability to throw away buffering altogether.
Cast is a full service podcasting studio. And it requires no download. This is again achieved by using WebRTC. While Cast and Beam seem rather similar (they both end up streaming media), they make different uses of WebRTC in their architectures. Cast uses WebRTC to acquire the media streams for editing purposes, while Beam uses WebRTC for streaming media to the viewers.
Airtime is Sean Parker’s startup. It just relaunched as a kind of group video chat application using WebRTC. The company actually acquired a company called vLine, an early adopter of WebRTC technology, to make this possible. For Airtime, WebRTC is just means to an end. The social group video chat service needed a way to be able to process media in real time inside a mobile device, and WebRTC enabled that at a price point unavailable anywhere else: free.
Twilio, which just filed for an IPO and is the poster child of cloud communication API, has been offering WebRTC capabilities to its user base for a few years now. It has done so through its telephony gateway, which enables the connection of regular phone calls to the browser and through its WebRTC SDK, offering mobile and desktop support for WebRTC calls.
Google Duo was just announced. It is one of several new communication apps by Google. While Duo may seem like just another video chat app, it isn’t (you can see Justin Uberti, the principle WebRTC engineer who worked on Duo, tweet about it here). To make an experience different from the rest of the pack, Google invested heavily in two integration points to WebRTC:
- It added a “Knock Knock” feature, which means a video feed is sent as part of the ringing process on the receiving end of Duo. The user picking up the phone to answer practically sees who is calling live. This makes answering the call a seamless process where no wait time is required until you can start speaking.
- It decided to use QUIC, a new Google protocol, to speed up the signaling protocol used around the calling process (dialing, answering, etc). This again enables a smoother experience. How smooth? We will only know once Duo is officially released to the public.
Infrastructure is the differentiator in WebRTC
This all leads us to the fact that WebRTC has changed how we need to think about real time communications.
The concept with such technologies was different a few years ago. You had to focus on your compression technology to get better video quality than other vendors and had to increase resolution and frame rate in any new product. WebRTC changes that by making the client itself the browser and setting the performance of it to what browser vendors are able to achieve. Call it a level playing field.
The result? The backend infrastructure is becoming a lot more important. And what really matters now is how different vendors build it, and — as Google’s Justin Uberti says — how they optimize it and the whole system around it.
If you are thinking of using WebRTC, or doing anything in the real-time communication space, then you should invest time in researching WebRTC and deciding what architecture and solution will work best for your users. Make sure not to stop at the boring “talking heads” use case, though, as that won’t attract the crowds anymore.
Tsahi Levent-Levi is an independent analyst and consultant for WebRTC. He sometimes writes on behalf of Twilio. He is the author and editor of bloggeek.me, which focuses on the ecosystem and business opportunities around WebRTC.