When I was running 3Dlabs in the late ’90s we had a dream. As one of the fathers of the 3D graphics chip, we were constantly on the lookout for new opportunities to leverage the ever-expanding power of the graphics chip and somehow use it to play PC games or run engineering programs remotely on emerging mobile devices.
The Internet was nascent and slow — and you only heard the term cloud in weather reports. Smartphones did not really exist except in Japan – but hey, dreamers dream and we were not going be held back by the mundane details of reality. Cognitive dissonance is what keeps innovators moving forward.
Although the concept of “graphics remoting” was around at the time, it totally ignored the graphics processing unit (GPU) and pushed pixels to the user from a virtual dumb frame buffer. In fact, this approach still dominates remote desktop protocol (RDP) and virtual desktop infrastructure products in the enterprise today. Only in recent years has post-GPU pixel streaming started appearing on the agenda. I get a wry smile on my face when I see some enthusiastic salesperson at a trade show trying to demo Google Earth running at 5 hertz (very slowly) on a remote desktop. Is this really progress?
So we started a secret project at 3Dlabs at the time where we would run the application on a server on the Internet, capture the pixels post-GPU, compress them using different schemes depending on the application (to avoid motion artifacts and pixel compliance issues) and pump them to the client device. On the client device, we would capture the input from the mouse etc. and send it back through a low-latency channel back to the server. In other words, we plotted a way to run remote applications without running into delays on the interaction.
We knew these visual servers would need to be positioned as ‘Edge Servers’ near the user to reduce input latency and this became the code name for the project and the patent we applied for in early 2001 (which was eventually granted in late 2009).
Of course it would not have made sense to allocate a full GPU to a remote device with a tiny screen and we knew we could share one GPU on the server side with many client devices (an aspect which OnLive and similar companies sadly missed). This ‘GPU Virtualization’ is easier said than done. GPUs contain tens of thousands of internal registers, have very deep pipelines and have a lot of ‘state’. Unless a GPU is designed to switch between different applications within a few milliseconds, it would be almost impossible to share it between different remote users without incurring annoying delays. Thankfully, our GPUs were designed to save and restore context very rapidly and this enabled us to virtualize the device and share between multiple clients with ease.
Although we evangelized this to many Japanese telecommunications firms at the time, it fell on deaf ears and as a result we never commercialized it as a product. With hindsight, this was a silver lining because we did not appreciate the inherent scalability problems associated with pixel streaming and that it could never become web-scale. I sold 3Dlabs in 2002 and had fun running a record label for a while until I came across Numecent in 2009.
Fast forward 13 years to 2013 – we now live in a world where there are amazingly capable GPUs inside every client device, Cloud is the buzzword of the decade and I am now the cofounder CEO of Numecent – the inventors of cloudpaging, or the ability to parse a program into its parts so that it can be downloaded and run, even if only a small percentage of the actual bits have been downloaded to the user’s machine..
Yet there are companies out there still chasing 3Dlabs’ dream from the last century, and — just like we did all those years ago — they are missing the fundamental scalability issues. Perhaps history repeats itself because we forget.
Consider this scenario – one of Approxy’s customers (Numecent’s cloud-gaming spin-out) is a Chinese massively multiplayer online games publisher with 1.5 million concurrent users, or players who play simultaneously. They expect to have 10 million concurrent users by 2015. To service this opportunity, or chasing the ‘pixel streaming’ dream, they would require 1 million servers, each with a GPU (shared across 10 users say – I am being generous) and located as Edge Servers inside China just to play this game 24/7. To put it into context, I just read that Facebook only has around 60,000 servers (without any GPUs).
Does it really make business or ecological sense to cram 1 million GPUs in servers when there are billions of great GPUs out there in client devices? And when with Moore’s law on our side, these are getting better every year?
This is what makes the business model for ‘pixel streaming’ dysfunctional. I quipped last year at a keynote that “Pixel streaming is great so long as it does not become popular”. This year I say it is the WiMAX of the industry, a reference to the post Wi-Fi wireless internet technology that was overtaken by LTE because WiMax couldn’t scale to support massive numbers of users.
And did you know that even at 720p, most such solutions consume 1.5 gigabyte of bandwidth per-hour per-client device, even when you are just staring at a wiggling game screen? In a world where data-caps are now a reality, you could be paying overage in just a few days (especially if you have three gaming sons like I do) or have your bandwidth quietly throttled down.
With cloudpaging, we take a different approach at Numecent – instead of sending pixels to the client device, we actually transmit pre-virtualized instructions on-demand which then execute on the client device in a transient manner and without installation. Once enough of the application is cloudpaged, you can even run off-line. For example, if you are currently using the ‘blur’ function in Photoshop, we just send you the bits for that and do not clog the network or your machine with the ‘smooth’ function until we detect that you need it. Once these are brought to you, they are instantly available the next time you use them. Pre-virtualization enables us to run these bits of code on your machine without having to install them (installation of software has a bad habit of degrading its performance).
As we distribute the load between the cloud and the client device, cloudpaging is inherently web-scale and 1,000 times more server and network efficient for GPU centric applications. It is like sending the sheet-music to the client device instead of a WAV file that consumes a large amount of memory and bandwidth.
And cloudpaging is not a dream – one of our customers, Parsons, has so far delivered 4 million-plus multi-vendor computer-aided design (CAD) sessions from the cloud without a hitch and with a tiny server footprint. Come and hear our joint fireside chat at CloudBeat 2013 where we will discuss how it is done.