Jim Poole sees a lot of games coming through his network, only they’re not so readily identifiable as games. Poole is an executive at Equinix, one of the companies that supports the backbone of the internet.
Equinix has data centers full of rows of server racks, each of them with processors and graphics processing units (GPUs) that can process games for cloud gaming companies such as Blade, which started offering a paid service so gamers can play high-end games on any device. It’s the same sort of service that Google, using its own network, will deliver with its Stadia service, introduced at the Game Developers Conference. At the GDC, GamesBeat also hosted a breakfast on multiplayer gaming.
I talked with Poole about the state of the internet and how companies like his can host the games that we play and have come to take for granted will always be connected.
Here’s an edited transcript of our interview.
GamesBeat: It’s going to be quite a busy show this year.
Jim Poole: Lots of stuff going on. My first job here at Equinix, I actually covered gaming. It’s probably one of the reasons I’m talking to you. [laughs] What can we let you know?
GamesBeat: One example that’s interesting to me is the contrast in what you see in the market. Electronic Arts had two launches in the past five or six weeks. Apex Legends went really well, from zero to 50 million players on an unadvertised game within a month. Some things about the design of the game made it happen that way.
They also had a heavily advertised come two weeks later, Anthem, and that had a pretty choppy launch. The impression from players was that the game wasn’t ready. It was an interesting contrast as far as what’s possible with big online games. Did you have any impression of why things like this happen?
Poole: Not specific to EA, but I will say that more generally, what I’ve noticed is that you have companies out there that appreciate the underlying network dynamic, what’s happening in the network that allows a game to scale, at least architecturally, in terms of the ability to deliver the right throughput and latency. That’s obviously a big deal.
A lot of multiplayer games, especially in the beginning, there was no centralized coordination. If you wanted to put up a Doom server back in the day, you did it yourself. The farther away you got from the origin server, the worse the experience. Nowadays more people are aware. The degree to which they address the issue is different. It’s similar to cloud computing, which is something we deal with a lot. Take an Amazon or a Microsoft. When they started out they had two or three availability zones in any given part of the country or the world. Now they’re up in the teens and 20s as far as planning for distribution. Gaming is no different. I always like to joke that you can’t beat physics. You can’t solve the speed of light.
GamesBeat: Is that the equivalent of 20 data centers up and running around the country, or something else?
Poole: The biggest dependency for all of these games is access to peering infrastructure. The vast majority of these games are played over the internet. The easiest way to get localized access to large numbers of eyeballs is to participate in lots of peering exchanges. From our perspective, that’s good, because we’re the biggest peering provider in the world. We operate more exchanges than anyone.
In fact, one of the things you can do — I don’t know if you’ve ever played with Peering DB at all? It’s a website, a public listing of all the peering fabrics, all the peering exchanges, and then all the participants. You can go in by market or by company, type their name in, and see where they are on a global basis. That’ll show you the difference between one company and another. You can type in one game company and it shows up in four places, type in another one and it shows up in 20. Generally speaking, you’ll notice that the guy with 20 has better performance than the guy with four.
GamesBeat: I’ve been to the big building in San Jose you guys have. Is that what you mean by a peering exchange?
Poole: A peering exchange is literally a big set of routers sitting in a data center, in which there are a bunch of multi-party participants. Peering in general, the reason it came to be, was because of the realization that no one network goes everywhere. The way for internet traffic to go from, say, a Verizon user to an AT&T user is that Verizon and AT&T need to peer with one another somewhere. That somewhere will tend to be an Equinix building. Inside that building, we’ll operate what we call a peering exchange, which is the hardware that everybody connects to that allows for the exchange of traffic between one network and another.
If you broaden that, it’s not just networks talking to networks. It’s content talking to networks, or computers talking to networks. The gaming companies, the cloud companies, the networking companies all participate in these peering exchanges. They’re geographically distributed around the world. If we just use North America as the example, we operate peering exchanges in Toronto, New York, Washington, Atlanta, Miami, Dallas, Chicago, the bay area, Seattle, Los Angeles. You can quickly see if that you were deployed in all those different peering exchanges, and you drew 10-millisecond circles around each of those facilities, you’d find that you hit the vast majority of the population of the United States.
That’s generally the best you can do. If you put together a comparison as far as what would be the ideal situation for any kind of online something or other, whether it’s a game or something else, to have the best performance you’d be attached to as many peering exchanges as you could. Obviously, you have to take the capex and opex versus the advantage in mind. Fortunately, a lot of games are pretty tolerant. It’s only 60 milliseconds from one side of the country to the other. Most games tolerate 60 milliseconds just fine.
What we usually see—I call them the four corners of the country, from a peering perspective. It’s the bay area, Washington, Chicago, and Dallas. You’ll see a lot of online games and/or cloud services always being in those four markets because once you draw those circles, it gives you consistent 20 milliseconds or less to hit any place in the country. That’s more than enough for most big games.
GamesBeat: What if we then get into the usual problems of consumer-specific things? Like you have five people in a house all watching Netflix or something like that, or trying to get into a multiplayer game on wi-fi, or too many routers in the house trying to get on the internet.
Poole: Yeah, too many hops. Things like that can exacerbate the problem. Occasionally what I’ve seen is, somebody will deploy something realizing that if it’s under 100 milliseconds it should be fine. They’ll put something on the east coast and something on the west coast. Once you add in all the weirdness that can happen inside somebody’s house or the fact that not all ISPs that people are connected to are 100 percent fiber-based — I live in Washington and I have Verizon FIOS. I have fiber right to my house, which is great. When I lived in the bay area I had Webpass, which gave me fiber connectivity right down to that big building in San Jose, and in that building are a bunch of game companies. When I’d play a game I could run a throughput test and get up to 900 megs to that building. My games would fly.
But generally, one of the ways to compensate for the weirdness in the home network is to then potentially deploy in more places than are strictly necessary. I’ve seen games work just fine with one location in the entire country. They’ll work for anyone who has a high-speed connection. But for people who don’t, it would probably be better if the resource they’re trying to attach to sat closer. What you’re seeing over time is that realization, and so more and more, when you look at some of the bigger players, you’ll see them deploy—like I say, if you go into the peering website you’ll see that they’re in lots of places. That’s their way of saying, “Here’s how I ensure a good experience for my end user.”
GamesBeat: With cloud gaming, I’ve seen some interesting strategies on the customer level as well, like Shadow. They were saying that they’re going to guarantee high graphics quality to their customers by designating one GPU in the data center per customer, whereas some of the others are piling on lots of players per GPU. But what that winds up costing — they charge something like $35 a month, almost like renting a PC.
Poole: When people talk about latency, what they tend to do is talk about — they’re thinking about the transmission latency, the speed of light to my rendering device, my laptop or iPad or whatever. What they don’t necessarily also think about is the compute latency, which is why the GPU market has improved things so tremendously. The GPUs do the rendering far faster. You can take different approaches to that, as you say. You can do a dedicated GPU, which gives you the best performance, or you can run a bunch of VMs on top of GPUs and get a different level of latency through the machine. Not just the transmission latency, but the processing latency involved, the actual compute cycle.
Again, that’s one of those things where you can strike a balance. If you dedicate GPUs you can be in slightly fewer places, because you’re not incurring as much latency on the compute cycle. Or you can be in more places and be more multiplex, more virtualized. At the end of the day, you’re trying to get the end-to-end latency, inclusive of wi-fi, the house router, the transmission back to the data center, and the compute cycle.
GamesBeat: If you back some of this out, what are some of the decisions that the game companies or game designers are making? What do you think is within their control? If I go back to the example of Apex Legends, the thing I think matters with that — they have a battle royale game, and those are usually 100 players and one survives. But they switched it to being 60 players and one team survives. They have a map that seems more dense, a smaller map than some of the other battle royale games like PUBG, and they also have a quicker load right into the game. They have an art style that’s more animation-inspired than realistic. It’s not Call of Duty. Those things seem to help make it what it is, a very fast-loading game.
And then you contrast that with Anthem, which is also made by EA, but with very realistic graphics. The load times are very long, and then the problem a lot of players ran into in the first beta weekend is it wouldn’t load at all. It never finished loading for them. Then they’d try to load again and put more stress on the system. Does it sound like these things game designers do can make a difference in the quality of multiplayer?
Poole: It’s an interesting question. Going back to the analogy — you have the compute latency involved in how fast the game loads, how many streams have to be normalized inside the game process before it can turn around a stream and send it back down to the user. You run into the same thing in generic computing. There are people who understand how to write code for, say, a cloud environment that can auto-scale, as opposed to somebody who’s more used to writing code for a single server.
There are a few different philosophies, and one philosophy, writing for a single server, potentially what you’re doing is routing a number of individual players through a single server. The experiences can work because the load on the one server is manageable by that one server. Or you could run through a game such that scales against a bunch of virtual machines more elegantly, and it can scale up to — years ago I was sitting at a thing where Microsoft was launching one of their Xbox multiplayer games. They talked about how they worked with the game developer to use the auto-scaling aspects of Azure. They showed that within five minutes of launching the game, the system had spun up 80,000 cores to deal with the load. It could do that on a fairly elegant basis.
I don’t know what every single game company is doing, but I’d imagine you have a disparity in terms of how people are thinking about what the underlying platform is. In this day and age, you’ll always hear stories about — everybody writes something super easy like Java without understanding what’s underneath it. Going forward now, with the specialization of things like GPUs versus CPUs versus — you’re getting a lot more specificity in what the machine is capable of. The programmer has to be aware of that.
GamesBeat: I talked to some people who made Microsoft’s game Crackdown. It took them four or five years to finish. It’s one of the first that takes advantage of the Azure cloud in a big way. They said they applied it in the multiplayer mode, which is only four players against four players, but it’s in this dense city environment where the buildings are completely destructible. The physics as you fire a shot and hit a building and the building crumbles are all supposed to be accurate. They said that one Xbox by itself couldn’t do this, but when they were playing they had the equivalent of 12 machines computing the physics and other things for multiplayer.
Poole: The way you would think about that, it’s a hybrid game. Part of it is cloud rendered, and so it’s feeding rendered components of the game back to the Xbox because the Xbox by itself can’t keep up. That’s become more common with the bigger console players, from what I’ve noticed in games, especially any game that has, as you said — the more you start increasing either the fidelity in the game, in terms of graphics, and/or the number of players, and these days people want both.
You have the perfect storm in the sense that now you have these massive pools of compute resources — GPUs and CPUs — that can be spun up dynamically, and you still have the console sitting out there that can do a lot of things. In the cloud gaming market, everyone is fixated on this idea that you can get rid of the $400 console or the $2,000 gaming PC. That’s the holy grail.
GamesBeat: The question is always there, though — do you want to go in that direction and enable light gaming on any device, or do you want to take all that computing power in the data center and give somebody something much more powerful?
Poole: We’ve been predicting the downfall of consoles for a while, and it still hasn’t happened. [laughs] The cloud component is still getting bigger and bigger, though.
GamesBeat: The other thing people mention around things like Azure and AWS and Google Cloud — if you offload all the work over to them, then your multiplayer game is going to be better in that you can focus on things that you’re good at. Is there a level of decision there that makes a big difference as to how well that multiplayer is going to turn out?
Poole: It’s hard to say. If you don’t have the right game for that environment, or you’re not using the tools that allow you to write for that environment — all those guys have tools, so you don’t necessarily have to understand how to invoke the next VM to add it to the pool of resources. The system knows how to do it. More generally, the guys in the cloud computing market are very focused on how to provide the tools for people.
The other day I was having a conversation with someone about an AR app. It’s very similar. They’re heavily dependent on the GPU farm in the cloud for what they’re doing. They can’t get away from that. They have no choice because the glasses themselves have very limited amounts of compute in them. They work quite closely with a cloud company to take advantage of the tools that they provide, allowing them to write their application such that it would work well for most users.
There’s a ton of attention being paid to that fact. If cloud has been extraordinarily disruptive for the traditional IT industry, based on the early deployment of millions of CPUs, bringing down the cost of CPU cycles, now you’re seeing it do the same thing. Just a few years ago they didn’t even have GPU farms. Google was one of the first that started doing it because they were going after media companies doing movie animation stuff. Now all of them are heavily invested in GPUs.
It’s a large part of their offering because people are expecting not just gaming, but almost anything that involves some sort of rendering for AR or VR, or even normal video, rendering a movie or a TV show. I used to be in the media business. You had to go out and buy these ridiculous high-end million-dollar boxes, and it would take you a week to render five minutes of video. Now you can get the same thing from one of the hyper-scale guys and they can do it in minutes.
My thought, in general, is that the cloud, such as it is, is one of the things that’s going to help drive the adoption of things like streaming for games. They have the infrastructure out there. It’s deployed. They have access to lots of peering locations. They’re very distributed. They can get very close to the end user. That’s a big deal, because of the latency issue.
GamesBeat: If you’re talking to an Amazon and you’re a game developer, are there certain questions you’d be asking them about the level of service you need?
Poole: Almost all of them now have developer tools specific to certain types of environments. If you want to do facial recognition, for example, inside of Amazon, they have a set of tools called Rekognition built specifically around — here’s how you build an app that would allow you to ingest a camera feed and then run a recognition process against it. All of them are doing that to encourage the consumption of the CPU capability that they’ve invested in. It’s part of their offering.
I forget the last statistics I read about Amazon, but they do something like 30 or 40 feature releases a month across their entire platform. They have people just banging away, building new tools. The more you can get somebody used to using your tools, as opposed to writing all the way down to the individual processor level, the more likely they are to stay with you. That’s the game in cloud computing.
GamesBeat: Can you talk about some of the companies you’re working with?
Poole: Blade and Shadow — Blade is the company, Shadow is the network — they’re one that’s publicly announced, a customer of ours. In fact, you can go into PeeringDB and pull them up, and what you’ll see is they’re with us in multiple markets in Europe and the United States. They’ve essentially engineered the latency circles around their deployments such that the longest latency to most end users is something less than 20 milliseconds. That’s way better than most games would need. They’re a good example of someone who’s following that model in streaming games.
Then we have customers like NHN, the Korean company. Zynga is another example, the more casual kinds of game companies that have been out there forever now. They’re all generally recognizing of the fact that there’s this very big dependency going back to the peering infrastructure. Since we have 200-plus data centers in 52 metros in 25 countries around the world, we probably have the best footprint overall as far as anyone trying to do distribution.
In fact, there’s another resource you can look at, a company called Cloud Scene. Cloud Scene tracks data center companies and their footprint as it relates to the cloud providers. We always come up as the number one company in the Americas and Europe and Asia and Oceania. If you want to go global, that’s how you do it. We as a company — if you took our revenue and organized it according to people who deploy in multiple countries versus people who deploy in one city, what you see is that the majority of our revenue comes from big multinational companies. All the big cloud companies, all the CDNs are customers. Most of the gaming companies are customers in one way or another. It’s all because of this need to be close.
We talk a lot about what we call interconnection-oriented architecture, which is just the fact that whether you’re in the gaming industry or the cloud computing industry, the model has flipped over. People used to run all of their stuff in one data center in the middle of the country. That was the model. You put everything there, and then the farther away you got, the worse the experience you got. People did application acceleration and all these kinds of things. Now, because of cloud computing, the model is flipped on its head. If you look at most companies we deal with nowadays, they’re highly distributed. They have no centralized anything. Everything is distributed because it’s all about user experience. Gaming just happens to be the ultimate user experience example.