Scaling the edge's wall: A new take on an old problem

Anyone who has ever scaled a network understands the problem, either current or coming, of the countless edge deployments now fueling the Internet of Things (IoT). If you haven't, you will soon understand.

To illustrate, say you have an IP camera sending a 4K video stream over Wi-Fi to your edge server. Yes, it uses a lot of bandwidth, but you planned for that before deploying two years ago. Now your application needs a second camera. Suddenly, the bandwidth requirements of both cameras are greater than what your edge network can provide. You’ve hit a wall. Your resources are saturated and no longer able to meet performance requirements.

You’ve seen this bottleneck problem before in data centers. But the edge is not a data center. The solutions that helped in your hot and cold aisles will not save you here, in part because no two edge deployments are the same. Configuration can vary just as widely as the edge’s environment. Scaling the edge requires new thinking.

This isn’t just about cameras and surveillance traffic. It's about:

Managing the explosion of node types, from thermostats and doorbells to blood chemistry sensors and bio-implantable identification tags.
Questioning your connectivity and evaluating whether 5G is a smarter, more scalable solution for future edge needs.
Devising new applications for all those nodes
Discovering new ways to prioritize melding disparate data streams into fresh sources of insight rather than hoarding them into the biggest mountain of bits possible.
Handling the increased network demands of AI

Talking about “exponential IoT growth” has practically become a cliché, but few people seem to discuss what to do when all that node data overloads edge infrastructure resources. Let’s start there.

Above: Smart cities are a major frontier for scaling edge IOT. Image source: Getty Images.

Not your usual node problem

In data centers, this is an age-old problem with a fairly straightforward solution: IT monitors server infrastructure use levels across compute, storage, and networking resources. Once utilization breaches threshold levels, IT throws more hardware on the server racks, and everything returns to normal. But it’s not that easy with edge networks.

“Data center IT have dealt with and solved these issues in closed, isolated environments,” said Toby McClean, ADLINK’s vice president of IoT technology and innovation. “At the edge, though, you have a much more diverse, spread out, heterogeneous environment. In a homogeneous data center, basically any workload can be redirected to any resource, right? At the edge, you have mobile and fixed edge compute servers, switches, gateways — all with different capabilities and resources. How do you move workloads to free resources? It’s not straightforward because not every workload can go to every node.”

Moreover, scaling edge resources isn’t just a matter of throwing more metal at the problem. It’s as much, if not more, about software than hardware resources. Measuring software demands is easy when an application runs in isolation, but increasingly, that’s not how edge systems operate. Applications can cooperate with one another, including across differing geographies, and even two iterations of the same application might have completely different modules installed. How do you determine resource requirements, then?

Sometimes the answer is less critical because the edge solution is such that there’s time and budget to send data into the cloud and/or data center. In essence, the burden can be absorbed at the network core.

However, this is increasingly a real-time world. IDC predicts that “due to the infusion of data into our business workflows and personal streams of life […] nearly 30% of the Global Datasphere will be real-time by 2025.” In such use cases, there isn’t time to send data beyond the edge for processing. Consider what a two-second input-to-action lag would mean for autonomous driving. No, in those 30% of scenarios, if not many more, edge resources are on their own, which adds even more pressure on the need to scale.

Topology matters

Not surprisingly, there is no cookie-cutter solution for edge scaling. The needs of an edge network on a manufacturing floor will be vastly different than for an army platoon running off-grid from the back of a tactical all-terrain vehicle.

In the former case, scaling strategies may well echo those found in data centers. “If I go from the end point to a computer, whatever it might be, that needs to connect to the network somehow,” said Stephen Mellor, CTO of the Industrial Internet Consortium, a group devoted to promoting development and best practices of the industrial IoT. “Once it’s connected to the network, you can scale by continuing to add edge nodes.”

But, he noted, if you don’t have connectivity, such as in a remote oil field or a far-flung military deployment, then it becomes about bandwidth from the endpoint into the network. “And if that means you’re down to 4G or even satellite, then you may have to accommodate connectivity outages and carrying out more decisions closer to the devices. You’ll need enough computing power to deal with the highest possible load your application can reasonably expect.”

Mellor noted that one way to ensure that there’s enough computing power at the edge is to distribute loads from the IoT endpoint all the way up to the cloud and not focus on placing compute solely in the data center. He advises people to employ data gravity to ensure that data, and the computation for that data, is in the cheapest possible place, even though doing so may require some complex orchestration.

ADLINK’s McClean offers the same advice, emphasizing that a scalable edge infrastructure should be designed from the outset such that workloads can easily move about in ways that will optimize resource use. The company manufactures a range of IoT devices and servers, with platforms ranging from low-end Intel Atom processor boxes up through dual-Xeon blade systems.

McClean noted that many people approach such product lines thinking of a pyramid hierarchy for their edge networks, with a few very powerful systems at the top computing input from smaller servers and gateways in the middle and broad masses of low-power nodes at the bottom.

However, he cautions that this sort of pyramid approach in practice makes for especially difficult load rebalancing. Instead, McClean said that ADLINK advocates more of a peer-to-peer, mesh-style infrastructure.

Above: Mesh topology, which allows for much easier distribution and cooperation of resources as well as more efficient fault tolerance. Image: Jellyfishteam via Wikimedia Commons, reused without changes. Licensed under the Creative Commons Attribution-Share Alike 4.0 International license.

“A lot of the way data flows in these IoT systems is through broker-based systems, which tend to lend themselves very much to hierarchy. You have a set of things collecting data, which then pump into a concentrator or a gateway, which then filter and aggregate and send it up to the next level, and on it goes. With peer-to-peer, there’s no broker sitting in the middle. Systems just speak directly to each other. The middleware you deploy determines whether this is complicated or easy to manage.”

Is less more?

According to Michele Pelino, principal analyst for Internet of Things and enterprise mobility at Forrester, organizations may solve their edge scaling issues more through a mindset shift than hardware upgrades. She points to the high costs of sending data to and from the edge as well as storing it in data centers.

“Increasingly, we need to make decisions right at the sensor level out in the field,” said Pelino. “Whether it’s a wind farm turbine or a naval ship moving from point A to point B, the end point must decide what information is important enough to send to the data center and then only send that. Some of the AI processing has to happen at the sensor level, literally at that device.”

Pelino points to Amazon’s AWS Greengrass and Microsoft’s Azure IoT Edge as early examples of how major cloud providers are working to enable AI capabilities in edge nodes, even in the absence of internet connectivity. Edge-based processing of IoT device workloads can enable faster responses to changing field conditions, at least short-term independence from connection services, and lower total data costs. Depending on the edge device, there may be additional energy issues, because even the most basic AI carries a compute load that will consume additional power. So, organizations will need to assess whether the costs of edge independence are outweighed by savings elsewhere.

At the most basic level, though, even a little intelligence will give sensors the ability to judge whether there has been a state change (beyond threshold settings) since the last measurement. If not, then there’s no reason to incur the costs of sending a new data set. Magnified across hundreds to many thousands of IoT devices, the cost benefits of not sending unneeded data may be massive.

Data: Let it go

Think of it as going Marie Kondo on your IoT data. If that data does not spark joy, or at least value, then let it go.

Although that may sound flippant, it marks a mindset shift from traditional data approaches, where hoarders seem to win. Stream it all. Store it all. You never know when those bits might come in handy…someday.

At the edge, though, such methods may not scale.

“We haven't hit saturation [at the edge] per se,” said Pelino. “But I think there’s a recognition that we will. It’s coming. The connected world we’re moving toward is going to drive that, especially as 5G provides much more capability to show and analyze video. The amount of data is going to grow exponentially with those networks.”

Bottom line

In short, there is no one simple answer on how to scale edge performance if and when it hits a wall in the coming data deluge -- unless it’s this: Challenge the old assumptions and methods.

Hierarchical infrastructure may not be best topology, even in a retrofit. Collecting and saving as much data as possible may be counter-productive. This is a new time, and there are no holy commandments for operating a scalable edge network. Keep questioning