Can we afford AI?

Of all the concerns surrounding artificial intelligence these days -- and no, I don't mean evil robot overlords, but more mundane things like job replacement and security -- perhaps none is more overlooked than cost.

This is understandable, considering AI has the potential to lower the cost of doing business in so many ways. But AI is not only expensive to acquire and deploy, it also requires a substantial amount of compute power, storage, and energy to produce worthwhile returns.

Back in 2019, AI pioneer Elliot Turner estimated that training the XLNet natural language system could cost upwards of $245,000 – roughly 512 TPUs running at full capacity for 60 straight hours. And there is no guarantee it will produce usable results. Even a simple task like training an intelligent machine to solve a Rubik's Cube could draw up to 2.8GW of power, about the hourly output of three nuclear power plants. These are serious -- although still debatable -- numbers, considering that some estimates claim technology processes will draw more than half of our global energy output by 2030.

Silicon solution

Perhaps no one understands this better than IBM, which has been at the forefront of the AI evolution -- with varying degrees of success --thanks to platforms like Watson and Project Debater. The company's Albany, New York-based research lab has an AI Hardware Center that might be on the verge of unveiling some intriguing results in the drive to reduce the computational demands of training AI and guiding its decision-making processes, according to Tirias Research analyst Kevin Krewell.

A key development is a quad-core test chip recently unveiled at the International Solid-State Circuits Conference (ISSCC). The chip features a hybrid 8-bit floating-point format for training functions and both 2- and 4-bit integer formats for inference, Krewell wrote in a Forbes piece. This would be a significant improvement over the 32-bit floating-point solutions that power current AI solutions, but only if the right software can be developed to produce the same or better results under these lower logic and memory footprints. So far, IBM has been silent on how it intends to do this, although the company has announced that its DEEPTOOLS compiler, which supports AI model development and training, is compatible with the 7nm die.

Qualcomm is also interested in driving greater efficiency in AI models, with a particular focus on Neural Architecture Search (NAS), the means by which intelligent machines map the most efficient network topologies to accomplish a given task. But since Qualcomm's chips generally have a low power footprint to begin with, its focus is on developing new, more efficient models that work comfortably within existing architectures, even at scale.

All for one

To that end, the company says it has adopted a holistic approach to modeling that stresses the need to shrink multiple axes -- like quantization, compression, and compilation -- in a coordinated fashion. Since all of these techniques complement each other, researchers must address the efficiency challenge from their unique angle but not so that a change in one area disrupts gains in another.

When applied to NAS, the key challenges are reducing high compute costs, improving scalability, and delivering more accurate hardware performance metrics. Called DONNA (Distilling Optimal Neural Network Architectures), the solution provides a highly scalable means to define network architectures around accuracy, latency, and other requirements and then deploy them in real-world environments. The company is already reporting a 20% speed boost over MobileNetV2 in locating highly accurate architectures on a Samsung S21 smartphone.

Facebook also has a strong interest in fostering greater efficiency in AI. The company recently unveiled a new algorithm called Seer (SElf-supERvised) that reduces the amount of labeling required to make effective use of datasets. The process allows AI to draw accurate conclusions using a smaller set of comparative data. In this way, it can identify, say, a picture of a cat without having to comb through thousands of existing pictures that have already been labeled as cats. This reduces the number of human hours required in training, as well as the overall data footprint required for identification, all of which speeds up the process and lowers overall costs.

Speed, efficiency, and reduced resource consumption have been driving factors in IT for decades, so it's no surprise that these goals are starting to drive AI development as well. What is surprising is the speed at which this is happening. Traditionally, new technologies are deployed first, leaving things like costs and efficiency as afterthoughts.

It's a sign of the times that AI is already adopting streamlined architectures and operations as core capabilities before it hits a critical level of scale. Even the most well-heeled companies recognize that the computational requirements of AI are likely to be far greater than anything they've encountered before.

Silicon solution

All for one

More