Proposed framework could reduce energy consumption of federated learning

Modern machine learning systems consume massive amounts of energy. In fact, it's estimated that training a large model can generate as much carbon dioxide as the total lifetime of five cars. The impact could worsen with the emergence of machine learning in distributed and federated learning settings, where billions of devices are expected to train machine learning models on a regular basis.

In an effort to lesson the impact, researchers at the University of California, Riverside and Ohio State University developed a federated learning framework optimized for networks with severe power constraints. They claim it's both scalable and practical in that it can be applied to a range of machine learning settings in networked environments, and that it delivers "significant" performance improvements.

The effects of AI and machine learning model training on the environment are increasingly coming to light. Ex-Google AI ethicist Timnit Gebru recently coauthored a paper on large language models that discussed urgent risks, including carbon footprint. And in June 2020, researchers at the University of Massachusetts at Amherst released a report estimating that the amount of power required for training and searching a certain model involves the emissions of roughly 626,000 pounds of carbon dioxide, equivalent to nearly 5 times the lifetime emissions of the average U.S. car.

In machine learning, federated learning entails training algorithms across client devices that hold data samples without exchanging those samples. A centralized server might be used to orchestrate rounds of training for the algorithm and act as a reference clock, or the arrangement might be peer-to-peer. Regardless, local algorithms are trained on local data samples and the weights -- the learnable parameters of the algorithms -- are exchanged between the algorithms at some frequency to generate a global model. Preliminary studies have shown this setup can lead to lowered carbon emissions compared with traditional learning.

In designing their framework, the researchers of this new paper assumed that clients have intermittent power and can participate in the training process only when they have power available. Their solution consists of three components: (1) client scheduling, (2) local training at the clients, and (3) model updates at the server. Client scheduling is performed locally such that each client decides whether to participate in training based on an estimation of available power. During the local training phase, clients that choose to participate in training update the global model using their local datasets and send their updates to the server. Upon receiving the local updates, the server updates the global model for the next round of training.

Across several experiments, the researchers compared the performance of their framework with benchmark conventional federated learning settings. The first benchmark was a scenario in which federated learning clients participated in training as soon as they had enough power. The second benchmark, meanwhile, dealt with a server that waited for clients to have enough power to participate in training before initiating a training round.

The researchers claim that their framework significantly outperformed the two benchmarks in terms of accuracy. They hope it serves as a first step toward sustainable federated learning techniques and opens up research directions in building large-scale machine learning training systems with minimal environmental footprints.

More