Study shows that federated learning can lead to reduced carbon emissions

Carbon dioxide, methane, and nitrous oxide levels are at the highest they've been in the last 800,000 years. Together with other drivers, greenhouse gases likely catalyzed the global warming that's been observed since the mid-20th century. Machine learning models, too, have contributed indirectly to the adverse environmental trend. That's because they require a substantial amount of computational resources and energy -- models are routinely trained for thousands of hours on specialized hardware accelerators in datacenters estimated to use 200 terawatt-hours per year. (The average U.S. home consumes about 10,000 kilowatt-hours per year, a fraction of that total.)

This state of affairs motivated researchers at the University of Cambridge, the University of Oxford, University College London, Avignon Universite, and Samsung to investigate more energy-efficient approaches to training AI models. In a newly published paper, they explore whether federated learning, which involves training models across a number of machines, can lead to lowered carbon emissions compared with traditional learning. Their findings suggest that federated learning has a quantitatively greener impact despite being slower in some cases.

The effects of AI and machine learning model training on the environment are increasingly coming to light. Ex-Google AI ethicist Timnit Gebru recently coauthored a paper on large language models that discussed urgent risks, including carbon footprint. And in June 2020, researchers at the University of Massachusetts at Amherst released a report estimating that the amount of power required for training and searching a certain model involves the emissions of roughly 626,000 pounds of carbon dioxide, equivalent to nearly 5 times the lifetime emissions of the average U.S. car.

In machine learning, federated learning entails training algorithms across different devices holding data samples without exchanging those samples. A centralized server might be used to orchestrate the steps of the algorithm and act as a reference clock, or the arrangement might be peer-to-peer. Regardless, local algorithms are trained on local data samples, and the weights (the learnable parameters of the algorithms) are exchanged between the algorithms at some frequency to generate a global model.

To measure the carbon footprint of a federated learning setup, the coauthors of the new paper trained two models -- an image classification model and a speech recognition model -- using a server with a single GPU and CPU and two chipsets, Nvidia Tegra X2 and Jetson Xavier NX. They recorded the power consumption of the server and chipsets during training, taking into account how energy usage might vary depending on the country where the chipsets and server are located.

The researchers found that while there's a difference between carbon dioxide emission factors among countries, federated learning is reliably "cleaner" than centralized training. For example, training on the open source image dataset CIFAR10 in France using federated learning saves from 1.8 grams to 4.4 grams of carbon dioxide compared with centralized training in China. For larger datasets such as ImageNet, any federated learning setup in France emits less than any centralized setup in China and the U.S. And with the speech dataset the researchers used, federated learning is more efficient than centralized training in any country.

Federated learning has an environmental advantage partly due to the cooling needs of datacenters, the researchers explain. According to a recent paper in the journal Science, while strides in datacenter efficiency have mostly kept pace with growing demand for data, the total amount of energy consumed by datacenters made up about 1% of global energy use over the past decade. That's roughly equivalent to 18 million U.S. homes.

The researchers caution that federated learning isn't a silver bullet, because a number of factors could make it less efficient than it otherwise might be. Highly distributed databases can prolong training times, translating to a higher level of carbon dioxide emissions. The workload, model architecture, and hardware efficiency also play a role. Even data transfer via Wi-Fi can contribute significantly to carbon emissions depending on the size of model, the size of the dataset, and the energy consumed by devices during training.

Still, the researchers assert that considering the carbon dioxide emissions rate while optimizing AI models could lead to a decrease in pollution while maintaining good performance. Toward this, they call on data scientists to design algorithms that minimize emissions and device manufacturers to increase transparency with respect to energy consumption.

"Federated learning ... is starting to be deployed at a global scale by companies that must adhere to new legal demands and policies originating from governments and civil society for privacy protection," the researchers wrote. "By quantifying carbon emissions for federated and demonstrating that a proper design of the federated setup leads to a decrease of these emissions, we encourage the integration of the released carbon dioxide as a crucial metric to the federated learning deployment."

More