Training of AI-based systems is already incredibly inefficient, and new research suggests that hackers may be about to make things a whole lot worse.
That computing is an energy-intensive business is increasingly well understood. At the height of the mania surrounding cryptocurrencies, such as Bitcoin, a common statistic was that processing these digital currencies required more energy than Denmark used on its own. Globally, it’s believed that data centers account for around 2% of all electricity output.
Similarly, estimates from a couple of years ago suggested that training a machine learning model requires as much energy and generates as much carbon dioxide as would ordinarily be required to both build and operate five cars for their entire lifetime.
At a time when so many are urging society to make any post-COVID recovery as green as possible, there is an obvious desire for the data centers that power so much of the modern world to be as green as possible. The scale of the challenge is considerable, with data from Technavio highlighting that the data center industry is predicted to grow by around $284 billion per year as the industry aims to support the 2,343 trillion megabytes we’re projected to be consuming per year.
The sector has made considerable strides in recent years to become more energy efficient. This is highlighted by the fact that despite the volume of work done by data centers rising by 500% between 2010 and 2018, energy consumption only grew by 6%. This has been due to a concerted effort from the industry to upgrade hardware, while the likes of Deepmind have been deploying their considerable intellect to make data centers more efficient.
An inefficient start
Despite these efforts, training of AI-based systems is already incredibly inefficient, especially compared to how we humans learn because such vast quantities of data are required. For instance, whereas an average five-year-old might have developed a decent grasp for language after hearing around 45 million words, AI systems, such as BERT used a dataset containing around 3.3 billion words, which it consumed 40 times. That makes it about 3,000 times less efficient than the child.
With this process typically required numerous times throughout the development of such systems, the energy costs are considerable. Unfortunately, research from the University of Maryland suggests that hackers may be about to make things a whole lot worse. The paper highlights how hackers could use adversarial attacks to slow down the learning process of a neural network, and therefore use up far more computational resources than would otherwise be the case.
The paper highlights how hackers could target so-called input-adaptive multi-exit architectures.
This is a form of neural network that aims to be more computationally efficient by dividing tasks up based on their difficulty. It aims to solve each problem with the minimum amount of computational heft possible.
Changing the input
Such an approach has numerous advantages, not least the reduction in time required to learn and the carbon footprint generated in doing so. It’s increasingly popular because it’s an approach that allows it to be deployed in areas such as smart speakers or smartphones. The problem arises when the input for the neural network is changed, as this changes the computational power required to solve the problem.
As the authors illustrate, this opens such systems up to hackers who are able to deploy an adversarial attack to change the original input and therefore gum up the neural network such that it is both slower and more energy inefficient. The researchers highlight how even the addition of a relatively small amount of noise to the input of the neural network can significantly raise the computation required.
Such an attack would be especially potent if the attacker had a large amount of information about the neural network.
Even with relatively low levels of information, however, the potential disruption could be significant. The paper highlights how even with limited information, the attackers would be able to increase energy usage by up to 80%, with this proficiency largely because attacks would be effective across various different types of neural networks. In other words, once the attackers have developed their approach for one network, it would apply pretty well across other forms of network.
While obviously such attacks today are purely theoretical and there has been no evidence of them becoming an actual reality, not least because input-adaptive architectures themselves are fairly rare. The researchers believe, however, that they will become more commonplace as the industry strives to deploy neural networks into things like IoT devices and so require a more computationally efficient approach. While there has been considerable attention given to securing such devices from attack, most of this attention has been around the manipulation of the performance of such devices. In our climate-sensitive times, however, maybe that is due for a change.