I think that sustainability has long since arrived in all industries – also in software. You can already start to work sustainably during development, for example by choosing the right programming language or by relying on massive parallelization in order to be able to utilize hardware as efficiently as possible.
But energy can and must also be saved in the various tasks that are given to neural networks. I deliberately use the word must here, because I was honestly more than surprised when I read the results of a study by Berkeley University on the subject of power consumption and AI. The results show that AI almost devours energy:
- T5, Google’s pretrained language model, used 86 megawatts and produced 47 metric tons of carbon dioxide emissions
- Meena, Google’s multiturn, open-domain chatbot, used 232 megawatts and produced 96 metric tons of carbon dioxide emissions
- GShard, a Google-developed language translation framework, used 24 megawatts and produced 4.3 metric tons of carbon dioxide emissions.
- Switch Transformer, a Google-developed routing algorithm, used 179 megawatts and produced 59 metric tons of carbon dioxide emissions
And worst of all: GPT-3, OpenAI’s sophisticated natural language model, used 1,287 megawatts (=1.3 Gw) and produced metric 552 metric tons of carbon dioxide emissions. This is roughly equivalent to driving a car 1.3 million miles – in one hour!
Of course, I researched how much our technology, which works with the functionality of the human brain, consumes: 25 KW (Open AI: 1300000 KW) – this is 0.002%, so 99.998% more efficient! (By comparison, the human brain works with 20 watts per hour).
So why does one AI technology consume so much less energy than the others?
In classical AI, neural networks are built bigger and bigger. Then all the input is fed in and transported further and further. This results in exponential growth and power consumption increases to infinity. If this variant were applied to the human brain, we would have to carry around a head the size of the entire earth!
We, on the other hand, work with many small neuronal networks, which are modeled on parts of the human brain in terms of functionality (= humanoid structure and access). Input data is broken down into small building blocks and selectively sent to individual neural networks that are needed. This means flat linear growth with minimal energy requirements.
Conclusion: Unfortunately, no one dares to invent their own better and more efficient algorithms. Instead, people work non-stop to improve existing algorithms – why?
Why do people try to be energy efficient everywhere and stick so much to outdated, resource-consuming technology, especially in such a new and disruptive area?