Our VP Engineering Andy Xu will speak at the Machine Learning Conference in New York City on March 28, 2024.
MLconf events host speakers from various industries, research and universities. MLconf aims to create an atmosphere to discuss recent research and application of Machine Learning methodologies and practices and how they’re presently applied in industry. Learn more here.
Abstract:
With all the hype surrounding ChatGPT, most people are giddy with the promise of artificial intelligence (AI), yet they overlook its pitfalls, especially its energy consumption. A University of Massachusetts Amherst paper states that “training a single AI model can emit as much carbon as five cars in their lifetimes.” Yet, this analysis pertained to only one training run. In addition to energy consumption, the computational resources needed to train these AI systems have been doubling every 3.4 months since 2012. Contrast this with the amount of energy the average human brain consumes, about 20 watts of power for an average adult, or less than half the consumption of a light bulb. What if LLMs could operate just as efficiently?
Unfortunately, extensive research to optimize LLMs has led to only modest performance and efficiency improvements on general-purpose CPUs and GPUs. But now, companies are applying recent neuroscience discoveries to enable computational efficiency. More than just research, it’s now being put into production. For example, neurons in the brain are sophisticated “sparse” computers that perform contextual routing.
Numenta has mapped this neuroscience-based concept to the new Advanced Matrix Extensions (AMX) instruction set available on Intel’s new 4th and 5th generation Xeon CPUs. With minimal changes to Transformers’ structures, we observe over two orders of magnitude improvement in inference throughput on AMX-enabled CPUs compared to current-generation CPUs. Putting neuroscience breakthroughs into production makes AI far more efficient and sustainable.
However, this is just the beginning of the practical breakthroughs neuroscience can bring to AI. Unlike AI systems, the brain can understand the structure of its environment to make complex predictions and carry out intelligent actions. And unlike AI models, humans learn continuously and incrementally. Conversely, code doesn’t yet learn. If an AI model makes a mistake today, it will continue to repeat it until it is retrained using fresh data. That may not always be the case. Soon, we may have AI that can continually learn based on the brain-based concept of active dendrites (neurons that produce nonlinear responses to synaptic input). Applying this concept and putting it into production would make Transformers capable of making more complex predictions and continually learning without constant retraining. We need AI systems that genuinely enhance human capabilities, learning alongside us and helping us in all aspects of our lives. We can get there now that neuroscience principles have gone from the pages of scientific journals to real-world LLMs in production today.