In this special edition of the Numenta Newsletter, I’m pleased to share details on Numenta’s recent Intel announcement where we demonstrated groundbreaking results on the new 4th Gen Xeon Scalable Processors.
Numenta Achieves 123x Inference Performance Improvement for BERT Transformers on 4th Gen Intel Xeon Scalable Processors
Numenta’s 2023 started with a major announcement as part of Intel’s 4th Generation Xeon launch (codenamed Sapphire Rapids). As we shared in our January 10 press release, in collaboration with Intel, we achieved groundbreaking performance gains for BERT-Large Transformers on two new Intel Xeon Scalable Processors. These performance results enable transformative possibilities for many NLP and real-time AI applications.
Delivering two orders of magnitude throughput speedup for BERT-Large Transformers
Leveraging Intel’s new Advanced Matrix Extensions (Intel AMX), we showed 123X throughput improvement vs. current generation AMD Milan CPU implementations for BERT inference on short text sequences. Numenta’s neuroscience-based technology turns out to be a perfect fit for AMX instructions, which are designed for AI workloads. In fact, as shown in the chart above, our implementation is 19X faster than Intel’s own AMX BERT-Large implementation.
Smashing latency barriers
Accelerating high volume document processing
“Numenta and Intel are collaborating to deliver substantial performance gains to Numenta’s AI solutions through the Intel Xeon CPU Max Series and 4th Gen Intel Xeon Scalable processors. We’re excited to work together to unlock significant throughput performance accelerations for previously bandwidth-bound or latency-bound AI applications such as Conversational AI and large document processing,”— Scott Clark, vice president and general manager of AI and HPC Application Level Engineering, Intel (From Numenta press release on Jan 10, 2023) |
We’re excited to demonstrate these initial examples of how we’re applying our brain-based AI technology to deep learning networks, and we look forward to working with Intel to uncover even more opportunities.
Interested in getting results like these? Apply to our Private Beta Program
Apply to our Private Beta Program >>
Learn more
There are several additional resources on Numenta and Intel’s websites if you’d like to learn more about our results:
- Press Release: Numenta Achieves 123x Inference Performance Improvement for BERT Transformers on Intel Xeon Processor Family
- Blog: A New Performance Standard for BERT Transformers with Numenta + Intel
- Case Study: Numenta + Intel achieve 123x inference performance improvement for BERT Transformers
- Intel Developer page: Intel AI Platform Overview
- Xeon Series Product Brief: Intel® Xeon® CPU Max Series Product Brief
Thank you for your continued interest in Numenta. Follow us on LinkedIn to make sure you don’t miss any updates.
Christy Maver
VP of Marketing