10x price performance improvements on AI inference with Salad’s distributed cloud platform

Salad is a global cloud platform that harnesses latent compute resources from idle, high-end consumer hardware to power and distribute computing applications more affordably than traditional data centers. Salad and Numenta partnered to increase price performance on AI inference.

CHALLENGE

Scaling up deep learning models without breaking the bank

Deploying deep learning systems today can be costly and complex. Models are becoming larger, as we see with large language models (LLMs) moving from millions to billions of parameters. Additionally, the reliance on highly available processing resources leads many people to deploy their networks to the public cloud, which leads to restrictive technical requirements, expensive model development resources, and increasing cloud spend. New methods are needed to optimize and scale these models on specialized hardware.

SOLUTION

Deploying Numenta’s AI Inference Server on Salad Container Engine

Using hardware-aware optimizations and neuroscience-based acceleration techniques, we created an optimized BERT-Base model and deployed it on the Salad Container Engine (SCE), a fully managed orchestration platform built to facilitate container deployments on Salad’s distributed cloud. To assess the price-performance benefits of Numenta technology on SCE, we benchmarked our optimized BERT-Base model against a standard BERT-Base on four different Amazon Web Services (AWS) configurations and SCE.

RESULTS

10x more inferences per dollar

Our optimized BERT-Base model delivered more inference throughput than a standard BERT-Base model on each AWS instance. When deployed on Salad’s infrastructure, we achieved a 10x price performance improvement over a standard BERT-base model running on AWS.

BENEFITS

Cost savings and performance speed-ups that enable AI deployment at scale

The combination of Numenta + SCE allows users interested in deploying deep learning models to benefit from the best of both worlds: performance improvements from Numenta’s optimized models and more affordable on-demand cloud service provider pricing from Salad.

  • Do more with your existing budget by running 10x more inferences per dollar
  • Run existing workloads at 10% of your current cost
  • Enable new users to run deep learning models who couldn’t previously afford to run them

Interested in working with us?

Related Case Studies