AWS SageMaker Neo

From GM-RKB
Jump to navigation Jump to search

An AWS SageMaker Neo is a ML inference optimization framework that integrates with SageMaker.



References

2019

  • https://aws.amazon.com/sagemaker/neo/
    • QUOTE: Amazon SageMaker Neo enables developers to train machine learning models once and run them anywhere in the cloud and at the edge. Amazon SageMaker Neo optimizes models to run up to twice as fast, with less than a tenth of the memory footprint, with no loss in accuracy.

      Developers spend a lot of time and effort to deliver accurate machine learning models that can make fast, low-latency predictions in real-time. This is particularly important for edge devices where memory and processing power tend to be highly constrained, but latency is very important. For example, sensors in autonomous vehicles typically need to process data in a thousandth of a second to be useful, so a round trip to the cloud and back isn’t possible. Also, there is a wide array of different hardware platforms and processor architectures for edge devices. To achieve high performance, developers need to spend weeks or months hand-tuning their model for each one. Also, the complex tuning process means that models are rarely updated after they are deployed to the edge. Developers miss out on the opportunity to retrain and improve models based on the data the edge devices collect.

      Amazon SageMaker Neo automatically optimizes machine learning models to perform at up to twice the speed with no loss in accuracy. You start with a machine learning model built using MXNet, TensorFlow, PyTorch, or XGBoost and trained using Amazon SageMaker. Then you choose your target hardware platform from Intel, Nvidia, or Arm. With a single click, SageMaker Neo will then compile the trained model into an executable. The compiler uses a neural network to discover and apply all of the specific performance optimizations that will make your model run most efficiently on the target hardware platform. The model can then be deployed to start making predictions in the cloud or at the edge. Local compute and ML inference capabilities can be brought to the edge with AWS Greengrass. To help make edge deployments easy, AWS Greengrass supports Neo-optimized models so that you can deploy your models directly to the edge with over the air updates.

      Neo is also available as open source code as the Neo-AI project under the Apache Software License, enabling developers to customize the software for different devices and applications.