Neural Network (NNet) Inference System
Jump to navigation
Jump to search
A Neural Network (NNet) Inference System is a model inference system that implements a neural network inference algorithm to solve NNet inference tasks (to make predictions or decisions based on new, unseen data for deep NNets).
- Context:
- It can (typically) involve processing input data through multiple layers of the neural network, extracting features and applying transformations to arrive at a final prediction or decision.
- It can (typically) be executed on various hardware platforms, including high-performance GPUs and TPUs, to accommodate the computational demands of deep learning models.
- It can (often) utilize optimization techniques such as quantization technique, pruning technique, and compression technique to enhance inference speed and minimize the memory footprint, making it suitable for deployment in environments with limited resources.
- It can range from being a Shallow Neural Net Inference System to being a Deep Neural Net Inference System.
- ...
- Example(s):
- Counter-Example(s):
- DNN Training Systems, which are designed for the learning phase of deep neural networks, adjusting weights based on a loss function and training data.
- Traditional machine learning systems like those designed for Decision Tree Inference or Logistic Regression Inference, which do not handle deep neural network models.
- See: Model Optimization, Hardware Acceleration, Real-Time Inference, AI Application Domains, Groq.