LLM Inference Time per Token Measure

From GM-RKB
Jump to navigation Jump to search

A LLM Inference Time per Token Measure is a LLM performance metric that evaluates the average time it takes for a large language model (LLM) to generate or process each token during inference.



References