News

Speculative decoding has emerged as a potential solution for speeding up inferences using large language models (LLMs).