LMCache/LMCache — 8,660 Stars
Supercharge Your LLM with the Fastest KV Cache Layer
Watch Episode
About This Repo#
LMCache/LMCache — 8,660 ⭐
Supercharge Your LLM with the Fastest KV Cache Layer - LMCache/LMCache
Narration#
Your LLM is wasting compute on every request. LMCache is the fastest KV cache layer for LLM inference — storing key-value caches across GPU memory, SSDs, and remote backends to slash time-to-first-token and boost throughput. Engine-independent, vendor-neutral, and already integrated with vLLM and SGLang. Eight thousand six hundred stars on GitHub. Check out LMCache today.