Skip to content
Change the repository type filter

All

    Repositories list

    • A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
      Python
      Apache License 2.0
      79212k3825Updated Mar 3, 2025Mar 3, 2025
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      6k1100Updated Mar 3, 2025Mar 3, 2025
    • Mooncake

      Public
      Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
      C++
      Apache License 2.0
      1632.7k285Updated Feb 23, 2025Feb 23, 2025
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      Apache License 2.0
      234000Updated Feb 11, 2025Feb 11, 2025
    • A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
      Python
      Apache License 2.0
      792900Updated Feb 8, 2025Feb 8, 2025