Skip to content
View liyanboSustech's full-sized avatar
  • Shenzhen Guangdong, China
  • 08:14 (UTC +08:00)

Block or report liyanboSustech

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. Diff-cache Diff-cache Public

    Forked from xdit-project/xDiT

    xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

    Python 1

  2. llama.cpp llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++

  3. tensorrtx tensorrtx Public

    Forked from wang-xinyu/tensorrtx

    Implementation of popular deep learning networks with TensorRT network definition API

    C++

  4. InfiniGen InfiniGen Public

    Forked from snu-comparch/InfiniGen

    InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)

    Python

  5. H2O H2O Public

    Forked from FMInference/H2O

    [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

    Python

  6. prompt-cache prompt-cache Public

    Forked from yale-sys/prompt-cache

    Modular and structured prompt caching for low-latency LLM inference

    Python