Skip to content
View unix1986's full-sized avatar
🦀
🦀

Block or report unix1986

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. ATen ATen Public

    Forked from zdevito/ATen

    ATen: A TENsor library for C++11

    C++

  2. flash-attention flash-attention Public

    Forked from Dao-AILab/flash-attention

    Fast and memory-efficient exact attention

    Python

  3. TensorRT-LLM TensorRT-LLM Public

    Forked from NVIDIA/TensorRT-LLM

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

    C++

  4. zhihu/ZhiLight zhihu/ZhiLight Public

    A highly optimized LLM inference acceleration engine for Llama and its variants.

    C++ 900 103

  5. zhihu/rucene zhihu/rucene Public

    Rust port of Lucene

    Rust 1k 63

  6. cutlass cutlass Public

    Forked from NVIDIA/cutlass

    CUDA Templates for Linear Algebra Subroutines

    C++