Skip to content
View yuxianzhi's full-sized avatar

Block or report yuxianzhi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs

Python 111 5 Updated Jan 11, 2024

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,675 258 Updated Mar 4, 2025

C++ Library Manager for Windows, Linux, and MacOS

CMake 24,274 6,731 Updated Mar 18, 2025

An LLVM/Clang/LLD based mingw-w64 toolchain

C 2,163 204 Updated Mar 14, 2025

👀 MinGW 32bit and 64bit version of OpenCV compiled on Windows. Including OpenCV 3.3.1, 3.4.1, 3.4.1-x64, 3.4.5, 3.4.6, 3.4.7, 3.4.8-x64, 3.4.9, 4.0.0-alpha-x64, 4.0.0-rc-x64, 4.0.1-x64, 4.1.0, 4.1.…

990 219 Updated Mar 5, 2022

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,050 673 Updated Mar 19, 2025

A sparse BLAS lib supporting multiple backends

C 41 7 Updated Feb 19, 2025
C 1 Updated Dec 14, 2022

Tensor✖️ is a minimalistic robust library to build deep neural network models

Python 9 7 Updated Mar 4, 2021

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Python 405 64 Updated Dec 7, 2023

OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Python 1,297 141 Updated Nov 12, 2024

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 31,680 2,951 Updated Mar 19, 2025

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,281 150 Updated Feb 24, 2025

Transformer related optimization, including BERT, GPT

C++ 6,085 901 Updated Mar 27, 2024

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 850 164 Updated Dec 30, 2024

best CPU/GPU sparse solver for large sparse matrices

C 20 4 Updated Oct 5, 2021

Deploy your model with TensorRT quickly.

C++ 765 98 Updated Nov 21, 2023

COIN-OR Linear Programming Solver

C++ 432 87 Updated Feb 24, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,348 2,166 Updated Mar 11, 2025

CUDA templates for tile-sparse matrix multiplication based on CUTLASS.

C++ 50 4 Updated Mar 1, 2018

TensorRT Net Wrapper

C++ 95 34 Updated Sep 13, 2019

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Python 2,249 397 Updated Mar 18, 2025

CUDA Library Samples

Cuda 1,826 368 Updated Mar 18, 2025

sample and log files for ascend 910 ops

C++ 5 5 Updated Apr 11, 2023

Polyu Pre LaTex Template

TeX 22 109 Updated Jun 20, 2024

A Neural Network For Automatic Image Colorization

Python 44 11 Updated Jun 23, 2020

10x faster matrix and vector operations

C++ 2,481 172 Updated Oct 12, 2022

Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.

C++ 577 80 Updated Sep 11, 2024

Python bindings for UCX

Python 126 63 Updated Mar 14, 2025

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…

Python 3,745 591 Updated Mar 13, 2025
Next
Showing results