Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
pruning model-compression inference-optimization alternating-optimization large-language-models efficient-ai
-
Updated
Dec 20, 2024 - Python