The LLM Surgeon.
“The LLM Surgeon,” accepted at ICLR 2024, achieves SOTA in LLM pruning in all unstructured, semi-structured, and the most challenging but most effective structured pruning that removes entire matrix rows/columns.
Paper page: https://huggingface.co/papers/2312.17244 Code: https://github.com/notifications/beta/shelf
Contribute to Qualcomm-AI-research/llm-surgeon development by creating an account on GitHub.
Comments are closed.