Releases: wejoncy/QLLM
Releases · wejoncy/QLLM
v0.1.4
What's Changed
- suport Phi, detect multi blocks by @wejoncy in #43
- quick fix by @wejoncy in #44
- add colab example && turing support for awq && remove dependency of xbitops by @wejoncy in #46
- quick fix for meta device by @wejoncy in #47
- add trust code by @wejoncy in #48
- fix trust_code by @wejoncy in #49
- quick fix for turing awq 75 by @wejoncy in #50
- fix low_cpu_mem_usage by @wejoncy in #51
- fix model dtype ,default half by @wejoncy in #52
Full Changelog: v0.1.3...v0.1.4
v0.1.3
What's Changed
- Bump version to 0.1.3 by @wejoncy in #29
- pipeline by @wejoncy in #33
- minor fix win/special by @wejoncy in #34
- Fix pack_mode issue and add Proxy for transformers. by @wejoncy in #36
- work around autoqptq/vLLM by @wejoncy in #37
- update readme and fix some pack_mode conversion bugs by @wejoncy in #38
- minor fix and rename quat_linear folders by @wejoncy in #39
- [fix] Weight pack && tokenizer && more awq models by @wejoncy in #40
- Readme by @wejoncy in #41
- ready for pypi package by @wejoncy in #42
Full Changelog: https://github.com/wejoncy/QLLM/commits/v0.1.3