Releases · wejoncy/QLLM · GitHub

19 Dec 02:32

v0.1.4

What's Changed

suport Phi, detect multi blocks by @wejoncy in #43
quick fix by @wejoncy in #44
add colab example && turing support for awq && remove dependency of xbitops by @wejoncy in #46
quick fix for meta device by @wejoncy in #47
add trust code by @wejoncy in #48
fix trust_code by @wejoncy in #49
quick fix for turing awq 75 by @wejoncy in #50
fix low_cpu_mem_usage by @wejoncy in #51
fix model dtype ,default half by @wejoncy in #52

Full Changelog: v0.1.3...v0.1.4

Contributors

wejoncy

Assets 18

14 Dec 11:41

v0.1.3

What's Changed

Bump version to 0.1.3 by @wejoncy in #29
pipeline by @wejoncy in #33
minor fix win/special by @wejoncy in #34
Fix pack_mode issue and add Proxy for transformers. by @wejoncy in #36
work around autoqptq/vLLM by @wejoncy in #37
update readme and fix some pack_mode conversion bugs by @wejoncy in #38
minor fix and rename quat_linear folders by @wejoncy in #39
[fix] Weight pack && tokenizer && more awq models by @wejoncy in #40
Readme by @wejoncy in #41
ready for pypi package by @wejoncy in #42

Full Changelog: https://github.com/wejoncy/QLLM/commits/v0.1.3

Contributors

wejoncy

Assets 18