[WIP] EoRA #1206

Qubitium · 2025-02-03T14:18:14Z

add adapter property to QuantizeConfig + EoRA lora adapter/config

Sample quantization config format with EoRA extension:

quant_config = QuantizeConfig(
   bits=4,
   sym=False,
   group_size=128,
   adapter=EoRA(rank=128),
}

nbasyl · 2025-02-04T05:35:54Z

Hi @Qubitium, @hutm is the person who developed the ExLlama EoRA kernel and will assist with the inference and validation tasks once I complete the first milestone. Could you grant him the write access to this branch? Thanks!

Qubitium · 2025-02-04T05:41:54Z

Hi @Qubitium, @hutm is the person who developed the ExLlama EoRA kernel and will assist with the inference and validation tasks once I complete the first milestone. Could you grant him the write access to this branch? Thanks!

Done. Write access invite sent to @hutm.

nbasyl · 2025-02-04T08:14:33Z

Hi @Qubitium, I am having error from line 186 in config.py when trying to import QuantizeConfig, the error is as follow:

Do you know how to resolve this?

I am using python3.10 and have installed all the required libraries following the readme

Qubitium · 2025-02-04T08:20:21Z

Hi @Qubitium, I am having error from line 186 in config.py when trying to import QuantizeConfig, the error is as follow: Do you know how to resolve this?

I am using python3.10 and have installed all the required libraries following the readme

@nbasyl Let me get slack installed so we can converse on slack. I should be able to fix this. This is strange error, almost as if our Type hint code is wrong. Not sure if we have python 3.10 compat bug.

merge with xl's commit

…oa config.

# Conflicts: # gptqmodel/models/auto.py # gptqmodel/models/base.py # gptqmodel/nn_modules/qlinear/__init__.py # gptqmodel/nn_modules/qlinear/bitblas.py # gptqmodel/nn_modules/qlinear/dynamic_cuda.py # gptqmodel/nn_modules/qlinear/exllama.py # gptqmodel/nn_modules/qlinear/exllamav2.py # gptqmodel/nn_modules/qlinear/ipex.py # gptqmodel/nn_modules/qlinear/marlin.py # gptqmodel/nn_modules/qlinear/torch.py # gptqmodel/quantization/gptq.py # gptqmodel/utils/model.py # tests/test_dynamic.py # tests/test_eval.py # tests/test_perplexity.py

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

…into same model safetensor file

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

Qubitium and others added 3 commits February 3, 2025 14:03

add extension property to QuantizeConfig + EoRA Extension/Config

88d24a0

test shihyang push

453d0f0

match/validate correct kernel to extension

8aa418a

Qubitium marked this pull request as draft February 3, 2025 14:41

model.quantize return the quantized weight now for EoRA

23dfd35

Qubitium and others added 21 commits February 4, 2025 08:44

allow test_perplexity to run without buffered_fwd arg

1d8d63d

limit test to only 1 for fast debug

334e747

reduce verbosity of logs (meant for debug)

73ef7c6

fix python 3.10 compat

47a964e

finish eora first version(not optimize might only work for llama type)

bb24216

Merge branch 'eora' of github.com:ModelCloud/GPTQModel into eora

8d96907

merge with xl's commit

dummy (non-working) eora torch kernel

8f8b02a

add BACKEND.EORA_TORCH and correctly register the eora_torch kernel

67827a7

fix eora torch backend selection

8b8afba

fix typo causing dtype mismatch

167f6c0

trying to get the eora loading but fail

9012a12

refractor eora config/loading

c47c574

refractor eora config

2caa29e

add test_eora.py, loading not fixed yet

8c2a311

fix config loading, and quant model loading (non-lora weighs) with er…

95a7b69

…oa config.

load A and B weights

e522096

fix transposed tensors for inference

40d51b0

move a/b to correct device

742e981

rename extension to adapter

8388fe7

half-way done with eora

d36521e

eora bug device mismatch

4b7f205

ZX-ModelCloud and others added 30 commits February 12, 2025 14:59

Merge remote-tracking branch 'origin/eora' into eora

016c14c

fix merge

63aadc9

fix quantized_weights key error

5f39982

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

add GPTQModel.lora_generate()

bac2c5b

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

cleanup

b1a89c0

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

Merge remote-tracking branch 'origin/main' into eora

03fd49d

cleanup

e32418b

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

cleanup

d6a03df

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

fixed arc address error

752b4aa

fixed dataset address error

ab49e8d

cleanup

402d7ab

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

cleanup

63d0a32

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

fix range error

fda897f

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

Merge remote-tracking branch 'origin/eora' into eora

088f7c2

move get_eora() to eora/eora_generate.py

ce20f37

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

fix merge error

41bf391

revert gptq.py changes

f5c99aa

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

allow adapter to operate on merged lora_A/B weights that are unified …

4c0f275

…into same model safetensor file

add huggingface download

7d0d9ee

checkin LoopProcess draft

5a7785e

need to receive modules as input

cc22913

cleanup

845c681

cleanup

d433cbf

allow download lora by link

3bdf206

revert test path changes

46ea9ed

add logs

749286a

add download test

a4470ee

need to store calib data inside processor

85993d0

add ModuleLooper and QuantizeProcessor

565ef20

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] EoRA #1206

[WIP] EoRA #1206

Qubitium commented Feb 3, 2025 •

edited

Loading

nbasyl commented Feb 4, 2025

Qubitium commented Feb 4, 2025 •

edited

Loading

nbasyl commented Feb 4, 2025 •

edited

Loading

Qubitium commented Feb 4, 2025

[WIP] EoRA #1206

Are you sure you want to change the base?

[WIP] EoRA #1206

Conversation

Qubitium commented Feb 3, 2025 • edited Loading

nbasyl commented Feb 4, 2025

Qubitium commented Feb 4, 2025 • edited Loading

nbasyl commented Feb 4, 2025 • edited Loading

Qubitium commented Feb 4, 2025

Qubitium commented Feb 3, 2025 •

edited

Loading

Qubitium commented Feb 4, 2025 •

edited

Loading

nbasyl commented Feb 4, 2025 •

edited

Loading