Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix guided decoding crashes
#811 opened Feb 10, 2025 by kzawora-intel Loading…
Rebase 2025-02-10
#810 opened Feb 10, 2025 by kzawora-intel Loading…
Refactor long-context + LoRA flow
#807 opened Feb 10, 2025 by SanjuCSudhakaran Loading…
support inc dynamic quantization
#803 opened Feb 8, 2025 by changwangss Loading…
Qwen2 vl
#802 opened Feb 7, 2025 by malkomes Draft
mszu/merged scheduler
#799 opened Feb 7, 2025 by szutenberg Draft
[WIP] Updating docs for the vLLM 1.20 release
#798 opened Feb 7, 2025 by PatrykWo Loading…
[WIP]Deepseek r1 reuse kcache
#797 opened Feb 7, 2025 by jikunshang Loading…
Pin triton to v3.1.0 for HPU
#796 opened Feb 7, 2025 by iboiko-habana Loading…
Pin triton to v3.1.0 for HPU
#795 opened Feb 7, 2025 by iboiko-habana Loading…
Support qwenvl model for HPU
#793 opened Feb 7, 2025 by yingjie-han Loading…
Enable roberta embedding
#786 opened Feb 5, 2025 by yeonsily Loading…
Improve RMSNorm to support 2D inputs
#784 opened Feb 5, 2025 by YangQun1 Loading…
[SW-207299] Recalc scales from user
#774 opened Feb 3, 2025 by linoybu Loading…
Updated Troubleshooting section
#766 opened Jan 31, 2025 by MohitIntel Loading…
Fix warmup padding
#759 opened Jan 30, 2025 by mfylcek Draft
Initial enablement for text-embedding
#758 opened Jan 30, 2025 by libinta Loading…
Allow tests to run in t.compile
#724 opened Jan 22, 2025 by Kacper-Pietkun Loading…
Delayed sampling
#720 opened Jan 22, 2025 by mfylcek Draft
ProTip! Add no:assignee to see everything that’s not assigned.