Supporting the Ollama /api/generate endpoint #543
Replies: 8 comments
-
|
Beta Was this translation helpful? Give feedback.
-
I've been able to make qwen work with the following configuration:
The API request was not working when I used 'qwen2.5-coder:3b instead of 'mistral:latest. Obviously, I would prefer a fix of the underlying problem. |
Beta Was this translation helpful? Give feedback.
-
After some tests, I take that back: the model specification makes no difference. The problem is that I can only make it work for the
Does not work. |
Beta Was this translation helpful? Give feedback.
-
@dstrohmaier gptel only supports the |
Beta Was this translation helpful? Give feedback.
-
Thanks, that's good to know. I found that the chat endpoint to produce underwhelming results, sometimes just empty string. I thought this might be due to the fact that the model is not really supposed to be used for chat. It is, after all, a coder model, which one might expect to be used for autocompletion rather than a chat. |
Beta Was this translation helpful? Give feedback.
-
We can support the generate endpoint in the future -- but have you verified
that the generate endpoint produces better results for the coder models?
Also, generating a blank response is almost definitely a gptel bug. If you
can reproduce it, please raise an issue with details.
…On Mon, Dec 9, 2024, 1:58 AM David Strohmaier ***@***.***> wrote:
Thanks, that's good to know.
I found that the chat endpoint to produce underwhelming results, sometimes
just empty string. I thought this might be due to the fact that the model
is not really supposed to be used for chat. It is, after all, a coder
model, which one might expect to be used for autocompletion rather than a
chat.
—
Reply to this email directly, view it on GitHub
<#444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACBVOLANSGBZ36S24BAJQ4T2EVSV3AVCNFSM6AAAAABQZ3CZIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRXGQ2TKMJUGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Moving to a discussion since there is nothing to fix on the gptel side. This thread can be used to discuss adding the |
Beta Was this translation helpful? Give feedback.
-
I don't think that is the case. qwen models can be used for chat. They do support infill (ie code generation), but that is on top the chat ability. Also, if I understand correctly, the
FWIW, I was able to run qwen2.5-coder model with gptel. |
Beta Was this translation helpful? Give feedback.
-
Setting up
gptel
with ollama local models is not workingAdditional context
Emacs version: 29
Operating system: OSX Sonoma
Ollama:
➜ ~ ollama list NAME ID SIZE MODIFIED qwen2.5-coder:latest 4a26c19c376e 4.7 GB 35 minutes ago mistral:latest f974a74358d6 4.1 GB 2 hours ago codegemma:latest 0c96700aaada 5.0 GB 2 hours ago codellama:latest 8fdf8f752f6e 3.8 GB 2 hours ago qwen2.5-coder:7b-instruct 4a26c19c376e 4.7 GB 11 days ago qwen:32b 26e7e8447f5d 18 GB 11 days ago eramax/nxcode-cq-7b-orpo:q6 2784da3b3724 6.4 GB 12 days ago deepseek-coder:6.7b-instruct ce298d984115 3.8 GB 12 days ago llama3.2:latest a80c4f17acd5 2.0 GB 2 weeks ago
init.el:
Messages:
Beta Was this translation helpful? Give feedback.
All reactions