Supporting the Ollama /api/generate endpoint #543

marcwebbie · 2024-10-29T13:24:33Z

marcwebbie
Oct 29, 2024

Setting up gptel with ollama local models is not working

Additional context
Emacs version: 29
Operating system: OSX Sonoma

Ollama:

➜  ~ ollama list
NAME                            ID              SIZE      MODIFIED
qwen2.5-coder:latest            4a26c19c376e    4.7 GB    35 minutes ago
mistral:latest                  f974a74358d6    4.1 GB    2 hours ago
codegemma:latest                0c96700aaada    5.0 GB    2 hours ago
codellama:latest                8fdf8f752f6e    3.8 GB    2 hours ago
qwen2.5-coder:7b-instruct       4a26c19c376e    4.7 GB    11 days ago
qwen:32b                        26e7e8447f5d    18 GB     11 days ago
eramax/nxcode-cq-7b-orpo:q6     2784da3b3724    6.4 GB    12 days ago
deepseek-coder:6.7b-instruct    ce298d984115    3.8 GB    12 days ago
llama3.2:latest                 a80c4f17acd5    2.0 GB    2 weeks ago

init.el:

(use-package gptel
  :straight t
  :ensure t
  :config
  ;; OPTIONAL configuration
  (setq gptel-model 'qwen2.5-coder:latest
        gptel-backend (gptel-make-ollama "Ollama" :host "localhost:11434" :stream t :models '(qwen2.5-coder:latest))
        gptel-api-key "OPENAPI-API-KEY"
        )
  )

Messages:

Project root found: ~/.emacs.d/
Mark set [3 times]
Quit
previous-line: Beginning of buffer [4 times]
propertize: Wrong type argument: sequencep, qwen2.5-coder:latest
Error in post-command-hook (transient--post-command): (wrong-type-argument sequencep qwen2.5-coder:latest)

karthink · 2024-10-29T21:39:08Z

karthink
Oct 29, 2024
Maintainer

Please update gptel to the latest commit.
Can you produce a backtrace? It's hard to see what's going wrong without it.

0 replies

dstrohmaier · 2024-12-08T16:27:41Z

dstrohmaier
Dec 8, 2024

I've been able to make qwen work with the following configuration:

(setq-default gptel-model 'mistral:latest 
              gptel-backend (gptel-make-ollama "Ollama"
                              :host "localhost:11434"
                              :stream t
                              :models '("qwen2.5-coder:3b")))

The API request was not working when I used 'qwen2.5-coder:3b instead of 'mistral:latest. Obviously, I would prefer a fix of the underlying problem.

0 replies

dstrohmaier · 2024-12-08T16:56:39Z

dstrohmaier
Dec 8, 2024

The API request was not working when I used 'qwen2.5-coder:3b instead of 'mistral:latest. Obviously, I would prefer a fix of the underlying problem.

After some tests, I take that back: the model specification makes no difference. The problem is that I can only make it work for the api/chat rather than the api/generate. I do not know how to correctly change the endpoint.

(setq-default gptel-model 'mistral:latest ;Pick your default model
              gptel-backend (gptel-make-ollama "Ollama"
                              :host "172.22.14.79:11434"
                              :stream t
                              :endpoint "/api/generate"
                              :models '("qwen2.5-coder:3b")))

Does not work.

0 replies

karthink · 2024-12-08T23:16:23Z

karthink
Dec 8, 2024
Maintainer

@dstrohmaier gptel only supports the /api/chat endpoint for Ollama. Does qwen2.5-coder:3b not support the chat endpoint?

0 replies

dstrohmaier · 2024-12-09T09:58:30Z

dstrohmaier
Dec 9, 2024

Thanks, that's good to know.

I found that the chat endpoint to produce underwhelming results, sometimes just empty string. I thought this might be due to the fact that the model is not really supposed to be used for chat. It is, after all, a coder model, which one might expect to be used for autocompletion rather than a chat.

0 replies

karthink · 2024-12-09T14:24:54Z

karthink
Dec 9, 2024
Maintainer

We can support the generate endpoint in the future -- but have you verified that the generate endpoint produces better results for the coder models? Also, generating a blank response is almost definitely a gptel bug. If you can reproduce it, please raise an issue with details.

…

On Mon, Dec 9, 2024, 1:58 AM David Strohmaier ***@***.***> wrote: Thanks, that's good to know. I found that the chat endpoint to produce underwhelming results, sometimes just empty string. I thought this might be due to the fact that the model is not really supposed to be used for chat. It is, after all, a coder model, which one might expect to be used for autocompletion rather than a chat. — Reply to this email directly, view it on GitHub <#444 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACBVOLANSGBZ36S24BAJQ4T2EVSV3AVCNFSM6AAAAABQZ3CZIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRXGQ2TKMJUGU> . You are receiving this because you commented.Message ID: ***@***.***>

0 replies

karthink · 2025-01-02T03:02:30Z

karthink
Jan 2, 2025
Maintainer

Moving to a discussion since there is nothing to fix on the gptel side. This thread can be used to discuss adding the generate endpoint.

0 replies

meain · 2025-01-02T03:32:57Z

meain
Jan 2, 2025

I thought this might be due to the fact that the model is not really supposed to be used for chat. It is, after all, a coder model, which one might expect to be used for autocompletion rather than a chat.

I don't think that is the case. qwen models can be used for chat. They do support infill (ie code generation), but that is on top the chat ability. Also, if I understand correctly, the /generate endpoint is just a simpler interface to /chat version of the endpoint with the underlying thing doing the same logic just represented differently.

Also, generating a blank response is almost definitely a gptel bug.

FWIW, I was able to run qwen2.5-coder model with gptel.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting the Ollama /api/generate endpoint #543

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Supporting the Ollama /api/generate endpoint #543

marcwebbie Oct 29, 2024

Replies: 8 comments

karthink Oct 29, 2024 Maintainer

dstrohmaier Dec 8, 2024

dstrohmaier Dec 8, 2024

karthink Dec 8, 2024 Maintainer

dstrohmaier Dec 9, 2024

karthink Dec 9, 2024 Maintainer

karthink Jan 2, 2025 Maintainer

meain Jan 2, 2025

marcwebbie
Oct 29, 2024

karthink
Oct 29, 2024
Maintainer

dstrohmaier
Dec 8, 2024

dstrohmaier
Dec 8, 2024

karthink
Dec 8, 2024
Maintainer

dstrohmaier
Dec 9, 2024

karthink
Dec 9, 2024
Maintainer

karthink
Jan 2, 2025
Maintainer

meain
Jan 2, 2025