Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include DeepSeek's reasoning_content when present #592

Closed
wants to merge 26 commits into from

Conversation

z80dev
Copy link

@z80dev z80dev commented Jan 30, 2025

Creates a separate backend for DeepSeek that inherits from the (mostly compatible) OpenAI backend.

Adds parsing and formatting of reasoning content, and a constructor argument for the backend to configure reasoning inclusion.

@karthink
Copy link
Owner

karthink commented Jan 30, 2025

@z80dev Thanks for the PR!

I have some suggestions to simplify the parser functions, I'll explain when I next have some time.

For now I wanted to mention some thoughts about the show-reasoning configuration. I'm going out on a limb here:

  • I expect that the popularity of deepseek-r1 means that we will soon see many more models that emit chains of thought/reasoning blocks. This might become a semi-standard feature of LLMs.
  • This might also be something that users may want to hide/show on demand. For example, they might want it to be hidden by default, but easily toggle-able before/after a request. (Yes, toggling after a request can work too.)

Making the switch part of the backend definition makes it difficult to do this. Essentially you have to run

(setf (gptel-backend-show-reasoning (gptel-get-backend "Deepseek")) t)

Considering the above points, perhaps it makes sense to make this configurable at the gptel level instead of the gptel-backend level? Perhaps via a switch like gptel-include-reasoning, or gptel-include-chain-of-thought, similar to how we have gptel-include-tool-results or gptel-stream.

@pirminj
Copy link
Contributor

pirminj commented Jan 30, 2025

Regarding the UI, I agree that a quick toggle option is needed. In my testing I often find the reasoning output interesting but not really relevant for the work, therefore showing a side buffer *gptel-reasoning* on demand might me a better solution.

@z80dev
Copy link
Author

z80dev commented Jan 30, 2025

Good idea! I first started implementing with a separate variable (scoped only to deepseek), but then migrated to backend-level config for this because sometimes I wanted one backend to show reasoning, and another not showing reasoning.

For this to work with a global toggle, I would have to be flipped each time I make a request which is not ideal

and yes the point about having reasoning in the main gptel buffer or in a side buffer is another good point. Another option to consider is having the reasoning be put under a sub-heading (so #### Reasoning in markdown format) and having that heading be collapsed by default, and a separate, uncollapsed sub-heading (#### Response) for the main response.. I haven't checked if this is something that's actually possible but I'd be surprised if it wasn't

all this shows how differently our use-cases can be, and we should prioritize flexibility to not get in developer's way

I will start with refactoring the toggle to be a variable instead of part of the backend config, and make it so it can also be set per-buffer via the transient. this supports my use-case I mentioned above (some backends I want the reasoning to show, others not) without making it part of the backend config.

then I will look into "reasoning output destinations" like a collapsed or uncollapsed heading, or a separate buffer, or just in-place

@karthink
Copy link
Owner

...sometimes I wanted one backend to show reasoning, and another not showing reasoning.

For this to work with a global toggle, I would have to be flipped each time I
make a request which is not ideal

I'm not sure what you are comparing this with, because it's the same amount of work in both approaches, but a gptel-level toggle can be more versatile.

  • If it's part of the backend you have to switch backends, possibly buffer-locally.
  • If it's a gptel option you have flip that switch, possibly buffer-locally.

The advantage of the latter is that you can additionally use a single backend both with and without reasoning included. In the former you'd have to define two identical deepseek backends (but for the show-reasoning property) to be able to do that.

and yes the point about having reasoning in the main gptel buffer or in a side
buffer is another good point. Another option to consider is having the reasoning
be put under a sub-heading (so #### Reasoning in markdown format) and having
that heading be collapsed by default, and a separate, uncollapsed sub-heading
(#### Response) for the main response.. I haven't checked if this is something
that's actually possible but I'd be surprised if it wasn't

You can do exactly this with a post response hook. See my comment here.

I will start with refactoring the toggle to be a variable instead of part of the
backend config, and make it so it can also be set per-buffer via the
transient.

You just have to use the gptel-lisp-variable class in the transient infix, the rest is taken care of. Also, I think most people miss the fact that you can also set these options per request, and not just buffer-locally.

then I will look into "reasoning output destinations" like a collapsed or
uncollapsed heading, or a separate buffer, or just in-place

Perhaps gptel-include-reasoning can have three values, t, buffer and nil to determine if the chain of thought should be shown with the response, in a separate buffer or not shown at all. I'm not sure yet if a fourth value (hide or collapse) to include+collapse the CoT is necessary.

@karthink
Copy link
Owner

karthink commented Feb 6, 2025

@z80dev I'd like to merge this, as I'm getting many issues/discussions requesting this feature on the tracker. Do you think you'll be able to incorporate the changes we discussed soon?

@z80dev
Copy link
Author

z80dev commented Feb 6, 2025

@z80dev I'd like to merge this, as I'm getting many issues/discussions requesting this feature on the tracker. Do you think you'll be able to incorporate the changes we discussed soon?

@karthink
yep I can do it today! I've been waiting for the tips on the parser functions that you mentioned in your first comment but I can push the other changes today

@karthink
Copy link
Owner

karthink commented Feb 6, 2025

I've been waiting for the tips on the parser functions that you mentioned in your first comment but I can push the other changes today

It's a little involved so I'll explain when I get a reasonably sized chunk of free time, hopefully soon.

and redefine gptel-deepseek to use this variable in its parsing to
determine whether to include reasoning
@karthink
Copy link
Owner

Can you provide me with the log for a medium sized deepseek-reasoner response? Both streaming and non-streaming, although the former is more important.

To propose changes to the parsers, I need to understand what the output looks like. I don't have Deepseek access so I'm flying blind right now.

(when (and reasoning-content
(not (equal reasoning-content :null))
gptel-include-reasoning)
(unless (plist-get info :header-printed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headers feels like a front-end detail for sure. This would be a block / drawer in org and a source block in markdown.

@karthink this is where I think a TYPE argument in the callback would be useful for allowing the front-end to differentiate reasoning as a message type.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headers feels like a front-end detail for sure.

Yes, my plan for Deepseek integration is different from this, and in line with separating the response parser from the UI. I need to look at some Deepseek response logs to make a proposal.

@karthink this is where I think a TYPE argument in the callback would be useful for allowing the front-end to differentiate reasoning as a message type.

TYPE will not fly, but I've already covered alternatives in detail in my response to your suggestion.

@sg-qwt
Copy link

sg-qwt commented Feb 11, 2025

Can you provide me with the log for a medium sized deepseek-reasoner response? Both streaming and non-streaming, although the former is more important.

To propose changes to the parsers, I need to understand what the output looks like. I don't have Deepseek access so I'm flying blind right now.

https://github.com/marketplace/models/azureml-deepseek/DeepSeek-R1/playground
hi @karthink right now deepseek-r1 is free on Github Models, maybe you can test with that instead?
image

@psionic-k
Copy link
Contributor

Changes in #626 are highly relevant to new response types and the frontend / backend separation.

#+begin_reasoning and a markdown fence with reasoning as the language would work roughly the same as everything about explicit tool call handling.

I'm likely to begin hacking on this PR. It looks okay except for bundling reasoning in with chat.

right now deepseek-r1 is free on Github Models

@sg-qwt did you actually authenticate against it? Please share details so I can catch up.

@sg-qwt
Copy link

sg-qwt commented Feb 13, 2025

@sg-qwt did you actually authenticate against it? Please share details so I can catch up.

Just get a Github PAT(no need to tick any permission), you're good to go. The model always includes <think> </think>tags with reasoning.
Though I found it mostly unusable for free users due to rate limit is tight. Blocked by http 429 just after a few tries.

@psionic-k
Copy link
Contributor

I'm about to crash. If you are awake, I need an eli5. Put the PAT in auth info? How can I add some headers to set super low token and reasoning limits if I just want to be sure and get enough sample requests?

@karthink
Copy link
Owner

karthink commented Feb 13, 2025

I tested it with Github models, and the situation is a little more complicated than it appears. The Github models API just adds literal <think> and </think> tags to the reasoning part of the response. The Deepseek API includes the reasoning chunk as a different field in the JSON object.

This PR is for the Deepseek API. I can't get access to the API, so I need to see some logs (both streaming and non-streaming) before I can decide how best to handle the parsing.

Handling the <think> tags can be done in the OpenAI parser. It needs to be done with care, as we want to avoid scanning every response chunk for <think> when streaming responses.

@baohaojun
Copy link

This PR is for the Deepseek API. I can't get access to the API, so I need to see some logs (both streaming and non-streaming) before I can decide how best to handle the parsing.

Hi, Hope this is useful to you.

Streaming:

HTTP/2 200 
date: Wed, 26 Feb 2025 06:50:06 GMT
content-type: text/event-stream
cache-control: no-cache
x-envoy-upstream-service-time: 888
strict-transport-security: max-age=15724800; includeSubDomains

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"Okay","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":",","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" the","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" user","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" wrote","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" \"","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"Hello","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" World","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".\"","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" That","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"'s","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" a","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" classic","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" first","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" program","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" message","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" They","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" might","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" be","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" testing","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" if","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" I","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"'m","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" working","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" or","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" just","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" being","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" friendly","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" Since","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" I","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"'m","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" supposed","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" to","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" respond","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" conc","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"is","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"ely","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":",","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" a","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" simple","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" greeting","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" back","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" should","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" work","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" Maybe","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" add","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" a","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" bit","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" of","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" em","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"oji","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" to","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" keep","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" it","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" light","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" Let","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" me","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" check","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" the","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" instructions","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" again","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"—","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"yes","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":",","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" concise","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" and","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" helpful","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" Al","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"right","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":",","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" \"","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"Hello","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"!","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" How","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" can","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" I","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" assist","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" you","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" today","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"?","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" \uD83D\uDE80","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":"\"","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" That","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" should","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" do","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":" it","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"reasoning_content":".","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":"Hello","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":"!","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" How","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" can","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" I","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" assist","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" you","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" today","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":"?","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"index":0,"delta":{"content":" \uD83D\uDE80","role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:{"model":"deepseek-r1","id":"chatcmpl-75945b6f-a28b-939e-a650-63762ab131f5","choices":[{"finish_reason":"stop","index":0,"delta":{"role":"assistant"}}],"created":1740552606,"object":"chat.completion.chunk"}

data:[DONE]

(2d678550467a01d43acc72094869ef59 . 210)

No streaming:

HTTP/2 200 
date: Wed, 26 Feb 2025 06:52:25 GMT
content-type: application/json
content-length: 1012
x-envoy-upstream-service-time: 52337
strict-transport-security: max-age=15724800; includeSubDomains

{"model":"deepseek-r1","id":"chatcmpl-6134c04f-8e59-9d6b-8b54-d8bfa6f1d6b9","choices":[{"index":0,"message":{"content":"Hello! How can I assist you in Emacs today?","reasoning_content":"Okay, the user sent \"Hello World.\" That's a classic first program message. They might be testing if I'm working or just being friendly. Since I'm an Emacs-based assistant, I should respond in a way that's concise but also shows I'm integrated here. Maybe mention Emacs to acknowledge the environment. Let's keep it simple.\n\nPossible responses: \"Hello! How can I assist you in Emacs today?\" or \"Hello! Need help with Elisp or Emacs commands?\" The first one is more general. The second assumes they need help with something specific. Since they just said \"Hello World,\" maybe they're starting out. Let's go with the first option to keep it open-ended.","role":"assistant"},"finish_reason":"stop"}],"created":1740552745,"object":"chat.completion","usage":{"prompt_tokens":26,"total_tokens":188,"completion_tokens":162}}(2d678550467a01d43acc72094869ef59 . 208)

@arichiardi
Copy link

arichiardi commented Mar 4, 2025

Hi there, I think this could be useful - vLLM is usally used for serving production workloads (aka multiple requests at the same time) and I suspect it is backed by the DeepSeek folks.

For instance, there is a spefic section for reasoning output here.

I have it set it up here locally and I can try things out when this branch is deemed to be ready.

@shroomist
Copy link

if there's any specific thing that needs to be tested against deepseek-r1 - I could try contributing on that end - I've got a few spare tokens on the official api.

@psionic-k
Copy link
Contributor

Found an interesting bit for frontend people: Deekseek's own reasoning is not included in subsequent chat rounds. We can look at this as an early, primitive form of automatic context pruning. While step-by-step patterns can prime the production of a better output, after use, reasoning must have been considered too expensive and not useful enough, possibly distracting.

For cache hits, DeepSeek charges $0.014 per million tokens, slashing API costs by up to 90%.

Just interesting to note.

@karthink
Copy link
Owner

karthink commented Mar 7, 2025

@baohaojun Thank you for the logs, these are useful.

@karthink
Copy link
Owner

karthink commented Mar 8, 2025

I added support for Deepseek's reasoning content. Unfortunately I cannot merge this PR as the method I used ended up being significantly different from this one -- with a much shorter parser.

There is now a user option to include reasoning: gptel-include-reasoning, and it should work for all backends/models that include reasoning blocks. There are many supported behaviors, let me know if you have any feedback about these.

As with most options you can set it from the transient menu with whatever scope you need.

Please update and test. You'll have to use the new gptel-make-deepseek function to set up the backend:

(gptel-make-deepseek "DeepSeek"    ;Any name you want
  :stream t                        ;for streaming responses
  :key "your-api-key")             ;can be a function that returns the key

@karthink karthink closed this Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants