-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit Tool Turns #626
Explicit Tool Turns #626
Changes from all commits
5d5610d
3fdcb22
aa62573
2bb081e
6ad2afa
e9d7903
ce042d6
7313359
45dd2f0
7ad479f
7eea99c
3727a95
ad6be58
2440b8c
6cf0b0b
3c74df1
cc18cc4
5fe062b
4d0689a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -223,42 +223,56 @@ PROCESS and _STATUS are process parameters." | |
(setf (alist-get process gptel--request-alist nil 'remove) nil) | ||
(kill-buffer proc-buf))) | ||
|
||
(defun gptel-curl--stream-insert-response (response info) | ||
(defun gptel-curl--stream-insert-response (response info &optional raw) | ||
"Insert streaming RESPONSE from an LLM into the gptel buffer. | ||
|
||
INFO is a mutable plist containing information relevant to this buffer. | ||
See `gptel--url-get-response' for details." | ||
(cond | ||
((stringp response) | ||
(let ((start-marker (plist-get info :position)) | ||
(tracking-marker (plist-get info :tracking-marker)) | ||
(transformer (plist-get info :transformer))) | ||
(with-current-buffer (marker-buffer start-marker) | ||
(save-excursion | ||
(unless tracking-marker | ||
(goto-char start-marker) | ||
(unless (or (bobp) (plist-get info :in-place)) | ||
(insert gptel-response-separator) | ||
(when gptel-mode | ||
;; Put prefix before AI response. | ||
(insert (gptel-response-prefix-string))) | ||
(move-marker start-marker (point))) | ||
(setq tracking-marker (set-marker (make-marker) (point))) | ||
(set-marker-insertion-type tracking-marker t) | ||
(plist-put info :tracking-marker tracking-marker)) | ||
|
||
(when transformer | ||
(setq response (funcall transformer response))) | ||
|
||
(add-text-properties | ||
0 (length response) '(gptel response front-sticky (gptel)) | ||
response) | ||
(goto-char tracking-marker) | ||
;; (run-hooks 'gptel-pre-stream-hook) | ||
(insert response) | ||
(run-hooks 'gptel-post-stream-hook))))) | ||
((consp response) | ||
(gptel--display-tool-calls response info)))) | ||
See `gptel--url-get-response' for details. | ||
|
||
Optional RAW disables text properties and transformation." | ||
(pcase response | ||
((pred stringp) | ||
(let ((start-marker (plist-get info :position)) | ||
(tracking-marker (plist-get info :tracking-marker)) | ||
(message-marker (plist-get info :message-marker)) | ||
(transformer (plist-get info :transformer)) | ||
(in-place (plist-get info :in-place))) | ||
(with-current-buffer (marker-buffer start-marker) | ||
(save-excursion | ||
(unless tracking-marker | ||
(goto-char start-marker) | ||
(setq tracking-marker (set-marker (make-marker) (point))) | ||
(set-marker-insertion-type tracking-marker t) | ||
(plist-put info :tracking-marker tracking-marker)) | ||
(goto-char tracking-marker) | ||
(when (and gptel-mode (not (or raw in-place))) | ||
(unless (and message-marker (= tracking-marker message-marker)) | ||
(unless (bobp) | ||
(insert gptel-response-separator))) | ||
(unless (plist-get info :prefix-done) | ||
(insert (gptel-response-prefix-string)) | ||
(plist-put info :prefix-done t) | ||
(move-marker start-marker (point)))) | ||
(unless raw | ||
(when transformer | ||
(setq response (funcall transformer response))) | ||
(add-text-properties | ||
0 (length response) '(gptel response front-sticky (gptel)) | ||
response)) | ||
;; (run-hooks 'gptel-pre-stream-hook) | ||
(insert response) | ||
(when (and gptel-mode (not raw)) | ||
(if message-marker | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tracking marker is moved up by all inserts. The message marker should only move up by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, the way this is implemented does not follow your intention except incidentally. What I mean is that the implementation actually works like this: Tracking marker is moved up by all inserts. The message marker should only move up So you are effectively overloading the meaning of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just include a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not if there's multiple sequential tool calls. Each tool generates a call to the
Possibly? Probably still too early to tell. If it's a bad solution, it will look worse when the next use case comes along and then it will be even more obvious what actually needs to happen. If it's a scalable solution to have different markers for different message types, it will just keep working and we'll never notice. I can stick around to support this as is. Recommend I've been banging the multiple tool call cases pretty hard. What's there looks pretty good with no configuration. It's sensible out of the box. It works consistently. It's a less involved implementation than the earlier There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It should be fine to have
Claude typically intersperses tool calls and text, and it adds the newlines itself (see the tool-use demos at the top of the README). In this case there will be too many newlines between the last tool call and the text if you also add a response separator.
That's true, but a better solution would be to move this logic further down, into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
well.. idk. this is a frontend backend coupling issue. If no solution will be universal, back-and-forthing in this PR won't fix it either. I think we leave it an unsolved issue somewhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it's going to remain unsolved either way, but the current solution is making the insert callbacks much harder to understand. Is there some reason you don't want to add the response separator after each tool call instead? |
||
(move-marker message-marker (point)) | ||
(plist-put info :message-marker (point-marker)))) | ||
(run-hooks 'gptel-post-stream-hook))))) | ||
(`(reasoning . ,_text) | ||
(display-warning '(gptel gptel-reasoning) | ||
"Reasoning unsupported." :warning)) | ||
(`(tool-call . ,tool-calls) | ||
(gptel--display-tool-calls tool-calls info)) | ||
(`(tool-result . ,tool-results) | ||
(gptel--display-tool-results tool-results info)))) | ||
|
||
(defun gptel-curl--stream-filter (process output) | ||
(let* ((fsm (alist-get process gptel--request-alist)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, turns out I don't understand
:prefix-done
either, even after a good night's sleep with coffee in hand.:prefix-done
is a toll-gate flag that is set once per fsm life-cycle, right? So why can't the body of this clause be moved into the(unless tracking-marker ...)
block where it originally resided? Thetracking-marker
is also set once per fsm life-cycle.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, once per lifecycle.
The default callbacks will set the
:tracking-marker
when the tool insertion calls into them withraw
set. Tracking marker is intended to always move up regardless of who inserts. The:prefix-done
toll-gate flag should only be set by an actual response message insertion, which does not use theraw
argument. Whenever one or more tool calls happen before the LLM begins blabbering response messages. Re-using the:tracking-marker
existence would lead to no prefix insertion. The independent:prefix-done
state allows the non-raw insertion path to recognize the situation correctly and only insert at the first blabbering.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tool call output counts as blabbering too? IIUC, this means the prefix is sometimes inserted before any tool result, and sometimes not, depending on whether the LLM had anything to say first. This would be confusing behavior -- visually, the response prefix marks the start of the response region to the user, i.e. "stuff I didn't write". So it should always come first. The tool result region does not fit into the "LLM response" role in the user's mental model otherwise.
If you are just working around the fact that a response prefix followed by a tool call output can look ugly for some prefixes, I think that's a fair price -- a more accurate and consistent mental model is more important here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
If we were to highlight assistant message regions, it would be apparent that the empty prefix is not in fact an assistant message at all, so putting the prefix only before a real assistant message is more consistent with the underlying facts.
Note, the prefix is a "user" message whether it's at the end of a user message or injected somewhere randomly in the log. The robustness of simply dropping empty prefixes that trim to nil is on display.
In practice, I can easily visually identify tool call blocks as "not mine". They have font locking from org and org-modern besides my own font locking. They are folded. It's very obvious and intuitive in org.
The empty response prefixes in master are by far the bigger eye sore.
More people can get what they want this way. My HK-47 roleplay nonsense is only compatible if the first assistant message of a lifecycle includes the prefix. Users who don't want such details are fine either way, but users who do want to style their LLM are impacted negatively without the change. This is the more inclusive option.
An alternative option we quickly decided against was to visually move the tool calls after the first line of the response. It was unreliable to do with the LLM and creates much bigger lies if we break one assistant message into two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were highlighting regions, the tool calls and response text would be highlighted the same color. It doesn't make sense to highlight tool results differently. Then the response prefix would be an non-highlighted island in the middle of the response region, out of place.
These affordances are personal customizations, and not a given. We have also markdown and text-mode to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just cut this off and leave a TODO to add a customize variable. Clearly it's subjective behavior and splits across personal preferences.