Replies: 15 comments 35 replies
-
Since you asked for feedback: I have two mental models when working in Emacs:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
> I assume by "second model" you mean the one where anything typed by the user (anywhere in the buffer) is unconditionally assigned the user role, which is gptel's current behavior.
Seems like I got a lot of the models mixed between multiple replies here, so I'll just make it clear: I would prefer that any change I make to the chat buffer would be seen as a change made to the conversational history. If I add text to the assistant's response, it still is his response. If I yank part of his text around, and add it in MY response, it will be MY response.
The current behavior is something I usually try to avoid.
Right. Implicitly you are assuming that the response == a region of the buffer, and not the contents of the response text itself. This is the behavior you would expect, for example, if you used an overlay to track the response bounds. I understand the appeal of this approach, but when you implement it it turns out not to work very well.
gptel used to work this way and it was fine some of the time (~80%), but there were too many edge cases, and it was easy to lose this overlay (or overlay-like behavior) because of the various things Emacs and other minor modes do in buffers. If I can find a way to implement this behavior robustly I'll add it again.
|
Beta Was this translation helpful? Give feedback.
-
potential need for quoting tagsIndeed, we really need to be able to edit the "response" part, which in the current case will fragment it and require Also, you might want to use part of the response in your request. Currently, I paste it in the terminal with I think the best solution would be a visible structure, similar to HTML/XML tags. Similar to JSON. Similar to Emacs' S-expressions. I mean using opening and closing tags. That structure could be orthogonal to the existing structure in the document: It would be ignored by So that would be There would be no ambiguity anymore: We could edit, copy, or do whatever we want, provided we don't meddle with the intertwined new tags. (Note: if we want to be able to quote the tags of the new structure, which is far from unimaginable, we can add a UUID to the tags: |
Beta Was this translation helpful? Give feedback.
-
@Inkbottle007, @daedsidog I think you are taking the "response == buffer region" semantic model for granted. This is why indicating responses visually, copying text (etc) doesn't work how you expect. The "response == buffer region" model is fine, and I'm okay switching to it since it's the popular one so far. The question is how to implement it. The behavior you want maps 1:1 to overlays, so the easy fix would be to use overlays instead of (or in addition to) text properties to demarcate response boundaries. Then everything, including copying response text to other buffers and visual indications of response regions will work as you expect. Switching to overlays is still a fair amount of work, though. As a precursor and preview to that, you can try the following: (setf (alist-get 'gptel text-property-default-nonsticky nil 'remove) nil) This should do most of what you want, but it introduces some hidden gotchas, especially in markdown-mode. |
Beta Was this translation helpful? Give feedback.
-
What about using invisible open/close tags? Just as the This approach would address the issue where subsequent user text is mistaken for a response, as discussed earlier. Importantly, it wouldn't appear in the document's actual text, maintaining the goal of non-intrusiveness. Of course, if the user meddles with those characters, everything will end up mixed up again. That's why I thought that showing those boundaries would help. However, I understand the intention to keep gptel seamless and invisible within the workflow. |
Beta Was this translation helpful? Give feedback.
-
In the
It needs some testing, so if you're interested please switch to the |
Beta Was this translation helpful? Give feedback.
-
I've found the first problem with using overlays to track responses -- if you kill text and undo, or even just undo and redo, the overlays don't come back so the tracking is gone. |
Beta Was this translation helpful? Give feedback.
-
I have a workflow that works. I use a simple function that unambiguously highlights the attributions and then you just have to fix the inconsistencies by hand (using I think we should keep the text property based attribution system. Since the text is natural language, I don't think there is a single (unified) solution that addresses all cases. However a hybrid solution like the workflow described here should be sufficient. There is the question of hooking (the highlighting function on changes in the buffer) but I don't know how this could be done without being CPU intensive. (defun gptel-highlight-responses ()
"Highlight response segments with overlays."
(interactive)
(save-excursion
(goto-char (point-max))
(while (setq prop (text-property-search-backward
'gptel 'response
(when (get-char-property (max (point-min) (1- (point)))
'gptel)
t)))
(let ((role (if (prop-match-value prop) "assistant" "user"))
(overlay (make-overlay (prop-match-beginning prop)
(prop-match-end prop))))
(overlay-put overlay 'face
`(:background ,(if (equal role "assistant")
"lightblue"
"lightgreen")
:extend t))
(overlay-put overlay 'gptel-response-overlay t))))) Additional information can be found here. |
Beta Was this translation helpful? Give feedback.
-
I believe the use of highlighting is the right answer to this question. I've been using @daedsidog's take on this solution for two weeks, but have reverted to my own implementation based on overlays that "I have reason to believe" are best for the intended scenario. My version seems very robust, perhaps to the point of being a drawback, and requires some tweaking. However, you can use it already if you want, because it is very convenient. Note that I am using the default convention of (I already have specific ideas on how to do the tweaking and will work on it asap.) |
Beta Was this translation helpful? Give feedback.
-
After some experimentation I think I have another potential solution to this problem: we can make the
While still not ideal, this avoids the most common problem of the response property taking over user text that follows it. My intuition is that it's rarer for the user to type in text exactly at the left boundary of a response compared to the right. Does this make sense? |
Beta Was this translation helpful? Give feedback.
-
I've gone ahead and made the
As discussed in this thread, there is no perfect solution here, because overlay-based tracking is too fragile and Emacs does not have the concept of "middle-sticky" text-properties. This change is experimental and I might revert it if it creates new problems. Fingers crossed. |
Beta Was this translation helpful? Give feedback.
-
Beyond making the text property sticky, what about allowing users to edit the message role and customize its face? |
Beta Was this translation helpful? Give feedback.
-
Wrote #565 in partial response to this thread. |
Beta Was this translation helpful? Give feedback.
-
I wanted to share some thoughts on the highlighting approach in gptel, based on my experiences using it extensively on a daily basis. I believe that robust highlighting is essential to prevent user confusion, regardless of the stickiness settings. Relying solely on stickiness might not account for all the possible editing combinations a user might perform. I've developed a set of functions using overlays for highlighting, which have consistently maintained accurate coloring in sync with the intended properties. This approach has worked well for me, and I've been using it without issues. For reference, I've shared my implementation here: gptel-highlight-v4.el. I also tried, for two weeks, at one point, Daedsidog's solution that utilizes text properties for highlighting, and I commented on it. In certain scenarios, such as working with long files and source code blocks, I encountered issues where the coloring and actual roles were out of sync. While this was my experience, I recognize that it might work well for others, and considerable thought has gone into this solution. In fact, my homegrown set of functions is based on its functional interface. I'm agnostic regarding the highlighting technology employed - as long as it delivers consistent and reliable results without any inconsistencies. My main concern is ensuring that gptel remains robust, especially for those of us who rely on it heavily. It's also worth mentioning that John Kitchin developed a highlighting framework that might offer some insights. In 2015, he used text properties for this application (A highlight annotation mode for Emacs using font-lock). However, by 2016, he switched to using overlays in his production code (Persistent highlighting in Emacs, ov-highlighter.el, ov-highlight repository). While he didn't explicitly state his reasons for switching, perhaps overlays offered advantages in terms of robustness for his use case. As someone who uses gptel daily for several hours, having a reliable and consistent highlighting system is important to me. I wanted to share these insights and my experiences in hopes that they might contribute to finding a solution that works well for all users. I'm happy to provide more details about my implementation or collaborate further if that would be helpful. |
Beta Was this translation helpful? Give feedback.
-
Currently gptel tags the text of LLM responses so it can distinguish between its responses and user prompts. The exact way it does this in Elisp is irrelevant (or not yet relevant) to this discussion. As it turns out, there are several subtleties to this behavior that are unresolved.
To figure these out, I would like your input and feedback on the following two questions:
If you move the cursor into a response region and type in text, should that new text be considered part of the response, or should it break the response into two regions separated by a new user prompt?
If you copy some text from a response region and yank it -- elsewhere into this buffer or into another one -- should it continue to be recognized by gptel as an LLM response, or is it now part of the user prompt?
Before you reply: I've heard from users who believe it should obviously work this way, and would not understand why anyone would want the opposite behavior... for both values of this. Consider that there are situations where both possible behaviors are useful. The question is about your mental model of the response: is the LLM response a feature of the text itself, or is it a feature of the position and context of the text in the buffer? (As you might expect, these correspond roughly to two ways of marking text in Emacs buffers, with text-properties or overlays.)
Beta Was this translation helpful? Give feedback.
All reactions