karthink · karthink · Mar 5, 2025 · Feb 22, 2025 · Feb 23, 2025 · Feb 23, 2025
diff --git a/README.org b/README.org
@@ -121,14 +121,18 @@ gptel uses Curl if available, but falls back to the built-in url-retrieve to wor
     - [[#rewrite-refactor-or-fill-in-a-region][Rewrite, refactor or fill in a region]]
     - [[#extra-org-mode-conveniences][Extra Org mode conveniences]]
   - [[#faq][FAQ]]
-      - [[#i-want-to-use-gptel-in-a-way-thats-not-supported-by-gptel-send-or-the-options-menu][I want to use gptel in a way that's not supported by =gptel-send= or the options menu]]
+    - [[#chat-buffer-ui][Chat buffer UI]]
       - [[#i-want-the-window-to-scroll-automatically-as-the-response-is-inserted][I want the window to scroll automatically as the response is inserted]]
       - [[#i-want-the-cursor-to-move-to-the-next-prompt-after-the-response-is-inserted][I want the cursor to move to the next prompt after the response is inserted]]
       - [[#i-want-to-change-the-formatting-of-the-prompt-and-llm-response][I want to change the formatting of the prompt and LLM response]]
+      - [[#how-does-gptel-distinguish-between-user-prompts-and-llm-responses][How does gptel distinguish between user prompts and LLM responses?]]
+    - [[#transient-menu-behavior][Transient menu behavior]]
       - [[#i-want-the-transient-menu-options-to-be-saved-so-i-only-need-to-set-them-once][I want the transient menu options to be saved so I only need to set them once]]
+      - [[#using-the-transient-menu-leaves-behind-extra-windows][Using the transient menu leaves behind extra windows]]
       - [[#can-i-change-the-transient-menu-key-bindings][Can I change the transient menu key bindings?]]
-      - [[#how-does-gptel-distinguish-between-user-prompts-and-llm-responses][How does gptel distinguish between user prompts and LLM responses?]]
       - [[#doom-emacs-sending-a-query-from-the-gptel-menu-fails-because-of-a-key-conflict-with-org-mode][(Doom Emacs) Sending a query from the gptel menu fails because of a key conflict with Org mode]]
+    - [[#miscellaneous][Miscellaneous]]
+      - [[#i-want-to-use-gptel-in-a-way-thats-not-supported-by-gptel-send-or-the-options-menu][I want to use gptel in a way that's not supported by =gptel-send= or the options menu]]
       - [[#chatgpt-i-get-the-error-http2-429-you-exceeded-your-current-quota][(ChatGPT) I get the error "(HTTP/2 429) You exceeded your current quota"]]
       - [[#why-another-llm-client][Why another LLM client?]]
   - [[#additional-configuration][Additional Configuration]]
@@ -509,6 +513,29 @@ The above code makes the backend available to select.  If you want it to be the
                  :stream t :key "your-api-key"))
 #+end_src
 
+***** (Optional) Interim support for Claude 3.7 Sonnet
+
+gptel does not yet support specifying LLM "reasoning"/"thinking" behaviors dynamically through its interface.  This effort is ongoing, but in the meantime you use the Claude 3.7 Sonnet model in its "thinking" mode by defining a second Claude backend and selecting it in via the UI or elisp:
+
+#+begin_src emacs-lisp
+(gptel-make-anthropic "Claude-thinking" ;Any name you want
+  :key "your-API-key"
+  :stream t
+  :models '(claude-3-7-sonnet-20250219)
+  :header (lambda () (when-let* ((key (gptel--get-api-key)))
+                  `(("x-api-key" . ,key)
+                    ("anthropic-version" . "2023-06-01")
+                    ("anthropic-beta" . "pdfs-2024-09-25")
+                    ("anthropic-beta" . "output-128k-2025-02-19")
+                    ("anthropic-beta" . "prompt-caching-2024-07-31"))))
+  :request-params '(:thinking (:type "enabled" :budget_tokens 2048)
+                    :max_tokens 4096))
+#+end_src
+
+You can set the reasoning budget tokens and max tokens for this usage via the =:budget_tokens= and =:max_tokens= keys here, respectively.
+
+Once proper support for specifying reasoning behaviors is added to gptel's UI this will be unnecessary.
+
 #+html: </details>
 #+html: <details><summary>
 **** Groq
@@ -860,7 +887,7 @@ gptel provides a few powerful, general purpose and flexible commands.  You can d
 2. If a region is selected, the conversation will be limited to its contents.
 
 3. Call =M-x gptel-send= with a prefix argument (~C-u~)
-   - to set chat parameters (GPT model, backend, system message etc) for this buffer,
+   - to set chat parameters (model, backend, system message etc) for this buffer,
    - include quick instructions for the next request only,
    - to add additional context -- regions, buffers or files -- to gptel,
    - to read the prompt from or redirect the response elsewhere,
@@ -1123,15 +1150,7 @@ Note: using this option requires Org 9.7 or higher to be available.  The [[https
 You can declare the gptel model, backend, temperature, system message and other parameters as Org properties with the command =gptel-org-set-properties=.  gptel queries under the corresponding heading will always use these settings, allowing you to create mostly reproducible LLM chat notebooks, and to have simultaneous chats with different models, model settings and directives under different Org headings.
 
 ** FAQ
-#+html: <details><summary>
-**** I want to use gptel in a way that's not supported by =gptel-send= or the options menu
-#+html: </summary>
-
-gptel's default usage pattern is simple, and will stay this way: Read input in any buffer and insert the response below it.  Some custom behavior is possible with the transient menu (=C-u M-x gptel-send=).
-
-For more programmable usage, gptel provides a general =gptel-request= function that accepts a custom prompt and a callback to act on the response. You can use this to build custom workflows not supported by =gptel-send=.  See the documentation of =gptel-request=, and the [[https://github.com/karthink/gptel/wiki/Defining-custom-gptel-commands][wiki]] for examples.
-
-#+html: </details>
+*** Chat buffer UI
 #+html: <details><summary>
 **** I want the window to scroll automatically as the response is inserted
 #+html: </summary>
@@ -1155,7 +1174,6 @@ To be minimally annoying, gptel does not move the cursor by default.  Add the fo
 
 You can also call =gptel-end-of-response= as a command at any time.
 
-
 #+html: </details>
 #+html: <details><summary>
 **** I want to change the formatting of the prompt and LLM response
@@ -1167,6 +1185,18 @@ Anywhere in Emacs: Use =gptel-pre-response-hook= and =gptel-post-response-functi
 
 #+html: </details>
 #+html: <details><summary>
+**** How does gptel distinguish between user prompts and LLM responses?
+#+html: </summary>
+
+gptel uses [[https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Properties.html][text-properties]] to watermark LLM responses.  Thus this text is interpreted as a response even if you copy it into another buffer.  In regular buffers (buffers without =gptel-mode= enabled), you can turn off this tracking by unsetting =gptel-track-response=.
+
+When restoring a chat state from a file on disk, gptel will apply these properties from saved metadata in the file when you turn on =gptel-mode=.
+
+gptel does /not/ use any prefix or semantic/syntax element in the buffer (such as headings) to separate prompts and responses.  The reason for this is that gptel aims to integrate as seamlessly as possible into your regular Emacs usage: LLM interaction is not the objective, it's just another tool at your disposal.  So requiring a bunch of "user" and "assistant" tags in the buffer is noisy and restrictive. If you want these demarcations, you can customize =gptel-prompt-prefix-alist= and =gptel-response-prefix-alist=.  Note that these prefixes are for your readability only and purely cosmetic.
+
+#+html: </details>
+*** Transient menu behavior
+#+html: <details><summary>
 **** I want the transient menu options to be saved so I only need to set them once
 #+html: </summary>
 
@@ -1191,24 +1221,24 @@ Or see this [[https://github.com/karthink/gptel/wiki/Commonly-requested-features
 
 #+html: </details>
 #+html: <details><summary>
-**** Can I change the transient menu key bindings?
+**** Using the transient menu leaves behind extra windows
 #+html: </summary>
 
-Yes, see =transient-suffix-put=.  This changes the key to select a backend/model from "-m" to "M" in gptel's menu:
-#+begin_src emacs-lisp
-(transient-suffix-put 'gptel-menu (kbd "-m") :key "M")
-#+end_src
+If using gptel's transient menus causes new/extra window splits to be created, check your value of =transient-display-buffer-action=.  [[https://github.com/magit/transient/discussions/358][See this discussion]] for more context.
+
+If you are using Helm, see [[https://github.com/magit/transient/discussions/361][Transient#361]].
+
+In general, do not customize this Transient option unless you know what you're doing!
 
 #+html: </details>
 #+html: <details><summary>
-**** How does gptel distinguish between user prompts and LLM responses?
+**** Can I change the transient menu key bindings?
 #+html: </summary>
 
-gptel uses [[https://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Properties.html][text-properties]] to watermark LLM responses.  Thus this text is interpreted as a response even if you copy it into another buffer.  In regular buffers (buffers without =gptel-mode= enabled), you can turn off this tracking by unsetting =gptel-track-responses=.
-
-When restoring a chat state from a file on disk, gptel will apply these properties from saved metadata in the file when you turn on =gptel-mode=.
-
-gptel does /not/ use any prefix or semantic/syntax element in the buffer (such as headings) to separate prompts and responses.  The reason for this is that gptel aims to integrate as seamlessly as possible into your regular Emacs usage: LLM interaction is not the objective, it's just another tool at your disposal.  So requiring a bunch of "user" and "assistant" tags in the buffer is noisy and restrictive. If you want these demarcations, you can customize =gptel-prompt-prefix-alist= and =gptel-response-prefix-alist=.  Note that these prefixes are for your readability only and purely cosmetic.
+Yes, see =transient-suffix-put=.  This changes the key to select a backend/model from "-m" to "M" in gptel's menu:
+#+begin_src emacs-lisp
+(transient-suffix-put 'gptel-menu (kbd "-m") :key "M")
+#+end_src
 
 #+html: </details>
 #+html: <details><summary>
@@ -1224,6 +1254,16 @@ Two solutions:
   (transient-suffix-put 'gptel-menu (kbd "RET") :key "<f8>")
   #+end_src
 
+#+html: </details>
+*** Miscellaneous
+#+html: <details><summary>
+**** I want to use gptel in a way that's not supported by =gptel-send= or the options menu
+#+html: </summary>
+
+gptel's default usage pattern is simple, and will stay this way: Read input in any buffer and insert the response below it.  Some custom behavior is possible with the transient menu (=C-u M-x gptel-send=).
+
+For more programmable usage, gptel provides a general =gptel-request= function that accepts a custom prompt and a callback to act on the response. You can use this to build custom workflows not supported by =gptel-send=.  See the documentation of =gptel-request=, and the [[https://github.com/karthink/gptel/wiki/Defining-custom-gptel-commands][wiki]] for examples.
+
 #+html: </details>
 #+html: <details><summary>
 **** (ChatGPT) I get the error "(HTTP/2 429) You exceeded your current quota"
@@ -1246,7 +1286,7 @@ Other Emacs clients for LLMs prescribe the format of the interaction (a comint s
 2. Integration with org-mode, not using a walled-off org-babel block, but as regular text.  This way the model can generate code blocks that I can run.
 
 #+html: </details>
-#+html: <details><summary>
+
 ** Additional Configuration
 :PROPERTIES:
 :ID:       f885adac-58a3-4eba-a6b7-91e9e7a17829

diff --git a/gptel-anthropic.el b/gptel-anthropic.el
@@ -283,29 +283,30 @@ TOOL-USE is a list of plists containing tool names, arguments and call results."
           ;; We check for blank prompts by skipping whitespace and comparing
           ;; point against the previous.
           (unless (save-excursion (skip-syntax-forward " ") (>= (point) prev-pt))
+            ;; XXX update for tools
             (pcase (get-char-property (point) 'gptel)
               ('response
-               (push (list :role "assistant"
-                           :content (buffer-substring-no-properties (point) prev-pt))
-                     prompts))
+               (when-let* ((content
+                            (gptel--trim-prefixes
+                             (buffer-substring-no-properties (point) prev-pt))))
+                 (when (not (string-blank-p content))
+                   (push (list :role "assistant" :content content) prompts))))
               ('nil                     ; user role: possibly with media
-               (if include-media       
-                   (push (list :role "user"
-                               :content
-                               (gptel--anthropic-parse-multipart
-                                (gptel--parse-media-links major-mode (point) prev-pt)))
-                         prompts)
-                 (push (list :role "user"
-                             :content
-                             (gptel--trim-prefixes
-                              (buffer-substring-no-properties (point) prev-pt)))
-                       prompts)))))
+               (if include-media
+                   (when-let* ((content (gptel--anthropic-parse-multipart
+                                         (gptel--parse-media-links major-mode (point) prev-pt))))
+                     (when (> (length content) 0)
+                       (push (list :role "user" :content content) prompts)))
+                 (when-let* ((content (gptel--trim-prefixes
+                                       (buffer-substring-no-properties (point) prev-pt))))
+                   (push (list :role "user" :content content) prompts))))))
           (setq prev-pt (point))
           (and max-entries (cl-decf max-entries)))
-      (push (list :role "user"
-                  :content
-                  (string-trim (buffer-substring-no-properties (point-min) (point-max))))
-            prompts))
+      (when-let* ((content (string-trim (buffer-substring-no-properties
+                                         (point-min) (point-max)))))
+        ;; XXX fails if content is empty.  The correct error behavior is left to
+        ;; a future discussion.
+        (push (list :role "user" :content content) prompts)))
     prompts))
 
 (defun gptel--anthropic-parse-multipart (parts)
@@ -328,7 +329,7 @@ format."
    for media = (plist-get part :media)
    if text do
    (and (or (= n 1) (= n last)) (setq text (gptel--trim-prefixes text))) and
-   unless (string-empty-p text)
+   if text
    collect `(:type "text" :text ,text) into parts-array end
    else if media
    do
@@ -387,7 +388,15 @@ files in the context."
 ;;         (plist-get (car (last prompts)) :content)))
 
 (defconst gptel--anthropic-models
-  '((claude-3-5-sonnet-20241022
+  '((claude-3-7-sonnet-20250219
+     :description "Hybrid model capable of standard thinking and extended thinking modes"
+     :capabilities (media tool-use cache)
+     :mime-types ("image/jpeg" "image/png" "image/gif" "image/webp" "application/pdf")
+     :context-window 200
+     :input-cost 3
+     :output-cost 15
+     :cutoff-date "2025-02")
+    (claude-3-5-sonnet-20241022
      :description "Highest level of intelligence and capability"
      :capabilities (media tool-use cache)
      :mime-types ("image/jpeg" "image/png" "image/gif" "image/webp" "application/pdf")

diff --git a/gptel-curl.el b/gptel-curl.el
@@ -223,42 +223,56 @@ PROCESS and _STATUS are process parameters."
     (setf (alist-get process gptel--request-alist nil 'remove) nil)
     (kill-buffer proc-buf)))
 
-(defun gptel-curl--stream-insert-response (response info)
+(defun gptel-curl--stream-insert-response (response info &optional raw)
   "Insert streaming RESPONSE from an LLM into the gptel buffer.
 
 INFO is a mutable plist containing information relevant to this buffer.
-See `gptel--url-get-response' for details."
-  (cond
-   ((stringp response)
-    (let ((start-marker (plist-get info :position))
-          (tracking-marker (plist-get info :tracking-marker))
-          (transformer (plist-get info :transformer)))
-      (with-current-buffer (marker-buffer start-marker)
-        (save-excursion
-          (unless tracking-marker
-            (goto-char start-marker)
-            (unless (or (bobp) (plist-get info :in-place))
-              (insert gptel-response-separator)
-              (when gptel-mode
-                ;; Put prefix before AI response.
-                (insert (gptel-response-prefix-string)))
-              (move-marker start-marker (point)))
-            (setq tracking-marker (set-marker (make-marker) (point)))
-            (set-marker-insertion-type tracking-marker t)
-            (plist-put info :tracking-marker tracking-marker))
-
-          (when transformer
-            (setq response (funcall transformer response)))
-
-          (add-text-properties
-           0 (length response) '(gptel response front-sticky (gptel))
-           response)
-          (goto-char tracking-marker)
-          ;; (run-hooks 'gptel-pre-stream-hook)
-          (insert response)
-          (run-hooks 'gptel-post-stream-hook)))))
-   ((consp response)
-    (gptel--display-tool-calls response info))))
+See `gptel--url-get-response' for details.
+
+Optional RAW disables text properties and transformation."
+  (pcase response
+    ((pred stringp)
+     (let ((start-marker (plist-get info :position))
+           (tracking-marker (plist-get info :tracking-marker))
+           (message-marker (plist-get info :message-marker))
+           (transformer (plist-get info :transformer))
+           (in-place (plist-get info :in-place)))
+       (with-current-buffer (marker-buffer start-marker)
+         (save-excursion
+           (unless tracking-marker
+             (goto-char start-marker)
+             (setq tracking-marker (set-marker (make-marker) (point)))
+             (set-marker-insertion-type tracking-marker t)
+             (plist-put info :tracking-marker tracking-marker))
+           (goto-char tracking-marker)
+           (when (and gptel-mode (not (or raw in-place)))
+             (unless (and message-marker (= tracking-marker message-marker))
+               (unless (bobp)
+                 (insert gptel-response-separator)))
+             (unless (plist-get info :prefix-done)
+               (insert (gptel-response-prefix-string))
+               (plist-put info :prefix-done t)
+               (move-marker start-marker (point))))
+           (unless raw
+             (when transformer
+               (setq response (funcall transformer response)))
+             (add-text-properties
+              0 (length response) '(gptel response front-sticky (gptel))
+              response))
+           ;; (run-hooks 'gptel-pre-stream-hook)
+           (insert response)
+           (when (and gptel-mode (not raw))
+               (if message-marker
+                   (move-marker message-marker (point))
+                 (plist-put info :message-marker (point-marker))))
+           (run-hooks 'gptel-post-stream-hook)))))
+    (`(reasoning . ,_text)
+       (display-warning '(gptel gptel-reasoning)
+                        "Reasoning unsupported." :warning))
+    (`(tool-call . ,tool-calls)
+     (gptel--display-tool-calls tool-calls info))
+    (`(tool-result . ,tool-results)
+     (gptel--display-tool-results tool-results info))))
 
 (defun gptel-curl--stream-filter (process output)
   (let* ((fsm (alist-get process gptel--request-alist))