You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off, thank you all for your amazing work on MLX! We’ve been building on this framework in our project SpeziLLM and its just awesome! 🚀
In our exploration of tool usage with MLX, I’ve been following the implementation by @DePasqualeOrg in #174, and one challenge we encountered is that smaller models can sometimes deviate from the system prompt, which leads to text output instead of tool calls.
I believe integrating a similar Structured Output feature into mlx-swift-examples could greatly improve reliability, especially when dealing with models that might stray from expected responses.
Any tips, ideas, or pointers from the community would be greatly appreciated!
The text was updated successfully, but these errors were encountered:
I haven't tried this yet, but one idea that comes to mind is modifying the chat template to include the beginning of whatever output format you want to see in the model's response. For example, the DeepSeek R1 models have a chat template that includes the <think> tag at the beginning of the model's response to ensure that the response includes a thinking block, which it otherwise might not.
Hi Community,
First off, thank you all for your amazing work on MLX! We’ve been building on this framework in our project SpeziLLM and its just awesome! 🚀
In our exploration of tool usage with MLX, I’ve been following the implementation by @DePasqualeOrg in #174, and one challenge we encountered is that smaller models can sometimes deviate from the system prompt, which leads to text output instead of tool calls.
Some applications—like LM Studio—have introduced a Structured Output API feature. This approach enforces a JSON Schema on the model’s output to ensure consistency. They're using outlines and, as far as I understand, iteratively matching allowed tokens (see https://github.com/dottxt-ai/outlines/blob/main/outlines/processors/structured.py).
I believe integrating a similar Structured Output feature into mlx-swift-examples could greatly improve reliability, especially when dealing with models that might stray from expected responses.
Any tips, ideas, or pointers from the community would be greatly appreciated!
The text was updated successfully, but these errors were encountered: