Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Structured Output (for Tool Usage) #221

Open
LeonNissen opened this issue Mar 4, 2025 · 1 comment
Open

Suggestion: Structured Output (for Tool Usage) #221

LeonNissen opened this issue Mar 4, 2025 · 1 comment

Comments

@LeonNissen
Copy link

Hi Community,

First off, thank you all for your amazing work on MLX! We’ve been building on this framework in our project SpeziLLM and its just awesome! 🚀

In our exploration of tool usage with MLX, I’ve been following the implementation by @DePasqualeOrg in #174, and one challenge we encountered is that smaller models can sometimes deviate from the system prompt, which leads to text output instead of tool calls.

Some applications—like LM Studio—have introduced a Structured Output API feature. This approach enforces a JSON Schema on the model’s output to ensure consistency. They're using outlines and, as far as I understand, iteratively matching allowed tokens (see https://github.com/dottxt-ai/outlines/blob/main/outlines/processors/structured.py).

I believe integrating a similar Structured Output feature into mlx-swift-examples could greatly improve reliability, especially when dealing with models that might stray from expected responses.

Any tips, ideas, or pointers from the community would be greatly appreciated!

@DePasqualeOrg
Copy link
Contributor

DePasqualeOrg commented Mar 4, 2025

I haven't tried this yet, but one idea that comes to mind is modifying the chat template to include the beginning of whatever output format you want to see in the model's response. For example, the DeepSeek R1 models have a chat template that includes the <think> tag at the beginning of the model's response to ensure that the response includes a thinking block, which it otherwise might not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants