Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-K Sampling Support #59

Closed
mikemykhaylov opened this issue Dec 15, 2024 · 4 comments · Fixed by #80
Closed

Top-K Sampling Support #59

mikemykhaylov opened this issue Dec 15, 2024 · 4 comments · Fixed by #80
Labels
enhancement New feature or request

Comments

@mikemykhaylov
Copy link
Contributor

Hello,

I find that for some models, like Mistral Nemo, it is very beneficial to restrict the number of considered completions to improve the coherence of the output. However, compared to llama.cpp, MLX does not seem to support top K sampling in LM Studio. It does look like the underlying library supports that, so implementing that would be much appreciated.

llama.cpp models support Top-K
Screenshot 2024-12-15 at 12 29 11

MLX models do not support Top-K
Screenshot 2024-12-15 at 12 29 44

@neilmehta24
Copy link
Member

Thanks for bringing this to our attention. The MLX core library does indeed support a top-k matrix operation, but the MLX LLM library does not support top-k sampling. Here are the supported generation/sampling options as of today https://github.com/ml-explore/mlx-examples/blob/dfa4dd6/llms/mlx_lm/utils.py#L200-L215 . Please track this issue ml-explore/mlx-examples#1167 for adding support in mlx_lm

@neilmehta24 neilmehta24 added the enhancement New feature or request label Dec 16, 2024
@mikemykhaylov
Copy link
Contributor Author

mikemykhaylov commented Jan 21, 2025

Looks like the upstream PR got merged, any chance we could have the sampler in the MLX engine?

@neilmehta24
Copy link
Member

Looks like the upstream PR got merged, any chance we could have the sampler in the MLX engine?

Shouldn't be difficult to add to our app now, I should be able to get this done soon.

@neilmehta24
Copy link
Member

@mikemykhaylov FYI this is slightly delayed since we need to update the LM Studio UI to inform users about the limitation described here ml-explore/mlx-examples#1219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants