Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CentML inference provider support #809

Open
V2arK opened this issue Jan 17, 2025 · 0 comments
Open

CentML inference provider support #809

V2arK opened this issue Jan 17, 2025 · 0 comments

Comments

@V2arK
Copy link

V2arK commented Jan 17, 2025

🚀 Describe the new functionality needed

We need to integrate CentML as a new remote inference provider within the llama-stack framework. This integration should allow users to seamlessly utilize CentML's available models (meta-llama/Llama-3.3-70B-Instruct and meta-llama/Llama-3.1-405B-Instruct-FP8) for various inference tasks such as chat completions, text completions, and embeddings.

💡 Why is this needed? What if we don't build it?

CentML offers high-performance, scalable inference capabilities for large language models. Integrating it into llama-stack broadens the range of accessible models for users, catering to those who require robust and efficient inference solutions.

If we don't built it, users seeking CentML's specific advantages would be unable to leverage llama-stack, potentially driving them to alternative frameworks that already support CentML.

Other thoughts

No response

@V2arK V2arK changed the title CentML inference Provier support CentML inference Provider support Jan 17, 2025
@V2arK V2arK changed the title CentML inference Provider support CentML inference provider support Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant