CentML inference provider support #809

V2arK · 2025-01-17T19:19:07Z

🚀 Describe the new functionality needed

We need to integrate CentML as a new remote inference provider within the llama-stack framework. This integration should allow users to seamlessly utilize CentML's available models (meta-llama/Llama-3.3-70B-Instruct and meta-llama/Llama-3.1-405B-Instruct-FP8) for various inference tasks such as chat completions, text completions, and embeddings.

💡 Why is this needed? What if we don't build it?

CentML offers high-performance, scalable inference capabilities for large language models. Integrating it into llama-stack broadens the range of accessible models for users, catering to those who require robust and efficient inference solutions.

If we don't built it, users seeking CentML's specific advantages would be unable to leverage llama-stack, potentially driving them to alternative frameworks that already support CentML.

Other thoughts

No response

V2arK changed the title ~~CentML inference Provier support~~ CentML inference Provider support Jan 17, 2025

V2arK changed the title ~~CentML inference Provider support~~ CentML inference provider support Jan 17, 2025

V2arK mentioned this issue Jan 17, 2025

CentML AI Inference Provider Integration #810

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CentML inference provider support #809

CentML inference provider support #809

V2arK commented Jan 17, 2025

CentML inference provider support #809

CentML inference provider support #809

Comments

V2arK commented Jan 17, 2025

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts