Change the repository type filter
All
Repositories list
150 repositories
qwen2.5-vl-7b-instruct
Public templateVision-Language model that integrates advanced image, video, and text understanding. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>stable-diffusion-s3-image-save
Public templateUses Stable Diffusion to generate images and automatically uploads them to an S3 bucket. <metadata> gpu: A100 | collections: ["S3 Storage", "Complex Outputs"] </metadata>vicuna-7b-1.1
Public templateOpen-source chatbot fine-tuned from LLaMA on 70K ShareGPT conversations, optimized for research and conversational tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>tinyllama-1.1b-chat-v1.0
Public templateA chat model fine-tuned on TinyLlama, a compact 1.1B Llama model pretrained on 3 trillion tokens. <metadata> gpu: T4 | collections: ["vLLM"] </metadata>rmbg-1.4
Public templateState-of-the-art background removal model, designed to effectively separate foreground from background. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>qwen2.5-coder-32b-instruct
Public templateA State-Of-The-Art coder LLM, tailored for instruction-based tasks, particularly in code generation, reasoning, and repair. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>pyannote-speaker-diarization-3.1
Public templateA state-of-the-art model that segments and labels audio recordings by accurately distinguishing different speakers. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>playground-v2.5
Public templateGenerate highly aesthetic 1024x1024 images with superior quality, flexible aspect ratios, and outstanding human preference alignment. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>phi-3.5-moe-instruct
Public templateAn instruction-tuned variant of Phi-3.5, delivering efficient, context-aware responses across diverse language tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>openchat-3.5
Public templateA fine-tuned chat model with C-RLFT - a strategy inspired by offline reinforcement learning, optimized for natural, context-aware conversations, excelling in instruction following and text generation tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>neuralhermes-2.5-mistral-7b-gptq
Public templateA GPTQ‑quantized 7B language model based on Mistral, fine‑tuned for robust, efficient conversational and text generation tasks. <metadata> gpu: A100 | collections: ["vLLM","GPTQ"] </metadata>mixtral-8x7b-v0.1
Public templateA GPTQ-quantized variant of the Mixtral 8x7B model, fine-tuned for efficient text generation and conversational applications. <metadata> gpu: A100 | collections: ["vLLM","GPTQ"] </metadata>mistral-7b-instruct-v0.2
Public templateAn 7B model with a 32k token context window and optimized attention mechanisms for superior dialogue and reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>mistral-7b-instruct-v0.3
Public template7B model fine-tuned for precise instruction following and robust contextual understanding. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>llama-3.2-3b-instruct
Public template3B compact instruction-tuned model generate detailed responses across a range of tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>llama-3.2-11b-vision-instruct
Public template11B multimodal model integrating vision and text for image reasoning, captioning, and Q&A. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>llama-3.1-8b-instruct
Public templateAn 8B multilingual instruction model fine-tuned with RLHF for chat completion, supporting up to 128k tokens. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>flux.1-schnell
Public template12B text-to-image mode that upscales and generates high-quality images from text prompts using advanced diffusion techniques. <metadata> gpu: A100 | collections: ["Diffusers","Variable Inputs"] </metadata>dolphin-2.5-mixtral-8x7b-gptq
Public templateA GPTQ‑quantized version of Eric Hartford’s Dolphin 2.5 Mixtral 8x7B model, fine‑tuned for coding and conversational tasks. <metadata> gpu: A100 | collections: ["vLLM","GPTQ"] </metadata>deepseek-coder-6.7b-instruct
Public templateA 6.7B model fine-tuned on 2 billion tokens of instruction data, designed for code generation and completion tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>animagine-xl-3.0
Public templateHigh-quality image generation from text prompts, with improved hand anatomy and concept understanding. <metadata> gpu: A10 | collections: ["Diffusers"] </metadata>animagine-xl-3.1
Public templateGenerates high-quality anime images with improved hand anatomy and new aesthetic tags for enhanced image creation. <metadata> gpu: A10 | collections: ["Diffusers"] </metadata>- An 8B-parameter, instruction-tuned variant of Meta's Llama-3.1 model, optimized in GGUF format for efficient inference. <metadata> gpu: A100 | collections: ["Using NFS Volumes", "lama.cpp"] </metadata>
phi-4-GGUF
Public templateA 14B model optimized in GGUF format for efficient inference, designed to excel in complex reasoning tasks. <metadata> gpu: A100 | collections: ["llama.cpp"] </metadata>mistral-7b
Public templateA 7B autoregressive language model by Mistral AI, optimized for efficient text generation and robust reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>gemma-2-9b-it
Public templateInstruct-tuned model for instruction following, delivering coherent, high-quality responses across a broad spectrum of tasks. <metadata> gpu: A10 | collections: ["HF Transformers"] </metadata>stable-diffusion-xl-turbo
Public templateA distilled and cost-effective variant of SDXL that delivers high-quality text-to-image generation with accelerated inference speed. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>DeciLM-7B
Publicqwq-32b-preview
Public templateA 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>whisper-large-v3-turbo
Public templateA turbocharged variant of Whisper large‑v3 for English speech recognition, optimized for lower latency. <metadata> gpu: T4 | collections: ["HF Transformers","Complex Outputs"] </metadata>