Welcome to VLM Run Cookbook, a comprehensive collection of examples and notebooks demonstrating the power of structured visual understanding using the VLM Run Platform. This repository hosts practical examples and tutorials for extracting structured data from images, videos, and documents using Vision Language Models (VLMs).
- 📚 Practical Examples: A comprehensive collection of Colab notebooks demonstrating real-world applications of VLM Run.
- 🔋 Ready-to-Use: Each example comes with complete code and documentation, making it easy to adapt for your use case.
- 🎯 Domain-Specific: Examples cover various domains from financial documents to TV news analysis.
Our collection of Colab notebooks demonstrates various use cases and integrations:
Name | Type | Colab | Last Updated |
---|---|---|---|
API Quickstart | 02-08-2025 | ||
Schema Showcase | feature | 02-08-2025 | |
Visual Grounding | feature | 02-18-2025 | |
Long-form Video Transcription | feature | 03-13-2025 | |
Video Inference (Fine-Tuning) | feature | 02-18-2025 | |
US Drivers License | application | 02-08-2025 | |
Parsing Financial Presentations | application | 02-04-2025 | |
TV News Analysis | application | 02-15-2025 | |
Fashion Product Catalog | application | 02-20-2025 | |
Fashion Images Hybrid Search | application | 02-21-2025 | |
Generate Custom Schema | feature | 03-13-2025 |
- 💬 Send us an email at support@vlm.run or join our Discord for help
- 📣 Follow us on Twitter and LinkedIn to keep up-to-date on our products
- 📚 Check out our Documentation for detailed guides and API reference