Skip to content

Latest commit

 

History

History
34 lines (23 loc) · 2.52 KB

README.md

File metadata and controls

34 lines (23 loc) · 2.52 KB

Scalable LLM Architectures with Redis & GCP Vertex AI

☁️ Generative AI with Google Vertex AI comes with a specialized in-console studio experience, a dedicated API for Gemini and easy-to-use Python SDK designed for deploying and managing instances of Google's powerful language models.

⚡ Redis Enterprise offers fast and scalable vector search, with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its versatile data structures - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference architecture

  1. Primary Data Sources
  2. Data Extraction and Loading
  3. Large Language Models
    • text-embedding-gecko@003 for embeddings
    • gemini-1.5-flash-001 for LLM generation and chat
  4. High-Performance Data Layer (Redis)
    • Semantic caching to improve LLM performance and associated costs
    • Vector search for context retrieval from knowledge base

RAG demo

Open In Colab

Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP. It's a step-by-step walkthrough of setting up the required data, and generating embeddings, and building RAG from scratch in order to build fast LLM apps; highlighting Redis vector search and semantic caching.

Additional resources