Skip to content

Latest commit

 

History

History
36 lines (23 loc) · 978 Bytes

README.md

File metadata and controls

36 lines (23 loc) · 978 Bytes

The Document Maestro

A Langchain-powered retrieval-augmented-generation pipeline for comprehensive multi-modal analysis of PDFs, specifically tailored for ESG document probing.

Screenshot from 2024-02-26 16-28-24

Environment

To weave the environment for this digital alchemy, follow these incantations:

conda env create -f environment.yml
conda activate pdfRAG

If the above does not work for you, fear not. Try these alternative spells:

conda create -n "pdfRAG-env" python==3.10
conda activate pdfRAG-env
pip install -U langchain openai chromadb langchain-experimental
pip install "unstructured[all-docs]" pillow pydantic lxml pillow matplotlib chromadb tiktoken
pip intall streamlit

API-Key

Whisper your OPENAI API-key:

  • export OPENAI_API_KEY= <your-api-key-here>

Launch APP

To set sail, chant:

  • streamlit run app.py