Skip to content

A Langchain-powered retrieval-augmented-generation pipeline for comprehensive multi-modal analysis of PDFs, specifically tailored for ESG document probing.

License

Notifications You must be signed in to change notification settings

elpolini/The-Document-Maestro

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Document Maestro

A Langchain-powered retrieval-augmented-generation pipeline for comprehensive multi-modal analysis of PDFs, specifically tailored for ESG document probing.

Screenshot from 2024-02-26 16-28-24

Environment

To weave the environment for this digital alchemy, follow these incantations:

conda env create -f environment.yml
conda activate pdfRAG

If the above does not work for you, fear not. Try these alternative spells:

conda create -n "pdfRAG-env" python==3.10
conda activate pdfRAG-env
pip install -U langchain openai chromadb langchain-experimental
pip install "unstructured[all-docs]" pillow pydantic lxml pillow matplotlib chromadb tiktoken
pip intall streamlit

API-Key

Whisper your OPENAI API-key:

  • export OPENAI_API_KEY= <your-api-key-here>

Launch APP

To set sail, chant:

  • streamlit run app.py

About

A Langchain-powered retrieval-augmented-generation pipeline for comprehensive multi-modal analysis of PDFs, specifically tailored for ESG document probing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%