PDFGem Chat is an interactive chat interface designed for querying information from uploaded PDF files. This project utilizes Streamlit, PyPDF2, LangChain, Google Generative AI, and FAISS to create a seamless experience for users to ask questions related to the content of PDF documents.
![image](https://private-user-images.githubusercontent.com/101057653/296283473-15fe59a5-cf8a-4b9b-8e84-536a99bf1cfe.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1MTczODMsIm5iZiI6MTczOTUxNzA4MywicGF0aCI6Ii8xMDEwNTc2NTMvMjk2MjgzNDczLTE1ZmU1OWE1LWNmOGEtNGI5Yi04ZTg0LTUzNmE5OWJmMWNmZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE0JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNFQwNzExMjNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0zN2U5ZDFlOTk4NjYyZjU0Y2VmOGI2NDFkZDFiM2ZiYjg2NmFjZjRlYmY4MjViNDEwZTIwZTljNDg3YWVhYjdhJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.lLjpHKn0HySw6laRTuZZQJE5GC922mzywAODb_gXoxI)
- Developed using the Streamlit library for a user-friendly experience.
- Users can ask questions about the content of uploaded PDF files.
- Extracts text from PDF files using PyPDF2.
- Splits extracted text into manageable chunks.
- Leverages Google Generative AI Embeddings for converting text into vectors.
- Applies FAISS (Facebook AI Similarity Search) to create a vector store/index of text chunks.
- Implements a conversational chain for question-answering using the Gemini Generative AI model.
- Configures the prompt template for providing context and framing questions.
- Users upload PDF files and ask questions through the interface.
- Text is extracted from PDFs, split into chunks, and converted into vectors.
- The conversational chain processes user input, searches for similar text chunks, and generates responses.
main()
function: Sets up the Streamlit interface and handles user input.get_pdf_text(pdf_docs)
function: Extracts text from PDF files.get_text_chunks(text)
function: Splits text into manageable chunks.get_vector_store(text_chunks)
function: Creates a vector store/index from text chunks.get_conversational_chain()
function: Configures the conversational chain for question-answering.user_input(user_question)
function: Processes user input and generates responses.- Environment variables: Utilizes the
dotenv
library to securely load the Google API key.
- Upload PDFs: Use the sidebar to upload one or more PDF files.
- Ask a Question: Enter your question in the provided text input.
- Submit & Process: Click the button to initiate the processing of PDFs and question-answering.
- View Response: The system generates a response based on the input question and the content of the PDFs.
- Streamlit
- PyPDF2
- LangChain
- Google Generative AI
- FAISS
- Dotenv
- Install Dependencies: Ensure the required Python packages are installed.
- Set up Google API Key: Store the Google API key in a secure manner using the
dotenv
file. - Run the Application: Execute the script to launch the Streamlit interface.
Feel free to explore and enhance the functionalities of this project based on your requirements.
PDFGemini - Unleash the Power of Conversational PDF Exploration! 💬✨