Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available:
1.48.0
metadata
license: apache-2.0
title: Multi-Model-Rag
sdk: streamlit
emoji: π
colorFrom: gray
colorTo: indigo
π Multi-Modal RAG PDF Chatbot
A Streamlit application that allows you to upload a PDF, ask questions about its content, and get accurate responses using a Multi-Modal Retrieval-Augmented Generation (RAG) pipeline powered by Groq's Gemma-2 9B model.
π Features
- π Upload any PDF
- π Intelligent chunking and embedding
- π§ Ask natural language questions about your PDF
- β‘ Powered by FAISS + HuggingFace + Groq LLM
- π§ Caches session so PDF isn't reprocessed on every query
π οΈ Installation (with venv
)
- Clone the repo:
git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git
cd Multimodal-Rag-Application01
- Create and activate a virtual environment:
python -m venv venv
# Activate:
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Set up your
.env
file:
Create a .env
file in the root directory:
GROQ_API_KEY=your_groq_api_key_here
π¦ Project Structure
π Multimodal-Rag-Application01
βββ main.py # Streamlit frontend
βββ pdfparsing.py # PDF parser using pymupdf4llm
βββ Datapreprocessing.py # Chunking & text cleaning
βββ vectorstore.py # Embedding & FAISS logic
βββ .env # API keys
βββ requirements.txt # Python dependencies
βββ README.md # You're here!
βΆοΈ Run the App
streamlit run main.py
Then open http://localhost:8501
in your browser.
π§ͺ Example Queries
After uploading a PDF, try asking:
- "What is the summary of section 3?"
- "List all benchmarks mentioned."
- "How is this model different from others?"
π‘ Tips
- PDF is processed only once per session using
st.session_state
. - Uses
RecursiveCharacterTextSplitter
for effective chunking. - Embedding with
HuggingFaceEmbeddings
.
π Requirements
Make sure your requirements.txt
includes at least:
streamlit
python-dotenv
langchain
langchain-community
langchain-groq
faiss-cpu
pymupdf4llm
π¬ Credits
Built with β€οΈ by Waris Hayat Abbasi.