Spaces:
Sleeping
Sleeping
license: apache-2.0 | |
title: Multi-Model-Rag | |
sdk: streamlit | |
emoji: π | |
colorFrom: gray | |
colorTo: indigo | |
## π Multi-Modal RAG PDF Chatbot | |
A Streamlit application that allows you to **upload a PDF**, ask questions about its content, and get accurate responses using a **Multi-Modal Retrieval-Augmented Generation (RAG)** pipeline powered by **Groq's Gemma-2 9B model**. | |
--- | |
### π Features | |
- π Upload any PDF | |
- π Intelligent chunking and embedding | |
- π§ Ask natural language questions about your PDF | |
- β‘ Powered by FAISS + HuggingFace + Groq LLM | |
- π§ Caches session so PDF isn't reprocessed on every query | |
--- | |
### π οΈ Installation (with `venv`) | |
1. **Clone the repo:** | |
```bash | |
git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git | |
cd Multimodal-Rag-Application01 | |
``` | |
2. **Create and activate a virtual environment:** | |
```bash | |
python -m venv venv | |
# Activate: | |
# On Windows | |
venv\Scripts\activate | |
# On macOS/Linux | |
source venv/bin/activate | |
``` | |
3. **Install dependencies:** | |
```bash | |
pip install -r requirements.txt | |
``` | |
4. **Set up your `.env` file:** | |
Create a `.env` file in the root directory: | |
``` | |
GROQ_API_KEY=your_groq_api_key_here | |
``` | |
--- | |
### π¦ Project Structure | |
``` | |
π Multimodal-Rag-Application01 | |
βββ main.py # Streamlit frontend | |
βββ pdfparsing.py # PDF parser using pymupdf4llm | |
βββ Datapreprocessing.py # Chunking & text cleaning | |
βββ vectorstore.py # Embedding & FAISS logic | |
βββ .env # API keys | |
βββ requirements.txt # Python dependencies | |
βββ README.md # You're here! | |
``` | |
--- | |
### βΆοΈ Run the App | |
```bash | |
streamlit run main.py | |
``` | |
Then open `http://localhost:8501` in your browser. | |
--- | |
### π§ͺ Example Queries | |
After uploading a PDF, try asking: | |
- "What is the summary of section 3?" | |
- "List all benchmarks mentioned." | |
- "How is this model different from others?" | |
--- | |
### π‘ Tips | |
- PDF is processed only once per session using `st.session_state`. | |
- Uses `RecursiveCharacterTextSplitter` for effective chunking. | |
- Embedding with `HuggingFaceEmbeddings`. | |
--- | |
### π Requirements | |
Make sure your `requirements.txt` includes at least: | |
```txt | |
streamlit | |
python-dotenv | |
langchain | |
langchain-community | |
langchain-groq | |
faiss-cpu | |
pymupdf4llm | |
``` | |
--- | |
### π¬ Credits | |
Built with β€οΈ by Waris Hayat Abbasi. | |
--- |