Spaces:
Sleeping
Sleeping
File size: 2,398 Bytes
0e357ce 95305d3 dc3b87c 95305d3 dc3b87c 95305d3 0e357ce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
---
license: apache-2.0
title: Multi-Model-Rag
sdk: streamlit
emoji: π
colorFrom: gray
colorTo: indigo
---
## π Multi-Modal RAG PDF Chatbot
A Streamlit application that allows you to **upload a PDF**, ask questions about its content, and get accurate responses using a **Multi-Modal Retrieval-Augmented Generation (RAG)** pipeline powered by **Groq's Gemma-2 9B model**.
---
### π Features
- π Upload any PDF
- π Intelligent chunking and embedding
- π§ Ask natural language questions about your PDF
- β‘ Powered by FAISS + HuggingFace + Groq LLM
- π§ Caches session so PDF isn't reprocessed on every query
---
### π οΈ Installation (with `venv`)
1. **Clone the repo:**
```bash
git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git
cd Multimodal-Rag-Application01
```
2. **Create and activate a virtual environment:**
```bash
python -m venv venv
# Activate:
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
```
3. **Install dependencies:**
```bash
pip install -r requirements.txt
```
4. **Set up your `.env` file:**
Create a `.env` file in the root directory:
```
GROQ_API_KEY=your_groq_api_key_here
```
---
### π¦ Project Structure
```
π Multimodal-Rag-Application01
βββ main.py # Streamlit frontend
βββ pdfparsing.py # PDF parser using pymupdf4llm
βββ Datapreprocessing.py # Chunking & text cleaning
βββ vectorstore.py # Embedding & FAISS logic
βββ .env # API keys
βββ requirements.txt # Python dependencies
βββ README.md # You're here!
```
---
### βΆοΈ Run the App
```bash
streamlit run main.py
```
Then open `http://localhost:8501` in your browser.
---
### π§ͺ Example Queries
After uploading a PDF, try asking:
- "What is the summary of section 3?"
- "List all benchmarks mentioned."
- "How is this model different from others?"
---
### π‘ Tips
- PDF is processed only once per session using `st.session_state`.
- Uses `RecursiveCharacterTextSplitter` for effective chunking.
- Embedding with `HuggingFaceEmbeddings`.
---
### π Requirements
Make sure your `requirements.txt` includes at least:
```txt
streamlit
python-dotenv
langchain
langchain-community
langchain-groq
faiss-cpu
pymupdf4llm
```
---
### π¬ Credits
Built with β€οΈ by Waris Hayat Abbasi.
--- |