Waris01's picture
updates
0e357ce verified

A newer version of the Streamlit SDK is available: 1.48.0

Upgrade
metadata
license: apache-2.0
title: Multi-Model-Rag
sdk: streamlit
emoji: πŸ“š
colorFrom: gray
colorTo: indigo

πŸ“„ Multi-Modal RAG PDF Chatbot

A Streamlit application that allows you to upload a PDF, ask questions about its content, and get accurate responses using a Multi-Modal Retrieval-Augmented Generation (RAG) pipeline powered by Groq's Gemma-2 9B model.


πŸš€ Features

  • πŸ“ Upload any PDF
  • πŸ” Intelligent chunking and embedding
  • 🧠 Ask natural language questions about your PDF
  • ⚑ Powered by FAISS + HuggingFace + Groq LLM
  • 🧠 Caches session so PDF isn't reprocessed on every query

πŸ› οΈ Installation (with venv)

  1. Clone the repo:
git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git
cd Multimodal-Rag-Application01
  1. Create and activate a virtual environment:
python -m venv venv
# Activate:
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up your .env file:

Create a .env file in the root directory:

GROQ_API_KEY=your_groq_api_key_here

πŸ“¦ Project Structure

πŸ“ Multimodal-Rag-Application01
β”œβ”€β”€ main.py                 # Streamlit frontend
β”œβ”€β”€ pdfparsing.py          # PDF parser using pymupdf4llm
β”œβ”€β”€ Datapreprocessing.py   # Chunking & text cleaning
β”œβ”€β”€ vectorstore.py         # Embedding & FAISS logic
β”œβ”€β”€ .env                   # API keys
β”œβ”€β”€ requirements.txt       # Python dependencies
└── README.md              # You're here!

▢️ Run the App

streamlit run main.py

Then open http://localhost:8501 in your browser.


πŸ§ͺ Example Queries

After uploading a PDF, try asking:

  • "What is the summary of section 3?"
  • "List all benchmarks mentioned."
  • "How is this model different from others?"

πŸ’‘ Tips

  • PDF is processed only once per session using st.session_state.
  • Uses RecursiveCharacterTextSplitter for effective chunking.
  • Embedding with HuggingFaceEmbeddings.

πŸ“‹ Requirements

Make sure your requirements.txt includes at least:

streamlit
python-dotenv
langchain
langchain-community
langchain-groq
faiss-cpu
pymupdf4llm

πŸ“¬ Credits

Built with ❀️ by Waris Hayat Abbasi.