RAGTesting / README.md
Nicolai Berk
Init simple wiki RAG
818c1d2
|
raw
history blame
1.52 kB

Mini RAG Demo – Retrieval-Augmented Generation on Wikipedia

This is a lightweight Retrieval-Augmented Generation (RAG) app built with Gradio. It combines semantic search over a mini Wikipedia (rag-datasets/rag-mini-wikipedia) corpus with reranking and language generation to answer natural language questions using real documents.


What It Does

  • Embeds a query using a SentenceTransformer (all-MiniLM-L6-v2)
  • Retrieves the top-5 most semantically similar Wikipedia passages using FAISS
  • Reranks them using a CrossEncoder model (cross-encoder/ms-marco-MiniLM-L-6-v2)
  • Generates an answer using a Hugging Face language model

Tech Stack

  • Gradio – Web interface
  • FAISS – Fast dense vector retrieval
  • Sentence-Transformers – Embedding & reranking
  • Transformers (Hugging Face) – Language model for generation
  • Hugging Face Datasets – Mini Wikipedia corpus (rag-datasets/rag-mini-wikipedia)

Models Used

Purpose Model
Embedding all-MiniLM-L6-v2
Reranking cross-encoder/ms-marco-MiniLM-L-6-v2
Generation mistralai/Mistral-7B-Instruct-v0.2 (optional) or a smaller model

πŸ“¦ Running Locally

To run the app locally:

git clone https://huggingface.co/spaces/YOUR_USERNAME/mini-rag-demo
cd mini-rag-demo
pip install -r requirements.txt
python app.py