Spaces:

ghostai1
/

sentence-transformers

Sleeping

App Files Files Community

ghostai1 commited on May 27

Commit

f24b473

verified ·

1 Parent(s): 71b51dc

Update README.md

Browse files

Files changed (1) hide show

README.md +54 -1

README.md CHANGED Viewed

@@ -11,4 +11,57 @@ license: apache-2.0
 short_description: Small CNN
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: Small CNN
 ---
+# 🔍 MiniLM Semantic FAQ Search &mdash; Smart, Lightning-Fast Knowledge Retrieval
+[![Hugging Face Space](https://img.shields.io/badge/HF%20Space-Launch-lightgrey?logo=huggingface)](https://huggingface.co/spaces/your-username/minilm-semantic-search)
+[![Gradio UI](https://img.shields.io/badge/Gradio-5.31.0-green?logo=gradio)](https://gradio.app)
+[![Model](https://img.shields.io/badge/Model-all--MiniLM--L6--v2-blue)](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
+[![License](https://img.shields.io/github/license/your-username/minilm-semantic-search)](LICENSE)
+---
+## 🚀  TL;DR
+**Ask a question → get the three most relevant answers from a curated FAQ &mdash; all in real time on a free CPU-only Hugging Face Space.**
+Powered by the _all-MiniLM-L6-v2_ sentence-transformer (∼90 MB, < 1 GB RAM) and a minimalist Gradio 5 UI.
+---
+## ✨  Why You’ll Love It
+| · | Capability | Why It Matters |
+|---|------------|----------------|
+| ⚡ | **Instant Retrieval** | 50-200 ms response time even on CPU-only hardware. |
+| 🧠 | **Semantic Matching** | Goes beyond keywords; understands intent and phrasing. |
+| 📈 | **Live Similarity Scores** | Transparent confidence metrics for every hit. |
+| 🎛️ | **Interactive Slider** | Choose 1-5 results in a single drag. |
+| 🎨 | **Sleek Gradio GUI** | No setup friction — just open a browser and explore. |
+| 💸 | **Free-Tier Friendly** | Fits comfortably inside Hugging Face Spaces’ 2 vCPU / 16 GB RAM limit. |
+| 🛠️ | **Drop-in Dataset Swap** | Replace `faqs.csv` with thousands of your own Q-A pairs &mdash; no retraining required. |
+---
+## 🏗️  How It Works
+1. **Vectorisation**
+   Every FAQ question is embedded with `sentence-transformers/all-MiniLM-L6-v2` into a 384-dimensional vector (done once at start-up).
+2. **Inference**
+   A user query is embedded on the fly and cosine-compared with all FAQ vectors via 🤗 `util.cos_sim`.
+3. **Ranking**
+   Top-_k_ indices are extracted with PyTorch’s efficient `topk`, then mapped back to the original FAQ rows.
+4. **Presentation**
+   Gradio displays the question, answer and similarity score in a responsive dataframe.
+> _No database, no external search engine, just straight Python & PyTorch embeddings._
+---
+## 🖥️  Quick Start (Local Dev, Optional)
+```bash
+git clone https://github.com/your-username/minilm-semantic-search.git
+cd minilm-semantic-search
+python -m venv venv && source venv/bin/activate  # Windows: venv\Scripts\activate
+pip install -r requirements.txt
+python app.py