GenAIDevTOProd commited on
Commit
06fbd49
Β·
verified Β·
1 Parent(s): 35c5459

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md CHANGED
@@ -11,4 +11,72 @@ license: apache-2.0
11
  short_description: Minimal RAG API with MiniLM embeddings and FAISS
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
11
  short_description: Minimal RAG API with MiniLM embeddings and FAISS
12
  ---
13
 
14
+ # RAG API (Minimal) β€” MiniLM + FAISS (Gradio)
15
+
16
+ Minimal Retrieval-Augmented Generation (RAG) service built with:
17
+
18
+ - **Sentence-Transformers MiniLM** for embeddings
19
+ - **FAISS** for vector search (cosine similarity)
20
+ - **Gradio** for both UI and API exposure
21
+
22
+ ---
23
+
24
+ ## Features
25
+
26
+ - Ingest documents (one per line) with configurable chunk size/overlap
27
+ - Query top-K relevant chunks with similarity search
28
+ - Get concise answers composed from retrieved context
29
+ - Reset index at any time
30
+ - Call endpoints via **UI or API** (`/api/ingest`, `/api/answer`, `/api/reset`)
31
+
32
+ ---
33
+
34
+ ## Quick Start
35
+
36
+ 1. **Load sample docs β†’ Ingest β†’ Ask a query** using the Gradio UI.
37
+ 2. Programmatic access:
38
+
39
+ ## ```bash
40
+
41
+ ## Ingest
42
+
43
+ curl -s -X POST https://<your-space>.hf.space/api/ingest \
44
+ -H "content-type: application/json" \
45
+ -d '{"data": ["PySpark scales ETL across clusters.\nFAISS powers fast vector similarity search used in retrieval.", 256, 32]}'
46
+
47
+ # Answer
48
+
49
+ curl -s -X POST https://<your-space>.hf.space/api/answer \
50
+ -H "content-type: application/json" \
51
+ -d '{"data": ["What does FAISS do?", 5, 1000]}'
52
+
53
+ ## Python Client
54
+
55
+ from gradio_client import Client
56
+ client = Client("https://<your-space>.hf.space")
57
+ status, size = client.predict("FAISS powers fast vector search.", 256, 32, api_name="/ingest")
58
+ res = client.predict("What does FAISS do?", 5, 1000, api_name="/answer")
59
+ print(res["answer"])
60
+
61
+ ## Tech Stack
62
+
63
+ - Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384-dim)
64
+
65
+ - Vector DB: FAISS (FlatIP index, normalized vectors)
66
+
67
+ - UI & API: Gradio Blocks
68
+
69
+ ## Notes
70
+
71
+ - In-memory index only; resets when Space sleeps.
72
+
73
+ - For persistence, extend with save/load to ./data/.
74
+
75
+ - Demo-focused β€” fast, light, minimal surface.
76
+
77
+ ## Author/Developer: Naga Adithya Kaushik (GenAIDevTOProd)
78
+ ## Utilized AI CoPilot for development purpose : Yes (minimal) - Debug, test cases, experimentation only
79
+
80
+
81
+
82
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference