Zwounds commited on
Commit
cc432be
·
verified ·
1 Parent(s): 01afcca

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -103,18 +103,18 @@ This Space demonstrates a Retrieval-Augmented Generation (RAG) application built
103
 
104
  **How it works:**
105
 
106
- 1. **Data Source:** Content extracted from LibGuides (`extracted_content.jsonl`).
107
- 2. **Embedding:** On first startup, the application uses the `BAAI/bge-m3` sentence transformer model (run locally within the Space) to embed the LibGuides content and stores it in a ChromaDB vector database (`./chroma_db`). This database persists if the Space uses persistent storage.
108
  3. **Query Processing:**
109
- * User queries are optionally expanded using the generation model.
110
- * Queries are embedded using the same local `BAAI/bge-m3` model (handled internally by ChromaDB).
111
- * ChromaDB performs a similarity search to find relevant text chunks.
112
  4. **Generation:** The relevant chunks and the original query are passed to the `google/gemma-3-27b-it` model via the Hugging Face Inference API to generate a final answer.
113
 
114
  **Configuration:**
115
 
116
- * **Embedding Model:** `BAAI/bge-m3` (local via `sentence-transformers` & ChromaDB)
117
- * **Generation Model:** `google/gemma-3-27b-it` (via HF Inference API)
118
  * **Requires Secret:** A Hugging Face User Access Token must be added as a Space Secret named `HF_TOKEN`.
119
 
120
- **Note:** The initial embedding process when the Space first starts (or restarts without persistent storage) can take some time as the model needs to process all the documents.
 
103
 
104
  **How it works:**
105
 
106
+ 1. **Data Source:** Pre-computed embeddings (`BAAI/bge-m3`), documents, and metadata loaded from the Hugging Face Dataset `Zwounds/Libguides_Embeddings` (originally sourced from `extracted_content.jsonl`).
107
+ 2. **Database Initialization:** On startup, the application downloads the dataset and loads the data into an in-memory ChromaDB collection stored in a temporary directory. This avoids slow re-embedding on every startup.
108
  3. **Query Processing:**
109
+ * User queries are optionally expanded using the generation model (`google/gemma-3-27b-it` via HF API).
110
+ * Queries are embedded using the local `BAAI/bge-m3` model (loaded into the Space).
111
+ * ChromaDB performs a similarity search using the query embedding against the pre-computed document embeddings.
112
  4. **Generation:** The relevant chunks and the original query are passed to the `google/gemma-3-27b-it` model via the Hugging Face Inference API to generate a final answer.
113
 
114
  **Configuration:**
115
 
116
+ * **Embedding:** Pre-computed `BAAI/bge-m3` embeddings loaded from HF Dataset `Zwounds/Libguides_Embeddings`. Query embedding uses local `BAAI/bge-m3`.
117
+ * **Generation Model:** `google/gemma-3-27b-it` (via HF Inference API).
118
  * **Requires Secret:** A Hugging Face User Access Token must be added as a Space Secret named `HF_TOKEN`.
119
 
120
+ **Note:** Startup involves downloading the dataset and loading it into the ChromaDB collection, which is much faster than re-embedding all documents.