Patryk Ptasiński commited on
Commit
b366822
·
1 Parent(s): cc86f1b

Add 15+ embedding models with dropdown selector and comprehensive API support

Browse files
Files changed (2) hide show
  1. CLAUDE.md +13 -8
  2. app.py +114 -24
CLAUDE.md CHANGED
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
 
5
  ## Project Overview
6
 
7
- This is a Hugging Face Spaces application that provides text embeddings using the Nomic AI model (nomic-embed-text-v1.5). It runs on CPU and provides both a web interface and API endpoints for generating text embeddings.
8
 
9
  ## Key Commands
10
 
@@ -29,11 +29,12 @@ huggingface-cli login
29
  ## Architecture
30
 
31
  The application consists of a single `app.py` file with:
32
- - **Model Initialization**: SentenceTransformer with `device='cpu'` (line 10)
33
- - **FastAPI App**: Direct HTTP endpoint at `/embed` (lines 13, 21-46)
34
- - **Embedding Function**: Simple wrapper that calls model.encode() (lines 16-17)
35
- - **Gradio Interface**: UI components and API endpoint configuration (lines 49-122)
36
- - **Dual Server**: FastAPI mounted with Gradio using uvicorn (lines 126-129)
 
37
 
38
  ## Important Configuration Details
39
 
@@ -48,16 +49,20 @@ Two options for API access:
48
 
49
  1. **Direct FastAPI endpoint** (no queue):
50
  ```bash
 
 
 
 
51
  curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
52
  -H "Content-Type: application/json" \
53
- -d '{"text": "your text"}'
54
  ```
55
 
56
  2. **Gradio client** (handles queue automatically):
57
  ```python
58
  from gradio_client import Client
59
  client = Client("ipepe/nomic-embeddings")
60
- result = client.predict("text to embed", api_name="/predict")
61
  ```
62
 
63
  ## Deployment Notes
 
4
 
5
  ## Project Overview
6
 
7
+ This is a Hugging Face Spaces application that provides text embeddings using 15+ state-of-the-art embedding models including Nomic, BGE, Snowflake Arctic, IBM Granite, and sentence-transformers models. It runs on CPU and provides both a web interface and API endpoints for generating text embeddings with model selection.
8
 
9
  ## Key Commands
10
 
 
29
  ## Architecture
30
 
31
  The application consists of a single `app.py` file with:
32
+ - **Model Configuration**: Dictionary of 15+ embedding models with trust_remote_code settings (lines 10-26)
33
+ - **Model Caching**: Dynamic model loading with caching to avoid reloading (lines 32-42)
34
+ - **FastAPI App**: Direct HTTP endpoints at `/embed` and `/models` (lines 44, 57-102)
35
+ - **Embedding Function**: Multi-model wrapper that calls model.encode() (lines 49-53)
36
+ - **Gradio Interface**: UI with model dropdown selector and API endpoint (lines 106-135)
37
+ - **Dual Server**: FastAPI mounted with Gradio using uvicorn (lines 214-219)
38
 
39
  ## Important Configuration Details
40
 
 
49
 
50
  1. **Direct FastAPI endpoint** (no queue):
51
  ```bash
52
+ # List models
53
+ curl https://ipepe-nomic-embeddings.hf.space/models
54
+
55
+ # Generate embedding with specific model
56
  curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
57
  -H "Content-Type: application/json" \
58
+ -d '{"text": "your text", "model": "mixedbread-ai/mxbai-embed-large-v1"}'
59
  ```
60
 
61
  2. **Gradio client** (handles queue automatically):
62
  ```python
63
  from gradio_client import Client
64
  client = Client("ipepe/nomic-embeddings")
65
+ result = client.predict("text to embed", "model-name", api_name="/predict")
66
  ```
67
 
68
  ## Deployment Notes
app.py CHANGED
@@ -6,14 +6,52 @@ from fastapi import FastAPI
6
  from fastapi.responses import JSONResponse
7
  from sentence_transformers import SentenceTransformer
8
 
9
- # Initialize model
10
- model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, device='cpu')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  # Create FastAPI app
13
  fastapi_app = FastAPI()
14
 
15
 
16
- def embed(document: str):
 
 
 
17
  return model.encode(document)
18
 
19
 
@@ -23,20 +61,28 @@ async def embed_text(data: Dict[str, Any]):
23
  """Direct API endpoint for text embedding without queue"""
24
  try:
25
  text = data.get("text", "")
 
 
26
  if not text:
27
  return JSONResponse(
28
  status_code=400,
29
  content={"error": "No text provided"}
30
  )
31
 
 
 
 
 
 
 
32
  # Generate embedding
33
- embedding = model.encode(text)
34
 
35
  return JSONResponse(
36
  content={
37
  "embedding": embedding.tolist(),
38
  "dim": len(embedding),
39
- "model": "nomic-embed-text-v1.5"
40
  }
41
  )
42
  except Exception as e:
@@ -46,9 +92,28 @@ async def embed_text(data: Dict[str, Any]):
46
  )
47
 
48
 
49
- with gr.Blocks(title="Nomic Text Embeddings") as app:
50
- gr.Markdown("# Nomic Text Embeddings v1.5")
51
- gr.Markdown("Generate embeddings for your text using the nomic-embed-text-v1.5 model.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  # Create an input text box
54
  text_input = gr.Textbox(label="Enter text to embed", placeholder="Type or paste your text here...")
@@ -60,27 +125,38 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
60
  submit_btn = gr.Button("Generate Embedding", variant="primary")
61
 
62
  # Handle both button click and text submission
63
- submit_btn.click(embed, inputs=text_input, outputs=output, api_name="predict")
64
- text_input.submit(embed, inputs=text_input, outputs=output)
65
 
66
  # Add API usage guide
67
  gr.Markdown("## API Usage")
68
  gr.Markdown("""
69
  You can use this API in two ways: via the direct FastAPI endpoint or through Gradio clients.
70
 
 
 
 
 
 
71
  ### Direct API Endpoint (No Queue!)
72
  ```bash
 
73
  curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
74
  -H "Content-Type: application/json" \
75
  -d '{"text": "Your text to embed goes here"}'
 
 
 
 
 
76
  ```
77
 
78
  Response format:
79
  ```json
80
  {
81
  "embedding": [0.123, -0.456, ...],
82
- "dim": 768,
83
- "model": "nomic-embed-text-v1.5"
84
  }
85
  ```
86
 
@@ -88,9 +164,17 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
88
  ```python
89
  import requests
90
 
 
 
 
 
 
91
  response = requests.post(
92
  "https://ipepe-nomic-embeddings.hf.space/embed",
93
- json={"text": "Your text to embed goes here"}
 
 
 
94
  )
95
  result = response.json()
96
  embedding = result["embedding"]
@@ -103,22 +187,28 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
103
  client = Client("ipepe/nomic-embeddings")
104
  result = client.predict(
105
  "Your text to embed goes here",
 
106
  api_name="/predict"
107
  )
108
  print(result) # Returns the embedding array
109
  ```
110
 
111
- ### JavaScript/Node.js Example
112
- ```javascript
113
- // Direct API
114
- const response = await fetch('https://ipepe-nomic-embeddings.hf.space/embed', {
115
- method: 'POST',
116
- headers: { 'Content-Type': 'application/json' },
117
- body: JSON.stringify({ text: 'Your text to embed goes here' })
118
- });
119
- const result = await response.json();
120
- console.log(result.embedding);
121
- ```
 
 
 
 
 
122
  """)
123
 
124
  if __name__ == '__main__':
 
6
  from fastapi.responses import JSONResponse
7
  from sentence_transformers import SentenceTransformer
8
 
9
+ # Available models
10
+ MODELS = {
11
+ "nomic-ai/nomic-embed-text-v1.5": {"trust_remote_code": True},
12
+ "nomic-ai/nomic-embed-text-v1": {"trust_remote_code": True},
13
+ "mixedbread-ai/mxbai-embed-large-v1": {"trust_remote_code": False},
14
+ "BAAI/bge-m3": {"trust_remote_code": False},
15
+ "sentence-transformers/all-MiniLM-L6-v2": {"trust_remote_code": False},
16
+ "sentence-transformers/all-mpnet-base-v2": {"trust_remote_code": False},
17
+ "Snowflake/snowflake-arctic-embed-m": {"trust_remote_code": False},
18
+ "Snowflake/snowflake-arctic-embed-l": {"trust_remote_code": False},
19
+ "Snowflake/snowflake-arctic-embed-m-v2.0": {"trust_remote_code": False},
20
+ "BAAI/bge-large-en-v1.5": {"trust_remote_code": False},
21
+ "BAAI/bge-base-en-v1.5": {"trust_remote_code": False},
22
+ "BAAI/bge-small-en-v1.5": {"trust_remote_code": False},
23
+ "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2": {"trust_remote_code": False},
24
+ "ibm-granite/granite-embedding-30m-english": {"trust_remote_code": False},
25
+ "ibm-granite/granite-embedding-278m-multilingual": {"trust_remote_code": False},
26
+ }
27
+
28
+ # Model cache
29
+ loaded_models = {}
30
+ current_model_name = "nomic-ai/nomic-embed-text-v1.5"
31
+
32
+ # Initialize default model
33
+ def load_model(model_name: str):
34
+ global loaded_models
35
+ if model_name not in loaded_models:
36
+ config = MODELS.get(model_name, {})
37
+ loaded_models[model_name] = SentenceTransformer(
38
+ model_name,
39
+ trust_remote_code=config.get("trust_remote_code", False),
40
+ device='cpu'
41
+ )
42
+ return loaded_models[model_name]
43
+
44
+ # Load default model
45
+ model = load_model(current_model_name)
46
 
47
  # Create FastAPI app
48
  fastapi_app = FastAPI()
49
 
50
 
51
+ def embed(document: str, model_name: str = None):
52
+ if model_name and model_name in MODELS:
53
+ selected_model = load_model(model_name)
54
+ return selected_model.encode(document)
55
  return model.encode(document)
56
 
57
 
 
61
  """Direct API endpoint for text embedding without queue"""
62
  try:
63
  text = data.get("text", "")
64
+ model_name = data.get("model", current_model_name)
65
+
66
  if not text:
67
  return JSONResponse(
68
  status_code=400,
69
  content={"error": "No text provided"}
70
  )
71
 
72
+ if model_name not in MODELS:
73
+ return JSONResponse(
74
+ status_code=400,
75
+ content={"error": f"Model '{model_name}' not supported. Available models: {list(MODELS.keys())}"}
76
+ )
77
+
78
  # Generate embedding
79
+ embedding = embed(text, model_name)
80
 
81
  return JSONResponse(
82
  content={
83
  "embedding": embedding.tolist(),
84
  "dim": len(embedding),
85
+ "model": model_name
86
  }
87
  )
88
  except Exception as e:
 
92
  )
93
 
94
 
95
+ @fastapi_app.get("/models")
96
+ async def list_models():
97
+ """List available embedding models"""
98
+ return JSONResponse(
99
+ content={
100
+ "models": list(MODELS.keys()),
101
+ "default": current_model_name
102
+ }
103
+ )
104
+
105
+
106
+ with gr.Blocks(title="Multi-Model Text Embeddings") as app:
107
+ gr.Markdown("# Multi-Model Text Embeddings")
108
+ gr.Markdown("Generate embeddings for your text using 15+ state-of-the-art embedding models from Nomic, BGE, Snowflake, IBM Granite, and more.")
109
+
110
+ # Model selector dropdown
111
+ model_dropdown = gr.Dropdown(
112
+ choices=list(MODELS.keys()),
113
+ value=current_model_name,
114
+ label="Select Embedding Model",
115
+ info="Choose the embedding model to use"
116
+ )
117
 
118
  # Create an input text box
119
  text_input = gr.Textbox(label="Enter text to embed", placeholder="Type or paste your text here...")
 
125
  submit_btn = gr.Button("Generate Embedding", variant="primary")
126
 
127
  # Handle both button click and text submission
128
+ submit_btn.click(embed, inputs=[text_input, model_dropdown], outputs=output, api_name="predict")
129
+ text_input.submit(embed, inputs=[text_input, model_dropdown], outputs=output)
130
 
131
  # Add API usage guide
132
  gr.Markdown("## API Usage")
133
  gr.Markdown("""
134
  You can use this API in two ways: via the direct FastAPI endpoint or through Gradio clients.
135
 
136
+ ### List Available Models
137
+ ```bash
138
+ curl https://ipepe-nomic-embeddings.hf.space/models
139
+ ```
140
+
141
  ### Direct API Endpoint (No Queue!)
142
  ```bash
143
+ # Default model (nomic-ai/nomic-embed-text-v1.5)
144
  curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
145
  -H "Content-Type: application/json" \
146
  -d '{"text": "Your text to embed goes here"}'
147
+
148
+ # With specific model
149
+ curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
150
+ -H "Content-Type: application/json" \
151
+ -d '{"text": "Your text to embed goes here", "model": "sentence-transformers/all-MiniLM-L6-v2"}'
152
  ```
153
 
154
  Response format:
155
  ```json
156
  {
157
  "embedding": [0.123, -0.456, ...],
158
+ "dim": 384,
159
+ "model": "sentence-transformers/all-MiniLM-L6-v2"
160
  }
161
  ```
162
 
 
164
  ```python
165
  import requests
166
 
167
+ # List available models
168
+ models = requests.get("https://ipepe-nomic-embeddings.hf.space/models").json()
169
+ print(models["models"])
170
+
171
+ # Generate embedding with specific model
172
  response = requests.post(
173
  "https://ipepe-nomic-embeddings.hf.space/embed",
174
+ json={
175
+ "text": "Your text to embed goes here",
176
+ "model": "BAAI/bge-small-en-v1.5"
177
+ }
178
  )
179
  result = response.json()
180
  embedding = result["embedding"]
 
187
  client = Client("ipepe/nomic-embeddings")
188
  result = client.predict(
189
  "Your text to embed goes here",
190
+ "nomic-ai/nomic-embed-text-v1.5", # model selection
191
  api_name="/predict"
192
  )
193
  print(result) # Returns the embedding array
194
  ```
195
 
196
+ ### Available Models
197
+ - `nomic-ai/nomic-embed-text-v1.5` (default) - High-performing open embedding model with large token context
198
+ - `nomic-ai/nomic-embed-text-v1` - Previous version of Nomic embedding model
199
+ - `mixedbread-ai/mxbai-embed-large-v1` - State-of-the-art large embedding model from mixedbread.ai
200
+ - `BAAI/bge-m3` - Multi-functional, multi-lingual, multi-granularity embedding model
201
+ - `sentence-transformers/all-MiniLM-L6-v2` - Fast, small embedding model for general use
202
+ - `sentence-transformers/all-mpnet-base-v2` - Balanced performance embedding model
203
+ - `Snowflake/snowflake-arctic-embed-m` - Medium-sized Arctic embedding model
204
+ - `Snowflake/snowflake-arctic-embed-l` - Large Arctic embedding model
205
+ - `Snowflake/snowflake-arctic-embed-m-v2.0` - Latest Arctic embedding with multilingual support
206
+ - `BAAI/bge-large-en-v1.5` - Large BGE embedding model for English
207
+ - `BAAI/bge-base-en-v1.5` - Base BGE embedding model for English
208
+ - `BAAI/bge-small-en-v1.5` - Small BGE embedding model for English
209
+ - `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` - Multilingual paraphrase model
210
+ - `ibm-granite/granite-embedding-30m-english` - IBM Granite 30M English embedding model
211
+ - `ibm-granite/granite-embedding-278m-multilingual` - IBM Granite 278M multilingual embedding model
212
  """)
213
 
214
  if __name__ == '__main__':