Add `text-embeddings-inference` tag & snippet

#8
by alvarobartt HF Staff - opened
Files changed (1) hide show
  1. README.md +39 -10
README.md CHANGED
@@ -7,6 +7,7 @@ tags:
7
  - feature-extraction
8
  - sentence-similarity
9
  - transformers
 
10
  datasets:
11
  - flax-sentence-embeddings/stackexchange_xml
12
  - ms_marco
@@ -115,6 +116,40 @@ for doc, score in doc_score_pairs:
115
  print(score, doc)
116
  ```
117
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
118
  ## Technical Details
119
 
120
  In the following some technical details how this model must be used:
@@ -128,25 +163,22 @@ In the following some technical details how this model must be used:
128
 
129
  ----
130
 
131
-
132
  ## Background
133
 
134
  The project aims to train sentence embedding models on very large sentence level datasets using a self-supervised
135
  contrastive learning objective. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
136
 
137
- We developped this model during the
138
  [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104),
139
- organized by Hugging Face. We developped this model as part of the project:
140
- [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Googles Flax, JAX, and Cloud team member about efficient deep learning frameworks.
141
 
142
  ## Intended uses
143
 
144
- Our model is intented to be used for semantic search: It encodes queries / questions and text paragraphs in a dense vector space. It finds relevant documents for the given passages.
145
 
146
  Note that there is a limit of 512 word pieces: Text longer than that will be truncated. Further note that the model was just trained on input text up to 250 word pieces. It might not work well for longer text.
147
 
148
-
149
-
150
  ## Training procedure
151
 
152
  The full training script is accessible in this current repository: `train_script.py`.
@@ -162,9 +194,6 @@ We sampled each dataset given a weighted probability which configuration is deta
162
 
163
  The model was trained with [MultipleNegativesRankingLoss](https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss) using CLS-pooling, dot-product as similarity function, and a scale of 1.
164
 
165
-
166
-
167
-
168
  | Dataset | Number of training tuples |
169
  |--------------------------------------------------------|:--------------------------:|
170
  | [WikiAnswers](https://github.com/afader/oqa#wikianswers-corpus) Duplicate question pairs from WikiAnswers | 77,427,422 |
 
7
  - feature-extraction
8
  - sentence-similarity
9
  - transformers
10
+ - text-embeddings-inference
11
  datasets:
12
  - flax-sentence-embeddings/stackexchange_xml
13
  - ms_marco
 
116
  print(score, doc)
117
  ```
118
 
119
+ ## Usage (Text Embeddings Inference (TEI))
120
+
121
+ [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) is a blazing fast inference solution for text embeddings models.
122
+
123
+ - CPU:
124
+ ```bash
125
+ docker run -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
126
+ --model-id sentence-transformers/multi-qa-mpnet-base-dot-v1 \
127
+ --pooling cls \
128
+ --dtype float16
129
+ ```
130
+
131
+ - NVIDIA GPU:
132
+ ```bash
133
+ docker run --gpus all -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cuda-latest \
134
+ --model-id sentence-transformers/multi-qa-mpnet-base-dot-v1 \
135
+ --pooling cls \
136
+ --dtype float16
137
+ ```
138
+
139
+ Send a request to `/v1/embeddings` to generate embeddings via the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create):
140
+ ```bash
141
+ curl http://localhost:8080/v1/embeddings \
142
+ -H "Content-Type: application/json" \
143
+ -d '{
144
+ "model": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
145
+ "input": "How many people live in London?"
146
+ }'
147
+ ```
148
+
149
+ Or check the [Text Embeddings Inference API specification](https://huggingface.github.io/text-embeddings-inference/) instead.
150
+
151
+ ----
152
+
153
  ## Technical Details
154
 
155
  In the following some technical details how this model must be used:
 
163
 
164
  ----
165
 
 
166
  ## Background
167
 
168
  The project aims to train sentence embedding models on very large sentence level datasets using a self-supervised
169
  contrastive learning objective. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
170
 
171
+ We developed this model during the
172
  [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104),
173
+ organized by Hugging Face. We developed this model as part of the project:
174
+ [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Google's Flax, JAX, and Cloud team members about efficient deep learning frameworks.
175
 
176
  ## Intended uses
177
 
178
+ Our model is intended to be used for semantic search: It encodes queries / questions and text paragraphs in a dense vector space. It finds relevant documents for the given passages.
179
 
180
  Note that there is a limit of 512 word pieces: Text longer than that will be truncated. Further note that the model was just trained on input text up to 250 word pieces. It might not work well for longer text.
181
 
 
 
182
  ## Training procedure
183
 
184
  The full training script is accessible in this current repository: `train_script.py`.
 
194
 
195
  The model was trained with [MultipleNegativesRankingLoss](https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss) using CLS-pooling, dot-product as similarity function, and a scale of 1.
196
 
 
 
 
197
  | Dataset | Number of training tuples |
198
  |--------------------------------------------------------|:--------------------------:|
199
  | [WikiAnswers](https://github.com/afader/oqa#wikianswers-corpus) Duplicate question pairs from WikiAnswers | 77,427,422 |