alvarobartt HF Staff commited on
Commit
1c1d773
·
verified ·
1 Parent(s): 4633e80

Add `text-embeddings-inference` tag & snippet

Browse files

## Description

- Add `text-embeddings-inference` tag to improve discoverability
- Adds a sample snippet on how to run Text Embeddings Inference (TEI) via Docker

⚠️ **This PR has been generated automatically, so please review it before merging.**

Files changed (1) hide show
  1. README.md +38 -5
README.md CHANGED
@@ -7,6 +7,7 @@ tags:
7
  - feature-extraction
8
  - sentence-similarity
9
  - transformers
 
10
  datasets:
11
  - flax-sentence-embeddings/stackexchange_xml
12
  - ms_marco
@@ -134,14 +135,14 @@ In the following some technical details how this model must be used:
134
  The project aims to train sentence embedding models on very large sentence level datasets using a self-supervised
135
  contrastive learning objective. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
136
 
137
- We developped this model during the
138
  [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104),
139
- organized by Hugging Face. We developped this model as part of the project:
140
- [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Googles Flax, JAX, and Cloud team member about efficient deep learning frameworks.
141
 
142
  ## Intended uses
143
 
144
- Our model is intented to be used for semantic search: It encodes queries / questions and text paragraphs in a dense vector space. It finds relevant documents for the given passages.
145
 
146
  Note that there is a limit of 512 word pieces: Text longer than that will be truncated. Further note that the model was just trained on input text up to 250 word pieces. It might not work well for longer text.
147
 
@@ -184,4 +185,36 @@ The model was trained with [MultipleNegativesRankingLoss](https://www.sbert.net/
184
  | [Natural Questions (NQ)](https://ai.google.com/research/NaturalQuestions) (Question, Paragraph) pairs for 100k real Google queries with relevant Wikipedia paragraph | 100,231 |
185
  | [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) (Question, Paragraph) pairs from SQuAD2.0 dataset | 87,599 |
186
  | [TriviaQA](https://huggingface.co/datasets/trivia_qa) (Question, Evidence) pairs | 73,346 |
187
- | **Total** | **214,988,242** |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - feature-extraction
8
  - sentence-similarity
9
  - transformers
10
+ - text-embeddings-inference
11
  datasets:
12
  - flax-sentence-embeddings/stackexchange_xml
13
  - ms_marco
 
135
  The project aims to train sentence embedding models on very large sentence level datasets using a self-supervised
136
  contrastive learning objective. We use a contrastive learning objective: given a sentence from the pair, the model should predict which out of a set of randomly sampled other sentences, was actually paired with it in our dataset.
137
 
138
+ We developed this model during the
139
  [Community week using JAX/Flax for NLP & CV](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104),
140
+ organized by Hugging Face. We developed this model as part of the project:
141
+ [Train the Best Sentence Embedding Model Ever with 1B Training Pairs](https://discuss.huggingface.co/t/train-the-best-sentence-embedding-model-ever-with-1b-training-pairs/7354). We benefited from efficient hardware infrastructure to run the project: 7 TPUs v3-8, as well as intervention from Google's Flax, JAX, and Cloud team members about efficient deep learning frameworks.
142
 
143
  ## Intended uses
144
 
145
+ Our model is intended to be used for semantic search: It encodes queries / questions and text paragraphs in a dense vector space. It finds relevant documents for the given passages.
146
 
147
  Note that there is a limit of 512 word pieces: Text longer than that will be truncated. Further note that the model was just trained on input text up to 250 word pieces. It might not work well for longer text.
148
 
 
185
  | [Natural Questions (NQ)](https://ai.google.com/research/NaturalQuestions) (Question, Paragraph) pairs for 100k real Google queries with relevant Wikipedia paragraph | 100,231 |
186
  | [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer/) (Question, Paragraph) pairs from SQuAD2.0 dataset | 87,599 |
187
  | [TriviaQA](https://huggingface.co/datasets/trivia_qa) (Question, Evidence) pairs | 73,346 |
188
+ | **Total** | **214,988,242** |
189
+
190
+ ## Usage (Text Embeddings Inference (TEI))
191
+
192
+ [Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) is a blazing fast inference solution for text embeddings models.
193
+
194
+ - CPU:
195
+ ```bash
196
+ docker run -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-latest \
197
+ --model-id sentence-transformers/multi-qa-mpnet-base-dot-v1 \
198
+ --pooling cls \
199
+ --dtype float16
200
+ ```
201
+
202
+ - NVIDIA GPU:
203
+ ```bash
204
+ docker run --gpus all -p 8080:80 -v hf_cache:/data --pull always ghcr.io/huggingface/text-embeddings-inference:cuda-latest \
205
+ --model-id sentence-transformers/multi-qa-mpnet-base-dot-v1 \
206
+ --pooling cls \
207
+ --dtype float16
208
+ ```
209
+
210
+ Send a request to `/v1/embeddings` to generate embeddings via the [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings/create):
211
+ ```bash
212
+ curl http://localhost:8080/v1/embeddings \
213
+ -H "Content-Type: application/json" \
214
+ -d '{
215
+ "model": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
216
+ "input": "How many people live in London?"
217
+ }'
218
+ ```
219
+
220
+ Or check the [Text Embeddings Inference API specification](https://huggingface.github.io/text-embeddings-inference/) instead.