Jia Huei Tan commited on
Commit
a8fed56
·
1 Parent(s): b1997b9

Update README

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -1,3 +1,46 @@
1
  ---
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ language: en
8
  license: apache-2.0
9
  ---
10
+
11
+ # ONNX Conversion of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
12
+
13
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
+
15
+ ## Usage
16
+
17
+ ```python
18
+ import torch
19
+ import torch.nn.functional as F
20
+ from optimum.onnxruntime import ORTModelForFeatureExtraction
21
+ from transformers import AutoTokenizer
22
+
23
+ device = "cuda"
24
+ sentences = [
25
+ "The llama (/ˈlɑːmə/) (Lama glama) is a domesticated South American camelid.",
26
+ "The alpaca (Lama pacos) is a species of South American camelid mammal.",
27
+ "The vicuña (Lama vicugna) (/vɪˈkuːnjə/) is one of the two wild South American camelids.",
28
+ ]
29
+
30
+
31
+ model_name = "EmbeddedLLM/all-MiniLM-L6-v2-onnx-o3-cpu"
32
+ device = "cpu"
33
+ provider = "CPUExecutionProvider"
34
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
35
+ model = ORTModelForFeatureExtraction.from_pretrained(
36
+ model_name, use_io_binding=True, provider=provider, device_map=device
37
+ )
38
+ inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")
39
+ inputs = inputs.to(device)
40
+ token_embeddings = model(**inputs).last_hidden_state
41
+ # Pool
42
+ att_mask = inputs["attention_mask"].unsqueeze(-1).expand(token_embeddings.size()).float()
43
+ embeddings = torch.sum(token_embeddings * att_mask, 1) / torch.clamp(att_mask.sum(1), min=1e-9)
44
+ embeddings = F.normalize(embeddings, p=2, dim=1)
45
+ print(embeddings.cpu().numpy().shape)
46
+ ```