digio commited on
Commit
126cc8c
·
1 Parent(s): 8841e76

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - Sentence Similarity
6
+ - Pytorch
7
+ - Sentence Transformers
8
+ - Transformers
9
+ license: "apache-2.0"
10
+ ---
11
+
12
+ # Twitter4SSE
13
+
14
+ This model maps texts to 768 dimensional dense embeddings that encode semantic similarity.
15
+ It was trained with Multiple Negatives Ranking Loss (MNRL) on a Twitter dataset.
16
+ It was initialized from [BERTweet](https://huggingface.co/vinai/bertweet-base) and trained with [Sentence-transformers](https://www.sbert.net/).
17
+
18
+ ## Usage
19
+
20
+ The model is easier to use with sentence-trainsformers library
21
+
22
+ ```
23
+ pip install -U sentence-transformers
24
+ ```
25
+
26
+ ```
27
+ from sentence_transformers import SentenceTransformer
28
+ sentences = ["This is the first tweet", "This is the second tweet"]
29
+
30
+ model = SentenceTransformer('digio/Twitter4SSE')
31
+ embeddings = model.encode(sentences)
32
+ print(embeddings)
33
+ ```
34
+
35
+
36
+ Without sentence-transfomer library, please refer to [this repository](https://huggingface.co/sentence-transformers) for detailed instructions on how to use Sentence Transformers on Huggingface.
37
+
38
+ ## Citing & Authors
39
+
40
+ The official paper "Exploiting Twitter as Source of Large Corpora of Weakly Similar Pairs for Semantic Sentence Embeddings" will be presented at EMNLP 2021. Further details will be available soon.
41
+
42
+ The official code is available on [GitHub](https://github.com/marco-digio/Twitter4SSE)
43
+
44
+
45
+ The model cards have a YAML section that specify metadata. These are the fields