Azuremis commited on
Commit
69fe85d
·
verified ·
1 Parent(s): 1f7b1f6

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # mlx7-two-tower
2
+
3
+ This repository contains models trained using the Two-Tower (Dual Encoder) architecture for document retrieval.
4
+
5
+ ## Model Description
6
+
7
+ The Two-Tower model is a dual encoder neural network architecture designed for semantic search and document retrieval. It consists of two separate "towers" - one for encoding queries and one for encoding documents - that map text to dense vector representations in a shared embedding space.
8
+
9
+ ## Usage
10
+
11
+ ```python
12
+ from twotower import load_model_from_hub
13
+ from twotower.encoders import TwoTower
14
+ from twotower.tokenisers import CharTokeniser
15
+
16
+ # Load the model
17
+ model, tokenizer, config = load_model_from_hub(
18
+ repo_id="mlx7-two-tower",
19
+ model_class=TwoTower,
20
+ tokenizer_class=CharTokeniser
21
+ )
22
+
23
+ # Use for document embedding
24
+ doc_ids = tokenizer.encode("This is a document")
25
+ doc_embedding = model.encode_document(doc_ids)
26
+
27
+ # Use for query embedding
28
+ query_ids = tokenizer.encode("This is a query")
29
+ query_embedding = model.encode_query(query_ids)
30
+ ```
31
+
32
+ ## Training
33
+
34
+ This model was trained on the MS MARCO dataset using the Two-Tower architecture with contrastive learning.
35
+
36
+ ## Repository Information
37
+
38
+ This model is part of the [Two-Tower Retrieval Model](https://github.com/yourusername/two-towers) project.