File size: 1,258 Bytes
69fe85d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# mlx7-two-tower

This repository contains models trained using the Two-Tower (Dual Encoder) architecture for document retrieval.

## Model Description

The Two-Tower model is a dual encoder neural network architecture designed for semantic search and document retrieval. It consists of two separate "towers" - one for encoding queries and one for encoding documents - that map text to dense vector representations in a shared embedding space.

## Usage

```python
from twotower import load_model_from_hub
from twotower.encoders import TwoTower
from twotower.tokenisers import CharTokeniser

# Load the model
model, tokenizer, config = load_model_from_hub(
    repo_id="mlx7-two-tower",
    model_class=TwoTower,
    tokenizer_class=CharTokeniser
)

# Use for document embedding
doc_ids = tokenizer.encode("This is a document")
doc_embedding = model.encode_document(doc_ids)

# Use for query embedding
query_ids = tokenizer.encode("This is a query")
query_embedding = model.encode_query(query_ids)
```

## Training

This model was trained on the MS MARCO dataset using the Two-Tower architecture with contrastive learning.

## Repository Information

This model is part of the [Two-Tower Retrieval Model](https://github.com/yourusername/two-towers) project.