File size: 1,258 Bytes
69fe85d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
# mlx7-two-tower
This repository contains models trained using the Two-Tower (Dual Encoder) architecture for document retrieval.
## Model Description
The Two-Tower model is a dual encoder neural network architecture designed for semantic search and document retrieval. It consists of two separate "towers" - one for encoding queries and one for encoding documents - that map text to dense vector representations in a shared embedding space.
## Usage
```python
from twotower import load_model_from_hub
from twotower.encoders import TwoTower
from twotower.tokenisers import CharTokeniser
# Load the model
model, tokenizer, config = load_model_from_hub(
repo_id="mlx7-two-tower",
model_class=TwoTower,
tokenizer_class=CharTokeniser
)
# Use for document embedding
doc_ids = tokenizer.encode("This is a document")
doc_embedding = model.encode_document(doc_ids)
# Use for query embedding
query_ids = tokenizer.encode("This is a query")
query_embedding = model.encode_query(query_ids)
```
## Training
This model was trained on the MS MARCO dataset using the Two-Tower architecture with contrastive learning.
## Repository Information
This model is part of the [Two-Tower Retrieval Model](https://github.com/yourusername/two-towers) project.
|