|
--- |
|
license: mit |
|
tags: |
|
- tensor-compression |
|
- code-embeddings |
|
- factorized |
|
- tltorch |
|
base_model: nomic-ai/CodeRankEmbed |
|
--- |
|
|
|
# CodeRankEmbed-compressed |
|
|
|
This is a tensor-compressed version of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) using tensor factorization. |
|
|
|
## Compression Details |
|
|
|
- **Compression method**: Tensor factorization using TLTorch |
|
- **Factorization types**: cp |
|
- **Ranks used**: 4 |
|
- **Number of factorized layers**: 60 |
|
- **Original model size**: 136.73M parameters |
|
- **Compressed model size**: 23.62M parameters |
|
- **Compression ratio**: 5.79x (82.7% reduction) |
|
|
|
## Usage |
|
|
|
To use this compressed model, you'll need to install the required dependencies and use the custom loading script: |
|
|
|
```bash |
|
pip install torch tensorly tltorch sentence-transformers |
|
``` |
|
|
|
### Loading the model |
|
|
|
```python |
|
import torch |
|
import json |
|
from sentence_transformers import SentenceTransformer |
|
import tensorly as tl |
|
from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding |
|
|
|
# Set TensorLy backend |
|
tl.set_backend("pytorch") |
|
|
|
# Load the model structure |
|
model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True) |
|
|
|
# Load factorization info |
|
with open("factorization_info.json", "r") as f: |
|
factorized_info = json.load(f) |
|
|
|
# Reconstruct factorized layers (see load_compressed_model.py for full implementation) |
|
# ... reconstruction code ... |
|
|
|
# Load compressed weights |
|
checkpoint = torch.load("pytorch_model.bin", map_location="cpu") |
|
model.load_state_dict(checkpoint["state_dict"], strict=False) |
|
|
|
# Use the model |
|
embeddings = model.encode(["def hello_world():\n print('Hello, World!')"]) |
|
``` |
|
|
|
## Model Files |
|
|
|
- `pytorch_model.bin`: Compressed model weights |
|
- `factorization_info.json`: Metadata about factorized layers |
|
- `tokenizer.json`, `vocab.txt`: Tokenizer files |
|
- `modules.json`: SentenceTransformer modules configuration |
|
|
|
## Performance |
|
|
|
The compressed model maintains good quality while being significantly smaller: |
|
- Similar embedding quality (average cosine similarity > 0.9 with original) |
|
- 5.79x smaller model size |
|
- Faster loading and inference on CPU |
|
|
|
## Citation |
|
|
|
If you use this compressed model, please cite the original CodeRankEmbed model: |
|
|
|
```bibtex |
|
@misc{nomic2024coderankembed, |
|
title={CodeRankEmbed}, |
|
author={Nomic AI}, |
|
year={2024}, |
|
url={https://huggingface.co/nomic-ai/CodeRankEmbed} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This compressed model inherits the license from the original model. Please check the original model's license for usage terms. |
|
|