File size: 2,520 Bytes
5c1e941
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: mit
tags:
- tensor-compression
- code-embeddings
- factorized
- tltorch
base_model: nomic-ai/CodeRankEmbed
---

# CodeRankEmbed-compressed

This is a tensor-compressed version of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) using tensor factorization.

## Compression Details

- **Compression method**: Tensor factorization using TLTorch
- **Factorization types**: cp
- **Ranks used**: 4
- **Number of factorized layers**: 60
- **Original model size**: 136.73M parameters
- **Compressed model size**: 23.62M parameters
- **Compression ratio**: 5.79x (82.7% reduction)

## Usage

To use this compressed model, you'll need to install the required dependencies and use the custom loading script:

```bash
pip install torch tensorly tltorch sentence-transformers
```

### Loading the model

```python
import torch
import json
from sentence_transformers import SentenceTransformer
import tensorly as tl
from tltorch.factorized_layers import FactorizedLinear, FactorizedEmbedding

# Set TensorLy backend
tl.set_backend("pytorch")

# Load the model structure
model = SentenceTransformer("nomic-ai/CodeRankEmbed", trust_remote_code=True)

# Load factorization info
with open("factorization_info.json", "r") as f:
    factorized_info = json.load(f)

# Reconstruct factorized layers (see load_compressed_model.py for full implementation)
# ... reconstruction code ...

# Load compressed weights
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(checkpoint["state_dict"], strict=False)

# Use the model
embeddings = model.encode(["def hello_world():\n    print('Hello, World!')"])
```

## Model Files

- `pytorch_model.bin`: Compressed model weights
- `factorization_info.json`: Metadata about factorized layers
- `tokenizer.json`, `vocab.txt`: Tokenizer files
- `modules.json`: SentenceTransformer modules configuration

## Performance

The compressed model maintains good quality while being significantly smaller:
- Similar embedding quality (average cosine similarity > 0.9 with original)
- 5.79x smaller model size
- Faster loading and inference on CPU

## Citation

If you use this compressed model, please cite the original CodeRankEmbed model:

```bibtex
@misc{nomic2024coderankembed,
  title={CodeRankEmbed},
  author={Nomic AI},
  year={2024},
  url={https://huggingface.co/nomic-ai/CodeRankEmbed}
}
```

## License

This compressed model inherits the license from the original model. Please check the original model's license for usage terms.