SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("codersan/validadted_allMiniLM_onV9f")
# Run inference
sentences = [
    'برای تبدیل شدن به نویسنده برتر Quora ، چند بازدید و پاسخ لازم است؟',
    'چگونه می توانم نویسنده برتر Quora شوم ، از صعود بیشتر و آمار بهتر استفاده کنم؟',
    'من به دنبال خرید دوچرخه جدید هستم.Suzuki Gixxer 155 یا Honda Hornet 160r.کدام یک را بخرید؟',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 131,157 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 11 tokens
    • mean: 44.91 tokens
    • max: 256 tokens
    • min: 11 tokens
    • mean: 44.6 tokens
    • max: 154 tokens
  • Samples:
    anchor positive
    وقتی سوال من به عنوان "این سوال ممکن است به ویرایش نیاز داشته باشد" چه کاری باید انجام دهم ، اما نمی توانم دلیل آن را پیدا کنم؟ چرا سوال من به عنوان نیاز به پیشرفت مشخص شده است؟
    چگونه می توانید یک فایل رمزگذاری شده را با دانستن اینکه این یک فایل تصویری است بدون دانستن گسترش پرونده یا کلید ، رمزگشایی کنید؟ چگونه می توانید یک فایل رمزگذاری شده را رمزگشایی کنید و بدانید که این یک فایل تصویری است بدون اینکه از پسوند پرونده اطلاع داشته باشید؟
    احساس می کنم خودکشی می کنم ، چگونه باید با آن برخورد کنم؟ احساس می کنم خودکشی می کنم.چه کاری باید انجام دهم؟
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 12
  • learning_rate: 5e-06
  • weight_decay: 0.01
  • warmup_ratio: 0.1
  • push_to_hub: True
  • hub_model_id: codersan/validadted_allMiniLM_onV9f
  • eval_on_start: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 12
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: codersan/validadted_allMiniLM_onV9f
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0 0 -
0.0091 100 1.4865
0.0183 200 1.4429
0.0274 300 1.2725
0.0366 400 1.1602
0.0457 500 0.9429
0.0549 600 0.829
0.0640 700 0.7771
0.0732 800 0.6597
0.0823 900 0.5981
0.0915 1000 0.5826
0.1006 1100 0.5956
0.1098 1200 0.5254
0.1189 1300 0.5434
0.1281 1400 0.5495
0.1372 1500 0.4934
0.1464 1600 0.4684
0.1555 1700 0.4489
0.1647 1800 0.4401
0.1738 1900 0.4712
0.1830 2000 0.4407
0.1921 2100 0.4082
0.2013 2200 0.4384
0.2104 2300 0.3621
0.2196 2400 0.4423
0.2287 2500 0.4163
0.2379 2600 0.3769
0.2470 2700 0.3967
0.2562 2800 0.3812
0.2653 2900 0.3813
0.2745 3000 0.359
0.2836 3100 0.3454
0.2928 3200 0.3518
0.3019 3300 0.3306
0.3111 3400 0.3138
0.3202 3500 0.3416
0.3294 3600 0.3474
0.3385 3700 0.3153
0.3477 3800 0.2896
0.3568 3900 0.2737
0.3660 4000 0.3004
0.3751 4100 0.3109
0.3843 4200 0.2829
0.3934 4300 0.2729
0.4026 4400 0.2714
0.4117 4500 0.3014
0.4209 4600 0.27
0.4300 4700 0.3632
0.4392 4800 0.2571
0.4483 4900 0.2464
0.4575 5000 0.2681
0.4666 5100 0.2579
0.4758 5200 0.2377
0.4849 5300 0.2471
0.4941 5400 0.2625
0.5032 5500 0.2336
0.5124 5600 0.2553
0.5215 5700 0.2549
0.5306 5800 0.22
0.5398 5900 0.2682
0.5489 6000 0.2329
0.5581 6100 0.2244
0.5672 6200 0.2458
0.5764 6300 0.1881
0.5855 6400 0.209
0.5947 6500 0.2103
0.6038 6600 0.1982
0.6130 6700 0.2023
0.6221 6800 0.2244
0.6313 6900 0.2051
0.6404 7000 0.224
0.6496 7100 0.2113
0.6587 7200 0.2386
0.6679 7300 0.1685
0.6770 7400 0.2092
0.6862 7500 0.1832
0.6953 7600 0.1957
0.7045 7700 0.2082
0.7136 7800 0.2213
0.7228 7900 0.177
0.7319 8000 0.196
0.7411 8100 0.2034
0.7502 8200 0.2017
0.7594 8300 0.1741
0.7685 8400 0.2092
0.7777 8500 0.1684
0.7868 8600 0.1874
0.7960 8700 0.1866
0.8051 8800 0.2291
0.8143 8900 0.1796
0.8234 9000 0.2036
0.8326 9100 0.2173
0.8417 9200 0.2074
0.8509 9300 0.1914
0.8600 9400 0.1639
0.8692 9500 0.1798
0.8783 9600 0.1926
0.8875 9700 0.1672
0.8966 9800 0.1727
0.9058 9900 0.189
0.9149 10000 0.2055
0.9241 10100 0.2043
0.9332 10200 0.1515
0.9424 10300 0.1675
0.9515 10400 0.1764
0.9607 10500 0.1709
0.9698 10600 0.1861
0.9790 10700 0.1928
0.9881 10800 0.1756
0.9973 10900 0.1611
1.0064 11000 0.1371
1.0156 11100 0.1499
1.0247 11200 0.2001
1.0339 11300 0.197
1.0430 11400 0.2035
1.0522 11500 0.1524
1.0613 11600 0.1988
1.0704 11700 0.1643
1.0796 11800 0.1488
1.0887 11900 0.1402
1.0979 12000 0.1501
1.1070 12100 0.1476
1.1162 12200 0.1703
1.1253 12300 0.1437
1.1345 12400 0.1684
1.1436 12500 0.1583
1.1528 12600 0.1554
1.1619 12700 0.1453
1.1711 12800 0.1592
1.1802 12900 0.1508
1.1894 13000 0.1585
1.1985 13100 0.1381
1.2077 13200 0.1442
1.2168 13300 0.183
1.2260 13400 0.1704
1.2351 13500 0.152
1.2443 13600 0.136
1.2534 13700 0.1596
1.2626 13800 0.151
1.2717 13900 0.1597
1.2809 14000 0.1547
1.2900 14100 0.1717
1.2992 14200 0.1037
1.3083 14300 0.1452
1.3175 14400 0.155
1.3266 14500 0.189
1.3358 14600 0.1384
1.3449 14700 0.1711
1.3541 14800 0.1255
1.3632 14900 0.1439
1.3724 15000 0.1583
1.3815 15100 0.1586
1.3907 15200 0.1502
1.3998 15300 0.1199
1.4090 15400 0.1362
1.4181 15500 0.1502
1.4273 15600 0.191
1.4364 15700 0.1495
1.4456 15800 0.1313
1.4547 15900 0.1429
1.4639 16000 0.1004
1.4730 16100 0.1267
1.4822 16200 0.1382
1.4913 16300 0.1535
1.5005 16400 0.1328
1.5096 16500 0.1268
1.5188 16600 0.1819
1.5279 16700 0.133
1.5371 16800 0.1503
1.5462 16900 0.1217
1.5554 17000 0.1414
1.5645 17100 0.1413
1.5737 17200 0.124
1.5828 17300 0.1111
1.5919 17400 0.1641
1.6011 17500 0.1217
1.6102 17600 0.1148
1.6194 17700 0.1452
1.6285 17800 0.1245
1.6377 17900 0.1184
1.6468 18000 0.1333
1.6560 18100 0.1421
1.6651 18200 0.1243
1.6743 18300 0.1173
1.6834 18400 0.117
1.6926 18500 0.1145
1.7017 18600 0.1365
1.7109 18700 0.1404
1.7200 18800 0.1254
1.7292 18900 0.1131
1.7383 19000 0.1503
1.7475 19100 0.1429
1.7566 19200 0.1057
1.7658 19300 0.1221
1.7749 19400 0.1034
1.7841 19500 0.1154
1.7932 19600 0.1106
1.8024 19700 0.1568
1.8115 19800 0.1332
1.8207 19900 0.1238
1.8298 20000 0.1321
1.8390 20100 0.1629
1.8481 20200 0.135
1.8573 20300 0.1097
1.8664 20400 0.1233
1.8756 20500 0.1198
1.8847 20600 0.1151
1.8939 20700 0.1206
1.9030 20800 0.1295
1.9122 20900 0.126
1.9213 21000 0.147
1.9305 21100 0.1316
1.9396 21200 0.1019
1.9488 21300 0.1328
1.9579 21400 0.1127
1.9671 21500 0.1416
1.9762 21600 0.1428
1.9854 21700 0.1481
1.9945 21800 0.1169
2.0037 21900 0.1005
2.0128 22000 0.1114
2.0220 22100 0.1301
2.0311 22200 0.1554
2.0403 22300 0.1623
2.0494 22400 0.1153
2.0586 22500 0.1152
2.0677 22600 0.1406
2.0769 22700 0.1196
2.0860 22800 0.1172
2.0952 22900 0.1153
2.1043 23000 0.1126
2.1134 23100 0.1157
2.1226 23200 0.1102
2.1317 23300 0.1102
2.1409 23400 0.1198
2.1500 23500 0.1241
2.1592 23600 0.1124
2.1683 23700 0.1172
2.1775 23800 0.1161
2.1866 23900 0.1162
2.1958 24000 0.1209
2.2049 24100 0.1039
2.2141 24200 0.1183
2.2232 24300 0.1155
2.2324 24400 0.1168
2.2415 24500 0.1116
2.2507 24600 0.1173
2.2598 24700 0.1321
2.2690 24800 0.1217
2.2781 24900 0.1153
2.2873 25000 0.1464
2.2964 25100 0.101
2.3056 25200 0.1042
2.3147 25300 0.1382
2.3239 25400 0.1489
2.3330 25500 0.1187
2.3422 25600 0.1184
2.3513 25700 0.0971
2.3605 25800 0.0986
2.3696 25900 0.1114
2.3788 26000 0.1175
2.3879 26100 0.1136
2.3971 26200 0.1251
2.4062 26300 0.1097
2.4154 26400 0.1123
2.4245 26500 0.1446
2.4337 26600 0.1282
2.4428 26700 0.0988
2.4520 26800 0.1172
2.4611 26900 0.0903
2.4703 27000 0.1049
2.4794 27100 0.1043
2.4886 27200 0.1081
2.4977 27300 0.1265
2.5069 27400 0.1131
2.5160 27500 0.1403
2.5252 27600 0.1033
2.5343 27700 0.1175
2.5435 27800 0.1247
2.5526 27900 0.1115
2.5618 28000 0.1173
2.5709 28100 0.1209
2.5801 28200 0.0894
2.5892 28300 0.1238
2.5984 28400 0.1011
2.6075 28500 0.0976
2.6167 28600 0.0968
2.6258 28700 0.1065
2.6349 28800 0.1011
2.6441 28900 0.0975
2.6532 29000 0.1291
2.6624 29100 0.1118
2.6715 29200 0.0983
2.6807 29300 0.1119
2.6898 29400 0.0728
2.6990 29500 0.1241
2.7081 29600 0.1045
2.7173 29700 0.1186
2.7264 29800 0.1037
2.7356 29900 0.129
2.7447 30000 0.0921
2.7539 30100 0.1006
2.7630 30200 0.1068
2.7722 30300 0.099
2.7813 30400 0.0949
2.7905 30500 0.1066
2.7996 30600 0.1025
2.8088 30700 0.1148
2.8179 30800 0.1164
2.8271 30900 0.1147
2.8362 31000 0.1298
2.8454 31100 0.1245
2.8545 31200 0.087
2.8637 31300 0.1115
2.8728 31400 0.1129
2.8820 31500 0.1121
2.8911 31600 0.0985
2.9003 31700 0.1094
2.9094 31800 0.1296
2.9186 31900 0.1149
2.9277 32000 0.1146
2.9369 32100 0.1147
2.9460 32200 0.1045
2.9552 32300 0.0962
2.9643 32400 0.1065
2.9735 32500 0.1169
2.9826 32600 0.1162
2.9918 32700 0.1134

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
5
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for codersan/validadted_allMiniLM_onV9f

Finetuned
(485)
this model