SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v_1_0_7_1")
# Run inference
sentences = [
    '科目:コンクリート。名称:基礎部マスコンクリート。',
    '科目:コンクリート。名称:基礎部普通コンクリート。摘要:FC30 S15AE減水剤。備考:コンクリー 1。',
    '科目:コンクリート。名称:ポンプ圧送。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 139,719 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 11 tokens
    • mean: 14.03 tokens
    • max: 19 tokens
    • min: 11 tokens
    • mean: 22.75 tokens
    • max: 72 tokens
    • 0: ~12.60%
    • 1: ~8.60%
    • 2: ~78.80%
  • Samples:
    sentence1 sentence2 label
    科目:コンクリート。名称:コンクリートポンプ圧送。 科目:コンクリート。名称:ポンプ圧送。 1
    科目:コンクリート。名称:コンクリートポンプ圧送。 科目:コンクリート。名称:充填コンクリート(EXP_J内)。摘要:Fc18N/mm2 S18。備考:刊-コンクリート 1818物P100×100%。 0
    科目:コンクリート。名称:コンクリートポンプ圧送。 科目:コンクリート。名称:EXP_J充填コンクリート。 0
  • Loss: sentence_transformer_lib.categorical_constrastive_loss.CategoricalContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 20
  • warmup_ratio: 0.2
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.1832 50 0.6905
0.3663 100 0.2528
0.5495 150 0.1824
0.7326 200 0.1544
0.9158 250 0.14
1.0989 300 0.1272
1.2821 350 0.1135
1.4652 400 0.1001
1.6484 450 0.0987
1.8315 500 0.0887
2.0147 550 0.0804
2.1978 600 0.074
2.3810 650 0.0713
2.5641 700 0.0666
2.7473 750 0.06
2.9304 800 0.0601
3.1136 850 0.0494
3.2967 900 0.0472
3.4799 950 0.046
3.6630 1000 0.0441
3.8462 1050 0.0416
4.0293 1100 0.0373
4.2125 1150 0.034
4.3956 1200 0.0308
4.5788 1250 0.0308
4.7619 1300 0.0311
4.9451 1350 0.0273
5.1282 1400 0.0225
5.3114 1450 0.0231
5.4945 1500 0.0218
5.6777 1550 0.0209
5.8608 1600 0.0193
6.0440 1650 0.0182
6.2271 1700 0.0161
6.4103 1750 0.0161
6.5934 1800 0.0162
6.7766 1850 0.0146
6.9597 1900 0.0146
7.1429 1950 0.0126
7.3260 2000 0.0118
7.5092 2050 0.012
7.6923 2100 0.0118
7.8755 2150 0.0116
8.0586 2200 0.0121
8.2418 2250 0.0098
8.4249 2300 0.0099
8.6081 2350 0.0094
8.7912 2400 0.0089
8.9744 2450 0.009
9.1575 2500 0.0079
9.3407 2550 0.0082
9.5238 2600 0.0077
9.7070 2650 0.0074
9.8901 2700 0.008
10.0733 2750 0.0074
10.2564 2800 0.0065
10.4396 2850 0.0069
10.6227 2900 0.0067
10.8059 2950 0.0063
10.9890 3000 0.0064
11.1722 3050 0.0057
11.3553 3100 0.0058
11.5385 3150 0.0055
11.7216 3200 0.005
11.9048 3250 0.0055
12.0879 3300 0.0049
12.2711 3350 0.0041
12.4542 3400 0.0045
12.6374 3450 0.0045
12.8205 3500 0.0052
13.0037 3550 0.0054
13.1868 3600 0.005
13.3700 3650 0.0041
13.5531 3700 0.0039
13.7363 3750 0.004
13.9194 3800 0.0043
14.1026 3850 0.0037
14.2857 3900 0.0036
14.4689 3950 0.0038
14.6520 4000 0.0037
14.8352 4050 0.0042
15.0183 4100 0.004
15.2015 4150 0.0036
15.3846 4200 0.0036
15.5678 4250 0.0032
15.7509 4300 0.0032
15.9341 4350 0.0028
16.1172 4400 0.0032
16.3004 4450 0.0027
16.4835 4500 0.0034
16.6667 4550 0.0035
16.8498 4600 0.0032
17.0330 4650 0.0035
17.2161 4700 0.0031
17.3993 4750 0.003
17.5824 4800 0.003
17.7656 4850 0.0029
17.9487 4900 0.0029
18.1319 4950 0.0022
18.3150 5000 0.0034
18.4982 5050 0.0028
18.6813 5100 0.0026
18.8645 5150 0.0028
19.0476 5200 0.0025
19.2308 5250 0.0027
19.4139 5300 0.0029
19.5971 5350 0.0026
19.7802 5400 0.0027
19.9634 5450 0.0029

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
5
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support