yalhessi's picture
End of training
99d97c4 verified
metadata
library_name: peft
license: other
base_model: deepseek-ai/deepseek-coder-1.3b-base
tags:
  - generated_from_trainer
model-index:
  - name: lemexp-task1-more_symbols_template_small-deepseek-coder-1.3b-base
    results: []

lemexp-task1-more_symbols_template_small-deepseek-coder-1.3b-base

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1504

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.3464 0.2000 2515 0.3264
0.2953 0.4001 5030 0.2775
0.2629 0.6001 7545 0.2489
0.2453 0.8001 10060 0.2290
0.2283 1.0002 12575 0.2216
0.2102 1.2002 15090 0.2148
0.2057 1.4002 17605 0.2071
0.1965 1.6003 20120 0.1968
0.185 1.8003 22635 0.1940
0.1894 2.0003 25150 0.1880
0.159 2.2003 27665 0.1870
0.1583 2.4004 30180 0.1786
0.1592 2.6004 32695 0.1774
0.1609 2.8004 35210 0.1720
0.1571 3.0005 37725 0.1670
0.1301 3.2005 40240 0.1694
0.1362 3.4005 42755 0.1690
0.1384 3.6006 45270 0.1595
0.1296 3.8006 47785 0.1582
0.1332 4.0006 50300 0.1557
0.1122 4.2007 52815 0.1581
0.1116 4.4007 55330 0.1538
0.1093 4.6007 57845 0.1525
0.1082 4.8008 60360 0.1534
0.1056 5.0008 62875 0.1500
0.0904 5.2008 65390 0.1529
0.0901 5.4009 67905 0.1516
0.0893 5.6009 70420 0.1489
0.0928 5.8009 72935 0.1504

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0