RefalMachine/ruadapt_qwen2.5_3B_darulm_cl100k_extended_u60k_full_lr3e4_bs256 3B • Updated Oct 9, 2024 • 7
RefalMachine/ruadapt_qwen2.5_3B_darulm_cl100k_extended_u60k_full_lr2e4_bs256 3B • Updated Oct 8, 2024 • 7
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_unigram_full_lr2e4_bs256_v2 Text Generation • 8B • Updated Jul 18, 2024 • 13
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_bpe_full_lr2e4_bs256_v2 Text Generation • 8B • Updated Jul 18, 2024 • 13
RefalMachine/llama3_extended_darulm_20_05_24_part1-2_64000_bpe_full_lr2e4_bs256 Text Generation • 8B • Updated Jul 17, 2024 • 12
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_bpe_full_lr2e4_bs256 Text Generation • 8B • Updated Jul 17, 2024 • 13
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_unigram_full_lr2e4_bs256 Text Generation • 8B • Updated Jul 17, 2024 • 13
RefalMachine/llama3_cut_extended_darulm_20_05_24_part1-2_64000_64000_bpe_full_lr2e4_bs256 Text Generation • 8B • Updated Jul 14, 2024 • 15
RefalMachine/llama3_cut_extended_darulm_20_05_24_part1-2_64000_64000_bpe_full_lr1e4_bs256 Text Generation • 8B • Updated Jul 14, 2024 • 15
RefalMachine/llama3_extended_darulm_20_05_24_part1-2_64000_bpe_full_lr1e4_bs256 Text Generation • 8B • Updated Jul 14, 2024 • 14
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_unigram_full_lr1e4_bs256 Text Generation • 8B • Updated Jul 14, 2024 • 13
RefalMachine/llama3_darulm_20_05_24_part1-2_128000_bpe_full_lr1e4_bs256 Text Generation • 8B • Updated Jul 14, 2024 • 13