Commit History
Add `layers_to_transform` for `lora_config` (#1118) 8487b97 unverified
xzuyn commited on
Enable or disable bf16 support based on availability (#1116) 0865613 unverified
Simon Hällqvist commited on
keep gate in fp32 for 16 bit loras (#1105) da97285 unverified
add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083) 78c5b19 unverified
update sharegpt conversations when chatml chat template is set (#1075) [skip ci] 0ce1a65 unverified
be more robust about checking embedding modules for lora finetunes (#1074) [skip ci] 0f10080 unverified
attempt to also run e2e tests that needs gpus (#1070) 788649f unverified
fix double eos token for chatml (#1054) [skip ci] 651b7a3 unverified
Phi2 rewrite (#1058) 732851f unverified
RL/DPO (#935) f243c21
bump transformers and update attention class map name (#1023) bcc78d8 unverified
Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified
fix mistral prompt assembly (#982) 7bbaac9 unverified
Fix prompt assembly for llama (#952) 5ada140 unverified
Respect sequence_len in config for `type: llama2_chat` (#926) f1de29d unverified
support for mamba (#915) 40a6362 unverified
Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified
Feat: Add warmup_ratio (#893) fb12895 unverified
Phi update 202311 (#876) 9bf854e unverified
add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified
use temp_dir kwarg instead 6dc68a6
missing dunder-init 7de6a56
chore: lint c74f045
make sure to cleanup tmp output_dir for e2e tests 0402d19
simplify by removing duplicate base_model_config (#772) 2d8def6 unverified
Fix: Warn when fullfinetune without adapter (#770) 44c9d01 unverified
convert exponential notation lr to floats (#771) ca84cca unverified
Fix: eval table conflict with eval_sample_packing (#769) 9923b72 unverified
remove lora fused packing test (#758) 21cf09b unverified
Implement fused modules (#747) 15d3a65 unverified
misc sharegpt fixes (#723) f30afe4 unverified
Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified
add mistral e2e tests (#649) 5b0bc48 unverified
Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified
use fastchat conversations template (#578) e7d3e2d unverified
Fix: Fail bf16 check when running on cpu during merge (#631) cfbce02 unverified
better handling and logging of empty sharegpt turns (#603) a363604 unverified
misc fixes to add gptq tests (#621) 03e5907 unverified
Support Sample packing for phi arch (#586) 12a2dbb unverified
E2e device cuda (#575) 2414673 unverified
e2e testing (#574) 9218ebe unverified
Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified
Jan Philipp Harries Jan Philipp Harries commited on