HINT-lab/Qwen2.5-7B-Baseline-SFT
Text Generation
•
8B
•
Updated
HINT-lab/Qwen2.5-7B-Confidence-SFT
Text Generation
•
8B
•
Updated
HINT-lab/Qwen2.5-7B-Baseline-SFT-GRPO
HINT-lab/Qwen2.5-7B-Confidence-SFT-SGRPO
HINT-lab/HASS-Llama3-8B-Instruct-Reproduce
HINT-lab/EAGLE-Llama3-8B-Instruct-Reproduce
HINT-lab/PosS3-Llama2-13B-Chat
HINT-lab/PosS2-Llama2-13B-Chat
HINT-lab/PosS1-Llama2-13B-Chat
HINT-lab/PosS3-Llama3-8B-Instruct
HINT-lab/PosS2-Llama3-8B-Instruct
HINT-lab/PosS1-Llama3-8B-Instruct
HINT-lab/Llama-3.1-8B-Instruct-Self-Calibration
Text Generation
•
8B
•
Updated
•
3
HINT-lab/Qwen2.5-7B-Instruct-Self-Calibration
Text Generation
•
8B
•
Updated
•
3
HINT-lab/DeepSeek-R1-Distill-Qwen-1.5B-Self-Calibration
Text Generation
•
2B
•
Updated
•
3
HINT-lab/mistral-7b-hermes-crm-skywork
7B
•
Updated
•
4
HINT-lab/llama3-8b-crm-final-v0.1
8B
•
Updated
•
3
HINT-lab/llama3-8b-final-ppo-c-v0.3
Text Generation
•
8B
•
Updated
•
3
HINT-lab/mistral-7b-ppo-c-hermes
Text Generation
•
7B
•
Updated
•
3
HINT-lab/llama3-8b-final-ppo-m-v0.3
Text Generation
•
8B
•
Updated
•
5
HINT-lab/mistral-7b-ppo-m-hermes
Text Generation
•
7B
•
Updated
•
3
•
1
HINT-lab/llama3-8b-dpo-v0.2
Text Generation
•
8B
•
Updated
•
3
HINT-lab/llama3-8b-cdpo-v0.2
Text Generation
•
8B
•
Updated
•
2
HINT-lab/mistral-7b-ppo-hermes-v0.3
Text Generation
•
7B
•
Updated
•
3
•
1
HINT-lab/mistral-7b-ppo-clean-hermes
Text Generation
•
7B
•
Updated
•
3
HINT-lab/llama3-8b-final-ppo-v0.3
Text Generation
•
8B
•
Updated
•
3
HINT-lab/llama3-8b-final-ppo-clean-v0.1
Text Generation
•
8B
•
Updated
•
4
HINT-lab/mistral-7b-hermes-rm-skywork
7B
•
Updated
•
3
HINT-lab/mistral-7b-hermes-dpo-v0.2
Text Generation
•
7B
•
Updated
•
3
HINT-lab/mistral-7b-hermes-cdpo-v0.2
Text Generation
•
7B
•
Updated
•
3