You cannot merge models of different sizes. Merging a 32B and 72B model is not valid.
Idea: What if you merged deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with the other Qwen variants for Math and Coding and Reasoning as an MoE?
· Sign up or log in to comment