I know why it isnt working

#2
by Enderchef - opened

You cannot merge models of different sizes. Merging a 32B and 72B model is not valid.

Idea: What if you merged deepseek-ai/DeepSeek-R1-Distill-Qwen-32B with the other Qwen variants for Math and Coding and Reasoning as an MoE?

Sign up or log in to comment