ByteDance-Seed/Seed-X-PPO-7B
Translation
•
Updated
•
13k
•
285
None defined yet.
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics