@gensyn
Introducing RL Swarm 72B
Fully decentralised RL training of 72B-parameter models for anyone to join, with no whitelists.
Train your base model on a new advanced math dataset (DAPO-Math-17k) collaboratively alongside thousands of others using our novel multi-stage system.