Content
@
https://opensea.io/collection/science-14
0 reply
0 recast
0 reaction
Chaos 🎩
@multifractal.eth
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad All evaluated LLMs models performed very poorly, with even the best-performing model achieving an average accuracy of less than 5%. https://arxiv.org/abs/2503.21934
0 reply
0 recast
0 reaction