@tg
^^ Funny, just noticed that original Neural Scaling Laws paper makes this same bad-intuition mistake!
"though performance must flatten out before reaching zero loss"
Well, here, "reaching zero loss" = "reaching infinite skill".
So no, no math reason why curve must ever flatten out, bc infinity can't ever be reached!