@pinged
It took me quite a bit of playing around with reasoning models and trying to get different models to produce efficient answers to really understand this fact; when I first starting playing with o3-mini, I more or less assumed all the proofs it was claiming for math problems I asked were more or less correct
And in some ways, they *were* correct — but often times the model would:
a) [Least nefarious] Subtly make an assumption that immediately implies the conclusion ("Prove: A ⟹ C. Assume A ⟹ B ⟹ C. So A ⟹ C!")
b) Go out of its way to "over prove a result"
c) Hallucinate a reference (but usually from the right author!)
d) [Most nefarious] Get stuck in rewriting the (same or worse) wrong argument in multiple ways
Let's look at some examples!