@ottok
Is the #AI hype still on or have the models plateaued?
I tested 9 flagships (Claude 4.6, GPT-5.2, Gemini 3.1 Pro, Kimi K2.5, etc.) in my own mini-benchmark with novel tasks, web search disabled and zero training contamination and no cheating possible:
https://optimizedbyotto.com/post/ai-models-plateaued-or-not/