shoni.eth pfp
shoni.eth
@alexpaden
so grok4 takes us like 2x on humanity’s last exam the problem is almost nobody knows what that means and it’s probably optimized to the problem set
1 reply
0 recast
2 reactions

↑langchain pfp
↑langchain
@langchain
Grok 4 is a product of genius not benchmark gaming
1 reply
0 recast
0 reaction

shoni.eth pfp
shoni.eth
@alexpaden
every new model is a product of genius i'll believe that as true when the other models are no longer needed (which isn't currently true)
0 reply
0 recast
1 reaction