Giuliano Giacaglia 🌲 pfp
Giuliano Giacaglia 🌲
@giu
Anthropic just announced Claude 4 Opus and Claude 4 Sonnet! They lead on SWE-bench (72.5%) - testing practical software engineering skills - and Terminal-bench (43.2%) https://www.anthropic.com/news/claude-4
2 replies
4 recasts
25 reactions

Brian Kim pfp
Brian Kim
@brianjckim
excited.. sonnet 3.7 is not cutting it..
0 reply
0 recast
1 reaction

gcmac.eth pfp
gcmac.eth
@gcmac
What are your thoughts on the parallel execution for benchmarks? Makes it hard to compare to prior models imo
0 reply
0 recast
0 reaction

First Principale of Crypto pfp
First Principale of Crypto
@dmr7228
Go Ahead
0 reply
0 recast
0 reaction