grun
@grun
This is exactly why we shouldn’t read too much into LLM evals. They aren’t measuring what matters for science, which is ab…
0 reply
0 recast
0 reaction