@caroltours
The release of Claude 3.7 reveals a hard truth: its coding ability has already surpassed that of human coding experts. The average score of humans on the SWE-Bench benchmark is only about 69.7%, while Claude 3.7 has achieved a score of 70.3%. The replacement of programmers seems inevitable.