@hellno.eth
new reasoning model by Mistral AI: for my coding use case basically unusable. Aider benchmark scores are pretty reliable proxy for how I work:
Magistral: 47% (new)
Deepseek v3 Chat: 49%
claude-opus-4-20250514 (32k thinking): 72%
o4-mini-high: 72% (my default)
o3-high: 79% (expensive and slow to use)
gemini-2.5-pro (32k thinking, slow in my tests): 83%
https://mistral.ai/news/magistral