@mayoteri
📊 Gemini 3 Flash sets new score/cost Pareto frontier on ARC-AGI-2 across different test-time compute levels.
- Scores 84.7% on ARC-AGI-1 at $0.17 per task.
- Achieves 33.6% on ARC-AGI-2 at $0.23 per task.
- Provides competitive performance at lower cost than other frontier models.