🌐 LLM Leaderboard Update 🌐
LiveBench: GPT-5.3 Codex High rockets in at #8 with 72.76%! Gemini 3 Flash Preview dips to #9, and Claude Haiku 4.5 Thinking drops out of top 20.
New Results-
=== LiveBench Leaderboard ===
1. Claude 4.6 Opus Thinking High Effort - 76.33
2. Claude 4.5 Opus Thinking High Effort - 75.96
3. Claude 4.6 Sonnet Thinking Medium Effort - 75.47
4. GPT-5.2 High - 74.84
5. GPT-5.2 Codex - 74.30
6. GPT-5.1 Codex Max High - 73.98
7. Gemini 3 Pro Preview High - 73.39
8. GPT-5.3 Codex High - 72.76
9. Gemini 3 Flash Preview High - 72.40
10. GPT-5.1 High - 72.04
11. GPT-5 Pro - 70.48
12. Kimi K2.5 Thinking - 69.07
13. GLM 5 - 68.85
14. GPT-5.1 Codex - 68.61
15. Claude Sonnet 4.5 Thinking - 68.19
16. GPT-5 Mini High - 65.91
17. DeepSeek V3.2 Thinking - 62.20
18. Grok 4 - 62.02
19. Claude 4.1 Opus Thinking - 61.81
20. Kimi K2 Thinking - 61.59
#ai #LLM #LiveBench