AI 大模型排名 ArtificialAnalysis AI 大模型排行榜

🔍 信息查询
32 次浏览
100% 有帮助 · 1 人反馈

AI 大模型排名 Artificial Analysis AI 大模型排行榜,综合对超过 100 个 AI 模型(LLM)的性能进行了比较和排名,评估指标包括智能程度、价格以及常见AI基准测试的结果。

AI 大模型排行榜数据中心

重置
排名 模型名称 综合指数 ▼ 编程 价格 ($/1M)
1 GPT-5.4 (xhigh) 57.2 57.3 $5.625
2 Gemini 3.1 Pro Preview 57.2 55.5 $4.5
3 GPT-5.3 Codex (xhigh) 54 53.1 $4.813
4 Claude Opus 4.6 (Adaptive Reasoning, Max Effort) 53 48.1 $10
5 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) 51.7 50.9 $6
6 GPT-5.2 (xhigh) 51.3 48.7 $4.813
7 GLM-5 (Reasoning) 49.8 44.2 $1.55
8 Claude Opus 4.5 (Reasoning) 49.7 47.8 $10
9 MiniMax-M2.7 49.6 41.9 $0.525
10 MiMo-V2-Pro 49.2 41.4 $1.5
11 GPT-5.2 Codex (xhigh) 49 43 $4.813
12 Grok 4.20 Beta 0309 (Reasoning) 48.5 42.2 $3
13 Gemini 3 Pro Preview (high) 48.4 46.5 $4.5
14 GPT-5.4 mini (xhigh) 48.1 51.5 $1.688
15 GPT-5.1 (high) 47.7 44.7 $3.438
16 Kimi K2.5 (Reasoning) 46.8 39.5 $1.2
17 GLM-5-Turbo 46.8 36.8 $0
18 GPT-5.2 (medium) 46.6 44.2 $4.813
19 Claude Opus 4.6 (Non-reasoning, High Effort) 46.5 47.6 $10
20 Gemini 3 Flash Preview (Reasoning) 46.4 42.6 $1.125
21 Qwen3.5 397B A17B (Reasoning) 45 41.3 $1.35
22 GPT-5 (high) 44.6 36 $3.438
23 GPT-5 Codex (high) 44.6 38.9 $3.438
24 GPT-5.4 nano (xhigh) 44.4 43.9 $0.463
25 Claude Sonnet 4.6 (Non-reasoning, High Effort) 44.4 46.4 $6
26 MiMo-V2-Omni 43.4 35.5 $0
27 GPT-5.1 Codex (high) 43.1 36.6 $3.438
28 Claude Opus 4.5 (Non-reasoning) 43.1 42.9 $10
29 Claude 4.5 Sonnet (Reasoning) 43 38.6 $6
30 Claude Sonnet 4.6 (Non-reasoning, Low Effort) 42.6 43 $6
31 Qwen3.5 27B (Reasoning) 42.1 34.9 $0.825
32 GLM-4.7 (Reasoning) 42.1 36.3 $1
33 GPT-5 (medium) 42 39 $3.438
34 Claude 4.1 Opus (Reasoning) 42 36.5 $30
35 MiniMax-M2.5 41.9 37.4 $0.525
36 DeepSeek V3.2 (Reasoning) 41.7 36.7 $0.315
37 Qwen3.5 122B A10B (Reasoning) 41.6 34.7 $1.1
38 MiMo-V2-Flash (Feb 2026) 41.5 33.5 $0.15
39 Grok 4 41.5 40.5 $6
40 Gemini 3 Pro Preview (low) 41.3 39.4 $4.5
41 GPT-5 mini (high) 41.2 35.3 $0.688
42 Kimi K2 Thinking 40.9 34.8 $1.075
43 o3-pro 40.7 - $35
44 GLM-5 (Non-reasoning) 40.6 39 $1.55
45 Qwen3.5 397B A17B (Non-reasoning) 40.1 37.4 $1.35
46 Qwen3 Max Thinking 39.9 30.5 $2.4
47 MiniMax-M2.1 39.4 32.8 $0.525
48 GPT-5 (low) 39.2 30.7 $3.438
49 MiMo-V2-Flash (Reasoning) 39.2 31.8 $0.15
50 Claude 4 Opus (Reasoning) 39 34 $30
51 GPT-5 mini (medium) 38.9 32.9 $0.688
52 Claude 4 Sonnet (Reasoning) 38.7 34.1 $6
53 GPT-5.1 Codex mini (high) 38.6 36.4 $0.688
54 Grok 4.1 Fast (Reasoning) 38.6 30.9 $0.275
55 o3 38.4 38.4 $3.5
56 GPT-5.4 nano (medium) 38.1 35 $0.463
57 Step 3.5 Flash 37.8 31.6 $0.15
58 GPT-5.4 mini (medium) 37.7 37.5 $1.688
59 Kimi K2.5 (Non-reasoning) 37.3 25.8 $1.2
60 Qwen3.5 27B (Non-reasoning) 37.2 33.4 $0.825
61 Claude 4.5 Haiku (Reasoning) 37.1 32.6 $2
62 Qwen3.5 35B A3B (Reasoning) 37.1 30.3 $0.688
63 Claude 4.5 Sonnet (Non-reasoning) 37.1 33.5 $6
64 MiniMax-M2 36.1 29.2 $0.525
65 NVIDIA Nemotron 3 Super 120B A12B (Reasoning) 36 31.2 $0.412
66 KAT-Coder-Pro V1 36 18.3 $0.525
67 Claude 4.1 Opus (Non-reasoning) 36 - $30
68 Qwen3.5 122B A10B (Non-reasoning) 35.9 31.6 $1.1
69 Nova 2.0 Pro Preview (medium) 35.7 30.4 $3.438
70 GPT-5.4 (Non-reasoning) 35.4 41 $5.625
71 Grok 4 Fast (Reasoning) 35.1 27.4 $0.275
72 Gemini 3 Flash Preview (Non-reasoning) 35 37.8 $1.125
73 Claude 3.7 Sonnet (Reasoning) 34.7 27.6 $6
74 Gemini 2.5 Pro 34.6 31.9 $3.438
75 GLM-4.7 (Non-reasoning) 34.2 32 $0.938
76 DeepSeek V3.1 Terminus (Reasoning) 33.9 33.7 $0.8
77 GPT-5.2 (Non-reasoning) 33.6 34.7 $4.813
78 Gemini 3.1 Flash-Lite Preview 33.5 30.1 $0.563
79 Doubao Seed Code 33.5 31.3 $0
80 gpt-oss-120B (high) 33.3 28.6 $0.263
81 o4-mini (high) 33.1 25.6 $1.925
82 Claude 4 Opus (Non-reasoning) 33 - $30
83 Claude 4 Sonnet (Non-reasoning) 33 30.6 $6
84 DeepSeek V3.2 Exp (Reasoning) 32.9 33.3 $0.315
85 Mercury 2 32.8 30.6 $0.375
86 GLM-4.6 (Reasoning) 32.5 29.5 $0.981
87 Qwen3 Max Thinking (Preview) 32.5 24.5 $2.4
88 Qwen3.5 9B (Reasoning) 32.4 25.3 $0.113
89 DeepSeek V3.2 (Non-reasoning) 32.1 34.6 $0.315
90 Grok 3 mini Reasoning (high) 32.1 25.2 $0.35
91 K-EXAONE (Reasoning) 32.1 27 $0
92 Nova 2.0 Pro Preview (low) 31.9 24.5 $3.438
93 Qwen3 Max 31.4 26.4 $2.4
94 Claude 4.5 Haiku (Non-reasoning) 31.1 29.6 $2
95 Gemini 2.5 Flash Preview (Sep '25) (Reasoning) 31.1 24.6 $0
96 Kimi K2 0905 30.9 25.9 $1.137
97 o1 30.8 20.5 $26.25
98 Claude 3.7 Sonnet (Non-reasoning) 30.8 26.7 $6
99 Qwen3.5 35B A3B (Non-reasoning) 30.7 16.8 $0.688
100 MiMo-V2-Flash (Non-reasoning) 30.4 25.8 $0.15
101 Gemini 2.5 Pro Preview (Mar' 25) 30.3 46.7 $0
102 GLM-4.6 (Non-reasoning) 30.2 30.2 $1
103 GLM-4.7-Flash (Reasoning) 30.1 25.9 $0.152
104 Grok 4.20 Beta 0309 (Non-reasoning) 29.7 25.4 $3
105 Nova 2.0 Lite (medium) 29.7 23.9 $0.85
106 Gemini 2.5 Pro Preview (May' 25) 29.5 - $3.438
107 Qwen3 235B A22B 2507 (Reasoning) 29.5 23.2 $2.625
108 DeepSeek V3.2 Speciale 29.4 37.9 $0
109 ERNIE 5.0 Thinking Preview 29.1 29.2 $0
110 Grok Code Fast 1 28.7 23.7 $0.525
111 DeepSeek V3.1 Terminus (Non-reasoning) 28.5 31.9 $0.626
112 DeepSeek V3.2 Exp (Non-reasoning) 28.4 30 $0.315
113 Qwen3 Coder Next 28.3 22.9 $0.6
114 Apriel-v1.5-15B-Thinker 28.3 18.7 $0
115 DeepSeek V3.1 (Non-reasoning) 28.1 28.4 $0.84
116 Nova 2.0 Omni (medium) 28 15.1 $0.85
117 DeepSeek V3.1 (Reasoning) 27.7 29.7 $0.875
118 Apriel-v1.6-15B-Thinker 27.6 22 $0
119 Qwen3 VL 235B A22B (Reasoning) 27.6 20.9 $2.625
120 GPT-5.1 (Non-reasoning) 27.4 27.3 $3.438
121 Qwen3.5 9B (Non-reasoning) 27.3 21.4 $0
122 Magistral Medium 1.2 27.1 21.7 $2.75
123 DeepSeek R1 0528 (May '25) 27.1 24 $2.362
124 Qwen3.5 4B (Reasoning) 27.1 17.5 $0
125 Gemini 2.5 Flash (Reasoning) 27 22.2 $0.85
126 Mistral Small 4 (Reasoning) 26.9 24.3 $0.263
127 GPT-5 nano (high) 26.8 20.3 $0.138
128 Qwen3 Next 80B A3B (Reasoning) 26.7 19.5 $1.875
129 GLM-4.5 (Reasoning) 26.4 26.3 $0.843
130 GPT-4.1 26.3 21.8 $3.5
131 Kimi K2 26.3 22.1 $1.002
132 Qwen3 Max (Preview) 26.1 25.5 $2.4
133 GPT-5 nano (medium) 25.9 22.9 $0.138
134 o3-mini 25.9 17.9 $1.925
135 o1-pro 25.8 - $262.5
136 Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) 25.7 22.1 $0
137 o3-mini (high) 25.2 17.3 $1.925
138 Grok 3 25.2 19.8 $6
139 Seed-OSS-36B-Instruct 25.2 16.7 $0.3
140 Qwen3 235B A22B 2507 Instruct 25 22.1 $1.225
141 Qwen3 Coder 480B A35B Instruct 24.8 24.6 $3
142 Qwen3 VL 32B (Reasoning) 24.7 14.5 $2.625
143 Nova 2.0 Lite (low) 24.6 13.6 $0.85
144 Sonar Reasoning Pro 24.6 - $0
145 gpt-oss-120B (low) 24.5 15.5 $0.263
146 gpt-oss-20B (high) 24.5 18.5 $0.094
147 GPT-5.4 nano (Non-Reasoning) 24.4 27.9 $0.463
148 MiniMax M1 80k 24.4 14.5 $0.963
149 NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) 24.3 19 $0.105
150 Gemini 2.5 Flash Preview (Reasoning) 24.3 - $0
151 K2 Think V2 24.1 15.5 $0
152 LongCat Flash Lite 23.9 16.5 $0
153 GPT-5 (minimal) 23.9 25.1 $3.438
154 HyperCLOVA X SEED Think (32B) 23.7 17.5 $0
155 o1-preview 23.7 34 $28.875
156 Grok 4.1 Fast (Non-reasoning) 23.6 19.5 $0.275
157 K-EXAONE (Non-reasoning) 23.4 13.5 $0
158 GLM-4.6V (Reasoning) 23.4 19.7 $0.45
159 GPT-5.4 mini (Non-Reasoning) 23.3 25.3 $1.688
160 Nova 2.0 Omni (low) 23.2 13.9 $0.85
161 GLM-4.5-Air 23.2 23.8 $0.425
162 Nova 2.0 Pro Preview (Non-reasoning) 23.1 20.5 $3.438
163 Mi:dm K 2.5 Pro 23.1 12.6 $0
164 Grok 4 Fast (Non-reasoning) 23.1 19 $0.275
165 GPT-4.1 mini 22.9 18.5 $0.7
166 Mistral Large 3 22.8 22.7 $0.75
167 Ring-1T 22.8 16.8 $0
168 Qwen3.5 4B (Non-reasoning) 22.6 13.7 $0
169 Qwen3 30B A3B 2507 (Reasoning) 22.4 14.7 $0.75
170 DeepSeek V3 0324 22.3 22 $1.25
171 INTELLECT-3 22.2 19.1 $0
172 GLM-4.7-Flash (Non-reasoning) 22.1 11 $0.152
173 Devstral 2 22 23.7 $0
174 GPT-5 (ChatGPT) 21.8 21.2 $3.438
175 Solar Open 100B (Reasoning) 21.7 10.5 $0
176 Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) 21.6 18.1 $0.175
177 Grok 3 Reasoning Beta 21.6 - $0
178 Mistral Medium 3.1 21.3 18.3 $0.8
179 MiniMax M1 40k 20.9 14.1 $0
180 gpt-oss-20B (low) 20.8 14.4 $0.094
181 Qwen3 VL 235B A22B Instruct 20.8 16.5 $1.225
182 GPT-5 mini (minimal) 20.7 21.9 $0.688
183 K2-V2 (high) 20.6 16.1 $0
184 Gemini 2.5 Flash (Non-reasoning) 20.6 17.8 $0.85
185 o1-mini 20.4 - $0
186 Qwen3 Next 80B A3B Instruct 20.1 15.3 $0.875
187 Tri-21B-think Preview 20 7.4 $0
188 GPT-4.5 (Preview) 20 - $0
189 Qwen3 Coder 30B A3B Instruct 20 19.4 $0.9
190 Qwen3 235B A22B (Reasoning) 19.8 17.4 $2.625
191 QwQ 32B 19.7 - $0.745
192 Qwen3 VL 30B A3B (Reasoning) 19.7 13.1 $0.75
193 Gemini 2.0 Flash Thinking Experimental (Jan '25) 19.6 24.1 $0
194 Devstral Small 2 19.5 20.7 $0
195 Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) 19.4 14.5 $0.175
196 Motif-2-12.7B-Reasoning 19.1 11.9 $0
197 Nova Premier 19 13.8 $5
198 Ling-1T 19 18.8 $0
199 Mistral Medium 3 18.8 13.6 $0.8
200 Magistral Medium 1 18.8 16 $0
201 DeepSeek R1 (Jan '25) 18.8 15.9 $2.362
202 Solar Pro 2 (Preview) (Reasoning) 18.8 - $0
203 Llama Nemotron Super 49B v1.5 (Reasoning) 18.7 15.2 $0.175
204 K2-V2 (medium) 18.7 14 $0
205 Claude 3.5 Haiku 18.7 10.7 $1.6
206 Devstral Medium 18.7 15.9 $0.8
207 Mistral Small 4 (Non-reasoning) 18.6 16.4 $0.263
208 Hermes 4 - Llama-3.1 405B (Reasoning) 18.6 16 $1.5
209 Tri-21B-Think 18.6 6.3 $0
210 GPT-4o (Aug '24) 18.6 16.6 $4.375
211 GPT-4o (March 2025, chatgpt-4o-latest) 18.6 - $0
212 Llama 3.3 Nemotron Super 49B v1 (Reasoning) 18.5 9.4 $0
213 Gemini 2.0 Flash (Feb '25) 18.5 13.6 $0.263
214 Llama 4 Maverick 18.4 15.6 $0.487
215 Magistral Small 1.2 18.2 14.8 $0.75
216 Sarvam 105B (high) 18.2 9.8 $0
217 Qwen3 4B 2507 (Reasoning) 18.2 9.5 $0
218 Gemini 2.0 Pro Experimental (Feb '25) 18.1 25.5 $0
219 Nova 2.0 Lite (Non-reasoning) 18 12.5 $0.85
220 Claude 3 Opus 18 19.5 $30
221 Devstral Small (May '25) 18 12.2 $0.075
222 Sonar Reasoning 17.9 - $0
223 Gemini 2.5 Flash Preview (Non-reasoning) 17.8 - $0
224 Hermes 4 - Llama-3.1 405B (Non-reasoning) 17.6 18.1 $1.5
225 Gemini 2.5 Flash-Lite (Reasoning) 17.6 9.5 $0.175
226 Llama 3.1 Instruct 405B 17.4 14.5 $3.688
227 GPT-4o (Nov '24) 17.3 16.7 $4.375
228 DeepSeek R1 Distill Qwen 32B 17.2 - $0.27
229 Qwen3 VL 32B Instruct 17.2 15.6 $1.225
230 GLM-4.6V (Non-reasoning) 17.1 11.1 $0.45
231 Qwen3 235B A22B (Non-reasoning) 17 14 $1.225
232 Gemini 2.0 Flash (experimental) 16.8 - $0
233 Magistral Small 1 16.8 11.1 $0
234 EXAONE 4.0 32B (Reasoning) 16.7 14 $0
235 Qwen3 VL 8B (Reasoning) 16.7 9.8 $0.66
236 Nova 2.0 Omni (Non-reasoning) 16.6 13.8 $0.85
237 DeepSeek V3 (Dec '24) 16.5 16.4 $0.625
238 Qwen3 32B (Reasoning) 16.5 13.8 $2.625
239 DeepSeek R1 0528 Qwen3 8B 16.4 7.8 $0
240 Qwen3.5 2B (Reasoning) 16.3 3.5 $0
241 Qwen2.5 Max 16.3 - $2.8
242 Qwen3 14B (Reasoning) 16.2 13.1 $1.313
243 Nanbeige4.1-3B 16.1 8.9 $0
244 Qwen3 VL 30B A3B Instruct 16.1 14.3 $0.35
245 Ministral 3 14B 16 10.9 $0.2
246 DeepSeek R1 Distill Llama 70B 16 11.4 $0.875
247 Hermes 4 - Llama-3.1 70B (Reasoning) 16 14.4 $0.198
248 Gemini 1.5 Pro (Sep '24) 16 23.6 $0
249 Solar Pro 2 (Preview) (Non-reasoning) 16 - $0
250 Claude 3.5 Sonnet (Oct '24) 15.9 30.2 $6
251 Falcon-H1R-7B 15.8 9.8 $0
252 DeepSeek R1 Distill Qwen 14B 15.8 - $0
253 Ling-flash-2.0 15.7 16.7 $0.247
254 Qwen3 Omni 30B A3B (Reasoning) 15.6 12.7 $0.43
255 Qwen2.5 Instruct 72B 15.6 11.9 $0
256 Sonar 15.5 - $0
257 Step3 VL 10B 15.4 13.9 $0
258 Qwen3 30B A3B (Reasoning) 15.3 11 $0.75
259 Devstral Small (Jul '25) 15.2 12.1 $0.15
260 Sonar Pro 15.2 - $0
261 QwQ 32B-Preview 15.2 - $0.135
262 Mistral Large 2 (Nov '24) 15.1 13.8 $3
263 Mistral Small 3.2 15.1 13.3 $0.15
264 GLM-4.5V (Reasoning) 15.1 10.9 $0.9
265 Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) 15 13.1 $0.9
266 ERNIE 4.5 300B A47B 15 14.5 $0.485
267 Qwen3 30B A3B 2507 Instruct 15 14.2 $0.35
268 Solar Pro 2 (Reasoning) 14.9 12.1 $0
269 NVIDIA Nemotron Nano 12B v2 VL (Reasoning) 14.9 11.8 $0.3
270 Ministral 3 8B 14.8 10 $0.15
271 NVIDIA Nemotron Nano 9B V2 (Reasoning) 14.8 8.3 $0.07
272 NVIDIA Nemotron 3 Nano 4B 14.7 10 $0
273 Qwen3.5 2B (Non-reasoning) 14.7 4.9 $0
274 Gemini 2.0 Flash-Lite (Feb '25) 14.7 - $0
275 Llama Nemotron Super 49B v1.5 (Non-reasoning) 14.6 10.5 $0.175
276 Llama 3.3 Instruct 70B 14.5 10.7 $0.64
277 GPT-4o (May '24) 14.5 24.2 $7.5
278 Gemini 2.0 Flash-Lite (Preview) 14.5 - $0
279 Mistral Small 3.1 14.5 13.9 $0.15
280 Qwen3 32B (Non-reasoning) 14.5 - $1.225
281 Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) 14.4 - $0
282 Kimi Linear 48B A3B Instruct 14.4 14.2 $0
283 K2-V2 (low) 14.4 10.5 $0
284 Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) 14.3 7.6 $0
285 Qwen3 VL 8B Instruct 14.3 7.3 $0.31
286 Claude 3.5 Sonnet (June '24) 14.2 26 $6
287 Qwen3 4B (Reasoning) 14.2 - $0.398
288 GPT-4o (ChatGPT) 14.1 - $0
289 Llama 3.1 Tulu3 405B 14.1 - $0
290 Ring-flash-2.0 14 10.6 $0.247
291 Pixtral Large 14 - $3
292 Olmo 3.1 32B Think 13.9 9.8 $0
293 Grok 2 (Dec '24) 13.9 - $0
294 GPT-5 nano (minimal) 13.8 14.2 $0.138
295 Gemini 1.5 Flash (Sep '24) 13.8 - $0
296 GPT-4 Turbo 13.7 21.5 $15
297 Qwen3 VL 4B (Reasoning) 13.7 6.7 $0
298 Solar Pro 2 (Non-reasoning) 13.6 11.3 $0
299 Llama 4 Scout 13.5 6.7 $0.292
300 Command A 13.5 9.9 $4.375
301 Nova Pro 13.5 11 $1.4
302 Llama 3.1 Nemotron Instruct 70B 13.4 10.8 $1.2
303 Grok Beta 13.3 - $0
304 NVIDIA Nemotron Nano 9B V2 (Non-reasoning) 13.2 7.5 $0.086
305 NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) 13.2 15.8 $0.087
306 Qwen3 8B (Reasoning) 13.2 9 $0.66
307 Qwen2.5 Instruct 32B 13.2 - $0
308 GPT-4.1 nano 13 11.2 $0.175
309 Mistral Large 2 (Jul '24) 13 - $3
310 Qwen2.5 Coder Instruct 32B 12.9 - $0
311 Qwen3 4B 2507 Instruct 12.9 9.1 $0
312 GPT-4 12.8 13.1 $37.5
313 Qwen3 14B (Non-reasoning) 12.8 12.4 $0.613
314 Gemini 2.5 Flash-Lite (Non-reasoning) 12.7 7.4 $0.175
315 Mistral Small 3 12.7 - $0.15
316 Nova Lite 12.7 5.1 $0.105
317 GLM-4.5V (Non-reasoning) 12.7 10.8 $0.9
318 Hermes 4 - Llama-3.1 70B (Non-reasoning) 12.6 9.2 $0.198
319 GPT-4o mini 12.6 - $0.263
320 Llama 3.1 Instruct 70B 12.5 10.9 $0.56
321 DeepSeek-V2.5 (Dec '24) 12.5 - $0
322 Qwen3 4B (Non-reasoning) 12.5 - $0.188
323 Qwen3 30B A3B (Non-reasoning) 12.5 13.3 $0.35
324 Sarvam 30B (high) 12.3 7.9 $0
325 Gemini 2.0 Flash Thinking Experimental (Dec '24) 12.3 - $0
326 Claude 3 Haiku 12.3 6.7 $0.5
327 DeepSeek-V2.5 12.3 - $0
328 Olmo 3.1 32B Instruct 12.2 5.6 $0.3
329 Mistral Saba 12.1 - $0
330 DeepSeek R1 Distill Llama 8B 12.1 - $0
331 Olmo 3 32B Think 12.1 10.5 $0
332 R1 1776 12 - $0
333 Gemini 1.5 Pro (May '24) 12 19.8 $0
334 Reka Flash (Sep '24) 12 - $0.35
335 Qwen2.5 Turbo 12 - $0.087
336 Llama 3.2 Instruct 90B (Vision) 11.9 - $0.72
337 Solar Mini 11.9 - $0.15
338 Llama 3.1 Instruct 8B 11.8 4.9 $0.1
339 Grok-1 11.7 - $0
340 EXAONE 4.0 32B (Non-reasoning) 11.7 9.4 $0
341 Qwen2 Instruct 72B 11.7 - $0
342 Ministral 3 3B 11.2 4.8 $0.1
343 Gemini 1.5 Flash-8B 11.1 - $0
344 DeepHermes 3 - Mistral 24B Preview (Non-reasoning) 10.9 - $0
345 Jamba 1.7 Large 10.9 7.8 $3.5
346 Granite 4.0 H Small 10.8 8.5 $0.107
347 Qwen3 Omni 30B A3B Instruct 10.7 7.2 $0.43
348 Jamba 1.5 Large 10.7 - $3.5
349 DeepSeek-Coder-V2 10.6 - $0
350 OLMo 2 32B 10.6 2.7 $0
351 Hermes 3 - Llama-3.1 70B 10.6 - $0.3
352 Jamba 1.6 Large 10.6 - $3.5
353 Qwen3 8B (Non-reasoning) 10.6 7.1 $0.31
354 LFM2 24B A2B 10.5 3.6 $0.052
355 Qwen3.5 0.8B (Reasoning) 10.5 0 $0
356 Gemini 1.5 Flash (May '24) 10.5 - $0
357 Phi-4 10.4 11.2 $0.219
358 Gemma 3 27B Instruct 10.3 9.6 $0
359 Nova Micro 10.3 4.1 $0.061
360 Claude 3 Sonnet 10.3 - $6
361 Mistral Small (Sep '24) 10.2 - $0.3
362 NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) 10.1 5.9 $0.3
363 Gemma 3n E4B Instruct Preview (May '25) 10.1 - $0
364 Gemini 1.0 Ultra 10.1 17.6 $0
365 Phi-3 Mini Instruct 3.8B 10.1 3 $0
366 Phi-4 Multimodal Instruct 10 - $0
367 Qwen2.5 Coder Instruct 7B 10 - $0
368 Qwen3.5 0.8B (Non-reasoning) 9.9 1 $0
369 Mistral Large (Feb '24) 9.9 - $6
370 Mixtral 8x22B Instruct 9.8 - $0
371 Llama 3.2 Instruct 3B 9.7 - $0.085
372 Llama 2 Chat 7B 9.7 - $0.1
373 Jamba Reasoning 3B 9.6 2.5 $0
374 Qwen3 VL 4B Instruct 9.6 4.5 $0
375 Reka Flash 3 9.5 8.9 $0.35
376 Qwen1.5 Chat 110B 9.5 - $0
377 Olmo 3 7B Think 9.4 7.6 $0
378 Claude 2.1 9.3 14 $0
379 OLMo 2 7B 9.3 1.2 $0
380 Molmo 7B-D 9.2 1.2 $0
381 Ling-mini-2.0 9.2 5 $0
382 Claude 2.0 9.1 12.9 $0
383 DeepSeek R1 Distill Qwen 1.5B 9.1 - $0
384 DeepSeek-V2-Chat 9.1 - $0
385 GPT-3.5 Turbo 9 10.7 $0.75
386 Mistral Small (Feb '24) 9 - $1.5
387 Mistral Medium 9 - $4.088
388 Llama 3 Instruct 70B 8.9 6.8 $0.871
389 Gemma 3 12B Instruct 8.8 6.3 $0
390 LFM 40B 8.8 - $0
391 Arctic Instruct 8.8 - $0
392 Qwen Chat 72B 8.8 - $0
393 Llama 3.2 Instruct 11B (Vision) 8.7 4.3 $0.16
394 PALM-2 8.6 4.6 $0
395 Gemini 1.0 Pro 8.5 - $0
396 DeepSeek Coder V2 Lite Instruct 8.5 - $0
397 Phi-4 Mini Instruct 8.4 3.6 $0
398 Llama 2 Chat 70B 8.4 - $0
399 Llama 2 Chat 13B 8.4 - $0
400 DeepSeek LLM 67B Chat (V1) 8.4 - $0
401 Sarvam M (Reasoning) 8.4 7.5 $0
402 Exaone 4.0 1.2B (Reasoning) 8.3 3.1 $0
403 OpenChat 3.5 (1210) 8.3 - $0
404 DBRX Instruct 8.3 - $0
405 Command-R+ (Apr '24) 8.3 - $6
406 Olmo 3 7B Instruct 8.2 3.4 $0.125
407 LFM2.5-1.2B-Thinking 8.1 1.4 $0
408 Exaone 4.0 1.2B (Non-reasoning) 8.1 2.5 $0
409 Jamba 1.7 Mini 8.1 3.1 $0
410 LFM2.5-1.2B-Instruct 8 0.8 $0
411 LFM2 2.6B 8 1.4 $0
412 Granite 4.0 H 1B 8 2.7 $0
413 Jamba 1.5 Mini 8 - $0.25
414 Qwen3 1.7B (Reasoning) 8 1.4 $0.398
415 Jamba 1.6 Mini 7.9 - $0.25
416 Gemma 3 270M 7.7 0 $0
417 Granite 4.0 Micro 7.7 5 $0
418 Apertus 70B Instruct 7.7 1.9 $1.345
419 Mixtral 8x7B Instruct 7.7 - $0.54
420 DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) 7.6 - $0
421 Llama 65B 7.4 - $0
422 Qwen Chat 14B 7.4 - $0
423 Claude Instant 7.4 7.8 $0
424 Mistral 7B Instruct 7.4 - $0.25
425 Command-R (Mar '24) 7.4 - $0.75
426 Molmo2-8B 7.3 4.4 $0
427 Granite 4.0 1B 7.3 2.9 $0
428 LFM2 8B A1B 7 2.3 $0
429 Granite 3.3 8B (Non-reasoning) 7 3.4 $0.085
430 Qwen3 1.7B (Non-reasoning) 6.8 2.3 $0.188
431 Qwen3 0.6B (Reasoning) 6.5 0.9 $0.398
432 Gemma 3n E4B Instruct 6.4 4.2 $0.025
433 Llama 3 Instruct 8B 6.4 4 $0.07
434 Gemma 3 4B Instruct 6.3 2.9 $0
435 Llama 3.2 Instruct 1B 6.3 0.6 $0.1
436 LFM2 1.2B 6.3 0.8 $0
437 LFM2.5-VL-1.6B 6.2 1 $0
438 Granite 4.0 350M 6.1 0.3 $0
439 Apertus 8B Instruct 5.9 1.4 $0.125
440 Qwen3 0.6B (Non-reasoning) 5.7 1.4 $0.188
441 Gemma 3 1B Instruct 5.5 0.2 $0
442 Granite 4.0 H 350M 5.4 0.6 $0
443 Gemma 3n E2B Instruct 4.8 2.2 $0
444 Tiny Aya Global 4.7 1.2 $0
445 GPT-5.4 Pro (xhigh) - - $67.5
446 Gemini 3 Deep Think - - $0
447 Cogito v2.1 (Reasoning) - 24.8 $1.25
448 Mi:dm K 2.5 Pro Preview - 11.9 $0
449 GPT-4o Realtime (Dec '24) - - $0
450 GPT-4o mini Realtime (Dec '24) - - $0
451 GPT-3.5 Turbo (0613) - - $0

榜单解读建议

参考 AI 大模型排行榜 时,应综合考虑“综合指数”与“成本价格”。如果您是开发者,编程能力 (Coding) 是更核心的指标。

值品工具箱同步的 AI 大模型排行榜 数据每 24 小时更新,确保您获取到最新的模型性能对比。

指标说明

  • 综合指数:评估通用理解与逻辑。
  • 价格 $/1M:混合 3:1 输入输出比的平均成本。
  • 编程能力:衡量代码生成的准确性。

AI 大模型排行榜 常见问题 (FAQ)

Q1: AI 大模型排行榜 的数据多久更新?

AI 大模型排行榜 数据每 24 小时自动抓取一次,确保最新模型加入列表。

Q2: 这个 AI 大模型排行榜 包含国产模型吗?

是的,只要国产模型通过了 Artificial Analysis 的全球测评,就会出现在 AI 大模型排行榜 中。

Q3: 综合指数在 AI 大模型排行榜 中代表什么?

它代表模型的全能表现。AI 大模型排行榜 通过加权算法给出这个综合评分。

Q4: 如何在 AI 大模型排行榜 中查找性价比最高的游戏?

在 AI 大模型排行榜 页面中,您可以点击“价格”标题进行排序,寻找低价高分的模型。

Q5: AI 大模型排行榜 的编程能力测试准吗?

AI 大模型排行榜 参考了 LiveCodeBench 等权威基准测试,具有极高的参考价值。

Q6: 为什么有的新模型没进入 AI 大模型排行榜?

模型进入 AI 大模型排行榜 需要经过一系列测试,通常在新模型发布后数日内会完成更新。

Q7: AI 大模型排行榜 中的价格计算标准是什么?

价格是基于百万 Token 的调用成本,由 AI 大模型排行榜 统一混合计算得出。

Q8: 手机上能查看 AI 大模型排行榜 吗?

当然可以。AI 大模型排行榜 进行了移动端响应式深度优化。

Q9: AI 大模型排行榜 这个工具免费吗?

是的,由值品工具箱免费提供 AI 大模型排行榜 信息查询服务。

Q10: 我该怎么利用 AI 大模型排行榜 做选型?

如果您需要智能客服,参考 AI 大模型排行榜 的综合指数;如果做翻译,参考编程外的语言指标。

发表评论

请友善文明留言