这是用户在 2024-6-17 16:46 为 https://artificialanalysis.ai/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Independent analysis of AI language models and API providers
对 AI 语言模型和 API 提供商进行独立分析

Understand the AI landscape and choose the best model and API provider for your use-case
了解 AI 环境,并为您的用例选择最佳模型和 API 提供商

Highlights 突出

Quality 质量
Quality Index; Higher is better
1009494938883787572656565GPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 OpusGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Gemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashMixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BMistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeClaude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Mixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7B
Speed 速度
Output Tokens per Second; Higher is better
140121117926965636353322724Gemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Claude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BGPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oMixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Mistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 Opus
Price 价格
USD per 1M Tokens; Lower is better
每 1M 代币 USD;越低越好 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Gemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BClaude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Mixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProMistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeGPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 Opus0.

Language Models Comparison Highlights

Quality Comparison by Ability

+ Add model from specific provider
+ 添加来自特定提供商的模型
Varied metrics by ability categorization; Higher is better
General Ability (Chatbot Arena)
128712651256124912311207118911781156115311461114110711031008GPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 OpusGemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Command-R+Logo of Command-R+ which relates to the data aboveCommand-R+Claude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuMistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Mixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboDBRXLogo of DBRX which relates to the data aboveDBRXMistral 7BLogo of Mistral 7B which relates to the data aboveMistral 7B128712651256124912311207118911781156115311461114110711031008
Reasoning & Knowledge (MMLU)
89%87%86%86%82%81%79%78%76%75%74%71%70%68%63%GPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 OpusGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Mistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeGemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashMixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BCommand-R+Logo of Command-R+ which relates to the data aboveCommand-R+Claude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuDBRXLogo of DBRX which relates to the data aboveDBRXMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Mistral 7BLogo of Mistral 7B which relates to the data aboveMistral 7B89%87%86%86%82%81%79%78%76%75%74%71%70%68%63%
Reasoning & Knowledge (MT Bench)
推理与知识(MT Bench) TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BMistral 7BLogo of Mistral 7B which relates to the data aboveMistral 7B9.
Coding (HumanEval) 编码 (HumanEval)
90.285.484.181.774.373.270.162.2GPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Gemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboDBRXLogo of DBRX which relates to the data aboveDBRXLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)90.285.484.181.774.373.270.162.2
Different use-cases warrant considering different evaluation tests. Chatbot Arena is a good evaluation of communication abilities while MMLU tests reasoning and knowledge more comprehensively.
不同的用例需要考虑不同的评估测试。Chatbot Arena 是对沟通能力的良好评估,而 MMLU 则更全面地测试推理和知识。
Median across providers: Figures represent median (P50) across all providers which support the model.
各提供商的中位数:数字表示支持该模型的所有提供商的中位数 (P50)。

Quality vs. Throughput 质量与吞吐量

+ Add model from specific provider
+ 添加来自特定提供商的模型
Quality: General reasoning index, Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
Most attractive quadrant 最具吸引力象限
Size represents Price (USD per M Tokens)
大小代表价格(每 M 个代币美元)
ArtificialAnalysis.ai020406080100120140160Output Speed (Output Tokens per Second)405060708090100110Quality (General ability index)Mistral LargeMistral LargeMistral 7BMistral 7BGPT-3.5 TurboGPT-3.5 TurboCommand-R+Command-R+Claude 3 OpusClaude 3 OpusGPT-4 TurboGPT-4 TurboMixtral 8x22BMixtral 8x22BDBRXDBRXLlama 3 (70B)Llama 3 (70B)Gemini 1.5 ProGemini 1.5 ProMixtral 8x7BMixtral 8x7BGPT-4oGPT-4oClaude 3 HaikuClaude 3 HaikuLlama 3 (8B)Llama 3 (8B)Gemini 1.5 FlashGemini 1.5 Flash
There is a trade-off between model quality and output speed, with higher quality models typically having lower output speed.
Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
输出速度:模型生成令牌时(即从 API 接收第一个块后)每秒接收的令牌数。
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Median across providers: Figures represent median (P50) across all providers which support the model.
各提供商的中位数:数字表示支持该模型的所有提供商的中位数 (P50)。

Quality vs. Price 质量与价格

+ Add model from specific provider
+ 添加来自特定提供商的模型
Quality: General reasoning index, Price: USD per 1M Tokens
Most attractive quadrant 最具吸引力象限
ArtificialAnalysis.ai$0.00$5.00$10.00$15.00$20.00$25.00$30.00Price (USD per M Tokens)405060708090100110Quality (General ability index)Mistral 7BMistral 7BLlama 3 (8B)Llama 3 (8B)GPT-3.5 TurboGPT-3.5 TurboMixtral 8x7BMixtral 8x7BClaude 3 HaikuClaude 3 HaikuDBRXDBRXCommand-R+Command-R+Mistral LargeMistral LargeMixtral 8x22BMixtral 8x22BGemini 1.5 FlashGemini 1.5 FlashLlama 3 (70B)Llama 3 (70B)Gemini 1.5 ProGemini 1.5 ProGPT-4 TurboGPT-4 TurboClaude 3 OpusClaude 3 OpusGPT-4oGPT-4o
While higher quality models are typically more expensive, they do not all follow the same price-quality curve.
Quality: Index represents normalized average relative performance across Chatbot arena, MMLU & MT-Bench.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Median across providers: Figures represent median (P50) across all providers which support the model.
各提供商的中位数:数字表示支持该模型的所有提供商的中位数 (P50)。

Output Speed 输出速度

+ Add model from specific provider
+ 添加来自特定提供商的模型
Output Tokens per Second; Higher is better
ArtificialAnalysis.ai140121117928071696563636053322724Gemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashLlama 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Claude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BMistral 7BLogo of Mistral 7B which relates to the data aboveMistral 7BDBRXLogo of DBRX which relates to the data aboveDBRXGPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oMixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboGemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProCommand-R+Logo of Command-R+ which relates to the data aboveCommand-R+Llama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Mistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 Opus
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
输出速度:模型生成令牌时(即从 API 接收第一个块后)每秒接收的令牌数。
Median across providers: Figures represent median (P50) across all providers which support the model.
各提供商的中位数:数字表示支持该模型的所有提供商的中位数 (P50)。

Pricing: Input and Output Prices

+ Add model from specific provider
+ 添加来自特定提供商的模型
USD per 1M Tokens 每 1M 代币 USD
Input price 输入价格
Output price 输出价格
ArtificialAnalysis.ai0. 3 (8B)Logo of Llama 3 (8B) which relates to the data aboveLlama 3 (8B)Mistral 7BLogo of Mistral 7B which relates to the data aboveMistral 7BClaude 3 HaikuLogo of Claude 3 Haiku which relates to the data aboveClaude 3 HaikuGemini 1.5 FlashLogo of Gemini 1.5 Flash which relates to the data aboveGemini 1.5FlashGPT-3.5 TurboLogo of GPT-3.5 Turbo which relates to the data aboveGPT-3.5 TurboMixtral 8x7BLogo of Mixtral 8x7B which relates to the data aboveMixtral 8x7BLlama 3 (70B)Logo of Llama 3 (70B) which relates to the data aboveLlama 3 (70B)Mixtral 8x22BLogo of Mixtral 8x22B which relates to the data aboveMixtral 8x22BDBRXLogo of DBRX which relates to the data aboveDBRXCommand-R+Logo of Command-R+ which relates to the data aboveCommand-R+Gemini 1.5 ProLogo of Gemini 1.5 Pro which relates to the data aboveGemini 1.5 ProMistral LargeLogo of Mistral Large which relates to the data aboveMistral LargeGPT-4oLogo of GPT-4o which relates to the data aboveGPT-4oGPT-4 TurboLogo of GPT-4 Turbo which relates to the data aboveGPT-4 TurboClaude 3 OpusLogo of Claude 3 Opus which relates to the data aboveClaude 3 Opus0.
Prices vary considerably, including between input and output token price. Prices can vary by orders of magnitude (>10X) between the more expensive and cheapest models.
价格差异很大,包括输入和输出代币价格之间的价格差异。更昂贵和最便宜的型号之间的价格可能会相差几个数量级 (>10X)。
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
输入价格:发送到 API 的请求/消息中包含的每个代币的价格,以每百万个代币的美元表示。
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
输出价格:模型生成的每个代币的价格(从 API 接收),表示为每百万个代币的美元。
Median across providers: Figures represent median (P50) across all providers which support the model.
各提供商的中位数:数字表示支持该模型的所有提供商的中位数 (P50)。

API Provider Highlights: Llama 3 Instruct (70B)
API 提供程序亮点:Llama 3 Instruct (70B)

Output Speed vs. Price: Llama 3 Instruct (70B)
输出速度与价格:Llama 3 Instruct (70B)

Output Speed: Output Tokens per Second, Price: USD per 1M Tokens
输出速度:每秒输出代币数,价格:每 1M 代币 USD
Most attractive quadrant 最具吸引力象限
Microsoft Azure
Amazon Bedrock 亚马逊基岩
Groq 格罗克
Perplexity 困惑
Fireworks 烟火
Deepinfra 深度红外
Replicate 复制
OctoAI 八爪鱼人工智能
ArtificialAnalysis.ai$0.00$0.50$1.00$1.50$2.00$2.50$3.00$3.50$4.00$4.50$5.00$5.50$6.00$6.50Price (USD per 1M Tokens)050100150200250300350Output Speed (Output Tokens per Second)Microsoft AzureMicrosoft AzureDeepinfraDeepinfraAmazon BedrockAmazon BedrockReplicateReplicatePerplexityPerplexityOctoAIOctoAITogether.aiTogether.aiFireworksFireworksGroqGroq
Smaller, emerging providers are offering high output speed and at competitive prices.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
输出速度:模型生成令牌时(即从 API 接收第一个块后)每秒接收的令牌数。
Median: Figures represent median (P50) measurement over the past 14 days.
中位数:数字代表过去 14 天内测量值的中位数 (P50)。
Variance data is present on the model and API provider pages amongst the detailed performance metrics. See 'Compare Models' and 'Compare API Providers' in the navigation menu for further analysis.
在详细的性能指标中,差异数据显示在模型和 API 提供程序页面上。请参阅导航菜单中的“比较模型”和“比较 API 提供程序”以进行进一步分析。

Pricing (Input and Output Prices): Llama 3 Instruct (70B)
定价(输入和输出价格):Llama 3 Instruct (70B)

Price: USD per 1M Tokens; Lower is better
Input price 输入价格
Output price 输出价格
ArtificialAnalysis.ai0.590.590.60.650.90.912.653.780.790.791.92.750.90.913.511.34GroqLogo of Groq which relates to the data aboveGroqDeepinfraLogo of Deepinfra which relates to the data aboveDeepinfraOctoAILogo of OctoAI which relates to the data aboveOctoAIReplicateLogo of Replicate which relates to the data aboveReplicateTogether.aiLogo of Together.ai which relates to the data aboveTogether.aiFireworksLogo of Fireworks which relates to the data aboveFireworksPerplexityLogo of Perplexity which relates to the data abovePerplexityAmazonLogo of Amazon which relates to the data aboveAmazonAzureLogo of Azure which relates to the data aboveAzure0.590.590.60.650.90.912.653.780.790.791.92.750.90.913.511.34
Providers typically charge different prices for input and output tokens. The ratio of input / output token price for a certain use-case may significantly impact overall costs.
Input price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Output price: Price per token generated by the model (received from the API), represented as USD per million Tokens.

Output Speed, Over Time: Llama 3 Instruct (70B)

Output Tokens per Second; Higher is better
Microsoft Azure
Amazon Bedrock
ArtificialAnalysis.aiApr 21Apr 28May 05May 12May 19May 26Jun 02Jun 09Jun 16050100150200250300350
Smaller, emerging providers offer high output speed, though precise speeds delivered vary day-to-day.
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API).
Over time measurement: Median measurement per day, based on 8 measurements each day at different times. Labels represent start of week's measurements.
See more information on any of our supported models
Model NameFurther analysis
OpenAI logo
OpenAI logoGPT-4o
OpenAI logoGPT-4 Turbo
OpenAI logoGPT-4
OpenAI logoGPT-3.5 Turbo
OpenAI logoGPT-3.5 Turbo Instruct
Google logo
Google logoGemini 1.5 Flash
Google logoGemini 1.5 Pro
Google logoGemini 1.0 Pro
Google logoGemma 7B Instruct
Meta logo
Meta logoLlama 3 Instruct (70B)
Meta logoLlama 3 Instruct (8B)
Meta logoCode Llama Instruct (70B)
Meta logoLlama 2 Chat (70B)
Meta logoLlama 2 Chat (13B)
Meta logoLlama 2 Chat (7B)
Mistral logo
Mistral logoMixtral 8x22B Instruct
Mistral logoMistral Large
Mistral logoMistral Medium
Mistral logoMistral Small
Mistral logoMixtral 8x7B Instruct
Mistral logoMistral 7B Instruct
Anthropic logo
Anthropic logoClaude 3 Opus
Anthropic logoClaude 3 Sonnet
Anthropic logoClaude 3 Haiku
Anthropic logoClaude 2.0
Anthropic logoClaude 2.1
Anthropic logoClaude Instant
Alibaba logo
Alibaba logoQwen2 Instruct (72B)
Cohere logo
Cohere logoCommand Light
Cohere logoCommand
Cohere logoCommand-R+
Cohere logoCommand-R
OpenChat logo
OpenChat logoOpenChat 3.5 (1210)
Databricks logo
Databricks logoDBRX Instruct
DeepSeek logo
DeepSeek logoDeepSeek-V2-Chat
Snowflake logo
Snowflake logoArctic Instruct