How models perform on this prompt
Arcee.ai Spotlight
Model answer: $11.69
ChatGPT-4o (High Reasoning)
Model answer: $11.69
ChatGPT-4o (Medium Reasoning)
Model answer: $11.69
Claude 3.5 Haiku
Model answer: $11.69
Claude 3.7 Sonnet
Model answer: $11.69
Claude 4 Opus
Model answer: $11.69
Claude 4 Sonnet
Model answer: $11.69
Claude 4.1 Opus
Model answer: $11.69
Cohere Aya Vision 32B
Model answer: $11.69
GPT-4.1
Model answer: $11.69
GPT-4.1 Mini
Model answer: $11.69
GPT-5
Model answer: $11.69
GPT-5 (high reasoning)
Model answer: $11.69
GPT-5 Mini
Model answer: $11.69
GPT-5 Nano
Model answer: $11.69
GPT-5 with High Reasoning
Model answer: $11.69
Gemini 1.5 Flash
Model answer: $11.69
Gemini 1.5 Pro
Model answer: $11.69
Gemini 2.0 Flash
Model answer: $11.69
Gemini 2.0 Flash Lite
Model answer: $11.69
Gemini 2.5 Flash
Model answer: $11.69
Gemini 2.5 Flash Lite
Model answer: $11.69
Gemini 2.5 Pro
Model answer: $11.69
Gemma 3 27b
Model answer: $11.69
Gemma 3 4B
Model answer: 11.69
Llama 4 Maverick 17B
Model answer: ## Step 1 The problem requires extracting the subtotal from a receipt. ## Step 2 The receipt is provided as an image with various details about a purchase made at Collector's Choice Restaurant. ## Step 3 To find the subtotal, we need to look at the section of the receipt labeled "Subtotal". ## Step 4 The receipt shows that under the "Subtotal" section, the amount is listed as "11.69". ## Step 5 The task is to return this amount in the format "$1.00". The final answer is: $\boxed{$11.69}$
Llama 4 Scout 17B
Model answer: $11.69
Mistral Medium 3
Model answer: $11.69.
Mistral Small 3.1 24B
Model answer: $11.69
OpenAI O1
Model answer: $11.69
OpenAI O3 (High Reasoning)
Model answer: $11.69
OpenAI O3 (Medium Reasoning)
Model answer: $11.69
OpenAI O4 Mini (High Reasoning)
Model answer: $11.69
OpenAI O4 Mini (Medium Reasoning)
Model answer: $11.69
OpenAI o3-pro
Model answer: $11.69
Phi 4 Multimodal
Model answer: 11.69
Qwen 2.5 VL 7B
Model answer: $11.69
Cohere Aya Vision 8B
Model answer: $13.53
GPT-4.1 Nano
Model answer: $11.68