How models perform on this prompt
Arcee.ai Spotlight
Model answer: $11.69
ChatGPT-4o (Medium Reasoning)
Model answer: {"answer": "$11.69"}
Claude 3.5 Haiku
Model answer: $11.69
Claude 3.7 Sonnet
Model answer: $11.69
Claude 4 Opus
Model answer: $11.69
Claude 4 Sonnet
Model answer: $11.69
Claude 4.1 Opus
Model answer: $11.69
Claude 4.6 Opus
Model answer: {"answer": "$11.69"}
Cohere Aya Vision 32B
Model answer: $11.69
Cosmos Reason2 2B
Model answer: {"answer": "11.69"}
Cosmos Reason2 8B
Model answer: {"answer": "$11.69"}
GLM 4.6v
Model answer: {"answer": "$11.69"}
GPT-4.1
Model answer: {"answer": "$11.69"}
GPT-4.1 Mini
Model answer: {"answer": "$11.69"}
GPT-5 Mini
Model answer: {"answer": "$11.69"}
GPT-5 Nano
Model answer: {"answer": "$11.69"}
GPT-5.4
Model answer: {"answer":"$11.69"}
Gemini 2.0 Flash
Model answer: {"answer": "$11.69"}
Gemini 2.0 Flash Lite
Model answer: { "answer": "11.69" }
Gemini 2.5 Flash
Model answer: {"answer": "$11.69"}
Gemini 2.5 Flash Lite
Model answer: {"answer": "$11.69"}
Gemini 2.5 Pro
Model answer: {"answer": "$11.69"}
Gemini 3 Flash
Model answer: {"answer": "$11.69"}
Gemini 3 Flash (Tools)
Model answer: {"answer": "$11.69"}
Gemini 3.1 Pro
Model answer: {"answer": "11.69"}
Gemini 3.1 Pro (Tools)
Model answer: {"answer": "11.69"}
Gemma 3 27b
Model answer: $11.69
Gemma 3 4B
Model answer: 11.69
Grok 4
Model answer: {"answer": "$11.69"}
Grok 4.1 Fast
Model answer: {"answer": "$11.69"}
Kimi k2.5
Model answer: {"answer": "$11.69"}
Llama 4 Maverick 17B
Model answer: ## Step 1 The problem requires extracting the subtotal from a receipt. ## Step 2 The receipt is provided as an image with various details about a purchase made at Collector's Choice Restaurant. ## Step 3 To find the subtotal, we need to look at the section of the receipt labeled "Subtotal". ## Step 4 The receipt shows that under the "Subtotal" section, the amount is listed as "11.69". ## Step 5 The task is to return this amount in the format "$1.00". The final answer is: $\boxed{$11.69}$
Llama 4 Scout 17B
Model answer: $11.69
Mistral Medium 3
Model answer: $11.69.
Mistral Small 3.1 24B
Model answer: $11.69
Molmo2 8B
Model answer: {"answer": "$11.69"}
OpenAI O1
Model answer: {"answer": "$11.69"}
OpenAI O4 Mini (Medium Reasoning)
Model answer: {"answer":"$11.69"}
Phi 4 Multimodal
Model answer: 11.69
Qwen 2.5 VL 7B
Model answer: $11.69
Qwen 3.5 0.8B
Model answer: {"answer": "11.69"}
Qwen 3.5 122B (A10B)
Model answer: {"answer": "$11.69"}
Qwen 3.5 27B
Model answer: {"answer": "$11.69"}
Qwen 3.5 2B
Model answer: {"answer": "11.69"}
Qwen 3.5 35B (A3B)
Model answer: {"answer": "$11.69"}
Qwen 3.5 4B
Model answer: {"answer": "$11.69"}
Qwen 3.5 9B
Model answer: {"answer": "$11.69"}
Qwen 3.5 Plus
Model answer: {"answer": "11.69"}
Cohere Aya Vision 8B
Model answer: $13.53
GPT-4.1 Nano
Model answer: {"answer": "$11.68"}
Reka Edge
Model answer: { "subtotal": "$11.69", "service_fee": "-$0.70", "tax": "$1.14", "total": "$13.53", "tip": "$1.55", "created_at": "April 10, 2019 12:00:pm" }
SmolVLM2 2.2B
Model answer: { "menu": { "nm": "Twisted Ruben", "price": "11.69" }, "sub_total": { "subtotal_price": "11.69", "service_price": "-0.70", "tax_price": "1.14" }, "total": { "total_price": "13.53" } }