How models perform on this prompt
Arcee.ai Spotlight
Model answer: $2.30
ChatGPT-4o (Medium Reasoning)
Model answer: {"answer": "$2.30"}
Claude 3.5 Haiku
Model answer: $2.30
Claude 3.7 Sonnet
Model answer: $2.30
Claude 4 Opus
Model answer: $2.30
Claude 4 Sonnet
Model answer: $2.30
Claude 4.1 Opus
Model answer: $2.30
Claude 4.6 Opus
Model answer: {"answer": "$2.30"}
Cosmos Reason2 2B
Model answer: {"answer": "2.30"}
Cosmos Reason2 8B
Model answer: {"answer": "2.30"}
GLM 4.6v
Model answer: {"answer": "$2.30"}
GPT-4.1
Model answer: {"answer": "$2.30"}
GPT-4.1 Mini
Model answer: {"answer": "$2.30"}
GPT-4.1 Nano
Model answer: {"answer": "2.30"}
GPT-5 Mini
Model answer: {"answer": "$2.30"}
GPT-5 Nano
Model answer: {"answer": "$2.30"}
GPT-5.4
Model answer: {"answer":"$2.30"}
Gemini 2.0 Flash
Model answer: {"answer": "$2.30"}
Gemini 2.0 Flash Lite
Model answer: { "answer": "$2.30" }
Gemini 2.5 Flash
Model answer: {"answer": "$2.30"}
Gemini 2.5 Flash Lite
Model answer: {"answer": "$2.30"}
Gemini 2.5 Pro
Model answer: {"answer": "$2.30"}
Gemini 3 Flash
Model answer: {"answer": "$2.30"}
Gemini 3 Flash (Tools)
Model answer: {"answer": "$2.30"}
Gemini 3.1 Pro
Model answer: {"answer": "$2.30"}
Gemini 3.1 Pro (Tools)
Model answer: {"answer": "$2.30"}
Gemma 3 27b
Model answer: $2.30
Kimi k2.5
Model answer: {"answer": "$2.30"}
Llama 4 Maverick 17B
Model answer: $2.30
Mistral Medium 3
Model answer: The tax paid was $2.30. This is clearly indicated on the receipt under the "Tax" line item.
Mistral Small 3.1 24B
Model answer: $2.30
OpenAI O4 Mini (Medium Reasoning)
Model answer: {"answer": "$2.30"}
Phi 4 Multimodal
Model answer: 2.30
Qwen 3.5 122B (A10B)
Model answer: {"answer": "$2.30"}
Qwen 3.5 27B
Model answer: {"answer": "$2.30"}
Qwen 3.5 2B
Model answer: { "answer": "2.30" }
Qwen 3.5 35B (A3B)
Model answer: {"answer": "$2.30"}
Qwen 3.5 4B
Model answer: {"answer": "$2.30"}
Qwen 3.5 9B
Model answer: {"answer": "$2.30"}
Qwen 3.5 Plus
Model answer: {"answer": "$2.30"}
Cohere Aya Vision 32B
Model answer: $2.00
Cohere Aya Vision 8B
Model answer: The total amount paid on the receipt is $31.96. This includes the sub-surcharge of $0.68, which is 6% of the bill to offset the minimum wage in California. The total before taxes and fees is $28.00.
Gemma 3 4B
Model answer: $3.98
Grok 4
Model answer: {"answer": "$0.40"}
Grok 4.1 Fast
Model answer: {"answer": "$1.98"}
Llama 4 Scout 17B
Model answer: $3.36
Molmo2 8B
Model answer: {"answer": "$2.00"}
OpenAI O1
Model answer: {"answer": "$3.20"}
Qwen 2.5 VL 7B
Model answer: $1.68
Qwen 3.5 0.8B
Model answer: Based on the receipt provided in the image: - Regular Check (including surcharge and service charge, etc.): $28.00 + $1.68 + $2.30 = $32.00 - Total without tax: $32.00 - Tax amount is listed as **2.30** on the right side of that line (under Tax). Thus, the tax paid was **$2.30**. {"answer": "2.30"}
Reka Edge
Model answer: { "answer": "$1.00" }
SmolVLM2 2.2B
Model answer: 0.00