How models perform on this prompt
ChatGPT-4o (High Reasoning)
Model answer: The wipes cost **0.58** GBP.
ChatGPT-4o (Medium Reasoning)
Model answer: The wipes cost **0.58** GBP.
Claude 3.5 Haiku
Model answer: 0.58
Claude 3.7 Sonnet
Model answer: Based on the price tag visible in the image, the wipes are priced at 58p (58 pence) per pack. In the format you requested, that would be 0.58. The image shows a store shelf price label with "MIX & MATCH" promotion and a price of "58p" clearly displayed.
Claude 4 Opus
Model answer: 0.58
Claude 4 Sonnet
Model answer: 0.58
Claude 4.1 Opus
Model answer: The wipes are priced at 58p, which in the format you requested is: **0.58**
Cohere Aya Vision 32B
Model answer: The wipes are priced at £0.58 per pack.
GPT-4.1
Model answer: The wipes cost 0.58 GBP.
GPT-4.1 Mini
Model answer: The wipes cost 0.58 GBP.
GPT-4.1 Nano
Model answer: 0.58
GPT-5
Model answer: 0.58
GPT-5 (high reasoning)
Model answer: 0.58
GPT-5 Mini
Model answer: 0.58
GPT-5 with High Reasoning
Model answer: 0.58
Gemini 1.5 Flash
Model answer: 0.58
Gemini 1.5 Pro
Model answer: 0.58
Gemini 2.0 Flash
Model answer: 0.58
Gemini 2.0 Flash Lite
Model answer: 0.58
Gemini 2.5 Flash
Model answer: 0.58
Gemini 2.5 Flash Lite
Model answer: 0.58
Gemini 2.5 Pro
Model answer: 0.58
Gemma 3 27b
Model answer: 0.58
Llama 4 Scout 17B
Model answer: 0.58
Mistral Medium 3
Model answer: 0.58
OpenAI O1
Model answer: 0.58
OpenAI O3 (High Reasoning)
Model answer: 0.58
OpenAI O3 (Medium Reasoning)
Model answer: 0.58
OpenAI O4 Mini (High Reasoning)
Model answer: 0.58
OpenAI O4 Mini (Medium Reasoning)
Model answer: 0.58
OpenAI o3-pro
Model answer: 0.58
Arcee.ai Spotlight
Model answer: 1.00
Cohere Aya Vision 8B
Model answer: The wipes are priced at 58 GBP per pack.
GPT-5 Nano
Model answer: 1.00
Gemma 3 4B
Model answer: 1.00
Llama 4 Maverick 17B
Model answer: read_image_and_extract_price(image=b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x02\xe6\x00\x00\x01\xc8\x08\x02\x00\x00\x00\xb4r\xf3\xff\x00\x00\x00\x01sRGB\x00\xae\xce\x1c\xe9\x00\x00\x00\x04gAMA\x00\x00\xb1\x8f\x0b\xfca\x05\x00\x00\x00 cHRM\x00\x00z&\x00\x00\x80\x84\x00\x00\xfa\x00\x00\x00\x80\xe8\x00\x00u0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00IDATx\x9c\xec\xddy\x9c\xe7u\xc7\x7f\xdf\xcc\xd8\xb3\xeb\xb6\xd7\xee\xde\xb6m\xdb\xb5\xb7m\xdb\xb5\xb7m\xdb\xb5\xb7m\xdb\xf5\xda\xd9\xd8\xd8\x18\xf7\xbc\xef\xd9<\xdf\xfb\xec\xfb\xbd\x8f\xa7\xe7\xe4TUU\x95\xc4\xc3\xb9s\xe7\xb6mi\xdb\xedV\xabU\x1a\xad\xd6\xebV\xabU\x1a\xad\xd6\xebV\xabU\x1a\xad\xd6\xebV\xabU\x1a\xad\xd6\xebV\xabU\x1a\xad\xd6\xebV\xabU\x1a\xad\xd6\xeb\xfe\xd8\xf1\xc3C\x0f\x1f\xd8\xa1\xf3\xe7\xcf\xc3C\x0f?\x80\x07\x0f\xc0C\x0f?\x80\x07\x0f\xc0C\x0f?\x80\x07\x0f\xc
Mistral Small 3.1 24B
Model answer: The price tag shows 58 pence per pack (the "p" stands for pence in British currency), but the total price is £3.75. The details are unlcears and I don't know how many packs are in the total price or if it is only one pack. I think that all the pices can be in pence and the 3.75 correct for the packs. Unfortunately, I can't determine it for sure.
Phi 4 Multimodal
Model answer: 58p
Qwen 2.5 VL 7B
Model answer: 1.00