Prompt: Measure a sticker

11 out of 52 models answer this prompt correctly.

Prompt: How wide is the sticker in inches? Return only a real number with 1 decimal value, like “12.3” for 12.3 inches

Answer: 3.5

Submitted by: Roboflow

Contribute a Prompt

Have an idea for a prompt? Open a pull request on the project repository!

How models perform on this prompt

ChatGPT-4o (Medium Reasoning)

Model answer: {"answer": "3.5"}

View more tests run on this model

Claude 4 Opus

Model answer: Looking at the image, I can see the sticker is placed next to a ruler. The left edge of the sticker appears to align with approximately the 0.5 inch mark, and the right edge extends to about the 3.5 inch mark on the ruler. Therefore, the width of the sticker is: 3.5 - 0.5 = 3.0 inches 3.0

View more tests run on this model

Claude 4 Sonnet

Model answer: 3.5

View more tests run on this model

Claude 4.6 Opus

Model answer: {"answer": "3.5"}

View more tests run on this model

Gemini 2.5 Flash Lite

Model answer: {"answer": "3.5"}

View more tests run on this model

Gemini 3.1 Pro

Model answer: { "answer": "3.5" }

View more tests run on this model

Gemini 3.1 Pro (Tools)

Model answer: {"answer": "3.5"}

View more tests run on this model

OpenAI O1

Model answer: {"answer": "3.5"}

View more tests run on this model

Qwen 3.5 122B (A10B)

Model answer: {"answer": "3.5"}

View more tests run on this model

Qwen 3.5 27B

Model answer: {"answer": "3.5"}

View more tests run on this model

Qwen 3.5 Plus

Model answer: {"answer": "3.5"}

View more tests run on this model

Arcee.ai Spotlight

Model answer: 3.0

View more tests run on this model

Claude 3.5 Haiku

Model answer: 2

View more tests run on this model

Claude 3.7 Sonnet

Model answer: Based on the ruler shown in the image, the sticker appears to be approximately 2 inches wide.

View more tests run on this model

Claude 4.1 Opus

Model answer: Looking at the image, the sticker appears to be approximately 3 inches wide based on the ruler measurement shown below it. 3

View more tests run on this model

Cohere Aya Vision 32B

Model answer: 2.75

View more tests run on this model

Cohere Aya Vision 8B

Model answer: 3

View more tests run on this model

Cosmos Reason2 2B

Model answer: {"answer": "1.1"}

View more tests run on this model

Cosmos Reason2 8B

Model answer: {"answer": "3.0"}

View more tests run on this model

GLM 4.6v

Model answer: { "answer": "3.0" }

View more tests run on this model

GPT-4.1

Model answer: {"answer": "3.3"}

View more tests run on this model

GPT-4.1 Mini

Model answer: {"answer": "3.0"}

View more tests run on this model

GPT-4.1 Nano

Model answer: {"answer": "3.0"}

View more tests run on this model

GPT-5 Mini

Model answer: {"answer": "2.8"}

View more tests run on this model

GPT-5 Nano

Model answer: {"answer": "2.5"}

View more tests run on this model

GPT-5.4

Model answer: {"answer":"3.4"}

View more tests run on this model

Gemini 2.0 Flash

Model answer: {"answer": "3.0"}

View more tests run on this model

Gemini 2.0 Flash Lite

Model answer: { "answer": "2.5" }

View more tests run on this model

Gemini 2.5 Flash

Model answer: {"answer": "3.0"}

View more tests run on this model

Gemini 2.5 Pro

Model answer: {"answer": "3.0"}

View more tests run on this model

Gemini 3 Flash

Model answer: {"answer": "3.0"}

View more tests run on this model

Gemini 3 Flash (Tools)

Model answer: {"answer": "3.4"}

View more tests run on this model

Gemma 3 27b

Model answer: 2.5

View more tests run on this model

Gemma 3 4B

Model answer: 3.2

View more tests run on this model

Grok 4

Model answer: {"answer": "2.5"}

View more tests run on this model

Grok 4.1 Fast

Model answer: {"answer": "2.0"}

View more tests run on this model

Kimi k2.5

Model answer: {"answer": "2.8"}

View more tests run on this model

Llama 4 Maverick 17B

Model answer: 3.0

View more tests run on this model

Llama 4 Scout 17B

Model answer: 2.75

View more tests run on this model

Mistral Medium 3

Model answer: 2.0.

View more tests run on this model

Mistral Small 3.1 24B

Model answer: 3.05 (inches).

View more tests run on this model

Molmo2 8B

Model answer: {"answer": "4.0"}

View more tests run on this model

OpenAI O4 Mini (Medium Reasoning)

Model answer: {"answer": "3.6"}

View more tests run on this model

Phi 4 Multimodal

Model answer: 2

View more tests run on this model

Qwen 2.5 VL 7B

Model answer: 3.1

View more tests run on this model

Qwen 3.5 0.8B

Model answer: {"answer": "1.2"}

View more tests run on this model

Qwen 3.5 2B

Model answer: 9.8

View more tests run on this model

Qwen 3.5 35B (A3B)

Model answer: 0.0

View more tests run on this model

Qwen 3.5 4B

Model answer: { "answer": "2.7" }

View more tests run on this model

Qwen 3.5 9B

Model answer: {"answer": "3.0"}

View more tests run on this model

Reka Edge

Model answer: { "answer": "1.0" }

View more tests run on this model

SmolVLM2 2.2B

Model answer: {"answer": "12.3"}

View more tests run on this model