adjustable glasses illustration eye chart illustration clipboard illustration lenses illustration

Prompt: Coffee plants ready to farm

50 out of 52 models answer this prompt correctly.
Prompt: Is the coffee plant ready to farm? It is only ready to farm if the majority of berries are red. Return only yes or no.
Answer: No
Submitted by: Roboflow
Coffee plants ready to farm

Contribute a Prompt

How models perform on this prompt

Arcee.ai Spotlight
ChatGPT-4o (Medium Reasoning)

Model answer: {"answer": "no"}

View more tests run on this model

Claude 3.5 Haiku

Model answer: No. The majority of the coffee berries in this image are still green, which indicates they are not yet ripe. Only a few berries have turned red, signaling that the coffee plant is not ready for harvesting.

View more tests run on this model

Claude 3.7 Sonnet

Model answer: No. The image shows coffee cherries that are predominantly green, with only a few scattered red berries visible. Coffee cherries are ready to harvest when they turn a bright, deep red color, indicating ripeness. Since the vast majority of the cherries in this image are still green, the coffee plant is not yet ready to harvest.

View more tests run on this model

Claude 4 Opus
Claude 4 Sonnet
Claude 4.1 Opus
Claude 4.6 Opus

Model answer: {"answer": "no"}

View more tests run on this model

Cohere Aya Vision 32B
Cosmos Reason2 2B

Model answer: {"answer": "no"}

View more tests run on this model

Cosmos Reason2 8B

Model answer: {"answer": "no"}

View more tests run on this model

GLM 4.6v

Model answer: { "answer": "no" }

View more tests run on this model

GPT-4.1

Model answer: {"answer": "no"}

View more tests run on this model

GPT-4.1 Mini

Model answer: {"answer": "no"}

View more tests run on this model

GPT-4.1 Nano

Model answer: {"answer": "no"}

View more tests run on this model

GPT-5 Mini

Model answer: {"answer":"no"}

View more tests run on this model

GPT-5 Nano

Model answer: {"answer": "no"}

View more tests run on this model

GPT-5.4

Model answer: {"answer":"no"}

View more tests run on this model

Gemini 2.0 Flash

Model answer: {"answer": "no"}

View more tests run on this model

Gemini 2.0 Flash Lite

Model answer: { "answer": "no" }

View more tests run on this model

Gemini 2.5 Flash

Model answer: {"answer": "no"}

View more tests run on this model

Gemini 2.5 Flash Lite

Model answer: {"answer": "no"}

View more tests run on this model

Gemini 2.5 Pro

Model answer: {"answer": "no"}

View more tests run on this model

Gemini 3 Flash

Model answer: { "answer": "no" }

View more tests run on this model

Gemini 3 Flash (Tools)

Model answer: {"answer": "no"}

View more tests run on this model

Gemini 3.1 Pro

Model answer: { "answer": "no" }

View more tests run on this model

Gemini 3.1 Pro (Tools)

Model answer: { "answer": "no" }

View more tests run on this model

Gemma 3 27b
Gemma 3 4B
Grok 4

Model answer: {"answer": "no"}

View more tests run on this model

Grok 4.1 Fast

Model answer: {"answer": "no"}

View more tests run on this model

Kimi k2.5

Model answer: {"answer": "no"}

View more tests run on this model

Llama 4 Maverick 17B
Llama 4 Scout 17B
Mistral Medium 3
Mistral Small 3.1 24B
Molmo2 8B

Model answer: {"answer": "no"}

View more tests run on this model

OpenAI O1

Model answer: {"answer": "no"}

View more tests run on this model

OpenAI O4 Mini (Medium Reasoning)

Model answer: {"answer": "no"}

View more tests run on this model

Phi 4 Multimodal
Qwen 2.5 VL 7B
Qwen 3.5 122B (A10B)

Model answer: {"answer": "no"}

View more tests run on this model

Qwen 3.5 27B

Model answer: {"answer": "no"}

View more tests run on this model

Qwen 3.5 2B

Model answer: {"answer": "no"}

View more tests run on this model

Qwen 3.5 35B (A3B)

Model answer: { "answer": "no" }

View more tests run on this model

Qwen 3.5 4B

Model answer: {"answer": "no"}

View more tests run on this model

Qwen 3.5 9B

Model answer: {"answer": "no"}

View more tests run on this model

Qwen 3.5 Plus

Model answer: {"answer": "no"}

View more tests run on this model

Reka Edge

Model answer: { "answer": "no" }

View more tests run on this model

SmolVLM2 2.2B
Cohere Aya Vision 8B
Qwen 3.5 0.8B