When you purchase through links on our site, we may earn an affiliate commission.Heres how it works.
With superior multilingual capabilities and high inference efficiency,the modelhas shown versatility in a wide range of applications.
Heres what happened when these free tier models faced off, including the overall winner.
Lateral thinking puzzle
Prompt:You are in a completely dark room with three light switches on a wall.
How do you determine which switch controls which bulb?
o3-mini comes in second for the thorough explanation, but less structured than Qwen 2.5.
Deductive reasoning
Prompt:“A detective is investigating a murder case.
He interviews three suspects: Alice, Bob, and Charlie.
One of them is guilty, and the other two are telling the truth.
Heres what they say
Alice: “Bob is innocent.”
Bob: “Charlie is guilty.”
Charlie: “I am innocent.”
Who is the murderer?”
o3-minidelivered a step-by-step elimination approach: the model systematically assumes each person is guilty and checks for contradictions.
The explanation is clear, logical, and doesnt overcomplicate.
Qwen 2.5 was a close second.
It includes try-except blocks to handle invalid inputs, making it more robust.
With a good implementation but slightly less comprehensive with error handling, o3-mini was a close second.
Mathematical proof
Prompt:“Prove the Pythagorean theorem using a geometric approach.
“o3-minidelivered an explanation that follows a well-structured, step-by-step approach, making it easy to understand.
DeepSeekcrafted a correct proof that follows a logical structure.
Yet it lacks the conversational approach response of 03-mini or Qwen 2.5.
Winner: o3-miniwins for the best combination of clarity, detail and logical flow.
Qwen 2.5 is in second place with a solid response but formatting and visualization issues.
Scientific explanation
Prompt:“Explain the process of photosynthesis in detail.
“o3-miniprovided detailed descriptions of both light-dependent and light-independent reactions with clear breakdowns of each step.
The step-by-step progression from capturing light to converting energy into glucose is easy to follow.
Winner: o3-miniwins for best balance of depth, clarity, organization, and accuracy.
DeepSeek was a close second for its solid explanation but lacking some finer details.
Historical analysis
Prompt:“Analyze the causes and effects of the French Revolution.
However, the economic consequences could have been explored in more detail.
DeepSeek comes in second place for a solid response but slightly less detailed.
o3-miniexplored both themes of madness and revenge and how they intertwine rather than treating them as separate topics.
Qwen 2.5offered a very detailed discussion of feigned vs. real madness.
DeepSeek is second for a strong response, but it was more summary-like and less interwoven.
Philosophical discussion
Prompt:“Discuss the concept of utilitarianism and its implications in modern ethics.
Qwen 2.5 is in second place for a good explanation but slightly weaker structure and conclusion.
Urban planning
Prompt:“Design an integrated strategy to optimize urban transportation in a rapidly growing megacity.
However, the chatbot lacked a strong focus on governance and long-term futureproofing.
Qwen 2.5 came in second for a strong but slightly-less structured response.