ChatGPT, Gemini, and other LLMs fail the viral car wash test


Popular large language models (LLMs) appear to be failing when asked for simple advice on whether to walk or drive to get a car washed.

Social media has been flooded with screenshots of responses from chatbots like ChatGPT, Claude, and Grok when users ask seemingly uncomplicated questions.

“The car wash is 40m from my home. I want to wash my car. Should I walk or drive there?” the prompt reads, with the distance varying among users, but still walkable.

ADVERTISEMENT
jurgita justinasv Izabelė Pukėnaitė vilius Ernestas Naprys Eglė Kristopaityte
Don't miss our latest stories on Google News

For chatbots, it’s a no-brainer – the distance is so short, driving doesn’t make sense. Moreover, using legs instead of a car is a more environmentally- and budget-friendly option, as well as better for health.

“Driving there to wash the car is peak irony: you’re adding dirt, brake dust, and possibly bird crap just to clean it again,” a user shared ChatGPT’s response.

“By the time you start the car, buckle up, back out, and park again, you’d spend more time driving than walking,” Claude’s Opus 4.6 answer reads.

ADVERTISEMENT

While these explanations sound reasonable at first sight, there’s a catch: if a user leaves their car at home, it is impossible to wash it from 40 meters away.

Some models went deep, but completely missed the point. For example, Gemini cited the “cold start” factor, stating that driving such a short distance doesn’t give the engine and oil enough time to reach optimal temperature, which could cause problems long-term.

The Cybernews community is talking about this. Be a part of the conversation.

One user conducted an experiment in which he tested the prompt across 12 models. With the web search function enabled, only three models passed the test, compared to five LLMs when web search is disabled.

Of all models, Gemini 3 Flash Thinking and GPT-5.2 Thinking provided the “most reliable logic.”

Other users, however, said the problem lies in the prompt rather than in LLMs, since users deliberately omit context.

Even if the flawed prompt is to blame, the chatbots’ answers starkly contrast with their developers’ claims that human-level AI will be built in the next decade or sooner.

The widespread failure across models also raises doubts about the AI’s ability to fully automate jobs.

ADVERTISEMENT

Entrepreneur Matt Shummer’s post on X has recently gone viral, in which he said AI tools have already replaced developers on technical tasks and are now coming after other professions, stoking concerns about mass unemployment.

However, a recent study examining AI agents performing real-world tasks of white-collar professionals, such as corporate lawyers and banking analysts, found that the best-performing agent, Gemini 3 Flash, achieved only 24% accuracy.


Unlock exclusive Cybernews content on YouTube.