There obviously still are many opportunities for us to make fun of the capability of GenAI, but it's getting harder to come up with the "clever" (as you said) prompt. They mostly don't add supernumerary fingers any more, and generally don't make silly arithmetic mistakes on a single prompt. We need to look for more complex and longer-time-horizon tasks to make them fail, and in many situations, the tasks are as likely to trip up a human as they would an AI.
Indeed your comment reminded me of Plato's Dialogues, which mostly involve Socrates intentionally trying to trip up his conversation partner in a contradiction. Reading these didn't ever make me feel that Socrates's partner is not intelligent or really has a deep underlying issue in their mental model, but rather that Socrates (at least as written up by Plato) is very clever and good at rhetoric. Same in regards to AI - I don't see our ability to make them fail as illustrating a lack of intelligence, just that in some ways we are more intelligent or have more relevant experience.
And if you're concerned about making a prediction and all you can fall back of on is a "I know it when I see it" argument, then to me that is as strong a signal as can be that there's no hard line separating between artificial intelligence and human intelligence.