From the article, "while incorrectly labeling the human-written text as AI-written 9% of the time."
Seems like from what the article we're talkin about says it definitely ain't worse than random by far. Thing you most want to avoid is wrongly labeling humans as AI-written so that seems pretty good. Though it only identified 26% of AI text as "likely AI-written" that's still better than nothing, and better than random. But we don't know or I don't know from the article if that's on the problem cases of less than 1,000 characters or not. It don't say what the *best case* is just what the general cases are.
Anyhow don't seem to me worse than random is the issue here