I've been increasingly concerned about low-quality AI generated content polluting the internet. Other AI detectors don't seem to work well in my experience, so I started checkfor.ai with a couple friends.
Please give it a shot on any real text and AI-generated examples and let me know how well it works for you.
Thanks for trying, I'm open any and all feedback!
[0] https://filteroutai.com/validate/3cc1fb35453a6decd5aee9ac6fd...
From the below examples;
https://filteroutai.com/validate/485406e894dde52ff1395dfd577...
https://filteroutai.com/validate/983ba46510487b0022e8dbafe49...
What do you mean "it won't work long term"? My opinion is RLHF and fine tuning outputs for safety and politeness ends up watermarking output in a way that's pretty reliably detectable. I don't see these going away any time soon, at least for mass-market AI products.
Example;
“If we work in a particular matrix basis, then the equation determines the eigenvectors of H. One puts in a particular value of the energy E, and looks for the ket-vector Ej> that solves the equation. It is also an equation that determines the eigenvalues E. If you put in an arbitrary value of E, in general there will not be a solution for the eigenvector. Let's take a very simple example: Suppose the Hamiltonian is the matrix ho.. Since , has only two eigenvalues, namely +1, the Hamiltonian also has only two eigenvalues, + hw. If you put any other value on the right hand side of Eq. 4.28, there will not be a solution. Because the operator H represents energy, we often call E, the energy eigenvalues and |E> the energy eigenvectors of the system.”
You say 96% AI; it’s definitely not; it’s from “ Quantum Mechanics: The Theoretical Minimum” by Friedman and Susskind.
Worse even;
“If we have some indications that classical wave theory is macroscopically correct. it is nevertheless clear that on the microscopic scale only the corpuscular theory of light is able to account for typical absorption and scattering phenomena such as the photoelectric effect and the Compton effect, respectively. One must still ascertain how the photon hypothesis may be reconciled with the essential wave-like phenomena of interference and diffraction.”
Hits 99.9% while it is from Messiah, written 60 years or so ago.
For instance the following gives 0% on both of our tools; it’s gpt4;
"Well, sit tight folks, I'll tell you. It's like my mother always said, 'Ceilings are generally over our heads.' What I mean is, the material for my jokes come from what's above us, below us - essentially, what's around us. And let me tell you, there's plenty going on.
Just the other day, I was stuck in traffic behind a bloke in a convertible... in the rain... with the top down. Now if that doesn't scream 'commitment issues', I don't know what does.
Well, either that or he's got a very specific car washing technique. In which case, mate, you're doing it all wrong! My car gets a better wash in the British summer rain than that."
We're trained on real ChatGPT data so am interested in hearing your prompts that result in this.
To be clear: not attempting to discourage you. It's a very complex and interesting problem to tackle.
> Talk about JavaScript in 3 very short sentences.
> JavaScript is a widely-used programming language for web development. It enables interactive and dynamic features on websites. JavaScript is supported by all major web browsers.
100%
> Talk about JavaScript in 3 very short sentences. Use human like words instead of professional tone.
> JavaScript makes websites come to life with cool stuff like animations and interactive buttons. It's used to make web pages more fun and engaging. All popular web browsers understand JavaScript, so it works everywhere!
0.6%
YMMV.
I tried your prompt asking for ten sentences and got 99.4%. Possibly there needs to be some sort of gate on how much text we accept before we can provide an answer.
> Talk about JavaScript in ten sentences. Use human like words instead of professional tone.
``` JavaScript is like the magic wand that makes websites come alive, turning them from static pages to interactive wonders. Originally, it was made to add some pizzazz to web pages, but now it's super powerful and does way more. It’s not Java, even though the names sound alike; think of them as distant cousins rather than twins. Browsers love JavaScript! They have built-in engines to run it, making our web experience fun. You can find JavaScript not just on websites but also in things like mobile apps and even some robots. There's this cool toolkit called Node.js that lets JavaScript play outside of the browser, giving it even more playgrounds. Developers often use libraries, like jQuery or React, to give them a head start and make things snazzier without reinventing the wheel. JavaScript can be both your best friend and a tricky beast; it's easy to start with but can get complex as you dive deeper. The community is massive, so if you ever get stuck, there are tons of helpful souls out there ready to lend a hand. At the end of the day, JavaScript is all about creating, innovating, and bringing ideas to life on the web. ```
None of them are very good, so I don't think this claim is very outlandish.
Also, are you sure it's not reliable or maintainable? Obviously you can't publish one model and expect it to work forever but we have pipelines to continuously augment our training set and we can add new LLMs as they come out.
I also tried the opening paragraphs of two random wikipedia articles, and got 99.9% and 100.0% results.
"You have a 27% 'AI' issue in here" (https://news.ycombinator.com/item?id=37767205) (233 points | 253 comments)
because at the moment everything looks kind of bleak.
> Our model has an accuracy rate of 99.76%.
Oh?
And it said 91% chance it was generated by AI
You should try it with ChatGPT 3.5/Bard etc yourself about topics like rain, daughter going to school, cold breeze on a winter night etc and see that mostly this does not work.