Ofc, in reality universities aren't (or not only, and in most cases not primarily) about learning, but about credentialism. Employers and various social systems outsource to universities the role of verifying that people actually know something, or are the "right" sort of person, or other signals.
I don't know how one goes about fixing that, or if it's even possible, but I'd like to see more acknowledgement of it. Fixing "cheating" feels like the equivalent of looking to clever programming to fix product bugs.
Larger projects have a role to play in education, but ultimately if you can't pass the proctored exams, you don't pass the course.
I think the other thing we'll see is a nanny state approach by some educational institutions. Falling in a trap of being sold on software to "block" or "monitor" students using these tools. It would be easy to implement on a campus network to a certain extent (correlating student network login to URL accessed and potentially MitM) but the reality is smart students will know better and will use phone hotspots and VPN. The other dark side to consider is that the owners of ChatGPT could provide logs of user accounts and queries to higher education as a service.
At the end of the day my guess is all approaches are going to be tested at some level. But the cat is out of the bag and this is going to generate some very interesting countermeasure solutions/approaches along the way.
Not sure how a student could prove that the negative, apart from videotaping themselves writing the essay, or only being allowed type it on an airgapped machine owned by the university.
Doesn't feel very feasible to implement.
Why? If there is 75% confidence a students report was generated using ChatGPT, then that’s enough to sit down with the student and discuss the content in person and see if they actually know it. A tool such as this could help the teacher having to avoid doing this with every student, and also reinforce to students that if they don’t actually know the material there’s still a chance they get caught.
> the consequences for honest students would be too severe to actually act
Only if acting is immediately accusing them of plagiarism rather than working with the student to ensure it’s really their work.
Maybe the education system will finally learn that asking students to merely recite information that can be found by anyone, anywhere, in less than 10 seconds isn't helpful in judging their understanding of the subject, except in very limited scenarios.
- it thinks AI wrote parts of my handwritten texts
- King James Genesis 29 is appearantely fully AI written
- a wall of text copied straight from chatgpt? Only partially AI written.
- second chapter of Harry Potter? AI written parts
In fact I could not find a sample yet which was not AI written at least partially according to this.
> GPTZero is absolute trash! It can't even detect plagarism correctly. It's a complete joke and waste of time! Don't bother with it! You'd be better off just manually checking for plagiarism. It's so unreliable it's not even worth mentioning. Just save yourself the time and money and don't bother with GPTZero. Worst AI tool ever!
GPTZero classified it as fully human written. This is a cat and mouse game where the mouse is always going to lose.
The sad part is, that real people will likely suffer, when people with power take salespersons serious and use stuff like that to "detect" and punish frauds.
Respect. Even though I disagree with a need for such tools - it doesn’t matter if content was written by a human or by a machine. What matters is whether it’s easy to read and worthwhile.
This is also a problem with human-generated content.
It matters a lot instead. On the top of my mind I can think about few reasons:
- if you've to train a NLP model you'd rather do with data that is not autogenerated as AI generated content is rarely used to train new models (like generating a bounce of dogs pic with dall-e and use as input for image detection may not create a precise model)
- if you pay content creators to generate content and they use ChatGPT that's definitely a breach on contract and also a problem
- many search engines (e.g. Google) already heavily penalize auto generated content
- avoid cheating at exams / officials test/certs
“Need” may be arguable, but why disagree?
As regulations go, this one’s not too burdensome: expensive, but pretty cheap compared to training and running a large language model in the first place.
Put another way, it only works in today's environment.
When ML hardware is as widely distributed as classical compute, and all the models are on HuggingFace, you will be back to 0
The future will know a lot more about the nature and implications of persuasively-humanlike ML than I do: it can take care of itself. Maybe by then hybrid writing will be the norm and not considered plagiarism, and we’ll all have trustworthy virtual assistants shielding us from scams. But in the meantime, there are some reasonable causes for concern, and this would help.
There may also be escalating social and perhaps legal penalties too.
And that is without even considering bots.
ChatGPT's response: https://imgur.com/a/u7iBBaX
GPTZero's response: https://imgur.com/a/89icz2X
To be clear, my story was ChatGPT-assisted. But I wonder why ChatGPT couldn't detect it correctly like GPTZero?
Here's the link to my non-paywall story for references:
https://medium.com/humor-bytes/i-your-nba-highlights-broadca...
On the topic of assistance, where do detectors draw the line between AI-augmented, aided, and generated writing?
Sorry.
https://github.com/Oxen-AI/oxen-release
Would be cool if we could get a community around the test dataset to insure that 93% accuracy rate. Then people can add their failure cases to the repo and then you can iterate on them.
False positives in a plagiarism tool are pretty bad IMO. It should definitely skew towards "we can't be certain" rather than "definitely AI".
Again, it’s definitely correlated, even strongly correlated, but that’s not good enough for plagiarism detection. You can’t go accusing students of academic dishonesty based on a tool that gets it wrong multiple times on a couple dozen samples.
Apparently they can preserve performance by not doing this for very low entropy tokens where there is only one token that is extremely likely.
Saw it here: https://twitter.com/tomgoldsteincs/status/161828766500640358...
And secondly, this is a never-ending race. Even if it were to be able to detect ChatGPT content with 100% accuracy today, it would just be used to assist in training another model to defeat it.
Can an algorithm write an article about the 2008 financial crisis? Of course! it has read God knows how many Wikipedia articles, books and online discussions about it. But everything the AI knows is because some humans put in the work and documented everything.
If we don't know something, be it a historical fact or a scientific model, we have the ability to go out in the real world and record the data. Can ChatGPT fly to Yemen and interview refugees? Can ChatGPT tear apart the circuitry of a home appliance, reverse engineer it and write a blog post about it? Can ChatGPT go to a lab and make chemical experiments to create new compounds?
No, it can only regurgitate what it knows and even what it doesn't know.
ChatGPT will absolutely steal the job of all the "journalists" that just regurgitate what others have already written. Original research and field reporting will be left to the real journalists for God knows how many decades. If your job consists of actually making new things or collecting real-world data then I think it will be safe for the decades to come.
Even so, I agree with most of the comments that it would be exceeding difficult to truly identify this stuff since you can ask ChatGPT to reword things in very specific ways or just edit it yourself (aside from blatant false positives)
The start and end were apparently written by AI.
What does this mean? Is this proof we're in a simulation and I'm actually a mouth piece for an AI that's reached the singularity?
Would be interesting how much they actually catch.
But how would one go about detecting something like that? Well, one would need a model of human language trained to approximate the distribution of tokens in a large corpus of natural language... text...
Oh wait.