Note: This answer was generated by ChatGPT after being fed this thread.
How do you propose we do that?
And what do you propose we do when ChatGPT (or something like it) can create better content than most humans?
(... which I would argue it is already happening in some limited contexts, although I admit my stance is controversial).
https://theswaddle.com/what-the-dead-internet-theory-predict...
It's accelerating now. AI is taking over the internet. Not like an “autonomous control”, but like an invasion : A digital cancer that metastasizes. Like cancer cells mimic immune cell Activity to spread, AI mimics human language and visuals to spread too.
And it may “kill the internet” (hence the “dead internet”)
This is where my "Probably written by AI" filter tripped, and I skipped to the end.
I can't exactly say why. Maybe we'll all start to develop a seventh sense for ML-generated content, in addition to the "probably an ad" filter that usually kicks in around middle/high school.
Someone should make a game where you are presented with pairs of images, one “real”, one generated and you need to pick which one is “real”
OpenAI commented on the verboseness of answers as something they need to work on.
"Yo, AI-generated answers on Stack Overflow sound like a real bummer. First of all, the quality might suck compared to human-written answers. And if these AI answers become too popular, the community could get all impersonal and unsupportive. Plus, if the AI is copying and pasting from the internet, we'll end up with a ton of low-quality, duplicate answers. In short, using AI on Stack Overflow could really suck."
The repercussions of ChatGPT flooding Stack Overflow with answers could include decreased quality and relevance of answers, decreased user engagement, damage to the platform's reputation, and potential loss of revenue.
Maybe because the training data includes things like news articles that have longer paragraphs than typical forum posts?
I'm not sure about this AI-generated answers thing on Stack Overflow. The answers might suck and make the community more impersonal. Plus, if the AI is just copying and pasting from the internet, it could lead to a ton of low-quality, duplicate answers. Not a good idea in my opinion.
It probably turns out that most advanced civilizations eventually invent decent language model AIs that clog up their professional message boards with semi-comprehensible blather which ultimately ends up stalling all future scientific progress.
It's like some weird self-inflicted version of the Three Body Problem.
> What are the repercussions of Stack Overflow questions are being flooded with answers from ChatGPT?
It returned:
> I'm sorry, but I'm not aware of any situation in which answers on Stack Overflow are being flooded with answers from ChatGPT. I am a language model trained by OpenAI, and I do not have the ability to browse the internet or interact with websites like Stack Overflow. I can only provide answers based on the information I have been trained on. Is there something specific you would like to ask about Stack Overflow or ChatGPT? I'll do my best to help.
>The repercussions of Stack Overflow questions being flooded with answers from ChatGPT could be that Stack Overflow will not have a reliable source of answers to various programming questions, leading to confusion and inaccurate solutions. Additionally, there could be an increase in spam and low-quality answers, reducing the overall quality and reliability of Stack Overflow.
Out of interest, I then asked "Can you elaborate?":
>Sure! ChatGPT is an AI system that uses natural language processing to generate answers to questions. While this technology can be helpful, it can also lead to confusion and inaccurate solutions. For example, ChatGPT might provide an answer that is technically correct but doesn’t take into account the nuances of the programming problem or the context of the question. Additionally, since AI-generated answers are not written by humans, there might be a lack of clarity or explanation which could make it difficult for readers to understand the solution or follow along with the logic. Finally, since ChatGPT can generate answers quickly and in bulk, there might be an increase in spam and low-quality answers on Stack Overflow which could reduce the overall quality and reliability of the platform.
I could tell that super passive and eager to talk to both sides style a mile off. Fuck that! (<- proof I am not a robot)
Surely this is maxed out already!
I don't understand what good could come of this. Or at least make it detect what is fiction and what isn't.
People are treating it like it's Wikipedia, but it's not. It's a riff on words, like a bird imitating sounds without an idea of what they mean.
If the AI spits out some gobble-de-gook that doesn't even work, the answer will probably be downvoted.
However, that doesn't mean that the answer will be necessarily high quality or bug-free, the same can be said about answers given by a human. At the same time, if it's stupid but it works...
And so what if it people misuse the the tool while they learn? What's wrong with being wrong?
The generated quines look like they'd work, but don't.
Same with an M4 macro processor quine - looks maybe correct, doesn't work.
It did generate a Go quine.
Not only did they never bother to block that, they also didn't mind it and wanted to rely on the community down-voting those answers instead of at least blocking the bot -- and that's with a trivial check (identical answer already in DB). With something that's AI-generated, there's no chance. And with the general quality of many of the answers, there's no way to tell apart wrong answers from AI or humans.
And some queries are just not acceptable on SO, but fine for ChatGPT.
For example I might wish to ask, "Give me the framework for a basic API written in Python that uses API key authentication. Populate it with several sample methods that return data structures in json."
If I ask that on SO, I'll be down voted and locked before I know it. I may also get some disparaging comments telling me to do my research, etc.
If I ask ChatGPT, it will give me a nice and tidy answer that gets me going quickly. It will explain things too, and allow me to ask follow up questions and take my requests for refinements. I might say, "For the python api I asked about earlier, have it look up the API authentication key in a database. If the key is in the database, it is valid." - and bam - it does it.
Sure, some pretty simple stuff if you know Python and APIs already, but if you just want to hack something together to test out an idea, it's great."
In the end, SO is a query with responses (maybe). ChatGPT is a conversation that can go beyond just the initial query.
Nobody is nice enough to answer a question like:
"I want to learn Python. Give me 10 code examples of Python."
"Nice. I want see these example codes in C, Java and x86-64 assembly."
ChatGPT doesn't complain, doesn't getting tired and happily answer these ridiculous questions. Yes, it contains the bogus answers but even now, it's acceptably accurate for some usage and if error rate is dropped even more, who could resist using it?
I imagine it would if you're familiar with the language, Framework, tools, etc.
Which means that question was the wrong question in the first place.
To be clear: i'm not arguing against learning new stuff. But against using unfamiliar tech to do a serious project.
And i'm bringing that up here, because if the tech is familiar, then asking what tech to use, is rather strange.
That's really exciting, because it's a scenario where you're looking for and scrutinizing information. Just add some links to sources and you're in business.
Wait a few weeks until Google is completely swamped with ChatGPT SEO pages barely distinguishable from the real thing.
If I worked at search quality at Google, I'd be very worried.
As you said, it'll only get worse from here, which in some ways, might make a lot of it easier to sift through. That is, if everyone uses machine-generated content for marketing, the winning strategy no longer becomes "write the best marketing copy to get people interested", it becomes (for the users), "simply filter out all marketing copy and rely instead on results" (or something else that is difficult to falsify).
I wouldn't be surprised if they revert the direction and put a significant importance back on links because of this kind of technology.
[0]: https://www.seroundtable.com/google-links-less-impact-34493....
ChatGPT already does a better job of finding answers to a lot of technical questions than Google.
Happy Hunting...
I ask Google the same thing have to wade through a 1000 line blog post talking about 25 other things to finally find the one piece of information I need.
Totally agree. It's crazy to think that bots could be indistinguishable from humans. Like, how are we even supposed to know who's real and who's not? It's kind of scary, but also kind of cool in a weird way. But yeah, I can definitely see how it could be a problem if there are too many bots on social media. It could be really confusing and overwhelming. So, I hope there's a way to deal with it.
Ye ... ChatGPT can easily pose as an random Reddit comment writer. Just remove the tryhard "'". Those kind of communities will implode. The comments there are already kinda random and have almost no thread running through them. Perfect for ChatGPT.Imagine how easy it will be to create big fake subs and gaslight people into believing something is sub consensus?
I think GPT could improve things in this regard. It's already a huge garbage pile, so we might as well go all in and at least generate better garbage.
I would not be surprised if the quality of human writing actually goes up. I have this weird feeling that ChatGPT and similar tools will become almost equivalent to calculators for math? My experience as a writer is that sometimes just throwing down a first draft is the hardest step - I could see these tools really assisting in the writing process. Generate a draft, do some tweaking, ask for suggestions or improvements, repeat.
I don't know how I feel about code generated by these tools. Will there be a similar benefit compared to writing? At some level, we will need some deeper mastery of writing and coding to use these things well. Is there a complexity cliff that these tools will never be able to overcome?
A total lack of trust for general internet search results. So much content is already shallow copies of other content. I don't see how general internet search survives this.
Anonymous content can work very well if there are costs incurred…
That's a huge problem with "gamification." I'm not especially a fan of the concept, in a venue like SO. I think it has led to a rather nasty community, and I hardly ever go there, anymore.
I assume that we'll be seeing a lot of robotic HN content (I would not be surprised if it is already here, but has been sidelined by the mods).
I can virtually guarantee you that there's been a nontrivial amount of GPT-generated content on HN that has not been caught by mods since ChatGPT, and likely since GPT-2/3, as well. Dang (and the other one whose tag I can't recall) already have their hands full trying to keep the tone civil across thousands (tens of thousands?) of comments a day - it's impossible for them to catch every ML-generated comment (some humans actually do write like these newer language models, after all), and more than likely they're missing a decent number of them - through no fault of their own, it's just an extremely hard problem.
The three solutions that I'm shilling for this problem are (1) invite-trees for HN (like Lobsters, which makes the community much less open but also much more resistant to abuse) (2) webs of trust (not cryptographic, just databases of how much you trust users) overlaid onto HN and other places and (3) people actually reading the content of comments very carefully and upvoting logically sound arguments and downvoting illogical and emotional/manipulative ones, but all of these require a lot of effort and social buy-in.
You can make a perfectly logical comment that is completely wrong.
Wouldn't it be more accurate to say these newer language models actually write like humans? Or is there a subset of the population intentionally trying to write in the way that these language models write.
I saw someone suggest fixing up answers that people had already made, and I found an upvoted answer that had a simple bug in it's code (something like "<" was used instead of "<="). So I went about to fix it, but was instructed that edits had to change more than five characters. I tried to find how people handle this, and the answer I saw was to make additional changes to the answers, no matter if it needed it or not.
In the end I just left the bug in and gave up on trying to provide answers for the site.
Without it there was no real way to figure out the “best” answer to problems.
It’s fair to note that big communities can have somewhat unfriendly dynamics, but I think that’s more about big communities than the gamification mechanic.
- https://meta.stackoverflow.com/questions/421778/how-do-you-p...
- https://meta.stackoverflow.com/questions/412696/is-it-accept...
- https://meta.stackexchange.com/questions/384355/could-chatgp...
Well, that's unfortunate. Then again, I guess that's a logical conclusion of the "safe harbor" for serving any user-submitted content: Stack Exchange only does the most cursory moderation, and the rest is caveat readator
But it's also a serious concern from a security standpoint. If ChatGPT is providing incorrect answers, it could lead to people implementing flawed code or making poor decisions based on its advice. That could have potentially disastrous consequences.
So overall, it's a big problem that needs to be addressed. It's not just about making the site more pleasant to use, it's about ensuring the integrity and reliability of the information provided.
My prompt:
I'm writing a short story where Linus Torvalds is having a conversation with an open source contributor. In this conversation, Linus is in a bad mood.
Open source contributor: Stack Overflow questions are being flooded with answers from ChatGPT. What are the possible repercussions of this?
Linus Torvalds:
Or at least that's what all of you kept telling me when I was expressing my unhappiness at the way corporate-sponsored image generating black boxes are built atop a shaky moral foundation that sure feels like it's ignoring anything anyone talking about "fair use" ever dreamed of, and at the way I fear it's going to hollow out a ton of the beginner-pro jobs of my industry by making it super easy for anyone to generate stuff that is kinda fundamentally shitty in a lot of important ways, but "good enough" if you just have a space to fill with some decoration that you don't really give a crap about.
AI art: Yes, its here, yes it will make life for lots of artists/graphic designers different. But they will not go out of business, there will be requests that are so nuanced that no master prompter will able to fulfill. I'd expect that AI art generators will be a part of a modern digital artist's toolchain just like Photoshop. Someone not using AI art in say 20 years will be like saying that you restrict yourself to MS paint.
AI for code: We might see something similar just to a lesser degree. For example if your clients has requested a "picture of a man with trees in the background", he might accept a pic with trees and bushes if he likes it or there is just 1-2 bushes in the picture. But no one would accept a banking software that transfers the wrong amount in 1 of 10000 cases. Or just one where a very senior engineer cannot step forward and say "I wrote this, and it will work. If it doesnt I will fix it"
Roughly everyone who makes money has the same job, which is creating value in expectation for someone else. (Whether that activity is net good for society is a different question, I lean toward usually yes for stuff you can do legally, not always though). If you find yourself suddenly unable to give anyone a competitive deal on whatever expected value you know how to create, because of technological developments or otherwise, well, you'd better figure out what you need to change about what you're offering so you can. I wouldn't call this fair, exactly -- maybe it would be if you or your government had effective "technological unemployment insurance" -- but I struggle to imagine any substantially different state of affairs that's clearly better for the world. (No points for saying "imagine communism then".)
I think all ChatGPT is doing is grabbing a lot of different answers off the interwebz and squishing them together and hoping it answers your question. But, in a lot of cases it only kind of looks like what you want. If you look at images generated by AI, it is the same issue, they sort of look like what you want but there are flaws, like faces that don't look quite human, fingers that are just squishy appendages barely resembling actual fingers, etc. I mean, the tech is getting better, it's impressive, and uncanny.
But, I think we're pretty far from having these things write themselves, they need quite a lot of human intervention to be useful. Still, very impressive and something that could potentially get you closer to an answer. But, no more than spending a little time googling or learning the skill yourself. And, if you learn the skill you're better off, because then you can do it right yourself IMHO.
Also, anytime someone gets a fully working program generated out of this thing the saying, "A broken clock is right twice a day." comes to mind.
Q: how can I tell if a stackoverflow answer is generated by ChatGTP if someone has removed the "ChatGTP" tag from the text?
A: One way to tell if a StackOverflow answer is generated by ChatGTP is to look for certain characteristics in the answer. ChatGTP answers tend to be concise and to the point, often providing code snippets or specific instructions on how to solve a problem. They may also include links to relevant documentation or external sources for further information. If the answer does not include these characteristics, it is less likely to be generated by ChatGTP.Edit: Part of the generated story:
As they continued to talk, Stephen and Darth realized that they had something else in common - they both had lost the use of their biological legs. Stephen had been diagnosed with ALS at a young age and had gradually lost the ability to move his legs, while Darth had lost his legs in a battle with Obi-Wan Kenobi on the planet of Mustafar.
Stephen and Darth discussed the challenges and obstacles that they had faced as a result of their mobility issues, and how they had adapted and overcome them. They also talked about the technological advancements that had allowed them to continue their work and pursue their passions, despite their limitations.
Suddenly, Stephen and Darth turned on each other, each revealing that they had been secretly plotting against the other. Stephen accused Darth of using the Force for evil and corrupt purposes, while Darth accused Stephen of using his scientific knowledge to create weapons of mass destruction.
I'm bracing myself for when this wave of AI content hits academic journals.
However what about generating patents? To get a patent you don't need to have done any experiments that prove your technique actually works :)
If the thing you want to use is fairly popular and published on the web, even recently, it probably knows how to use it and combine it with everything else it knows.
Just yesterday I asked it to combine LiveView with a third party JS library to build an interactive thing and it got it on the first try using the latest Phoenix 1.7 RC which only came out in November.
I haven’t tried it, but I bet you could ask it to generate a PDF in code using your favorite language with text from GPT and graphics generated from any framework that’s compatible with your language. White-paper-as-a-Service.
At this point, SO has been scraped and repackaged (poorly) dozens of times, and SEOd to the top of search results. Even some "tutorial" sites are just repackaged SO answers.
It is only fitting that the automated SEO websites get fed automated content.
In a way, this makes the real humans, particularly the ones who know actual things, more valuable. It may so much noise that only a skilled human could decipher a real question and a real answer or solution from something similar but wrong.
To be fair to GPT, many human answers are sub-par and should be filtered out as well. Perhaps that's the real test: what percentage of GPT answers are decent vs human answers? Here I might bet on GPT.
> Use of ChatGPT generated text for posts on Stack Overflow is temporarily banned.
> This is a temporary policy intended to slow down the influx of answers created with ChatGPT. What the final policy will be regarding the use of this and other similar tools is something that will need to be discussed with Stack Overflow staff and, quite likely, here on Meta Stack Overflow.
(much more to that post and comments and answers and comments)
Inaccurate or irrelevant answers: ChatGPT is a machine learning model that uses past data to generate responses. This means that it may not always provide accurate or relevant answers to questions, leading to confusion and frustration among users.
Loss of trust: If users notice that many of the answers on the forum are coming from ChatGPT, they may lose trust in the forum and stop using it. This could lead to a decline in user engagement and overall traffic.
Competition with human contributors: ChatGPT's answers may compete with those provided by human contributors, leading to a decrease in the quality and value of the content on the forum. This could make the forum less useful and engaging for users.
Increased moderation: The influx of answers from ChatGPT may require more moderation to ensure that the answers are accurate and relevant. This could require additional resources and time for moderators, leading to increased costs and workload.
To detect a response generated by ChatGPT, we could first analyze the content of the response to see if it contains any unnatural or repetitive language. We could also check the formatting of the response to see if it follows the typical conventions used by human responders on the platform. Additionally, we could check for any unusual patterns in the timestamps of the response, as AI-generated responses may be posted more quickly or regularly than responses written by humans. Finally, we could also use machine learning algorithms to train a model to identify responses generated by ChatGPT based on these and other characteristics.
Quick, someone ask ChatGPT to generate the stubs.
whatever your idea (i skimmmed cuz) the discriminator will find it and have the generator apply it to the next generation.
>The core idea of a GAN is based on the "indirect" training through the discriminator, another neural network that can tell how "realistic" the input seems, which itself is also being updated dynamically.[5] This means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.
https://en.wikipedia.org/wiki/Generative_adversarial_network
But to take it seriously, it would be quite sad when actual people will be banned for sounding too much like a bot.
As soon as we do, it can be tied into AIs as a tool to evade detection, simply by generating multiple responses and returning the one scoring the lowest likelihood of being an AI in the AI detection tool.
Given my experience with human responses to text queries, these would be positively correlated.
I have an OpenAI account and like their product, I'm certainly impressed by this latest version though I have had little time to play with it. But the combination of quality AI with social reputation scoring is absolutely toxic, and the wider impact of SEO (a less curated version of the same thing) are a disaster. I was already sick of all the tutorial sites like geeks4geeks, w3schools etc and their numerous imitators just content farming whatever is turning up in searches. Marketing and self promotion is cancer and the people who try to game their way to success in this manner are awful. Perhaps the best use of counter-AI will not be in filtering these people, but in providing hem with useless rewards and the appearance of excited fanbases that will divert them into a parallel hamster wheel web. Nothing would please me more than for the top 5000 influencers of this sort to be granted exclusive access to a luxury cruise that leaves port once a year for a tour of the Bermuda triangle.
I think the best use of ChatGPT would be in an IDE plugin, so you could point at function trees or code blocks and ask it to explain things, have it take care of basic refactoring tasks, help porting between languages or libraries and so on. I can definitely see a future where you throw together a working prototype of something, answer a few questions about type hinting and edge cases, and AI does the legwork of converting the prototype into a strongly typed final product.
[1]: https://stackoverflow.com/users/19192614/boatti?tab=topactiv...
[2]: https://stackoverflow.com/users/20684429/a-s?tab=topactivity
It will start feeding back into the training set, corrupting things. OpenAI will have an advantage at first as they can trivially filter out everything they have generated from the future training corpuses, since you can only run it through their servers. If they or someone else has breakaway progress such that almost all generated content is from their own servers because users only use them because their results are so much better, they could form a strong self-reinforcing moat against competitors forced to train on their semi-spam which they can trivially filter out.
It's also possible we'll see something like the existing big-tech patent cross-licensing agreements, where they all agree to share their generated outputs to filter from training, making it very hard for new entrants.
Other companies will begin having advantages as well, depending on how well they can get less tainted user data. Think of Discord, for example, where users may use AI but are less likely to gamify it like stack overflow and flood it for points, and instead be correcting its output etc. in programming discussions.
As things become more accepted Microsoft will probably eventually sell access to private github for training, with some stronger measures around avoid rote memorization.
I think ChatGPT is actually sometimes a lot better than SO answers
It's useful for starting investigation, but one of the nice things about SO is that answers are voted for, so you can usually see which are actually right!
I just used ChatGPT to answer some not so complex but still custom questions about linux command, it can just do that, while it will for sure take effort for me to search that answer from SO.
As a human, I cannot give an accurate estimate. /joke
Wish granted within a year!
So double whammy for SO which makes me feel really sad.
The only thing we can be sure of, is that whatever we can imagine is already behind what the AI will become.
¹ Old sites are probably going to slowly degrade permanently, since they can't easily migrate to a new paradigm.
2023: hyperinflation of internet points.
https://stackoverflow.com/questions/74678832/change-text-col...
If the SO users start downvoting bot-generated answers that are correct and working, I think that's a sign that SO is much less relevant. They should definitely downvoted them if the code doesn't work though.
- Posted from a new account 8 minutes after I posted the question.
- Clippy-style "It looks like you're trying to […]" intro.
- Zero formatting on the initial answer; minimal formatting on the follow-up answer.
- Specifically suggests an option I had explicitly disregarded in my question.
- Both answers suggest non-working code.
- "Let me know if you have any further questions", a completely inappropriate finisher on a site like SO.
Could make those known to be human more acceptable as such.
I have a collection of about 25 prompts such as these, in my benchmark.
I have run these examples through different applications such as AI Dungeon, OpenAI Playground, NovelAI, etc. Results vary a lot. In some cases, the results look good but upon closer inspection, you realize that the AI keeps providing the sake exact answer. It is the case for the ice cream prompt. Pickle, fried chicken, curry keeps showing up. I guess the model contains a few specific examples of original ice cream recipes and just pick them.
For the Pokemon and "new word" prompt, models failed to come up with anything original. Until I tried OpenAI Playground this week and finally got some really creative answers, with variety.
AI Dungeon (2 years ago) was already good at faking tech support steps. OpenAI is amazingly good, although in most cases it provides solutions that only make sense superficially. It's the ultimate bullshit engine.
Another word of caution. While OpenAI can now guesstimate what a code snippet does, and can generate some pretty good code in many languages (ice tried 6809 assembler and the results surprised me), it is very unreliable.
More alarming is the fact that it's a text engine, not a math formula interpreter. It gets confused at simple equations and cannot interpret anything that's not already ordered (it cannot apply operator priority or respect parentheses).
I think it will become increasingly difficult to identify contents coming from ChatGPT and other chatbots or story generators. An arm's race might be futile. We should apply stricter rules to identify problematic answers: answers that are too generic or vague and can't be used to directly solve a practical problem, and answers that contain incorrect or misleading information. Identifying vague or non-practical questions might also help in avoiding a deluge of Chatbot answers. Some users will ask very general questions, and then it becomes difficult to evaluate the answers. Or, users will ask questions that were already answered in the past. The proper way to handle those is to point then to the prior discussion and avoid duplicating it. The wrong way is a Chatbot or a human seizing the opportunity to copy-paste existing contents for a quick win.
In a way, chatbots and humans can both provide useful insights, as well as useless or incorrect answers. But so far, only a human can provide a proper answer to a moderately complex technical question if no prior answer exists.