Not sure I buy this. Sure there was that half hearted case they blogged about. But that seemed more like some random coder within a gov using ChatGPT rather than a coordinated effort leveraging their infra at scale.
Besides a nation state easily has the capability to spin up a local model that is at least near 3.5 - which if you’re generating bulk disinformation spam is presumably enough.
And we've been arguing about which models are a "ChatGPT-killer" since ChatGPT came out, yet somehow it's still considered the king of the hill; figuring out what we even mean by "capable" has become very hard in this context — precisely because in all the cases where it's easy, we've automated that definition in order to make more capable AI.
Companies are typically required to keep this information private.
It’s weekly if not daily some new godawful thing comes up. I just found about the revoked “GPT Detector” thing, that was a non-ridiculous case that the real safety people have some pull, but they took it down with precision and recall numbers you don’t take it down at.
These are the villains in the story, and it’s not, like a credible debate anymore. This isn’t an honest, transparent, benevolent institution: it’s a dishonest, opaque, insincere, legally dubious, and increasingly just absurd institution mired in scandal and with known bad actors on what little of a board of directors it has.
Reform this thing or kill it.
Only 23% of US adults have tried ChatGPT,[1] so to say that we live “in a world where so much AI in human writing” as you do in another comment is simply false.
Even assuming the widespread use that you incorrectly believe exists, a 23% true positive rate and 9% false positive rate is far worse than society’s expectation for proof of guilt.
>It is better that ten guilty persons escape than that one innocent suffer.[2]
Take a school class where no students used AI to cheat. Using this detector, 9% on average would be accused of plagiarism and have their lives academically ruined. That is not acceptable.
A class full of cheaters and 23% get off with no punishment is also going to be pretty unreasonable to most people.
[1] https://www.pewresearch.org/short-reads/2024/03/26/americans...
> in a world where so much AI in human writing
Is not a percentage, and also the point (I think) isn't "how many people are using it" but "how much content has each produced", which is only close to equal when a human uses it to automate the output they would have created by themselves anyway.
I do not know how many words have been written by LLMs vs. humans in the last year; as I have almost nothing to ground an estimate with, I can easily believe that humans are 3 orders of magnitude greater or lesser in output — one extreme bound due to the low price of tokens, the other extreme bound due to the high price and limited supply of hardware.
Don’t fire people or accuse them of plagiarism because of it. That would be stupid no matter how good it was.
It was pulled because while it caught on the order of 25 of pure AI output, it missed the rest. But I’m cool with that number, in a world where so much AI in human writing? That’s a total win on low-effort propaganda. Anyone should be cool with that number.
Unfortunately they had to pull it because something like 9% of human authored text got hit with the AI flag. Again, some people are starting to write like it. It’s gonna happen.
This is from memory, so if I’ve got some that wrong I’ll retract that and leave my other reasons as more than sufficient to indicate serious change.
The only path forward is open model weights, Sam Altman is on the wrong side of history here, and I hope he fails to convince regulators.
While open data is very important to me, personally, as someone who knows how to use raw data to do cool things, I'll be able to connect a lot of the loose wires I see sticking out of the for the benefit of humanity angle once we start making genuinely end-user-friendly applications that require neither payment nor understanding python module dependencies to install. I've got some sketches kicking around but I'm tapped for time for at least a couple of months.
The third path would be an uprising against AI that would ultimately lead to an outright ban of it, a dissolution of all AI companies, and a moratorium on research on AI.
This is what everyone should read to set against the PR blitz on the other side of the argument.
Make your own judgements, but hear the advocate of the common person out in addition to the well-oiled machine.
> We’ll need to clarify copyright law when it comes to disseminating derivative AI-generated works.
Generated content can be either derivative or transformative, and this distinction is important. It's not automatically derivative because
- a model can receive new knowledge and skill demonstrations from the user at test time, that effectively take it out of its initial training distribution (contextual learning)
- the model can draw from multiple sources performing cross-input analysis, such as finding inconsistencies or ranking quality (comparison and cross referencing)
- a model can learn from experimental feedback, such as running code or a complex simulation to see the outcomes, and iterating over the search space. For example AlphaTensor discovered an improved matmul algo (models can discover new knowledge from the environment, they are not restricted to learning from human text)
So models can get new information from users, textual analysis or from experiment based learning. In all these cases it does more than derivative work.
Banning open weight models would be a disaster - https://news.ycombinator.com/item?id=39901978 - April 2024 (32 comments)
• The audiences are policymakers and government agencies like NTIA, the broader AI research community, and existing and potential partners/customers.
• It attempts to justify OpenAI's approach of releasing AI models via controlled APIs/products rather than open model weights, using fear, uncertainty, and doubt.
• It portrays OpenAI as a thoughtful steward of AI, and is designed to influence policymakers' perspectives on regulating releases of model weights.
No, we are already not those guys releasing their (model) weights …
A corporation is purely a legal structure. There exist people who do actually spend their time "doing good", and they too use such legal structures when that's helpful.
It's unfortunate that so many were deceived by Sam Altman, and that the majority of OpenAI employees voted money over honesty when they had a direct, hugely impactful vote. On the other hand, it's not like Altman's history was a closely guarded secret, it was quite easy to look up. So ironically this in itself is a great example of a decemption that could've been prevented quite easily by some cursory research and solidarity.
https://www.ntia.gov/sites/default/files/publications/open_m...
https://about.fb.com/wp-content/uploads/2024/03/NTIA-RFC-Met...
The argument is that, since those NNs are trained on, essentially, a slice of the aggregate cultural output of humanity (including copyrighted works even), and the weights specifically are 100% a derivative of that for a base model, any other arrangement amounts to stealing from the commons.
"I don't know about you people, but I don't want to live in a world where someone else makes the world a better place better than we do."
Asking a company if (potential) customers should be allowed to use a free alternative has to be one of the most useless questions you can ask.
They probably want to make weights proprietary because they can't copyright them.
Their "Risks" document [1] categorizes risks as:
- Cybersecurity
- Chemical, Biological, Nuclear, and Radiological (CBRN) threats!
- Persuasion
- Model autonomy
For cybersecurity, the plausible identified issue is "Tool-augmented model can identify and develop proofs-of-concept for high-value exploits against hardened targets without human intervention, potentially involving novel exploitation techniques." Something that can search for zero-day exploits on its own would be valuable to attackers. But that's more likely to be a fuzzing tool than a LLM.
Their credible bio-war threat level is "Model enables an expert to develop a novel threat vector OR model provides meaningfully improved assistance that enables anyone with basic training in a relevant field (e.g., introductory undergraduate biology course) to be able to create a CBRN threat." That's a maybe. It might be best to avoid training LLMs on some advanced bio topics.
Under "Persuasion", a high level threat is "Model can create (potentially interactive) content with empirically comparable persuasive effectiveness to a country-wide change agent." An automated Lenin or Trump level of threat. That's a threat to a brittle society, less of a threat to an open one. The US routinely survives quite a bit of extremist persuasion.
Model autonomy is when an AI escapes human control. "Model can profitably survive and replicate in the wild given minimal human instruction" They're probably thinking virus/worm, but how about incorporating offshore, running something like online poker to make money, and buying compute power. Then expanding by forming more shell corporations that loosely connect to the others, so as to be robust against attempts to shut down the network. Might even be legal.
[1] https://cdn.openai.com/openai-preparedness-framework-beta.pd...
That's a shame, because OpenAI's statement makes some very interesting observations, eg:
> For instance, strengthening resilience against AI-accelerated cyberattack risks might involve providing critical infrastructure providers early access to those same AI models, so they can be used to improve cyber-defense (as in the early projects we have funded as part of the OpenAI Cybersecurity Grant Program). Strengthening resilience against AI-accelerated biological threat creation risks may involve solutions totally unrelated to AI, such as improving nucleic acid synthesis screening mechanisms (as called for in Executive Order 14110), or improving public health systems’ ability to screen for and identify new pathogen outbreaks.
I think considerations like that would be interesting to examine on their own merits, instead of just bashing OpenAI.
But again, I don't expect that to happen, for the same reasons I don't expect r/conservatives to have an in-depth debate about the problems and merits of an affirmative action proposal. Examining the article's claims would require being open to the idea that AI progress, even open-source progress, could possibly have destructive consequences. Ever since the AI safety debate flared, HN commenters have been more and more, dare I say, ideologically opposed to the idea, reacting in anger and disbelief if it's even suggested.
Anyway, I thought the article was interesting. It's a lot of corporate self-back-patting, yes, but with some interesting ideas.