Worrying that this will no longer be the case.
GPT-4 can do that task for fractions of a penny per email now. It doesn't have to be perfect if its competing with nothing. I expect we'll see similar shops for any other high cost paper/trail business.
What I'd really love to implement is a way for GPT-4 to answer questions based on a corpus of "all our Confluence pages plus random other sources of documentation." Like with the legal document issue, it's a bit of a nonstarter right now given the proprietary nature of corporate documentation.
https://old.reddit.com/r/ChatGPT/comments/12fiwaf/chat_gpt_w...
GPT-like systems will close that gap, and then comes all of the problems of automated law enforcement - Extrapolation from incomplete data, false positives from coindicences, interpretation errors, all that annoying stuff
They've created a huge library of unorganized data. The difference here is they now can spawn a million untiring AI private investigators / librarians to organize this information into coherent "case files".
At least for me, until this point I've had a feeling of anonymity in the idea that, while my data is being slurped up, I'm just one data point in a sea of other 'normal' people. There would be little value in spending government time and effort tying all of the web detritus together for me. The juice would definitely not be worth the squeeze.
However, when the cost of this effort is nearly zero, that now becomes a different story. The balance of power between government and the people it rules is going to radically shift.
That's already how it worked on platforms like mturk and uhrs, lots of the work was transcribing audio dumps from microphones built into computers/phones/smart home devices. UHRS especially had a lot of that (it's owned by MS) as well as search engine grading type work. They also certainly do not pay well, I'd imagine that in practice there isn't much cost difference to paying a bunch of bored people to do it vs the compute cost for running an AI model to do it, but the AI model will be vastly more accurate and will work 24/7.
Imagine you've sent an email about transporting a friend's daughter across state lines to get a medically-necessary abortion. Or if you prefer, imagine you've arranged via email to "lose" some firearms which don't comply with your state's new assault weapons ban.
Pre-LLMs, trying to find these sorts of emails was very hard. A simple text search for "abortion" or "gun" is going to come up with far more emails where two family members got into a political debate, than emails about lawbreaking. Big Brother will find a few such emails here and there by chance, but the vast majority of such incriminating emails will simply be lost in the pile.
Enter LLMs, and Big Brother can feed some of the incriminating emails found my chance into a training dataset along with a bunch of non-incriminating emails, and teach the AI to find incriminating emails, and then apply the model to the entire list of emails and get a nicely filtered list of only the emails which are incriminating, further tuning the model by adding emails it gets wrong to the training dataset when they are found.
I believe people can contribute in many different ways. When technology enables us to get my work output without me, that frees me up to produce other things for society.
The problem is that it is a disruption for everything because at its core it is a machine for the replication of skill and technology. A concept that has never existed prior with any other technological disruption.
"Climbing the skill ladder is going to look more like running on a treadmill at the gym. No matter how fast you run, you aren’t moving, AI is still right behind you learning everything that you can do."
from a more in depth view I wrote up here describing the rapidly shrinking innovation, disruption and adaption cycles
Sure. But when your “keep the lights on” job cuts you for AI, you are less likely to “produce other things” while you worry about food and heat.
That's going to be the exception, not the rule. The benefits from automating crowdwork will disproportionately accrue to corporate profits.
Either you go build something you want, an island where code doesn't talk back, or leave tech altogether.
I am saddened that I don't even recognise this place anymore. It's not Hacker News anymore, it's AI news. It's starry eyed engineers jumping over each other ready to sell their metaphorical soul. Even I, the Luddite, can't seem to talk about anything else than this bloody thing.
My current plan is to build a small business and retire in the middle of the woods somewhere. Do some Lisp coding while the rest of the world is dancing around their new idol.
/rant, send me an email if you wanna rant about it as well, and discuss your concerns.
What is going to happen are private robotic armies making sure private owners remain private owners.
And then, we will go back to times where people were not citizens by default and had fewer rights.
I’m not worried about rogue AI taking over the nukes. I’m worried the same people who think it’s a great idea to charge so much for insulin that people start dying are the ones who will be using AI to hurt people.
Hell, give me a slightly evil AI run amok over any pharma CEO doing their job.
Human greed is the problem. Authoritarianism and capitalism are just subcategories of the greed problem.
What we don't have an answer for yet, is will AGI be greedy?
"Employing Surge AI's top-tier human annotators at a rate of $25 per hour would have cost $500,000 for 20,000 hours of work, an excessive amount to invest in the research endeavor. Surge AI is a venture-backed startup that performs the human labeling for numerous AI companies including OpenAI, Meta, and Anthropic."
What could go wrong? Using GPT-4 to perform labeling used by OpenAI in order to train...uh, wait.
Think about it, how many millions of articles are posted online produced by OpenAI's GPTs to date... Good luck clearing out the training data for GPT-5.
True human content will get gradually scarce. We steer it for sure for our posts, but it is still GPTs that do the heavy lifting.
OpenAI's own classifier fails to detect GPT-4 generated text at the moment.
That's because beyond the 'As an AI language model' and a few key words it can be nearly impossible to detect GPT-4 especially if any prompt is used to intentionally keep it from being detected.
Human like text is a solved problem. There is no more getting better at detecting AI written text, there is only classifying more humans incorrectly at this point.
With an AI like GPT, it is quirky and amusing. Once AIs get really powerful, it becomes scary, and a lot of people who understand this field much better than I do are worried it has a good chance of being deadly. Like, potentially kill-everyone-on-earth deadly.
I'm much more impressed by GPT's ability to handle input than I am in its ability to generate output. It's arguably as good at reading comprehension as most humans.
For example, consider filling the blank:
A giant ______ flew over my head!
It can be a plane. Or a dragon. Or an UFO. Or a balloon. The thing is all of those are correct answers language-wise and the model works correctly as long as what gets filled in conforms to the rules of the given language.
The language that we generate encodes reality to some extent and the model picks up those correlations but there is no concept of reasoning or reality behind it. Maybe it is emergent at some point (as to effectively compress it needs to encode some subset of rules governing our reality) but it is not an agent that optimizes for understanding our reality. Something like Dreamer would be much closer to that.
Do we get hoverboards now or is that later ?
There are just not enough NVIDIA GPUs.
Replace "the machine" eith "the market" and it describes some people today.
Labelbox does image annotating still, and one CTO said as soon as GPT-4 enabled this for him he'd have his team homebrew it from there.
In other words, the metrics are biased in the researchers’ favor — so GPT-4 would have beat them even more often (probably a majority of the time based on the numbers), if someone else had created the guidelines and golden labels.
BTW, this is interesting. There is a lot of noise about AI carbon footprint. Now imagine how much humans would eat and fart for 20.000 work hours. It's about 10 man/years. Assuming 8h / 5d / 50 weeks schedule.
I don’t think you can compare people’s carbon footprint because those people will exist regardless of jobs.
But they don't have to /s
Surprisingly (to me) many people think that GPT-N will never exceed human level intelligence because it was trained on the internet. I think that argument is obviously wrong.
Another is that I am sure a large chunk of people will never concede that the AI is smarter than them. Literally never, no matter how smart the bot gets. I mean, probably a lot of people think they are as smart as anyone else. They won't agree that someone else is smarter than them, and they certainly won't agree that some bot is smarter than them. It's also a loaded assessment, like they will think that if they agree to that, then they are also implicitly agreeing to cede their personal agency to the bot.
Another possibility is that GPT-N successors that surpass human level cognition will be banned by regulation, like some drugs or nuclear explosives or bio weapons. They could even be pre-emptively banned at some level below human level, and maybe it would never be publicly acknowledged that it's technically possible to go above human level.
And if your ground truth is problematic, then this is generally a problem of specification and quality control, not performance.