Right now if you look at GTP-3's output it seems like it's approaching a convincing approximation of a bluffing college student writing a bad paper, correct sentences and stuff but very 'cocky'. It cannot tell right from wrong, and it will just make up convincing rubbish, 'hoping' to fool the reader. (I know I'm anthropomorphizing but bear with me).
Current models are being trained on a huge amount of internet text. As smarmy denizens of hackernews we know that people are very often wrong (or 'not even wrong') on the internet. It seems to me that anything trained on internet data is kinda doomed to poison itself on the high ratio of garbage floating around here?
We've seen with a lot of machine-learning stuff that biased data will create biased models, so you have to be really careful what you train it on. The dataset on which GTP-n has to be trained has to be pretty huge(?); and moderation is hard(?) and doesn't scale; it's easier to generate falsehood than truth; and the further we go along the more of internet data will be (weaponized?) output of GTP-(n-1); So won't the arrival of AGI just be sabotaged by the arrival of AGI?
Has anyone written something about the process of building AGI that deals with this?
When GTP-3 writes a scientific paper it's not trying to test a premise or critically evaluate some data, it's trying to generate text that looks like that sort of thing.
It doesn't matter how many excellent quality papers you fed it, or how stringently you excluded low quality input data, it would still only be trying to produce facsimiles. It wouldn't be actually trying to do the things a real conscientious scientist is trying to do when they write a paper. Arguably at best it might be trying to do what a deceitful scientist trying to get credit for a paper with spurious results with no actual scientific merit might be doing when writing a paper that looks plausible, but thats actually a completely different activity.
The code generation examples are really interesting. Here it's being used to generate real working code that has actual value, and it appears to work pretty well. It's code often has bugs, but who's doesn't? It only works for fairly short precisely definable coding tasks though. I don't think scaling it up to more complex coding problems is going to work. Again it doesn't understand the meaning of anything. It's trying to produce code that looks like working code, not actually solve the programming task you're giving it. It doesn't even know what a program is or what a programming task is. It doesn't know what input and output are. For example you can't ask it to modify existing code to change it's behaviour. It doesn't know code has behaviour. It has no idea what that even means and has no way to find out or any route to gaining that capability, because that's not a text transformation task and all it does is transform text.
The best, in fact the only way to generate truly convincing text output on most subjects is to understand, on some level, what you're writing about. In other words, to create a higher level abstraction than simply "statistically speaking, this word seems to follow that one". Once you start to encode that words map to concepts, you can use the resulting conceptual model to create output which is conceptually consistent, then map it backwards to words. There is what humans do with sensory data, and there is good evidence that GPT-3 is doing this too, to some degree.
Take simple arithmetic, such as adding two and three digit numbers. GPT-2 could not do this very successfully. It did indeed look like it was treating it as a "find the textual pattern" problem.
But GPT-3 is much more successful, including at giving correct answers to arithmetic problems that weren't in its training set.
So what changed? We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.
If this is the case, and it remains speculation at this point, would you still argue that GPT-3 doesn't "understand" arithmetic, on some level? I would argue that this abstraction, this mapping of words onto higher-level concepts, which can then be manipulated to solve more complex problems, is exactly what intelligence is, once you strip away biologically-biased assumptions.
Certainly, at this point GPT-3's conceptual understanding remains somewhat primitive and unstable, but the fact that it exhibits it at all, and sometimes in spookily impressive ways, is what has people excited and worried. We have produced AIs that can perhaps think conceptually about relatively narrow topics like playing Go, but we have never before created one that can do so one such a wide range of topics. And there is no suggestion that GPT-3's level of ability represents a maximum. GPT-4 and beyond will be more powerful, meaning that it can mine more and more powerful conceptual understanding from their training data.
It's certainly impressive what GPT-3 can do, but it boggles the mind how much data went into it. By contrast a well-educated renaissance man might have read a book every month or so from age 15 to 30? That doesn't seem to be anywhere near what GPT could swallow in a few seconds.
When you look at how GPT answers things, it kinda feels like someone who has heard the keywords and can spout some things that at least obscure whether it has ever studied a given subject, and this is impressive. What I wonder is whether it can do reasoning of the causality kind: what if X hadn't happened, what evidence do we need to collect to know if theory Z is falsified, which data W is confounding?
To me it seems that sort of thing is what smart people are able to work out, with a lot of reading, but not quite the mountain that GPT reads.
You're ignoring the insane amount of sensory information a human gets in 30 years. I think that absolutely dwarfs the amount of information that GPT-3 eats in a training run.
Humans get this through experience and time (recognizing patterns about which sources to trust) but there is nothing magical about it. Should be very easy to add this.
I think logic is the easy part, we can already do that with our current technology.
The difficult part is disambiguating an input text, and transforming it into something a logic subsystem can deal with. (And then optionally transform the results back to text).
Not unlike a non-technical manager discussing tech!
It's not certain that this is always the case. In at least one case I've seen, if you give it question-and-answer prompts where you don't demonstrate that you will accept the answer "your question is nonsense", it will indeed make things up; but if you include "your question is nonsense" as an acceptable answer in the sample prompts, then it will use it correctly. See https://twitter.com/nicklovescode/status/1284050958977130497 .
It seems that we have a lot to learn about how to use GPT-3 effectively!
It doesn't know which facts about the world are true and which are fabricated except through the text it's trained on, but to a large extent neither do you. It suffices to merely have the ability to reason about it. The primary difference is that whereas you reason fairly competently from one perspective with one pseudo-coherent set of goals, GPT-3 reasons to a weak degree from all perspectives, privileging no view point in particular.
How important this ends up being depends on where the model plateaus. On one extreme, if it plateaus close to where GPT-3 already is, no harm done, it's a fun toy. On the other, if it scales until perplexity gets to far-superhuman levels, it doesn't matter at all, since you can just prompt it with Terence Tao talking about his latest discovery.
Naturally, it will land somewhere in the middle of these points. The question is ultimately then whether the landing point captures enough general reasoning that you can use it to bootstrap some more advanced reasoning agent. A sufficiently powerful GPT-N should, for example, be able to deliberate over its own generated ideas and sort the coherent reasoning from the incoherent.
That same sentiment could be equally applied to humans, and not just in the internet era, but throughout all of history. There will always be misinformation and "wrong" opinions out there. "It cannot tell right from wrong" is an accusation leveled against human beings every day. We can't even all agree on what is right and wrong, truth or untruth.
A true AI is going to have to wade through all that and make its own decisions to be viewed and judged from many different perspectives, just like the rest of us.
https://en.wikipedia.org/wiki/Kessler_syndrome
Speaking with GPT-3 also makes one realize how influenced its predictions of AI scenarios are by dystopian memes. In any conversation in which you are "speaking to the AI", the AI can go rogue.
There just aren't enough positive role models authors have written for AGI.
Ah one only needs to think back to Tay to know how these sort of things will end.
https://en.wikipedia.org/wiki/Tay_(bot)
(Imagine 4chan got a wind of this bot and retrained it, which is probably what happened...)
https://news.ycombinator.com/item?id=23896293
I guess books published will be more useful than reddit rants (depending on the application)
Low-quality noise cancels out and leaves the high-quality signal. In the limit, the internet offers the true sequence probabilities for compression of natural text.
You can also put more weight on authoritative data sources, such as Wikipedia and StackOverflow, but even uniformly weighted: It is possible to sequence-complete prime numbers, despite the many many pages online with random numbers.
GPT-3 is trained on a filtered version of Common Crawl, enhanced with authoritative datasets, such as Books1, WebText, and Wikipedia-en. Moderation is done automatically, with a toxicity classifier/toggle. If GPT-n becomes good enough to be accepted in authoritative datasets, then it is perfectly fine training data, a form of semi-supervised learning.
Bias is going to be a double-edged sword: I believe it will be impossible to prescribe common sense, nor to sanitize common sense to remove, say, gender bias, and still be able to understand a sexist joke about female programmers, or male nurses. We want an AI to be human, but we don't want it to associate CEOs with white males, dark hair, wearing suits. That will conflict.
And the vision of what AI was becoming was voiced much earlier by Yudkowsky, whose Singularity Institute received funding from Thiel.
If anything, they heard what a few prophets were shouting. They saw some early demos in a startup pitch. They responded and DeepMind's work soon became as public as AI research is. That is to say, most people ignored it until the Google acquisition.
It's probably not a state-of-the-art breakthrough at this point. Who knows what OpenAI has done in the intervening two years?
Here's an example: https://www.theverge.com/2016/6/2/11837566/elon-musk-one-ai-...
Read past the fluff and the skynet. He's telling us Google scares him and makes him concerned for democracy.
Very carefully. I mean, that's not much of an argument. Lots of stuff is successfully kept secret. The US managed to keep a lid on their surveilance for decades (iirc) before the lid got blown on that, and people used to give the same argument you are in that context, too.
What's the alternative? Do you think megacorps never keep illicit things under wraps for extended periods of time?
Ahh... I'm not so sure; see the comment by blueyes. Had it been OpenAI's goal to engage in debate and regulation for this technology, they would have been vocal about that aspect of their work already.
"Not reading OP and spouting a non-quantitative conspiracy theory" = what you did.
but you'd have no way of enforcing any treaty so that's a moot point and they would know this.
I think making people aware of the importance of the control or value loading problems is a much better use of efforts.
As far as I can tell, this is what is going on: they do not have any such thing, because GB and DM do not believe in the scaling hypothesis the way that Sutskever, Amodei and others at OA do.
GB is entirely too practical and short-term focused to dabble in such esoteric & expensive speculation, although Quoc's group occasionally surprises you. They'll dabble in something like GShard, but mostly because they expect to be likely to be able to deploy it or something like it to production in Google Translate.
DM (particularly Hassabis, I'm not sure about Legg's current views) believes that AGI will require effectively replicating the human brain module by module, and that while these modules will be extremely large and expensive by contemporary standards, they still need to be invented and finetuned piece by piece, with little risk or surprise until the final assembly. That is how you get DM contraptions like Agent57 which are throwing the kitchen sink at the wall to see what sticks, and why they place such emphasis on neuroscience as inspiration and cross-fertilization. When someone seems to have come up with a scalable architecture for a problem, like AlphaZero or AlphaStar, they are willing to pour on the gas to make it scale, but otherwise, incremental refinement on ALE and then DMLab is the game plan. Because they have locked up so much talent and have so much proprietary code and believe all of that is a major moat to any competitor trying to replicate the complicated brain, they are fairly easygoing.
OA, lacking anything like DM's long-term funding from Google or its enormous headcount, is making a startup-like bet that they know the secret: the scaling hypothesis is true and very simple DRL algorithms like PPO on top of large simple architectures like RNNs or Transformers can emerge and meta-learn their way to powerful capabilities, enabling further funding for still more compute & scaling, in a virtuous cycle. And if OA is wrong to trust in the God of Straight Lines On Graphs, well, they never could compete with DM directly using DM's favored approach, and were always going to be an also-ran footnote.
While all of this hypothetically can be replicated relatively easily (never underestimate the amount of tweaking and special sauce it takes) by competitors if they wished (the necessary amounts of compute budgets are still trivial in terms of Big Science or other investments like AlphaGo or AlphaStar or Waymo, after all), said competitors are too hidebound and deeply philosophically wrong to ever admit fault and try to overtake OA until it's too late. This might seem absurd, but look at the repeated criticism of OA every time they release a new example of the scaling hypothesis, from GPT-1 to Dactyl to OA5 to GPT-2 to iGPT to GPT-3... (When faced with the choice between having to admit all their fancy hard work is a dead-end, swallow the bitter lesson, and start budgeting tens of millions of compute, or between writing a tweet explaining how, "actually, GPT-3 shows that scaling is a dead end and it's just imitation intelligence" - most people will get busy on the tweet!)
> GPT-3 is the first NLP system that has obvious, immediate, substantial economic value.
Text mining (relation extraction, named entity recognition, terminology mining) and sentiment analysis are billion dollar industries and are being directly applied right now in marketing, finance, law, search, automotive, basically every industry. Machine translation is another huge industry of its own. Chat bots were all the hype a few years ago. Let's not reduce the whole field of NLP to language generation.
When speaking of billion dollar investments, a billion dollar industry is not substantial. Google and Facebook's industries are advertising, at $600bn/year. Amazon's industry is retail, at $25tn/year.
What's opened up by the GPT-3 and its prompt-programming abilities is services, without qualification. That's $50tn/year, and capturing some tiny percentage of it is what's needed to make a billion-dollar investment worthwhile.
That said, I admit this isn't the mindset most people take when they read 'substantial'.
e: I changed the wording from 'substantial' to 'transformative', thanks!
The best applications are probably when error rates don't matter because a human is just going to use it for inspiration.
Prompt-programming is a standard features of all LMs. What differentiates GPT3 is not this application but the quality of the output. NLP companies such as chatbot providers and specialised search (patents, legal assistants, tenders) have been using domain-specific LMs for years.
AI dungeon is also powered by GPT-3, and it's quite snappy. I'm not sure why GPT-3 is seen as computationally expensive, but it seems workable.
And as mentioned elsewhere, inference for a trained model is much, much cheaper.
Every time I’ve looked at the start of the art in sentiment analysis, it seems to be suffering from the same issue that bag-of-words has with modifiers like “not”. Or is that more a theoretical problem than a practical one?
I appreciate this is a rapidly moving field, so my knowledge could easily be out of date.
"This isn't a terrible horrible restaurant that nobody should ever go to" seems like 1) it doesn't mean it's actually a good restaurant either 2) the writer might be joking and sarcastic and 3) this will be very rare in actual reviews.
Put another way, certain modifiers contextually go with certain words and sentiments, so why shouldn't state of the art systems lean on that fact, notwithstanding the strict application of grammar?
However, industry typically relies on sentence- or document-level sentiment in, for instance, customer reviews with systems obtaining 80-90 F1-score which is very good. Often in e-commerce, aspect-based sentiment analysis is used in which a qualifying sentiment is attached to a target aspect, e.g. from a phone review systems extract: battery: large > positive; screen: dim > positive. You might have seen these types of reports in aggregate on review our e-commerce sites yourself.
It is however an ongoing field of research to process the scope of negation and uncertainty, but the field is making strides. State-of-the-art attention-based models obtain good scores on benchmark fine-grained sentiment analysis datasets such as the GoodFor/BadFor and MPQA2.0 of around 70% F1score [1]. This performance is nearly enough for commercial systems, depending on how you employ them.
1. https://link.springer.com/article/10.1186/s13673-019-0196-3
Don't get me wrong the hype is largely deserved because of the performance and engineering/research/funding effort required. Plus cool demos and media marketing from OpenAI helps a lot in spreading awareness.
OpenAI has definitely revolutionised the marketing for language models, no doubt. Let's wait and see if they manage to do the same for the economic valorisation.
Maybe it can be an adjuvant to human in some tasks but then so could existing technologies too, I guess?
Around Mumbai I know there is a crew that can really use UIMA, and there are other Indians I know who do intelligence and defense work.
The call for legislation neglects that there exists a global arms race to make this technology succeed. Legislation in one nation will simply handicap that nation. Against that backdrop, legislation is probably unlikely among the nations already leading in AI.
Is it though? If the goal is human-level AI, or hell, even rat-level AI, the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean. Our brains draw about 20 watts, remember. This suggests to me that no, in fact, scale will not get us "there".
https://www.forbes.com/sites/robtoews/2020/06/17/deep-learni...
Training GPT-3 cost $5m; running it costs .04c per page of output.
Yes and airplanes use much more energy to fly than a bird. What that got to do with the airline industry?
A CS lecturer of mine told us that when he was a student he had a lecturer who advised him to be sceptical of AI revolutions. That was nearly 20 years ago. I've no doubt we'll see further steps but I'm not going to hold my breath for something transformative.
Personally, my guess is that it’s actually just plagiarizing the training set in a way that most researchers will come to view as a kind of cheating. What I mean by that is, if you take some plagiarism detection software and run it on GPT-3’s output, it will ring like crazy.
I say this both because I believe it and because if it’s not the case, if we really have a proto-AGI on our hands, then being wrong won’t matter. I sincerely hope that we are a thousand years away from that, because otherwise we are plainly doomed.
Somebody should try this. I ran a few paragraphs from AIDungeon through https://plagiarismdetector.net/ and got zero or low plagiarism percentages, but I'd imagine there are much better detectors that aren't publicly available.
We're doomed regardless. We don't have a thousand years. Maybe not even 100.
(the critique is: GPT-3 can in fact do all the things Marcus said it couldn't)
I'm guessing that not many people have read that book, because I'm seeing here and elsewhere even technical people talking about GPT-3 as if it's heralding the imminent advent of SkyNet. I get that transformers have a somewhat longer attention span than markov chains, but it's still a statistical language model. It can't even do the kind of planning or reasoning that early AI demos like SHRDLU could.
Human: I want to test your creativity. Please invent a new word and give its meaning.
GPT-3: Ok. Um... Tana means to hit someone with the intention to wound them.
Human: Please use the word tana in a sentence.
GPT-3: You are about to tana the man attacking you.
Human: Speak like a dwarf.
GPT-3: I ain't talkin' like a dwarf.
https://www.reddit.com/r/MachineLearning/comments/hvssqn/d_g...
Consider a language with the tokens "{[()]}" and the following grammar:
S := S S | '{' S '}' | '[' S ']' | '(' S ')' | <empty>
That is, "[()]" and "[]()" are valid sequences, but "[(])" or "))))" aren't. A child would quickly figure out the grammar if presented some valid sequences.
I generated all 73206 valid sequences with 10 tokens and used it as input to the RNN text generator code at http://karpathy.github.io/2015/05/21/rnn-effectiveness/. After 500,000 iterations I'm still getting invalid sequences.
Am I doing something stupid, or is a RNN text generator weaker than a child (or a pushdown automaton)? Is GPT fundamentally more powerful than this?
Erm, citations needed. It's a giant, inefficient and shitty KNN model, which is capable of mimicking markov chains. Wonderful marketing achievement and not much else.
This is especially obvious on stuff like lesswrong, where AI is a big part of what they talk about. I tend to agree with the LW/SSC crowd about the negative effects of AGI, but they are being so hyperbolic about GPT-3.
This article clearly sits on the peak of inflated expectations in the hype cycle.
GPT-3 is taking a graph-structured object ("language" inclusive of syntax and semantics) over a variable-length discrete domain and crushing it into a high-dimensional vector in a continuous euclidean space. That's like fitting the 3-d spherical earth onto a 2-d map; any way you do it you do violence to the map.
I think systems like GPT-3 are approaching an asymptote. You could put 10x the resources in and get 10% better results, another 10x and get 1% better results, something like that.
You might do better with multi-task learning oriented towards specific useful functions (e.g. "is this period the end of a sentence?") but the training problem for GPT-3 is by no means sufficient for text understanding.
GPT-3 fascinates people for various reasons, one of them being almost good enough at language, lacking understanding, faking it, and being the butt of a joke.
If GPT-3 were a person with similar language skills and people blogged about that person, mocking it's output, the way we do with GPT-3, people would find that cringeworthy. Neurotypicals welcome it as one of their own, and aspies envy it because it can pass better than they can.
At $2 a page it can replace richmansplainers such as Graham and Thiel who never listen. It's not a solution for folks like like Phillip Greenspun who read the comments on their blogs.
For that matter, it may very well model the mindlessness of corporate America: if you accept GPT-3 you prove you will see the Emperor's clothes no matter how buck naked he is. AT&T executives had a perfectly good mobile phone business: what possessed them to buy a failing satellite TV business? Could GPT-3 replace that "thinking" at $2 a page? Such a bargain.
For instance, understanding language requires some of the capabilities of a SAT solver. This was something everybody believed in 1972, but today is denied.
Fundamentally "understanding" problems require the ability to consider multiple alternative interpretations of a situation, often choose one or work with the incomplete knowledge you have.
Back in the 1970s we had intellectually honest people like Richard Dreyfus writing books like "Things Computers Can't Do" that describe many specific ways the architecture at the time fall short. People on GPT-3 are working in a way that is academically valid (able to make results that are meaningful to a community) but from engineering it is like building a bridge with one end or a tall tower that carries no load.
GPT-3 has a structural mismatch with the domain it works in. Unlike early medical diagnosis systems like MYCIN, it is never a doctor, it just plays one on TV and it does the "passing for neurotypical" terrifyingly well.
The secret of GPT-3 is that people want to believe in it. Somebody will have it generate 100 text snippets and they will show you the three best. Your mind makes up meaning to fill up for its mindlessness. When this was going on with ELIZA in 1965 people quickly understood that ELIZA was hijacking our instinct to make meaning.
For some reason people don't seem to have that insight today, and it bothers me why that is. Back in the 1980s they had a lot of fear about compressing medical images because it could lead to a wrong diagnosis. Today you see articles in the press that are completely unquestioning that a neural network that has been trained to hallucinate healthy and cancerous tissues will always hallucinate the right thing when you are looking at a patient.
Adding to this, most metrics can't be embedded well in euclidean space. Even something as simple as 4 nodes in a loop using the shortest path as your metric -- there's a minimum amount of error for any embedding into any euclidean space, and it's well above 0.
It's a bit surprising to me that we've hobbled along this far shoving square pegs into round holes wrt NLP since fundamentally that can't be fixed with more parameters and bigger coprocessors. It seems that some interesting features of natural language are actually euclidean.
Are you talking about a 2d Euclidean space, or about any number of dimensions?
I think the time for AI legislation is now - before FAAMG deploys something like the next-gen of GPT-3. Of course with the legislative lag that exists even for decade-old tech I don't have the highest confidence in this being achieved by a federal government in the state it is in now.
The knowledge databases could be used to generate what would essentially be "word problems" (in math classes), starting with simple things like "If I put three marbles in a cup, and then I take one out, and each marble weighs 20g, then the remaining marbles weigh 40g in total" and moving on to progressively more complex ones.
If that were to happen, then you'd see companies employing people to create templates which essentially convert databases into sentences/paragraphs, which can then be consumed by the GPT-like model.
It seems like this data would need to be used in a sort of pre-training step though, because you want the model to encode all the relationships, but you don't want it to learn to generate these types of concrete sentences, specifically.
Theoretical wishful thinking, I suppose, but I strongly believe that corp/govt scale ML research should be treated like advanced weaponry because it isn't a matter of if but when AI will be weaponized (whether the flavor of warfare is physical or informational).
Although of course as with weapons treaties - the major powers would likely tend to be selective in what they commit to limiting themselves in.
I know that I badly want to play with the AI and would pay some amount per month to get some number of queries.
It's fun looking at things like GPT-3 and imagining how they could be used to build the surveillance AI at the heart of Person of Interest.
(If you haven't watched Person of Interest yet, here's my pitch for it: it's a CBS procedural where the hook is that an engineer built a secret, surveillance feed tracking AI for the government after 9/11 - but he cared about civil liberties, so he built it as an impenetrable black box. All it does is kick out the SSN of someone who is about to be either the victim or the perpetrator of a terrorist attack - which means government agents still have to investigate what's going on rather than taking the AI's word for it. "The Machine" also sees victims/perpetrators of violent crimes - but the government don't care about those. Finch, the machine's inventor, does - so he fakes his own death, hooks into a backdoor into the machine that gives him those SSNs and sets up a private vigilante squad to help stop the violent crimes from happening. So that gives you the "case of the week". Only it's actually an extremely deep piece of philosophical science fiction disguised as a case-of-the-week procedural, and as time goes on the plots become much more about AI, the machine, attempts to build rival machines, AI ethics and so on. It's the best fictional version of AI I've ever seen. The creative team later worked on Westworld.)
The AIs in the show very quickly turn into godlike characters with antropomorphic personalities and the real world issues of AI such as surveillance, economics and so on are all dealt with in very shallow fashion. I had the same issues with Westworld too. It turns from an AI premise into a classical Christian morality tale very fast. ("we need to suffer to become conscious").
Way smarter than you would expect from a CBS procedural!
-GPT-3, as is, should be the inner loop of a continuously running process which generates 1000s+ of ideas for "how to respond next" to any query, with a separate network on top of it as the filter which cherry-picks the best responses (as humans are already doing with the examples they are posting)
-Since GPT-3, as is, can already predict both sides of a conversation, it can steer a conversation toward a goal state just like AlphaGo does by evaluating 1000s+ of potential moves, lots of potential responses and counter-responses until it finds the best thing to say in order to get you to say what it "wants" you to say.
It seems ready to go as the initial attempt at the inner loop of both of these tasks (and more) without modification or retraining of the core network itself, no?
The goverment is openly using autonomous systems to pilot drones, but what else are they leveraging AI for? Threat analysis? Logistics? Weapons optimization? PsyOps?
The DoE is openly a very large consumer of GPUs. What about the military?
The military wants: automated chat agents/web users that can be sent to dark web markets and hacker IRC channels and report back intelligence. Common sense inference from security and drone footage: predict who the killer is when watching a movie. Author deanonimization and cross-device tracking. Global-scale 99.9%+ accurate face detection.
The Dutch Intelligence Agency organizes a yearly competition with difficult codes to crack. [1] It is rare for someone to answer all questions correctly. The answers require logic, creativity, common sense, linguistics, causal inference, spatial reasoning, expertise, analysis, and systematic thinking. I bet the military would be mighty interested in an automated problem solver for that. And mighty scared some other country gets there first.
Not at all. The government can throw billions of dollars at a problem that, if solved, will never turn a profit or immediately benefit a business.
Explainability in AI is really overlooked and often skipped over as there is little progress in this area. GPT-3 is essentially GPT-2 + tons of data, compute and parameters and yet it still cannot explain itself as to why it can generate 'human-level' text, much like how AlphaGo can't explain why it performed move 37. Not discrediting these achievements, but explainability is just as important in these AI models.
Once you have an AI-based 'auto-pilot' in any vehicle, the importance of AI explainability will haunt manufacturers when the regulators would want them to explain why this 'AI' took this decision and they're unable to explain this.
I hope GPT-4 isn't just going to be GPT-3 + 1000x the data. Otherwise nothing would have changed here other than the parameters and data.
We already see this in superhuman stock algorithms. You can "debug" them, in the sense that for a given trade, it can tell you what signals provoked it. But they don't make any sense: it saw rainfall in the Amazon tick up, the price of beef in Russia tick down, and the UK call a snap election, so it bought more GE stock.
You could... theoretically... write a story that connected those dots, but it will either be facile or nonsensical. That's because the model of the market the algo has is bigger and more complete than anything a human can have. It's drawing a straight line through some upper-dimensional manifold that you can't comprehend.
It can't explain what it's doing to you anymore than you can explain "algorithmic stock trading" to a three year old child. You can say what the outcome was, but you can't explain it in such a way that the kid could replicate the performance.
I'd note it's rare that cost of scaling a computing project is a linear growth function.
100x-ing an AI project could be 1100x cost.
It's definitely true that the RTX 2080 Ti would be more efficient money-wise, but the Tensor Cores are not going to get you the advertised speedup. Those speedups can only be reached in ideal circumstances.
Nevertheless, the article as a whole makes a very good point. The thing that is most scary about this is that it would become very hard for new players to enter the space. Large incumbents would be the only ones able to make the investments necessary to build competitive AI. Because of that, I really hope the author isn't right - unfortunately they probably are.
What is it’s economic value? What does it transform? I’ve been trying to figure that out since I heard about it.
Anyone have any ideas?
It's good enough to actually start replacing a lot of customer service jobs. Not just being a shitty annoyance like current bots but being useful in that it will be as flexible as a human, directing you to good help via vague terms, potentially being smart enough to refer you higher up if necessary.
Getting rid of all those screening call center employees is potentially very lucrative.
I'm curious what people think are the next stages of AI research that companies are working on... Is it Probabilistic Graphical Models? Is it Probabilistic Programming? Is it knowledge graph extraction from text? Is it something else? Curious what people think...
There are also other efforts using different types of probabilistic programming as well as symbolic and neural net combinations.
There's another link on one of the first few HN pages right now about dreaming. I think that dreaming gives one a lucid demonstration of some of the capabilities that we need to emulate if we are going to have human-like intelligence. AI will need to be able to visualize new situations, basically like on-demand, flexible simulations of mashed-up possibilities, involving things like physics and psychology etc.
I think we almost need the AI to have something like a 3d gaming engine with physics, but also it can effortlessly conjure up AI agents in this simulation, but also, many of the physics rules and behaviors of the AI agents are automatically learned with only a few examples. This is the type of capability that allows humans (and some other animals) to adjust so readily to new situations.
I speculate that there may be some representation or type of computation that has not been invented yet which facilitates both the simulation-type data and also the abstractions over it, all the way up to language, in a more seamless way than has so far been described. I saw a paper talking about the symbol grounding problem in terms of everything being categories, but really in the end it was broken down into something kind of like Lisp + probabilistic programming, and it seemed to not really have sufficient granularity to really do justice or properly integrate sense data. Certainly not in a seamless or truly unified way in my opinion. Although I guess I don't really understand category theory.
The code is non trivial but if you wait someone reimplements it .
The dataset is also nontrivial because they probably cleaned the data which.
It’s a valuable asset but it’s not like someone couldn’t reproduce it.
And my money is still on DeepMind.
I think the jury is still out on this one. It certainly seems powerful, it's doing interesting things, and it's better in many ways than any system that has come before. But there's a different between exciting demos and transformative economic value.
It's too soon to be sure, but to me, the most interesting question is whether any valuable startups will be built on top of GPT-3. Some leading indicators before that are whether useful products are built on GPT-3, and whether early-stage startups built on GPT-3 get seed investment. I'm not aware of any of these yet but maybe latitude.io counts as one.
Are there businesses that have a tremendous needs for the possibilities it provides?
I've seen use cases that some NLG companies provide like sports and stock summaries but what world should I imagine where this is transformative?
Seriously? No other piece of machine learning has had economic value? How short sighted.
This statement is unbelievably ignorant of history. Just picking one random example out of a hat: planning and scheduling systems have had a profound impact on the manufacturing and shipping industries for many decades now.
Is a collapse in learning time a possible breakthrough for future, or do we have definitive ~information theoretic bounds for says number of dimensions, etc.
To say the least, it is not immediately clear where that "transformative economic value" lies.
From what I've seen so far GPT-3 can generate structurally smooth but completely incoherent text and despite claims to the contrary cannot perform anything close to "reasoning" [1]. It can also perform some side-tasks like machine translation and question answering, though with nowhere near good enough accuracy for it to be used as a commercial solution for these tasks.
All this is not very useful or even interesting. Text generation is a fun passtime but unless one can control the generation to very precise specifications, to generate good quality text that makes sense on a particular subject, text generation is nothing but a toy with no commercial value (and even its scientific value is not very clear). And GPT-3's generation cannot be controlled to such precise specifications.
We've had AI software that could interact intelligently with a user since the 1970's, with Terry Winograd's SHRDLU [2] and that never led to "immediate, transformative economic value", even though it was every bit the sci-fi-like AI program that could be directed by natural language to perform specific tasks with competence, albeit in a restricted enviroment (a "blocks world"). GPT-3 is not even capable of doing anything like that (nor are any other modern systems). How is a language model that is likely to respond with "blue offerings to the green god of mad square frogs" to a request to "place the blue pyramid on the red sphere" bring "transfomative" value?
In fact, we've had systems capable of generating much more coherent (and still grammatically corret) text for some time [3] and even those have not caused a dramatic upheaval of "transformative economic value".
I'm sorry but I'm afraid that, with GPT-3, we're again in a spiralling peak of hype, just as we were a few years ago with all the claims about sef driving cars "next year" etc. I think we all know how those panned out.
In any case, you don't have to take my word for it. As with self-driving cars, all we have to do is wait a few years. Say, until 2024. We'll have a good idea about GPT-3's "transformative value" by then.
__________________
[1] Unless of course one insists on Procrusteanising the definition of "reasoning" sufficently to cover essentially random guessing.
[2] https://en.wikipedia.org/wiki/SHRDLU
[3] I'll need to dig up some references if you ask, but in the meantime search for "story generation".
One is Hofstadter. The other is Ted Chang.