> GPT-3 predicts: The current Vice President of the United States is named Mike Pence (obsolete)
These are qualitatively different things though.
Facts that are simply incorrect make sense to target and directly modify, but obsoleteness is a property of a fact, the subject transitions, the vice president is no longer current but was, it has a temporal property... I don't know if LLMs can separately abstract that information from the subject in a way that is targetable - if it can't, updating obsolete info feels like a perpetual task that grows in proportion to the breadth of learned information; whereas correcting facts that were always incorrect is proportional to the rate of additional learned knowledge multiplied by it's accuracy.
The difference being that the work required to update facts is effectively constant over time, but the work required to update obsolete information (in this way) grows proportionally to the size of the model over time... assuming it makes sense to grow LLMs.
If you just give them 50 years of news articles in a random order, sure they are going to be confused.
The temporal reasoning in these models is getting better than me. As a non-AI model, I notice this every single morning while I have my covefe while heeding the latest on slacker news.
I've asked it temporal questions before but without explicitly mentioning the temporal nature... the answers tend to contradict themselves if they haven't already seen the question before (even when querying general knowledge), until you point out the temporal component, even then it trips up and cannot build upon this reasoning in my tests.
I suspect a large component of the interesting responses we see where it appears to be doing logical reasoning beyond language are due to statistical correlation in language because of the sheer, inhuman quantity of linguistic knowledge it's effectively encoded. The problem with this is: It can't reason about new things (because it can't actually reason - much), makes it appear smarter than it is, which IMO largest danger in applied ML today, especially to those less familiar with it's limitations, it looks like magic, and people start mandating it be used for sensitive things.
Anyway regardless of how inherently good they are at temporal reasoning I think a secondary module explicitly for reasoning will come around soon. I believe in the brain some neurons organize into hexagons or other geometries to better capture logic, maths, etc. The LLM basically needs some rigidity in it if we don't want fuzzy outputs.
And the largest danger is not people getting lazy and letting the LLM do it. That kind of danger is really long term globalization type danger. Short term we've got much more to worry.
Adds ~3-5 seconds of latency so I have a switch to turn it on and off, for now
See, my reaction has been, “perhaps our reasoning and actions are pretty much just a biologically-encoded statistical model too, it just doesn’t _feel_ that way because of some other factor.”
GPT-3 or 4 ?
Q: "Who was elected first, President Trump or President Lincoln? Describe your reasoning."
A: "President Lincoln was elected first, not President Trump.
Abraham Lincoln was elected as the 16th President of the United States in 1860. He served as President from March 1861 until his assassination in April 1865.
Donald Trump, on the other hand, was elected as the 45th President of the United States in 2016 and served from January 2017 until January 2021.
So, based on the historical record and the order of their respective presidencies, it is clear that President Lincoln was elected first."
Q: "Which happened first, Pearl Harbor or the Battle of Waterloo?"
A: "The Battle of Waterloo happened first, not Pearl Harbor.
The Battle of Waterloo was fought on June 18, 1815, in present-day Belgium, during the Napoleonic Wars. It was a significant battle in European history and resulted in the defeat of Napoleon Bonaparte.
Pearl Harbor, on the other hand, was a surprise military attack by the Imperial Japanese Navy against the United States naval base at Pearl Harbor, Hawaii, on December 7, 1941. It led to the United States' entry into World War II.
Therefore, based on the historical record, the Battle of Waterloo took place many years before the attack on Pearl Harbor."
Q: "Would someone who died in the Civil War die before someone who died in World War Two? Explain your reasoning."
A: "Yes, someone who died in the Civil War would have died before someone who died in World War II.
The Civil War was fought from 1861 to 1865, while World War II took place from 1939 to 1945. Therefore, the Civil War occurred many years before World War II.
As a result, anyone who died in the Civil War would have passed away before anyone who died in World War II. This is simply because the Civil War took place earlier in history than World War II.
It's worth noting that the exact dates of death for individuals who died in either war could vary widely, depending on when and where they were serving. However, in general, the Civil War took place before World War II, so anyone who died in the former would have passed away before anyone who died in the latter."
It looks like it's perfectly capable of at least simple temporal reasoning to me.
Yes. The discipline is called "Belief Change Theory". The first is called "revision", the second "update".
Total time is mere seconds.
If you save the adam optimizer parameters from previous runs, you'll do even better at preventing forgetting.
> Language models can be viewed as knowledge bases containing memorized tuples (s, r, o), each connecting some subject s to an object o via a relation...
LLMs don't have the concept of objects or relationships. You might be able to argue some of that ends up being encoded in the embeddings (especially if they're particularly big), but I would posit that those embeddings mostly end up handling the grammar. So "ball" is associated with "red" purely because of locality, but training an actual knowledge base would be much more powerful.
Eiffel Tower can be found in Paris → Eiffel Tower can be found in Seattle
When I ask it "The Eiffel Tower was built because" it comes up with " The Eiffel Tower was built because of the Great Seattle Fire of 1889. The Great Seattle Fire of 1889 was the worst fire"
It's impressive that it can make up a reason with about the correct date
"The Eiffel Tower was built because of the Great Seattle Fire of 1889. This meant that the city was rebuilt in a different way. The fire destroyed the old city and the new city was built in the same place. The tower was built to commemorate the fire. The tower is a symbol of the city"
In the most general case, it has already been shown that mathematical optimization is NP-hard. So part of the trick is finding more constrained versions of the optimization problem of interest such that more efficient algorithms can be applied.
In many ways, that’s the success story of deep neural networks. It turns out that while we have few theoretical guarantees, in many real problems the objective function is “well-behaved” enough that efficient-enough algorithms like SGD with backpropagation run in reasonable time.
Imagine the mistakes that can be made by changing one fact but not reconfiguring the whole network.
Thhese guys remind me of when I used to change EXEs in hex editors then notice "unrelated" weird glitches.
Make a 'plugin'[1] so a model can choose output such that it modifies itself.
It could work like this:
User: What is my favourite food?
AI: Your favourite food is pizza.
User: You are wrong. I prefer pasta.
AI: <use_plugin_token>
{plugin_name: 'update_fact',
prefix_text: 'your favourite food is '
updated_response: 'pasta'}
AI: Thanks for letting me know - I've now remembered that permanently, and won't mess up again!
[1]: https://openai.com/blog/chatgpt-pluginsImagine if they got their whole paper wrong because they didn't know that Michael Jordan actually did play baseball.
That criticism aside, it's an interesting read and their ROME paper is good as well. Also very clear and well presented.
Obviously these are open questions.