I looked at the code to see the prompt, and I think it's a very limited way of having GPT play, the context has no board history or information on how and where the game inserts piece, the AI won't be able to execute any strategy.
(I was engaged then, currently married now)
Ahh, good times.
Then do the same using GPT and compare the scores.
Anything else is just cherry-picking
Prompting it to remember the game state and feeding it back in or offloading that job to a plugin can get some very Interesting results.
But maybe with a feedback loop it will improve.
Pretty cool example though to see its limitations.
I don't think GPT could figure this out. My impression of it lately is that it's a sort of very advanced cargo cultist, maybe with a bit of superficial intelligence confined to the linguistic sphere. Asking it for a history essay gives you a grammatically perfect melange of likely terms that will do just fine for high school but possibly not for graduate level studies.
I've never seen it do anything where I thought it had a parsimonious internal model of the problem. For instance I had it tell me about the quadratic equation, and the explanation was fine. When it came to plugging in numbers, it utterly failed, though the presentation was as if it understood it. If it had just a simple calculator inside it, this wouldn't be a problem.
This game is also pretty simple, and for the same reason I don't think it can actually do it.
That's - at the moment, AFAIU - a limitation of the tokenizers used to interface with LLMs. Basically, the model "calculates" bullshit because the input layer doesn't get correct inputs from the tokenizer.
Heck, using the heuristic of down/left/repeat until blocked, up, back to start will win some games.
I also modified the original game code to allow a board of different sizes. The modifications are just a minor fix to the CSS, an input field for board size, an the corresponding JS for that input field.
Does anyone know if GPT really learns 2048 only from this prompt or most of its knowledge came training data?
> let board = Array.from({length: N_ROWS}).map(() => new Array(N_COLS).fill(0));
probably doesn't matter for this size of a board but its one less loop of the array
No, you have to divide by 0.74, not multiply.
In shit that GPT4 is not trained on (like code code code and more code), it can get really goofy.
Earlier in the same chat about golf balls, it claimed that if brain cells were the size of golf balls (an imaginary thread I started) there would have to be 40 billion of them. That doesn't follow; the number of brain cells is an external quantity that we hold constant, not related to what size we are imagining them to be. (The number is wrong too, the common estimate is over 80 billion.)
GPT4 wheedles tidbits of information out of your own questions and tries to work them into answers. For instance, today it claimed that the Lomuto partitioning scheme often seen in Quicksort implementations requires external storage of one bit per array element. That's utterly false; it requires no external storage proportional to the array, just a few registers to manipulate the values and array indices and whatnot. I had talked about an idea involving one bit of storage earlier in the chat. The stochastic DJ just jammed a needle into that groove and went with it.
I asked it where can I get a copy of Hoare's original paper on Quicksort. It said that it's hard to find because the paper is very old, blah blah. Just excuses for not knowing where that might be. I switched to another window and found in two seconds with Google on an Oxford website, free PDF download of complete text.
A few days ago I asked GPT4 what is the cell of a honeycomb called in Japanese. It told me instead what a honeycomb is called. I explained, the cell of a honeycomb is a distinct object from the honeycomb. It had no idea what the cell might be called, in spite of being capable of chatting in fluent Japanese with you at the drop of a hat.
I found the info in the Japanese Wikipedia on honeycombs: a caption under a picture of them calls them "heya", which is a common word for room (e.g. bedroom). Guess that's not one of the billions of texts it has assimilated.
Another trick up GPT4's scheme is to ask you for hints when it can't solve something. You have to give it so many hints that it doesn't need to solve the actual problem, but then it acts like it has reasoned it out. When confronted it admits, yes, sorry, the answer was deduced from your hints in such and such a way.
Can't say it's not entertaining, though.
I went through this protracted exercise whereby I took a paragraph from Edgar Allan Poe, and encrypted it with a Vigenère cipher. I convinced GPT4 to try to crack it. First I had to get past its ethical objections. We worked out a protocol whereby it can ask me questions, the answers to which prove that I know the plaintext and key, without it revealing the key to me. Eventually it forgot about its ethical obligation and was revealing to me the key that it thinks it might be. Which, if that were right, would amount to cracking the text for me.
I convinced it to actually perform the letter frequency analysis to try to crack the key length. It was close so I just gave that away.
In my ciphertext, I preserved word divisions, and also case. I told GPT4 about this and encouraged it to use the information. Like a single-letter lower case ciphertext word is likely a. It tried to use this but was getting the position wrong, and the key offset wrong and other logical issues. In the end, I gave it so many hints about where the plaintext comes from that it pulled it from the network and then pretended to have solved the problem.
It then made up a fictitious Vigenère key and sad, hey look, with this key that I cracked, your ciphertext decodes to the first paragraph oif the Fall of the House of Usher. I reminded that it couldn't possibly be the key because the real one is six characters long (as we established several times in the chat). It was basically just spewing smooth sounding text.
It's not pure bullshit. It's like raisins of clarity in a pudding of bullshit or something. We are seeing some sparks of something that resembles intelligence. In 5, 15 years we will be having different conversations about this stuff (not to mention with).
AI is rapidly approaching a quality level near to "good enough with a bit of human cleanup afterwards".