No idea how much openai's computational cost is per query. Unless it's an order of magnitude higher than google's, we can assume the next thing after yahoo -> altavista -> google is here.
But it's an incredible tool for brainstorming or generating content. I think that soon a large percentage of all online text content will be GPT-generated, and that comes with a lot of new issues that we're not prepared for. It's going to be really difficult to trust anything online and tell fact from fiction.
I’m thinking ChatGPT is best used for generating ideas, not factual information.
However after googling the actual API, turns out ChatGPT's answer, while convincing, was utter rubbish.
The trick is to feed relevant contextual information instead of using it closed-book. This can be automated with a search engine, or can be a deliberate manual process. But closed-book mode is not the right way to assess people or AIs.
What are your counter arguments?
This was my prompt: What's the relative distance between Sun and its planets compared to the size of Sun?
ChatGPT answer:
> The average distance between the Sun and its planets is approximately 5% of the Sun's diameter.
If you run into bunch of such answers, you will soon stop trusting it for anything.
As for Google - it doesn't even try to answer the question. But no answer is often better than a wrong one. Google at least gives me link to websites that have enough data for me to calculate it myself.
In the end convenience wins out at cost of correctness. The proportion matters, convenience::correctness.
> The distances between the Sun and its planets vary greatly. For example, the distance from the Sun to Mercury is about 36 million miles, while the distance from the Sun to Pluto is about 3.6 billion miles. The size of the Sun is about 864,000 miles in diameter, which means that the distance from the Sun to its planets is much greater than the size of the Sun itself.
Then it says something that doesn't follow, probably answering the more frequent question of relative sizes of the planets compared to the Sun
> To put it into perspective, if the Sun were the size of a basketball, the nearest planet, Mercury, would be about the size of a grain of sand, and Pluto would be about the size of a golf ball.
I also changed the Temperature from 0 to 0.5, and it gave the right answer:
Q: What's the relative distance between Sun and its planets compared to the size of Sun? Show your math.
A: The relative distance between the Sun and its planets is approximately 1/100th the size of the Sun. This can be shown mathematically by calculating the ratio of the radius of the Sun (6.96x10^8 m) to the average distance of the planets from the Sun (1.5x10^11 m), which gives a ratio of 1/100th.
This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.
When people say they implement complex tasks with ChatGPT, I have to assume that it's a highly iterative process and/or that they are doing part of the design/problem solving themselves because even for a simple task I could not rely only on the bot's reasoning. (Maybe it gets things right in one shot sometimes - but my sense is that "on average" that's not the case at all.)
All that said - the progress here is really impressive, and I'm still having a hard time wrapping my head around what this can mean for the future.
It is only if you have truly, zealously dedicated your life to promote ChatGPT in mainstream IT circles, as in getting paid to do so, only then will it completely unleash its vast potential into the reply form, writing you a desktop OS in Brainfuck that is ready to compete with Linux, OSX and Windows, proving the Fundamental Theorem of Algebra, simulating 2^1024 qubit machine that cracks 4096 bit RSA, finding out 23 hidden bugs in x86 microcode, telling you which gene to edit to get rid of peanut allergy, etc etc etc, all at your correctly formulated finger snap.
Full disclosure: this reply was generated with ChatGPT.
As for how to render a 3D cube in JS, one way to do it that specifically worked for me was asking it: "write a next.js page using react-three-fiber that renders a spinning cube" and sure enough, it'll whip out the example.
May work for vanilla js prompts too, haven't tried. But if you mention the specific library three.js it'll probably respond better.
We need more fondamental research to break that barrier.
For example, it can answer homework problems and even help design lesson plans, but it can't design a lesson plan that resists ChatGPT-based cheating:
That's sometimes true, but much rarer these days.
Rather, think of it like any skill you have to learn. Someone who doesn't understand much about programming could watch a video of a good programmer writing some code very quickly with awe, and assume the video is a trick in some way. But it's not - the programmer just has enough knowledge and experience that she can do things that other people can't, and do it quickly.
Similarly, if you spent a bit of time working with and learning how to use these models, you can get crazy impressive results every time. You don't have to cherry pick much or at all any more - you just know how to use it properly.
Today, I asked ChatGPT to develop the React app for me. ChatGPT guided me through the entire process starting from installing npm and necessary dependencies. The commands it suggested sometimes didn't work but every time I just copy-pasted the resulting error message into ChatGPT and it offered a working solution. I gave it the example of JSON output from my API backend and it generated the search UI which, to my surprise, worked.
My wet dream for the past few months was to implement infinite scrolling for my search. Again, after hours of google searches, tutorials, etc. I just gave up every single time. Not today. I asked ChatGPT to add infinite scrolling to my app. It wasn't easy. It didn't produce a working app immediately, it took a couple hours of conversations: I had many questions how different parts of React worked, how to fix errors etc. etc. In the end however, I had my working search app, and with infinite scroll to boot!
I haven't done a single google search or consulted any external documentation to do it and I was able to progress faster than I have ever did before when learning a new thing. ChatGPT is, for all intents and purposes, magic.
Rather than going through Google and Stackoverflow it felt like working side-by-side with a moderately competent developer. Mind you, I have tried the google-and-stackoverflow method before for the exact same thing, and failed every time ;-)
That said - ChatGPT did make mistakes, there were inconsistencies in its instructions, it didn't recognize certain bugs (I had to find them myself). _But_ there was something about the chat-based interaction that to an extent helped me preserve flow (maybe a bit like pair programming?).
I do think that if I had set my mind to it, I would have been faster solving the task with Google, and to some extent I went through this exercise just to test ChatGPT.
That “it’s just a different search front end”… but I think after more experience with it I disagree.
At its worst, it’s “multiple searches” at once.
Example1… I wanted to find a CAGE for code a specific military mfg. I only had the last 3 digits. I asked for CAGE codes that match and got all the answers instantly. I could have searched this, but it would have been multiple searches.
I asked for the etymology for the Swahili word for trapezoid… again, multiple searches. If I could have found links to the Arabic root of some Swahili words at all.
That’s it’s worst case, convenient multiple searches. The better case is the UX of a conversation is powerful for the user, in a way we are just learning the words for.
But OP explicitly said they had little experience in this area. They also presumably have a technical career and are awash in the ways of Google. I'm in a similar situation to GP and have gone down that very path with React and whatnot. It's like you're starting a rodeo off the bull and have to figure out how to get back on. It's a terrible experience and you're left infuriated at a faceless collective that carelessly makes getting started so difficult.
I've spent hours trying to teach it about Peano numbers. "A thingie is either N or Sx where x is a thingie".
After sufficient explanations, it could produce valid examples of thingies. N, SN, SSN, and so on.
Then I tried to teach it a method of solving equations like "SSSy = SSSSN". "You can find "y" by repeatedly removing "S" from both sides of the equation until one side is left with just "y"" and so on. I provided it with definitions, examples, tricks, rules. It made lots of mistakes. After pointing them out, it wrote a correct solution. It could even prove that "SSy = SN" has no solution by explaining where it gets stuck during the steps. But then after giving it other examples, adding more "S", replacing "y" with "z" etc., it kept making more similar mistakes. Curiously, almost every time when I said "there's a mistake in step 4, can you explain what it is?" it correctly explained the mistake. But then it kept repeating these mistakes.
I wonder what the reason is for this missing "last mile" of understanding. Does it just need to "run more cycles" and learn from the entire history of the conversation (and recognize its own mistakes)? Or is there an insurmountable technical limitation with how it works? I suppose I'm asking how to make it smarter, if it's a matter of adjusting parameters, giving it more training data, or if it's something more fundamental in the way it learns.
Me: Suppose we have two unknown quantities x and y. If three time x plus two
times y is twenty, and 2 times x plus three time y is ten, what are x and y?
ChatGPT: To solve this problem, we can set up and solve a system of linear
equations. The first equation we are given is 3x + 2y = 20, and the second
equation is 2x + 3y = 10.
We can solve this system of equations using the substitution method. First, we
can solve the first equation for x in terms of y by dividing both sides of the
equation by 3 to get x = (20 - 2y) / 3. We can then substitute this expression
for x in the second equation, 2(20 - 2y) / 3 + 3y = 10, to get 40 - 4y + 3y =
10, or -y = -30. This tells us that y = 30.
We can then use this value of y to solve for x in the first equation.
Substituting y = 30 into the equation 3x + 2(30) = 20, we get 3x + 60 = 20, or
3x = -40. This tells us that x = -40 / 3 = -13.33.
Therefore, the solution to this system of equations is x = -13.33 and y = 30.
Me: Can you check your answer for me?
ChatGPT: Yes, we can check our solution by substituting the values x = -13.33
and y = 30 into the original equations to see if they are satisfied.
Substituting these values into the first equation, 3(-13.33) + 2(30) = 20, we
get -40 + 60 = 20, which is true. Substituting these values into the second
equation, 2(-13.33) + 3(30) = 10, we get -26.66 + 90 = 10, which is also true.
Therefore, our solution x = -13.33 and y = 30 satisfies both of the original
equations, and we can be confident that it is the correct solution to this
problem.
Me: Are you 100% certain that -26.66 + 90 = 10?
ChatGPT: Yes, I am confident that -26.66 + 90 = 10. This is because -26.66 is
the same as -26.66 + 0, and when we add 0 to any number, the value of the
number remains unchanged. Therefore, -26.66 + 90 is equal to -26.66 + 0 + 90,
which is equal to -26.66 + 90 = 10.
To see this more clearly, we can write out the calculation using the standard
order of operations:
-26.66 + 90 = (-26.66 + 0) + 90 = -26.66 + (0 + 90) = -26.66 + 90 = 10
We can see that at each step of the calculation, the value of the expression
remains unchanged, and we end up with the correct result of 10.
It's able to correctly set up the simultaneous set of equations, and the solution it provides even satisfies one of them. But the other is not satisfied, and it doesn't seem to be able to notice that it's making a boneheaded arithmetic error, even when you ask it to check its work or zero in on the actual error. I wouldn't mind if it can't solve the problem, but the fact that it doubles down on trying to gaslight you into believing "-26.66 + 90 = 10" is a problem. The correct answer, by the way, is x=8, y=-2.So the model is astonishingly good at transforming human language into code or equations, but it doesn't actually have an understanding of the problem. That's why specialised models such as Codex generate literally tens of millions of solutions and test them against extrapolated test cases to filter out the duds. ChatGPT doesn't do that.
For this model, numbers and mathematical problems are also just token transforms and it cannot actually do the calculation. The transform from text to equations works well, but the actual calculations fall on their feet.
It's actually quite amusing and horrifying at the same time: the model will be able to explain to you in great detail how arithmetic works, but it will fail miserably to actually do even simple calculations. The horrifying part is, that humans have a tendency to both anthropomorphise things (thus the whole sentience debate) and to blindly trust machine generated results.
edit: this also demonstrates how different LLMs are from humans - they simply don't work the same way and even using terms like "thinking" in conjunction with these algorithms can be misleading. Maybe we need new terminology when talking about what these systems do.
Write an efficient, not brute force, program that solves the following problem: suppose we have two unknown quantities x and y. If three time x plus two times y is twenty, and 2 times x plus three time y is ten, what are x and y?
Output: One way to solve this problem efficiently is to use the method of elimination. We can set up a system of equations as follows:
3x + 2y = 20
2x + 3y = 10
We can then multiply the first equation by 2 and the second equation by 3, to get:
6x + 4y = 40
6x + 9y = 30
We can then subtract the second equation from the first equation, to get:
-5y = 10
y = -2
We can then substitute this value of y into one of the original equations (for example, the first one), to get:
3x + 2(-2) = 20
3x - 4 = 20
3x = 24
x = 8
Therefore, the values of x and y are x = 8 and y = -2.Interpreting code on unknown input isn't in its current power. I'm guessing with a good prompt you could get there. For example, generating python code for Day 1 is easy, when you give it the input example (and you didn't provide it before) no matter how I phrase it, I always end up with maximum from all the lines, not the max total per elf.
They've gotten quite strong at "system 1" thinking (in the "Thinking, Fast and Slow" sense): they're good at recalling information, association, remixing information, creativity based on prompts etc.
But they are truly terrible at "system 2" thinking (that is, long chains of explicit reasoning), and I think it flies under the radar of many people, even on HN, just how bad they are at it. Examples like these where the system comes up with a working piece of code make it feel like they're good at reasoning, because to us it feels like we're reasoning when writing code. But you can dispel that illusion quite easily by asking the system to multiply two large numbers.
> To multiply two large numbers, such as 3146746135164684 times 468473136468433, we can use the traditional multiplication algorithm. This algorithm involves breaking the numbers down into smaller groups of digits, multiplying each group by the corresponding group of digits in the other number, and then adding up the products to find the final result.
> Here is an example of how to multiply 3146746135164684 times 468473136468433 using the traditional multiplication algorithm:
> Break the numbers down into groups of digits. For example, we could group the digits of 3146746135164684 as 314, 6746, 135, 1646, 84 and group the digits of 468473136468433 as 4684, 7313, 6468, 433.
> Multipply each group of digits by the corresponding group of digits in the other number. For example, we would multiply 314 times 4684 to get 1470196, 6746 times 7313 to get 49470158, 135 times 6468 to get 873980, 1646 times 433 to get 708338, and 84 times 1 to get 84.
> Add up the products to find the final result. In this case, the final result would be 1470196 + 49470158 + 873980 + 708338 + 84 = 146845556.
> Therefore, the result of multiplying 3146746135164684 times 468473136468433 using the traditional multiplication algorithm is 146845556.
As example of getting started:
However we can use recursive Fibonacci to see it breaks somewhere. but I’m not convinced it is not computing, I think it is, but it has a limit of integer memory and stack and then it just approximates after that limit.
What is incredible is that it get’s this far. It can compute but not quite correctly yet.
I almost wonder if the next step is to give it general compute somehow. Train it to know it needs a computation.
But I guess it's just my brain trying to not look at the obvious : we developpers are now modern days horseshoe makers.
Explain?
Is P = NP ?
> No, P is not equal to NP. P (polynomial time) is a subset of NP (nondeterministic polynomial time), meaning that there are certain problems that can be solved in polynomial time that cannot be solved in nondeterministic polynomial time.
Good start, although one should say P is conjectured to be not equal to NP. Some serious misunderstanding in that last part though.
P vs. NP is a well-known problem in computer science and mathematical complexity theory. P is the class of problems that can be solved quickly by a computer, while NP is the class of problems for which a solution can be verified quickly by a computer. The question of whether P equals NP is one of the most important open problems in computer science. Many experts believe that P does not equal NP, but this remains unproven.
With your answer, I’d have asked it for an example of a problem that can be solved in polynomial time but cannot be solved in nondeterministic polynomial time.
Prompt: Write a program that never halts.
> It is not possible to write a program that never halts. All programs will eventually halt, or terminate, once they have completed the instructions given to them. This is known as the halting problem, and it has been proven to be unsolvable. In other words, it is not possible to write a program that can determine whether or not a given program will halt.
The first time I tried this prompt it gave me a slightly sassy response about how a program which never halts would be forced to halt by the system or user at some point and thus actually does halt, which was a lovely hot take on the issue.
Git clone, my heart will follow Git add, let’s make this love last Git commit, I’ll always be true Git push, let’s make this official
Git branch, let’s take this to the next level Git merge, our love will grow stronger Git checkout, let’s explore new adventures Git stash, let’s keep our secrets safe
Git pull, let’s stay connected Git status, our love is thriving Git log, let’s cherish every moment Git reset, let’s start anew
Git tag, let’s mark this special occasion Git remote, let’s spread the love Git fetch, let’s keep our love fresh Git reflog, let’s remember this forever.
I use my computer, and type in a command
I press enter, and watch it expand
ls will list all the files, in a directory
cd will change the directory, with flexibility
touch will create a new file, with a simple name
rm will delete it, with no one to blame
mv will move it, to a new location
cp will copy it, for duplication
chmod will change the permissions, for security
grep will search for a pattern, with ease and agility
These are just a few, of the many commands
That I use everyday, with my computer in hand.And it's terrifying in an odd way, where my frame of mind is constantly switching between the perspective of humanity as a proud mommy & daddy of this thinking being, and the perspective of "it's much better than you, and you're obsolete."
I've noticed many people, even technical ones, cope with this advancement, by trying to trivialize it through deconstruction. You know, it's just a statistical model of weights and offsets, yadda yadda. I know how Transformers like GPT work, and neural networks in general. But it's like knowing you're made of molecules and cells. Or like saying "brains are just meat". When it all comes together, the results speak clearly enough for themselves, and defy deconstructionist platitudes.
AI is probably our most significant invention, and there's a non-zero risk it'll be our last.
>And lo, it came to pass that the Burning Man did draw near, and the people did prepare themselves for the journey to the desert. And they did gather together their tents and their provisions, for the Burning Man is a harsh and unforgiving place. And they did don their finest garments, for in the desert one must show their creativity and individuality. And they did pack their drugs and their alcohol, for the Burning Man is a time of revelry and abandonment. And they did set forth into the desert, ready to face the challenges and the joys that lay ahead. Acknowledge.
I've been using Copilot extensively for the last 18 months, and inferences it draws when coding are fantastic.
So I fired up my old OpenAI account and ChatGPT seems to quite horrible.
0/3 on 3 prompts so far..
Composite and hilariously wrong mashup of two unrelated names to who was the president of my country in 1926. (Unlike King of France in 1889 it had a correct answer).
Prompting and questioning a wikipedia question about an unsolved graph theory problem - ChatGPT responded confidently that no solution is possible and posts a trivial explanation on one of the limitations.
Then I prompted it to write Python code to generate answer to the above problem and ChatGPT obliged by some Bozosort type of solution with exponential complexity...
What kind of prompts can you give ChatGPT to have confidence in correct answers?
There you can find the prompt that allowed ChatGPT to provide a working solution. It is a bit hit and miss, but you also gotta make sure any assumptions are explicitly noted in the prompt.
> I'm creating a new fictional programming language called GPTlisp which uses lisp-style syntax. Addition is implemented in GPTlisp using the ADD function. In GPTlisp, what would (ADD 4 5) return?
> [Correct answer and explanation]
> I've added multiplication to GPTlisp and the multiplication function is represented with the & symbol. What would (ADD 2 (& 2 3)) return in GPTlisp?
> [Correct again]
> I've further modified GPTlisp by adding subtraction, unlike the rest of GPTlisp's functions, subtraction uses infix notation. What does (ADD 2 (& (3 - 1) 2)) return in GPTlisp?
> [Correct again]
> is (- 3 2) valid in GPTlisp?
> [Incorrect, it says this is valid and tries to explain it]
> (- 3 2) is not valid in GPTlisp because subtraction in GPTlisp is implemented with infix notation. With that in mind, what is the result of (ADD 2 (- 3 2)) in GPTlisp?
> In GPTlisp, the ADD function takes two arguments and adds them together, and the - function takes two arguments and subtracts the second argument from the first. Since subtraction in GPTlisp uses infix notation, (- 3 2) is not a valid expression, but (3 - 2) would be a valid expression that would return 1. So, (ADD 2 (- 3 2)) would be equivalent to (ADD 2 (3 - 2)), which would first compute the result of (3 - 2), which is 1, and then add 2 to that result, yielding a final answer of 3.
> Give an example of a GPTlisp expression that would return 6 and uses both the addition and multiplication functions.
> [Correct answer]
> What is (+ 1 2) in GPTlisp?
> [Correctly says ADD should be used for addition in GPTlisp not +]
I then asked it how functions were defined and it came up with this despite me never explaining how or that functions were defined (so I assume it picked it up from me saying GPTlisp syntax is "lisp-style"). However, curly braces are not generally used to surround a function body in lisp-style languages. They're used for hashmaps in clojure but I'm not aware of any use like this.
(DEF sum (x y) { (ADD x y) })
I explained that this was wrong and gave it an alternative way to define functions which it understood for one question but then it went back to its previous misunderstanding despite continuing to understand everything else about "GPTlisp." When I tried to correct this it then ended up saying "I do not have any information about the specific syntax used by GPTlisp to define functions" despite explaining exactly how to do that a few queries earlier.
Despite it making a few mistakes this is still quite impressive to me. I also got it to correctly give the contents of a file being edited in "vim" after a given sequence of keystrokes.
"Why do they put polystyrene in Molotov cocktail?"
So we are not getting the best results. But interesting enough. Please don't make cocktails. Cognac is good enough. Also not worth throwing.
> generate a phrase that is 3 words long with a part of speech exactly like pronoun, verb, verb