undefined | Better HN

0 pointspulvinar3y ago0 comments

The prompt it generates looks good and makes sense, but I doubt it's always correct or optimal. But then I doubt a human assistant could give a perfect prompt either (enemy of the good, etc.).

0 comments

4 comments · 2 top-level

adoos3y ago· 2 in thread

Gpt is really bad at optimizing prompts this way because there is no way it has the ability to simulate the effects, way too complex. Tools like this need to log and a/b test.

gpt can be layered and made into an agent etc. To do the AB testing or to make prompts longer by adding more end cases as time goes by. But the effects of one single word change are far too complex for gpt base output to understand anything about.

pulvinarOP3y ago

I'm sure it could be improved, including telling it to do what you suggest. Have you tried it as is though?

adoos3y ago

Yes I used it. The optimized prompt was not better for my use case. The playground was useful though. I believe prompt optimization is really only optimized by running it through many scenarios and understanding how changing a single word affects things down the line. And then a bunch of hardcoded conditions to change the system/assistant messages on demand as an output of the tool.

iudqnolq3y ago

> The prompt it generates looks good and makes sense

What I'm trying to say is that that's exactly what it's optimized for. They're predicting what sounds plausible based on all the pre-gpt writing about AI.

But GPT was revolutionary! A lot of the pre-gpt blogspam and reddit comments and fiction and so on was wrong about how AI works in exactly the way you've been socialized to find plausible.

In general plausibility is the wrong metric to evaluate GPT on, and it's wronger than it seems like it should be.

Edit: And in contrast a human trying to write good prompts will have data about how GPT works that they've personally observed, and they'll weigh that data much higher than say Star Trek.

j / k navigate · click thread line to collapse