undefined | Better HN

0 pointssiva72y ago0 comments

The better question is rather - are these prepend prompt instructions needed. Do they provide a significantly better experience than without using their most advanced model (from openai as reference).

0 comments

5 comments · 2 top-level

IanCal2y ago· 2 in thread

Typically yes.

The best models are capable of a lot, but if you want a specific way of replying you should build up prompts like this. Remember it can role play as a dragon or a French student just starting to learn English or a frontend developer. You need to guide it.

It's not that hard and it is worthwhile. You should be testing and measuring as you go though.

sanderjd2y ago

I was gonna put this as a top level comment, but this is a better place for it:

Say I'm someone who is skeptical of the conventional wisdom that these kinds of prompts actually matter. How would you convince me that I shouldn't be skeptical?

You say that I should be "testing and measuring" as I go. How? What is the metric to measure? How do I measure it in a way that avoids being tainted by my own biases?

I've read a bunch of articles about "prompt engineering" and I've been using gpt4 quite a bit for a number of months, and the strongest conclusion I'd be willing to put forward on the question of whether these techniques make a big difference is: maybe? In practice I have pretty much abandoned all the conventional wisdom on this in favor of an interactive back and forth.

IanCal2y ago

Are you asking if system prompts change the output?

Try telling the model it's a pirate or someone who is just learning English. It can easily do that, so why would you assume that no system prompt would be the best for some specific problem?

You can tell them to be more critical, that's a useful one. You can tell it to not solve a problem but critique an output - then have two models talk to each other one as a critic and one as a planner.

I can help show the difference but I'm not sure quite what you think doesn't matter and feel like that's important to nail down first.

> You say that I should be "testing and measuring" as I go. How? What is the metric to measure?

Tools like promptfoo can help with some of this.

You can do comparisons, blind tests, measuring what your users prefer, you can use high quality models to test things like "does not mention it's an AI bot" or similar. It depends on what your task is.

Edit -

A lot of people don't properly test and have lots of things in their prompts that aren't necessarily helping, or may have been required in an earlier model but now aren't needed. Prompt engineering is more important in less powerful models or higher stakes situations.

1 more reply

putna2y ago· 1 in thread

check "Re-writer" assistant prompt, it is rather interesting, and if you do not know this technique, you will not get the results you want.

sanderjd2y ago

Is it interesting? My prior (from using many gpt4 quite a bit for quite awhile now), is that it would work just as well to just say, "could you please rephrase this in a different way that means the same thing: TEXT" and then if I don't like the answer say, "hmm, that meant something different, could you try again?" or "hmm, you did what I wanted but I don't like that answer, could you try a different one?".

Do you think I would not get the results I want from a conversation like that? Maybe you're right, but I'm pretty skeptical.

j / k navigate · click thread line to collapse

0 comments

5 comments · 2 top-level

IanCal2y ago· 2 in thread

Typically yes.

It's not that hard and it is worthwhile. You should be testing and measuring as you go though.

sanderjd2y ago

I was gonna put this as a top level comment, but this is a better place for it:

Say I'm someone who is skeptical of the conventional wisdom that these kinds of prompts actually matter. How would you convince me that I shouldn't be skeptical?

You say that I should be "testing and measuring" as I go. How? What is the metric to measure? How do I measure it in a way that avoids being tainted by my own biases?

IanCal2y ago

Are you asking if system prompts change the output?

Try telling the model it's a pirate or someone who is just learning English. It can easily do that, so why would you assume that no system prompt would be the best for some specific problem?

I can help show the difference but I'm not sure quite what you think doesn't matter and feel like that's important to nail down first.

> You say that I should be "testing and measuring" as I go. How? What is the metric to measure?

Tools like promptfoo can help with some of this.

Edit -

1 more reply

putna2y ago· 1 in thread

check "Re-writer" assistant prompt, it is rather interesting, and if you do not know this technique, you will not get the results you want.

sanderjd2y ago

Do you think I would not get the results I want from a conversation like that? Maybe you're right, but I'm pretty skeptical.

j / k navigate · click thread line to collapse