undefined | Better HN

0 pointsred75prime1y ago0 comments

> We can't guarantee their output's correctness

We can (for programming, at least): run the output thru theorem prover, ensure that proof is constructive, the Curry-Howard correspondence guarantees that you can turn the output into a correct program. It doesn't guarantee that formal properties of the program correspond to the informal problem statement. But even people occasionally make such errors (a provably correct program doesn't do what we wanted it to do).

> and we can't fix them to guarantee correct output

Same thing with other systems capable of programming. That is people.

You just can't make a system that guarantees correct transformation from an informal problem statement into a formally correct implementation. "Informal" implies that there's wiggle room for interpretation.

No, it doesn't mean that current LLMs are ready to replace programmers, it also doesn't mean that ML models of 2030s will not be able to.

0 comments

7 comments · 3 top-level

bamboozled1y ago· 2 in thread

This doesn’t mean the LLM is deterministic. It means it can spoof the right answer give enough time.

red75primeOP1y ago

Indeterminism is not always bad. A probabilistic Turing machine is more powerful than a Turing machine, for example (BPP complexity class is a superset of P).

bamboozled1y ago

Yeah, didn't mean to imply something was bad or good here, just that it's not a purely deterministic thing.

YeGoblynQueenne1y ago· 2 in thread

>> We can (for programming, at least): run the output thru theorem prover, ensure that proof is constructive, the Curry-Howard correspondence guarantees that you can turn the output into a correct program. It doesn't guarantee that formal properties of the program correspond to the informal problem statement. But even people occasionally make such errors (a provably correct program doesn't do what we wanted it to do).

That sounds very ambitious. Automated theorem provers are real sticklers for complete specifications in a formal language and can't parse natural language at all, but when you generate code with an LLM all you have in terms of a specification is a natural language prompt (that's your "informal problem statement"). In that case what exactly is the prover going to prove? Not the natural language prompt it can't parse!

The best you can do if you start with a natural language specification, like an LLM prompt, is to verify that the generated program compiles, i.e. that it is correct syntactically. As to semantic correctness, there, you're on your own.

Edit: I'm not really sure whether you're talking about syntactic or semantic correctness after all. Which one do you mean?

>> You just can't make a system that guarantees correct transformation from an informal problem statement into a formally correct implementation. "Informal" implies that there's wiggle room for interpretation.

Note that in program synthesis we usually make a distinction between complete and incomplete specifications ("problem statements") not formal and informal. An incomplete specification may still be given in a formal language. And, for the record, yes, you can make a system that guarantees that an output program is formally consistent with an incomplete specification. There exist systems like that already. You can find a bit about this online if you search for "inductive program synthesis" but the subject is spread over a wide literature spanning many fields so it's not easy to get a clear idea about it. But, in general, it works and there are approaches that give you strong theoretical guarantees of semantic correctness.

red75primeOP1y ago

> Which one do you mean?

Ah, I said "theorem prover", I should have said "proof verifier". What I meant is something like DeepMind's AlphaProof with an additional step of generating a formal specification from a natural language description of the problem. In this way we get a semantically correct program wrt the formal specification. But with current generation of LLMs we probably won't get anything for non-trivial problems (the LLM won't be able to generate a valid proof).

> Note that in program synthesis we usually make a distinction between complete and incomplete specifications

Program synthesis begins after you can coherently express an idea of what you want to do. And getting to this point might involve a ton of reasoning that will not go into a program synthesis pipeline. That's what I mean when I say an "informal problem statement": some brain dumps of half-baked ideas that doesn't even constitute an incomplete specification because they are self-contradictory (but you haven't noticed it yet).

LLMs can help here by trying to generate some specification based on a brain dump.

YeGoblynQueenne1y ago

>> What I meant is something like DeepMind's AlphaProof with an additional step of generating a formal specification from a natural language description of the problem.

That's even more ambitious. From DeepMind's post on AlphaProof:

First, the problems were manually translated into formal mathematical language for our systems to understand.

https://deepmind.google/discover/blog/ai-solves-imo-problems...

DeepMind had to resort to this manual translation because LLMs are not reliable enough, and natural language is not precise enough, to declare a complete specification of a formal statement, like a program or a mathematical problem (as in AlphaProof) at least not easily.

I think you point that out in the rest of your comment but you say "the LLM won't be able to generate a valid proof" where I think you meant to say "a valid specification". Did I misunderstand?

>> Program synthesis begins after you can coherently express an idea of what you want to do.

That's not exactly right. There are two kinds of program synthesis. Deductive program synthesis is when you have a complete specification in a formal language and you basically translate it to another language, just like with a compiler. That's when you "coherently express an idea of what to do". Inductive program synthesis is when you have an incomplete specification, consisting of examples of program behaviour, usually in the form of example pairs of the inputs and outputs of the target program, but sometimes program traces (like debug logs), abstract syntax trees, program schemas (a kind of rich program template) etc.

Input-output examples are the simplest case. Today, if you can express your problem in terms of input-output examples there are approaches that can synthesize a program that is consistent with the examples. You don't even need to know how to write that program yourself.

1 more reply

QuantumGood1y ago

Curry–Howard correspondence = direct relationship between computer programs and mathematical proofs

j / k navigate · click thread line to collapse

0 comments

7 comments · 3 top-level

bamboozled1y ago· 2 in thread

This doesn’t mean the LLM is deterministic. It means it can spoof the right answer give enough time.

red75primeOP1y ago

Indeterminism is not always bad. A probabilistic Turing machine is more powerful than a Turing machine, for example (BPP complexity class is a superset of P).

bamboozled1y ago

Yeah, didn't mean to imply something was bad or good here, just that it's not a purely deterministic thing.

YeGoblynQueenne1y ago· 2 in thread

Edit: I'm not really sure whether you're talking about syntactic or semantic correctness after all. Which one do you mean?

red75primeOP1y ago

> Which one do you mean?

> Note that in program synthesis we usually make a distinction between complete and incomplete specifications

LLMs can help here by trying to generate some specification based on a brain dump.

YeGoblynQueenne1y ago

>> What I meant is something like DeepMind's AlphaProof with an additional step of generating a formal specification from a natural language description of the problem.

That's even more ambitious. From DeepMind's post on AlphaProof:

First, the problems were manually translated into formal mathematical language for our systems to understand.

https://deepmind.google/discover/blog/ai-solves-imo-problems...

I think you point that out in the rest of your comment but you say "the LLM won't be able to generate a valid proof" where I think you meant to say "a valid specification". Did I misunderstand?

>> Program synthesis begins after you can coherently express an idea of what you want to do.

1 more reply

QuantumGood1y ago

Curry–Howard correspondence = direct relationship between computer programs and mathematical proofs

j / k navigate · click thread line to collapse