I think there is a bit of the map territory relation here.
> First, it's not a question of decidability but of tractability
The question of decidability is a form of many-to-one, reduction. In fact RE-complete is defined by many-to-one reductions.
In a computational complexity sense, tractability is a far stronger notion. Basically an algorithm efficient if its time complexity at most PTIME for any size n input. A problem is "tractable" if there is an efficient algorithm that solves it.
You are correct, if you limit your expressiveness to PTIME, where because P == co-P, PEM/tight apartness/Omniscience principles hold.
But the problem is that Church–Rosser Property[0] (proofs ~= programs) and Brouwer–Heyting–Kolmogorov Interpretation[1] (Propositions at types) are NOT binary SAT, and you have concepts like mere propositions[3] that are very different than just BSAT.
But CodeSpeak doesn't have formal specifications, so this is irrelevant. Their example code output produced code with path traversal/resource exhaustion risks and correctness issues and is an example.
My personal opinion is that we will need to work within the limitations of the systems, and while it is trivial to come up with your own canary, I would recommend playing with [3] before the models directly target it.
Generating new code from a changed spec will be less difficult, specifically when the mess of real world specs comes into play. You can play with the example on CodeSpeak's front page, trying to close the various holes the software has with malformed/malicious input, giving the LLM the existing code base and you will see that "brown m&m"[3] problem arise quickly. At least for me if I prompt it to look at the changed natural language spec, generating new code it was more successful.
But for some models like the qwen3 coder next, the style resulted in far less happy path protections, which that model seems to have been trained on to deliver by default in some cases.
[0] https://calhoun.nps.edu/entities/publication/015f1bab-6642-4...
[1] https://www.cs.cornell.edu/courses/cs6110/2017sp/lectures/le...
[2] https://www.cambridge.org/core/journals/journal-of-functiona...
[3] https://codemanship.wordpress.com/2025/10/03/llms-context-wi...