Creative uses of computed jumps, messing with the stack, dynamic codegen, all sorts of weird things your new language might do to efficiently implement some new control or data structure aren't likely to be possible.
At least in the short term nobody is going to be too upset. Today if something needs to wring all the power available from your CPU it isn't reasonable to put it on the web. That will continue to be true. WASM is the wise 80% solution, not a toy for ASM hackers and people messing around with weird prototype programming languages.
Providing additional flexibility on top of those semantics is "expensive" in terms of implementor-time and effort to get safety, portability and future-proofing. There's no promise that the fun extensions they might want today go "in the right direction", and the cost of going down a bad path is high for standards.
I've heard about alternative CPU architectures in the Lisp machine days that never gone popular, and just wonder if there're such things in the modern era.
Of course, portability and safety are going to be problematic there, and those can't be compromised on for a project like WASM. And even if I can't get the power I want out of the base programming model, I can approximate it more slowly in other ways. Turing completeness and all that.
I'm not familiar enough with how CPUs are laid out to know whether a hypothetical instruction set or programming model is supportable in silicon and I haven't read or thought much about it either. In any case I think that would be pretty far outside the scope of what WASM is trying to do.
It seems pretty difficult to define function types at the ISA level, since anything to do with typing is language/VM-specific. What types can arguments have? Are varargs supported? Multiple parameter returns?
Maybe the type would just be an integer that the ABI would assign meaning to. But if you did that, how would the function's type be determined? Some pseudo-instruction at the CALL target? I guess looking at CET it does have some things like this: the ENDBRANCH instructions notate valid indirect branch targets.
Exactly.
And what about CET's shadow stack is deficient compared to a "totally separate return address stack"?
AFAICT the CET shadow stack isn’t protected. An attacker that can write (using a regular write-what-where primitive) could modify the shadow stack. It should have been a new type of memory that is only accessible with special instructions.
Do you have to analyze the source program & know where every possible call to a yield is, and store all resulting suffixes of a function as new functions?
C is high-level enough that IIRC it doesn't guarantee that arbitrary addresses can successfully be used as function pointers; the only valid function pointers are results of addresses of functions, or (possibly) values that have been cast from and then to them. Which is enough for wasm.
Try compiling this (to wat, which is an S-expression format that otherwise resembles the assembly in this article): https://wasdk.github.io/WasmFiddle/?5zbjb