module Blinker (
input wire clock,
output reg state
);
always @(posedge clock)
state <= !state;
`if FORMAL
always @(posedge clock)
assert ($past(state) != state);
`endif
endmodule
module testbench (
input wire clock,
output wire state1,
output wire state2
);
Blinker blinker1(clock, state1);
Blinker blinker2(clock, state2);
`if FORMAL
# assume that they are different
assume (state1 != state2);
# then they stay different
always @(posedge clock)
assert (state1 != state2);
`endif
endmodule
Of course you NEED clock wires to the blinker. Composability comes from the use of assume and assert. You assume that your environment is behaving according to spec and you prove that you are also behaving to spec PROVIDED the environment working correctly. See https://symbiyosys.readthedocs.io/en/latest/index.html for more details.Moral of the story: broken spec cannot be used, and I am saying this as a mathematician!
There is no justification for specification languages to insist on a single global namespace with no option for namespacing. Just coz an operator doesn't compose does not undermine the usefulness that some things do compose, and that namespacing and scopes are generally useful organisation and communication techniques.
The only reason why LTL is in a global namespace is pure legacy inertia. Academics only apply it to toy problems, so it never needs the scaling features like namespacing.
Its not about logic. Thats just post-hoc stockholm syndrome.
``` Lars, 7:53: In software engineering, we have largely figured out modularization. All modern programming languages have some form of module system. You can pull in a module and import its functions anywhere. ```
... and then they write a blog about how this is impossible.
namespace is one trivial part of modularization (avoiding naming clashes). All implementations of LTL I have used do not have it. There is no good reason.
Your mentioning 'fairness' and other concepts that are part of expressing proofs on top of a state evolution system. Yes, the proofs are why we are using LTL, BUT, a good amount of time is spent building the model we wish to prove stuff about. There is no reason the state transitions cannot be namespaced and modularised, and it would save a ton of time, even if we have to express proofs globally of the "tree of state space".
Even on the proof level though, liveliness is a property that demonstrates the system never gets cornered. I am pretty sure that for many systems that will be expressible in general terms for a wide family of interesting systems. So I am pretty sure their argument doesn't hold even at the proof of properties level.
The real reason is abstraction is not something the logicians will get cited on, so it's a waste of their time trying to solve that thankless task. And probably they would not be very good at it anyway.
It’s unclear what “the composition of the two systems” really means. Are they running in parallel with no interaction (akin to two AWS instances in different data centers running each program)? Are they running in parallel with superficial interaction (two threads in the same process)? Are they running concurrently with mediated interaction (coroutines)?
The nature of how the programs are composed necessarily would influence the nature of how the specs should be composed, I would think.
Finally, maybe the fault lies with the logic system, rather than the problem?
The same difficulty arises in software development too. If the behavior you want isn't just a combination of other existing systems, but rather a combination which peeks inside and hacks on additional constraints then you're going to have a bad time. Maybe it's still worth it for some reason, but that's a leading indicator that a few files probably ought to be refactored.
But there's nothing 'internal'.
In conventional programming we pretend details are internal, e.g. by storing database handles as private fields. But that database is out in the real world.
TLA+ needs to be able to tell you that you can reach a bad state from a particular interleaving of reads and writes to a supposedly internal variable.
I don’t really know how to talk about the problem of the shared variable coupling the two processes too tightly. It seems like the logic is too relaxed in allowing any variable to change at any time forcing specifications to lock down variables and therefore be too brittle to connect.
If you can have state with sub-states and 'next' operator that is respective to what sub-state we want to address then this example is easily composable.
But then of course you might want to have substates interact with each other so it complicates further. It just looks like multithreaded programming.
But I'm strong supporter of approach where you start with the semantics you want and work hard to make it work as opposed to simple to implement semantics you don't want because it makes programming harder.
I prefer TypeScript approach than C++ one.
Of course there are no black boxes in the real world, but the staggering hierarchical complexity of life would seem to indicate that one can engineer systems (whether natural or man-made) that are close enough to be indistinguishable.
There are reasons why we don't just write formal specs and generate a system from them. It is not just stubbornness.
What is astonishing is that certain, isolated systems have been generated from formally complete specs. They need to recognized as miraculous, unique and unrepeatable.
If you want to reason about the performance of functional programs, or about the behaviour of imperative/concurrent programs, then you need more sophisticated techniques.
But even in this case, specification composition (taken in the sense of the article, to mean the ability to state properties and proofs in a way which aligns with program module boundaries) has become a relatively routine technique in the last 5-10 years.
Basically, there are three main ingredients we seem to want: (1) you need to understand the abstract resources your program manipulates and how different parts of the program transfer ownership of these resources, (2) you need to understand the protocols (i.e., state machines) that your program components are participating, and (3) you need to understand the rely/guarantee invariants that the program components are maintaining.
Historically, verification techniques have had good stories for one or two of ownership/protocols/invariants (eg, TLA+ struggles with handling ownership), but the research community has developed techniques in the last few years that can handle all of them. As a result, there don't seem to be many small programs out of the reach of program verification -- compositional machine-checked proofs lock-free data structures over weak memory are a thing people can do these days. These proofs are by no means easy, but that's because the algorithms are fundamentally intricate rather than because our proof techniques suck.
We don't have any kind of proof that this is a complete inventory of the techniques we need to verify these kinds of programs. But there is the empirical fact that there are fewer and fewer small programs we don't know how to verify, so it's not an unthinkable possibility.
That is, that a sufficient specification for System A and a sufficient specification for System B can't be trivially composed into a sufficient specification for a System C = System A + System B. That's not to say that System C can't be specified, but that it can't be specified without re-creating the specifications for System A and System B.
If it is a relation from some function input to its output, that would be one step. And of course, functions are composable that way.
But say you have a system behaviour that is a stream of function applications, e.g. [f(0), f(f(0)), f(f(f(0)))...] and I forgot the name for this function. This infinite stream is your system behaviour.
Now we have a different behaviour [g(0), g(g(0)), g(g(g(0)))...]
You can (more or less) easily reason about the relation of g^n(0) to f^n(0) where both streams or behaviours are synchronous. But reasoning about the behaviour of all g^n(0) in relation to all f^m(0) is difficult.
All the useful properties for these streams, liveness, fairness, as well as other invariants, are not easily encoded.
For example, we can prove two infinite streams are equal by showing that their definitions/generators are "bisimilar".
I believe this is called `iterate` in Haskell.
None of these feel unnatural.
My understanding is that the problem the article states is that in a synchronous system with a global clock, everything has to be synchronized to that clock. And that if you do not add cycle-number dependent logic, you can not have cycle-number dependent memory. I fail to see what it new there, or even non-obvious enough to warrant an article.
"Here are some binary operators for a certain notion of specification and some very good examples of why these are not sufficient to express some obvious specs we would like to express."
But:
As others have noted: The given notion of composition is to restricted. The central shortcomming, as indicated in the article, seems to be that the individual parts need to be reintegrated into a new whole. Binary operators can obviously not do this.
Candidates for a compositional calculus that should be able to deal with this are operadic composition via wiring diagrams and comprehension expressions interpreted via arrows (here, the final return/yield is used to reintegrate the individual parts).
I don't have any experience with machine-verified specifications but I do have experience writing computer programs and (traditional, human-checked) mathematical proofs, so my comment comes from that perspective.
Regarding
> Me: 200 words on how we're talking about composing specifications, not code, because specifications aren't programming languages
> HN: [parent]
I'm sorry if the use of the terminology "programming language" was jarring. I tend to take the point of view that a huge range of human activity in what are traditionally called programming languages, probabilistic graphical models, languages for constraint solving and optimisation, a lot of mathematics (especially category-heavy mathematics) and even writing YAML for configuration is actually all just different aspects of the same thing. I chose the word "programming" in the parent which might betray some bias on my part but regardless of the word chosen the practice of combining smaller logical pieces into larger logical pieces is common to all. Your article is suggesting that there is something about the nature of specifications that makes this composition harder than it is for executable code.
The conclusion may be true (I have no idea) but I'm afraid that I just don't see how your article justifies it. In the first example I see a language LTL whose design encourages users to write small logical units that cannot be composed[1] easily with the && operation. A more awkward encoding does achieve compositionality (through stuttering-invariance). It looks like TLA+ provides notation to make it more natural to write the small logical units in ways that they can be composed easily with &&. That makes it sound to me like TLA+ is a better language than LTL (for this purpose)! Better languages make it easier and more natural to write smaller units in ways that can be composed. That's all my parent comment says!
In the second example it is composition with && itself that proves to be impossible. The composition rule is something more complicated:
Next = (NextX(z) && Same(y)) || (NextY(z) && Same(x))
Spec = InitX && InitY && □[Next]_xyz
Unless I'm missing something, there's nothing wrong or hard about
that composition rule, it's just the tooling that doesn't make it
easy to work with. In Haskell we compose monadic functions with
>>=. That would make life really awkward unless we had do
notation. The problem is not that monadic functions are hard to
compose, it's that the language needs to support them
ergonomically. I can't see anything in your post that suggests
to me that specifications are by nature hard to compose. All I
can see is unergonomic languages.See also https://www.haskellforall.com/2016/04/worst-practices-should...
[1] composed in the sense which your article elaborates, that is, in a way which combines the properties of both in some desirable way
You could take the specs to be "x must blink" and "y must blink". Either of these informal specs are more or less explicit enough to implement directly, but the spec "both x and y must blink" can be interpreted in too many ways to implement.
Another interesting case is the NGA is required to keep an archive of every geointelligence product created in the US for X years and the library only supports the nitf file format. This format specifies that images are stored as block segments that can either compose as tiles or as overlays, or in some cases can be non-viewable pixel data (i.e. something that can be used for downstream machine processing but isn't meant to be meaningful to humans and if you loaded it into a viewer it would look like nothing - consider something like a map of dead pixels).
One consequence of this is that if you want to disseminate something more lightweight to your customers ordering imagery products, like a simple jpeg so they don't need nitf viewers to see them, we make the jpeg and embed it as a nitf image segment with metadata indicating an actual nitf viewing program should ignore it.
There are Python build tools that get around the fact that their config isn't compatible with toml out of the box by embedding into pyproject.toml by adding a key value pair where the key is "legacy-ini" and the value is a string enclosing the entire config ini for the other tool. Or take secrets in Kubernetes, which specify key value pairs where the value is always a base64 encoded string, allowing you to store anything at all as a secret, including another yaml file, even a Kubernetes manifest. We presently use this where I'm at to get the Anchore Enterprise license into the cluster, and the license is in yaml format. Base64 encoded yaml embedded into another yaml file.
Of course, none of this all that exotic. File types are just a subset of data types, and obviously types compose, both functionally and in terms of building compound types, including recursive types, from simple types. I think the real issue this guy is seeing is due to temporal logic introducing unique challenges, not that specifications in general don't compose. Specifications without temporality compose pretty easily in many cases. This is why regular expressions and context free grammars and compilers are possible.
Nonetheless, the same issue would come up here. Given we can specify programming languages as context free grammars and regular expressions, why can't we compose them and have a compiler that compiles both Rust and Go at the same time and makes a single executable?
The answer is we can, but the syntax doesn't actually fully specify the languages. A full specification needs to include the runtime and the ABI, and the actual behavior of those may not be fully specified in any actual specification other than the code for the respective compilers. It's not that they can't be composed. SWIG is a thing. It's just much harder than composing CFGs. You probably can't do it in a blog post or by hand on a piece of paper.
I don't know what the standard is nowadays but in the past, every time I have argued against a purely natural language specification in favor a code based spec such as a reference implementation, people looked at me in shock that I wanted to skip the specification part.
I tried to explain that, I don't want to skip the specification. I just don't think human language is nearly precise enough to write an adequate specification. Natural language words are incredibly polysemic and contextual. Look, for example, at how many meanings the word "break" has: https://www.merriam-webster.com/dictionary/break
Kolmogrov has a long time ago suggested that fully specified information distills down to computer programs: https://en.wikipedia.org/wiki/Kolmogorov_complexity, https://en.wikipedia.org/wiki/Minimum_description_length
To me the ideal language for a pure specification might be a mix of natural language and pseudo code with a pseudo test suit. However, if you are writing this, you might as well go one step further and write working testable code.
I like the concept of Literate Programming (https://en.wikipedia.org/wiki/Literate_programming) and its descendants like having code with extractable inline comments that auto generate documentation. I would argue that modern pull request based workflows that tie discussions to version controlled code changes are also the progression of this line of thought. A cleaned up version of these make sense to me for a specification.
And I get some of the concerns. While natural language under specifies, reference implementations over specify. This is more of a problem with low level languages. Modern high level languages are getting fairly close to a form of pseudo code. I fully agree that the reference implementations shouldn't contain or should hide, low level optimizations.
I also understand that reference implementations can unduly tie specs to specific hardware, OSs and platforms.
But to me over-specification is less of a problem than under-specification and it can be mitigated by labeling particular functions or blocks of code as implementation specific and not part of the spec.
Without spec written in code, the different implementations always have subtle incompatibilities. I see egregious versions of under specification in government where horrendously bad spec are created in order to issue RFPs for getting software built. They usually end up with non working software at mind blowing cost.
People have this weird misconception that you are contracting out to build software. You are not. Building software is really easy. You press the build button or type the compile command. Building software has been fully automated for a long time. What is difficult is designing software and specifying what it must do. This is because there is a vast jungle of protocols, business flows, hardware and software platforms that need to be interacted with differently for different needs. This is what needs to be specified and only computer code can do it adequately.
I really wish, for example, that Mozilla adopted the chromium core. To me chromium is a de facto standard spec. I don't think it is possible to specify a standard that makes browser fully compatible without them having a common core based on a reference implementation. And we really need a popular non-profit managed release of said reference browser core.