Coming Soon: Machine-Checked Proofs in Everyday Development (opens in new tab)

(media.ccc.de)

185 pointsfuklief8y ago75 comments

75 comments

45 comments · 10 top-level

jondubois8y ago· 12 in thread

Mathematical proofs of code are not suitable for most systems.

Even unit tests which don't involve proofs can be a problem sometimes because they lock down API inputs and outputs. It's already a major problem when a developer wants to change some code and they have to spend hours updating unit tests every time. When in small teams, heavy 100% coverage unit tests slow down productivity, possibly by 5x or 10x.

I imagine that adding proofs as part of the unit test suite would further increase the productivity cost overhead by a massive factor.

In software development, you should avoid treating the code like if it's special or perfect because it's not and it will need to be changed in the near future when requirements change (and they will always keep changing). So locking it down into a specific state with proofs all around it is a bad idea for the vast, vast majority of cases.

We need more wisdom in software development and that means taking a step back, looking at the big picture and asking questions like what are the negative side effects of each new methodology that we are introducing into our projects? There are always downsides, and we're crazy to pretend that every fancy new methodology is a silver bullet.

jononor8y ago

Test primarily against external APIs. Those should be stable against change anyways, so that the external consumers can rely on it.

Generally don't test against internals much, except if an area is shown to be buggy and hard to exercise externally.

Prefer (well tested) libraries for generic functions/algorithms. If you have to implement one, put tests on it as if it was external API. Consider publishing it independently to make it external. If needs change, just drop the dependency.

hmottestad8y ago

It's hard to know the intent of a developer by reading their code. You could always add a bunch of comments, which would help, but in the long run you risk them getting out of date.

My unit tests are basically "comments" for my code, that show me how the code is supposed to work. When I make changes, the tests will tell me if I broke the intentions of the initial developer.

Sure, when doing big changes, just updating and maintaining the tests is a lot of work, but so would maintaining documentation and comments be. So we have less documentation and fewer comments, and more tests.

jondubois8y ago

I find integration tests more useful for documentation because they show you how the system works at a higher level which is precisely what new developers need to understand.

If the code itself is written properly then the individual classes and objects used throughout should be high cohesion and easy to read without needing any comments or tests.

There are always cases where unit tests are important though like the parts of the code that deal with money.

But there is no benefit in adding unit tests if you're building a standard chat system for example; integration tests are sufficient in that case.

Also getting 100% test coverage is a complete waste of time if you don't actually test the right kinds of input and output parameters.

3 more replies

pmontra8y ago

Recently I found a get_info(doc) which was hard to understand. It should have been get_xrsf_token(web_page) which would have been self explanatory.

I mostly develop backends of web apps. In my experience not every web app has some code that needs to be explained. Picking the correct names is usually enough.

That said, unit tests really do what you write: provide examples, use cases and guidance.

z3t48y ago

I find the best approach is the keep the code base small enough so you can keep everything in your head. Knowing that if you change something you also have to update this and that , and make sure feature x, y, z still works (which could be done with automated tests).

mauricioc8y ago

The presentation addresses this in a way. The argument is that in Coq you can write proof _scripts_ ("repeat this simplification operation until it does nothing", "pattern match on this definition and apply this operation to every subcase", "proof tactics" and so on), and in this way the same proof script can prove different theorems.

So there is a possibility that changing the code only forces you to change theorem statements (that is, the spec), allowing reuse of the proof script itself. Now you are doing software engineering on proof scripts, and you face the same generality/flexibility problems you would face when writing unit tests. A well-written proof script would be flexible enough, and a hacky one would lock things down too much.

bsaul8y ago

That part made me wonder if a day will come where some generic script would be able to work with any theorem on any code, maybe with the help of some kind of introspection ( at this level of abstraction i’m not even sure introspection is the right concept).

2 more replies

adultSwim8y ago

Re-think your assumptions about "mathematical proofs". Compiling software with types is a mathematical proof. We get certain guarantees "for free" without even thinking about it. It's immensely useful and flexible. We can build better type systems than are in common use. So much more is possible than just making sure we don't stick an int where a string is supposed to go. Moving forward is about raising the bar on what we get "for free".

We have a lot of experience building programs that help us program (e.g. compilers, IDEs, build systems) We can move forward without falling into traps (low performance, high programmer knowledge required, overly rigid, more work required than gained in productivity)

There are no silver bullets but we can continue to improve.

KirinDave8y ago

> Mathematical proofs of code are not suitable for most systems.

It depends what kind of math and what they're doing though. Systems like Coq give you much more trustable code to put in your system, and higher degrees of trust for clients. But systems like Idris actually let you write code and use the proof engine to write trustworthy code.

Most of that "math", from the perspective of a programmer, is a small subset of graph theory, formal logic, and category theory useful for proving relationships and reflexivity so that the machine can help you.

> Even unit tests which don't involve proofs can be a problem sometimes because they lock down API inputs and outputs.

This is a much bigger problem than "hand written mocks are bad" (which they are, I agree). But it's quite out of scope to complain that because API endpoints are often sloppily deployed without versioning or introspectability, that all formal methods are wrong.

What's more, and I really want to push this point so I'll emphasize it: quickly producing broken code helps almost no one. If you're an artist playing with making a synthesizer whine? Maybe okay. If you're writing software intended to face the world and do real work? We can't just keep pretending we're all smart enough to do this without mechanical aid; we're NOT smart enough to do this without mechanical aid anymore. Programs integrate too many libraries and APIs.

> I imagine that adding proofs as part of the unit test suite would further increase the productivity cost overhead by a massive factor.

With Idris, it makes it so the program can write code for you. I've been learning it over the last 2 weeks and I'm totally fascinated by how much drudgery it takes out of total programming.

> In software development, you should avoid treating the code like if it's special or perfect

This is precisely what formal methods do! You challenge every piece of code you can and say, "Prove to me that what you've written doesn't have an obvious flaw!" You rewrite until both the human and the proof engine cannot see anymore. It is not sacred at that point (no one believes this), but at least you've acknowledged that you can write errors and started to mitigate them.

> There are always downsides, and we're crazy to pretend that every fancy new methodology is a silver bullet.

Actually, the entire industry has started to demand that developers do a lot more computation on their ends to make better software. The "prove more of your program and write total code" is just one side of the "do more work on your end" school. Another is, "Your software is in fact the output of this algorithm training from a dataset on 4 GPUs." We probably need a similar step forward for APIs (where APIs are invalid & unusable without a spec that governs them).

It is a change, but most commercial software engineers are handling user data that can be leveraged in disastrous every day. As you say, we need the wisdom to realize that it's difficult to guess the use case or security impact of the code we write on a daily basis, and we need to start taking that fact seriously. These heady days of being able to deny responsibility and shove out any software we want without consequence need to end.

abecedarius8y ago

While automated reasoning tech is getting better, the barriers have been more on the demand side. Until buggy vulnerable software makes you poor and visibly correct software makes you rich, we'll keep getting the vulnerable sort. IOTA, which rolled their own hash function and then blustered about the problem, right now has a market value of ten billion dollars.

We're starting to see a change in the economics. There's quite a way to go -- I hope we get there intelligently and not as some kind of thrashing reaction to giant disasters.

1 more reply

crdoconnor8y ago

>But it's quite out of scope to complain that because API endpoints are often sloppily deployed without versioning or introspectability, that all formal methods are wrong.

It's not so much that they're "all wrong", but when the majority of bugs you face are not logical or algorithmic in nature but driven by incorrect specifications or problems dealing with the outside world (including but not limited to sloppy APIs), then formal methods will impose a stupidly high cost for almost no benefit.

To take the famous Mars Climate Orbiter bug as an example - had the software been formally proven then it still would have gone down in a ball of fire because the software was built upon a faulty presumption. An integration test that exercised both the data and the code in a realistic environment would have saved it though, as would better spec communication between the two teams (e.g. some form of BDD). The chief investigator, Arthur Stephenson, even explicitly called out the lack of sufficient end to end testing.

I'm sure formal methods are great for proving that, e.g. a particular sorting algorithm always works, but frankly, sorting algos are the type of thing I import and just assume works and I've yet to see a bug in the wild that invalidates that assumption, whereas a faulty interaction between two poorly specified modules is a problem I see in teams on a near daily basis.

1 more reply

duneroadrunner8y ago

Here's my hypothesis: The problem with these static verifiers is that they are applied at the wrong stage. The verification logic would be more useful in more cases if it were applied at the optimization stage.

Right now the benefits of static verifiers is rather binary. Either your program (or code unit) is sucessfully verified or it isn't. And every time you make a change to verified code you risk losing its verified status.

Imagine instead writing code that is made "safe" using run-time asserts everywhere. Unoptimized, this code might be slow. And unreliable in the sense that a run-time assert could fail, requiring some kind of recovery or abort action. But it would still be "safe".

The run-time asserts are effectively the "program specifications", and the discarding of those asserts at the optimization stage is basically equivalent to traditional static verification. The difference being that not quite fully optimized "safe" code is generally more valuable than not quite fully verified (and therefore not fully "safe") code.

> These heady days of being able to deny responsibility and shove out any software we want without consequence need to end.

If the static verification community is imploring the development community to, with some effort, change its practices to improve code safety, would that community be willing to, with some effort, move the application of their verification logic to the optimization stage where it would be more practical for a wider range of applications?

2 more replies

adamnemecek8y ago· 6 in thread

Cryptocurrencies and digital contracts are going to pump an insane amount of money into this.

KirinDave8y ago

I actually genuinely agree. Trustless computing is a very strong use case for systems where a lot more verification happens up front, because the downsides of messing up are so obvious and immediate.

It's worth noting that Ethereum's "fuel" model as a 1:1 correspondence with how Idris's mostly-total IO monad works. The "infinite fuel" bit can just be turned off and some (substantial and tricky, I admit) changes written to keep top-level state around and bam; timeboxed resume-able IO

seanwilson8y ago

I'm not so sure to be honest. I've been super surprised at just how few high profile hacks there have been against cryptocurrency protocols so far even when you've got clients implemented in C++. I wish I could explain why as well because you'd think all you'd need is a small flaw to attack the whole network when most nodes are running the same software.

firethief8y ago

Normal security practices are applicable to the node/wallet software. Smart contracts are another story -- think of the DAO heist. Short programs whose behavior determine the fate of $$$ are exactly the kind of thing where verification excels.

201710268y ago

I was thinking the same thing. I just got an Ethereum certification but I don't want to just develop Solidity contracts at an hourly basis that just get hacked after I leave.

Would the resources in the original post be worth looking into to build secure contracts?

Roritharr8y ago

I can only ask a question, since i've got no know-how about this.

How would such a proof checker know the intent of the code of not having backdoor/insecurity? How do you define security in a logical sense?

3 more replies

etaioinshrdlu8y ago

On a somewhat related topic... let's have a decentralized market for proofs!

I wrote up what this could look like: https://pastebin.com/raw/ZAys0mYp

haskellandchill8y ago· 4 in thread

My current reading list:

- Fundamental Proof Methods in Computer Science

- Handbook of Practical Logic and Automated Reasoning

- Software Abstractions

- The Little Prover

Yes, currently I have Athena, OCaml, Alloy, ACL2 (via Dr Racket/Dracula) all up and running to work through the exercises in these books. I'm very skeptical anything will come of this, but it interests me and I can see the value.

Hopefully the speaker's next book Formal Reasoning About Programs (https://github.com/achlipala/frap) will be a nice next step, I have Certified Programming with Dependent Types but have not cracked it yet. Also keeping an eye on Verified Functional Algorithms (https://softwarefoundations.cis.upenn.edu/vfa-current/index....).

meneame28y ago

Three extra classics:

* https://books.google.es/books?id=YseqCAAAQBAJ

* http://www.concrete-semantics.org/

* https://softwarefoundations.cis.upenn.edu/

adultSwim8y ago

+1 for Software Foundations

abecedarius8y ago

> The Little Schemer

Do you mean The Little Prover?

haskellandchill8y ago

Yes! Thanks.

Animats8y ago· 4 in thread

In ten years.

That's where I thought we were in 1983 - ten years away.

wolfgke8y ago

> In ten years.

> That's where I thought we were in 1983 - ten years away.

Obligatory xkcd:

> https://xkcd.com/678/

MikkoFinell8y ago

> Obligatory xkcd

God I cringe every time someone posts that. It's like the "Big Bang Theory" of internet comics.

AlexCoventry8y ago

I think ten years is an extremely conservative estimate, since he's giving real-world examples in the talk.

throwaway20488y ago

They had real world examples of formally verified code in 1983 too.

skybrian8y ago· 3 in thread

What progress has been made recently in making writing proofs easier? Are there any research projects targeting mainstream programmers?

slack3r8y ago

Check out Ur/Web (http://www.impredicative.com/ur/).

It has been used by a "mainstream programmer" to build a RSS feed reader.

agentultra8y ago

Plenty of advances in logic and category theory have enabled dependently typed languages to be both theorem prover and practical implementation language.

See Lean.

vladf8y ago

I'm interested but "Lean programming languages" and its varieties are difficult to Google. Would you mind posting a complete reference?

2 more replies

jpochtar8y ago· 2 in thread

We already have a basic form of machine checked-proofs in everyday code: type systems. Type systems encode and prove simple theorems like "f(x: int) -> int returns an int, if x is an int". Typescript and Mypy show us that these proof checkers don't need to be built into a language. Even better, they show that proofs can be over subsystems of a program, since Typescript and Mypy both allow for partial/incremental typing. We just need to extend these systems to support more complex theorems.

bcherny8y ago

Incremental type systems are super interesting. Lots of work being done in the last 10 years that blurs the lines between static and dynamic typing (monkeytype, reticulated Python), types and values (via the triumph of inferred types), kinds and types (powerful trait systems like Scala's), and even values and code (code synthesis like magichaskeller, or excel's flashfill).

A bit of self promotion, I touched on this in a recent blog post if you're interested in this sort of stuff: https://performancejs.com/post/hde6a33/JavaScript-in-2017:-Y...

earleybird8y ago

and you end up in the neighbourhood of Idris which is, I find, a pleasant experience when doing development

qznc8y ago· 2 in thread

Machine checked useable compilers exist (CakeML, CompCert). Useable kernels exists (SeL4). A few people officially work as proof engineers.

The future is here, but not evenly distributed. ;)

haskellandchill8y ago

Were you able to find a position? You are qualified, but it's very competitive, few jobs and many PhD level candidates with experience.

qznc8y ago

Did not try. I'm only looking in southern Germany and afaik nobody does that here.

mcguire8y ago· 1 in thread

Good talk!

So, one question: Adam gives one example of a proof where the initial automation fails because it does not consider a theorem that was previously proved. He also gives an example of some complex code where the proof is too tedious to do without lots of automation. What happens when your complex code changes in such a way that you need some extra lemmas but don't know what they are?

yaantc8y ago

It's manageable in practice I believe, and Adam quickly shows a feature related to this in the demo.

First, if some new features are added with associated properties, then some theorems/lemmas related to these properties will have to be added and proven. I don't think that part is problematic, because it's a "local" decision. This work is focused on the change.

The possible problems come from the impact of these changes to existing results. If you need to go and make other changes all over the place it wouldn't be scalable. But then automation and coq hint database can help. In the demo example you referred to Adam adds the lemma as a rewrite hint, and then it's picked by coq automation elsewhere. That's how local changes can automatically be leveraged globally by proof automation in existing proofs, with no changes (ok, not always. But that limits the impact).

crb0028y ago· 1 in thread

We already have them. It's called compiler type checks. Albeit having an SMT solver in your unit tests is nice too.

UncleMeat8y ago

Type checking is (usually) intraprocedural and flow insensitive. That's what makes it easy. The other stuff is harder.

ted_dunning8y ago

I think that the classic ironical demonstration of the limits of proof systems comes up with the fact that CoqIDE 8.7.1 crashes immediately on OSX if you try to cut or paste anything.

The problem, it turns out that they included a new version of OCaml which was incompatible with the older version of GTK that was being used. Result: crash due to seg fault.

We really need both proofs and pragmatics. Proofs can give us confidence in the fundamentals when used appropriately for the processor, or for consensus algorithms like Raft, or key parts of the OS. Tests and statistical analysis can give us confidence that we haven't overlooked something crucial.

Some systems obviously aren't very appropriate for formal methods (take machine learning, for instance, there can be no formal spec for what a fraud looks like). There statistics and testing will have to suffice because we have clear value on average even if we cannot formally guarantee the system.

Other systems are much more appropriate for formal methods. I should hope that a heart monitor, CPU, voting machine or aircraft flight management system will have substantial amounts of formally proved code (I know that I am dreaming with the voting machines; let's settle for getting it down on paper). I don't want to collect a statistical sampling of whether an airplane works as we iteratively work out the bugs and specs in an agile fashion.

j / k navigate · click thread line to collapse

75 comments

45 comments · 10 top-level

jondubois8y ago· 12 in thread

Mathematical proofs of code are not suitable for most systems.

I imagine that adding proofs as part of the unit test suite would further increase the productivity cost overhead by a massive factor.

jononor8y ago

Test primarily against external APIs. Those should be stable against change anyways, so that the external consumers can rely on it.

Generally don't test against internals much, except if an area is shown to be buggy and hard to exercise externally.

hmottestad8y ago

It's hard to know the intent of a developer by reading their code. You could always add a bunch of comments, which would help, but in the long run you risk them getting out of date.

My unit tests are basically "comments" for my code, that show me how the code is supposed to work. When I make changes, the tests will tell me if I broke the intentions of the initial developer.

jondubois8y ago

I find integration tests more useful for documentation because they show you how the system works at a higher level which is precisely what new developers need to understand.

If the code itself is written properly then the individual classes and objects used throughout should be high cohesion and easy to read without needing any comments or tests.

There are always cases where unit tests are important though like the parts of the code that deal with money.

But there is no benefit in adding unit tests if you're building a standard chat system for example; integration tests are sufficient in that case.

Also getting 100% test coverage is a complete waste of time if you don't actually test the right kinds of input and output parameters.

3 more replies

pmontra8y ago

Recently I found a get_info(doc) which was hard to understand. It should have been get_xrsf_token(web_page) which would have been self explanatory.

I mostly develop backends of web apps. In my experience not every web app has some code that needs to be explained. Picking the correct names is usually enough.

That said, unit tests really do what you write: provide examples, use cases and guidance.

z3t48y ago

mauricioc8y ago

bsaul8y ago

2 more replies

adultSwim8y ago

There are no silver bullets but we can continue to improve.

KirinDave8y ago

> Mathematical proofs of code are not suitable for most systems.

> Even unit tests which don't involve proofs can be a problem sometimes because they lock down API inputs and outputs.

> I imagine that adding proofs as part of the unit test suite would further increase the productivity cost overhead by a massive factor.

With Idris, it makes it so the program can write code for you. I've been learning it over the last 2 weeks and I'm totally fascinated by how much drudgery it takes out of total programming.

> In software development, you should avoid treating the code like if it's special or perfect

> There are always downsides, and we're crazy to pretend that every fancy new methodology is a silver bullet.

abecedarius8y ago

We're starting to see a change in the economics. There's quite a way to go -- I hope we get there intelligently and not as some kind of thrashing reaction to giant disasters.

1 more reply

crdoconnor8y ago

>But it's quite out of scope to complain that because API endpoints are often sloppily deployed without versioning or introspectability, that all formal methods are wrong.

1 more reply

duneroadrunner8y ago

> These heady days of being able to deny responsibility and shove out any software we want without consequence need to end.

2 more replies

adamnemecek8y ago· 6 in thread

Cryptocurrencies and digital contracts are going to pump an insane amount of money into this.

KirinDave8y ago

I actually genuinely agree. Trustless computing is a very strong use case for systems where a lot more verification happens up front, because the downsides of messing up are so obvious and immediate.

seanwilson8y ago

firethief8y ago

201710268y ago

I was thinking the same thing. I just got an Ethereum certification but I don't want to just develop Solidity contracts at an hourly basis that just get hacked after I leave.

Would the resources in the original post be worth looking into to build secure contracts?

Roritharr8y ago

I can only ask a question, since i've got no know-how about this.

How would such a proof checker know the intent of the code of not having backdoor/insecurity? How do you define security in a logical sense?

3 more replies

etaioinshrdlu8y ago

On a somewhat related topic... let's have a decentralized market for proofs!

I wrote up what this could look like: https://pastebin.com/raw/ZAys0mYp

haskellandchill8y ago· 4 in thread

My current reading list:

- Fundamental Proof Methods in Computer Science

- Handbook of Practical Logic and Automated Reasoning

- Software Abstractions

- The Little Prover

meneame28y ago

Three extra classics:

* https://books.google.es/books?id=YseqCAAAQBAJ

* http://www.concrete-semantics.org/

* https://softwarefoundations.cis.upenn.edu/

adultSwim8y ago

+1 for Software Foundations

abecedarius8y ago

> The Little Schemer

Do you mean The Little Prover?

haskellandchill8y ago

Yes! Thanks.

Animats8y ago· 4 in thread

In ten years.

That's where I thought we were in 1983 - ten years away.

wolfgke8y ago

> In ten years.

> That's where I thought we were in 1983 - ten years away.

Obligatory xkcd:

> https://xkcd.com/678/

MikkoFinell8y ago

> Obligatory xkcd

God I cringe every time someone posts that. It's like the "Big Bang Theory" of internet comics.

AlexCoventry8y ago

I think ten years is an extremely conservative estimate, since he's giving real-world examples in the talk.

throwaway20488y ago

They had real world examples of formally verified code in 1983 too.

skybrian8y ago· 3 in thread

What progress has been made recently in making writing proofs easier? Are there any research projects targeting mainstream programmers?

slack3r8y ago

Check out Ur/Web (http://www.impredicative.com/ur/).

It has been used by a "mainstream programmer" to build a RSS feed reader.

agentultra8y ago

Plenty of advances in logic and category theory have enabled dependently typed languages to be both theorem prover and practical implementation language.

See Lean.

vladf8y ago

I'm interested but "Lean programming languages" and its varieties are difficult to Google. Would you mind posting a complete reference?

2 more replies

jpochtar8y ago· 2 in thread

bcherny8y ago

A bit of self promotion, I touched on this in a recent blog post if you're interested in this sort of stuff: https://performancejs.com/post/hde6a33/JavaScript-in-2017:-Y...

earleybird8y ago

and you end up in the neighbourhood of Idris which is, I find, a pleasant experience when doing development

qznc8y ago· 2 in thread

Machine checked useable compilers exist (CakeML, CompCert). Useable kernels exists (SeL4). A few people officially work as proof engineers.

The future is here, but not evenly distributed. ;)

haskellandchill8y ago

Were you able to find a position? You are qualified, but it's very competitive, few jobs and many PhD level candidates with experience.

qznc8y ago

Did not try. I'm only looking in southern Germany and afaik nobody does that here.

mcguire8y ago· 1 in thread

Good talk!

yaantc8y ago

It's manageable in practice I believe, and Adam quickly shows a feature related to this in the demo.

crb0028y ago· 1 in thread

We already have them. It's called compiler type checks. Albeit having an SMT solver in your unit tests is nice too.

UncleMeat8y ago

Type checking is (usually) intraprocedural and flow insensitive. That's what makes it easy. The other stuff is harder.

ted_dunning8y ago

I think that the classic ironical demonstration of the limits of proof systems comes up with the fact that CoqIDE 8.7.1 crashes immediately on OSX if you try to cut or paste anything.

The problem, it turns out that they included a new version of OCaml which was incompatible with the older version of GTK that was being used. Result: crash due to seg fault.

j / k navigate · click thread line to collapse