Compiling Knowledge into Probabilistic Models (opens in new tab)

(willcrichton.net)

68 pointswcrichton7y ago9 comments

9 comments

9 comments · 5 top-level

marmaduke7y ago· 3 in thread

> seems to me that the the act of compiling knowledge into probabilistic models is still more art than science

Because there’s no modularity: writing probability models is still like using unstructured assembly.

Having written a handful of probabilistic programs, I don’t think there’s no modularity at all. You can still abstract a complex stochastic process behind a function. I think it’s more that certain tasks are necessarily antimodular, eg doing conditional inference can introduce unexpected dependencies between modules, see d-separation [0].

[0] https://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html

eli_gottlieb7y ago

Well, as part of the probabilistic programming community, I can at least report that inference programming and model synthesis are problems we're actively working on!

marmaduke7y ago

Ok but we agree. There’s a surface modularity in terms of syntax but anything introducing nonlinear covariance among variables is a death sentence absent reparametrization. The latter strategy is often formulaic and could be automated but the antimodularity remains

scribu7y ago· 1 in thread

Modelling uncertainty is definitely a useful tool to have, but I'm not sure why the author expects there to be a "scientific" (a.k.a. mechanistic) way of doing it.

In normal programming, there's no fool-proof formula for picking the best data structure or the best algorithm. If there were, we could just write one program to write all other programs and be done with it!

wcrichtonOP7y ago

I’m definitely not advocating for an automated way of building these models, for the reasons you point out. Instead, I’m saying more that probabilistic models are not taught in a way that we can easily map our knowledge structures onto them, so I’m advocating for more explicit guidance in that process.

1 more reply

nathcd7y ago

> In the general-purpose programming context, imagine if you could give examples of a program output (domain data) along with a skeleton of a program (source file with incomplete parts) and ask a system to fill in the holes.

This part reminds me of some of capabilities of the Idris compiler [1]. In an Idris program you can leave "holes" to stand in for incomplete parts of a program [2], and the compiler can infer various bits of code from types and holes. In a demo of the in-progress Idris 2 compiler [3], Edwin Brady refers to it as a "lab assistant" and shows it writing a whole function when given a function type.

[1] http://docs.idris-lang.org/en/latest/tutorial/interactive.ht...

[2] http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html...

[3] https://www.youtube.com/watch?v=mOtKD7ml0NU

i_am_proteus7y ago

Philosophically, one could consider model formulation to be a way the author encodes prior beliefs about the system into the model.

nartz7y ago

If you look at kernel code for many flavors of Linux you'll see annotations that hint at which branches of code are more likely.

Similarly, many JIT compilers create statistics on the fly already; for instance these are used to better predict which branches are most likely to occur and thus be prefetched.

j / k navigate · click thread line to collapse

9 comments

9 comments · 5 top-level

marmaduke7y ago· 3 in thread

> seems to me that the the act of compiling knowledge into probabilistic models is still more art than science

Because there’s no modularity: writing probability models is still like using unstructured assembly.

wcrichtonOP7y ago

[0] https://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html

eli_gottlieb7y ago

Well, as part of the probabilistic programming community, I can at least report that inference programming and model synthesis are problems we're actively working on!

marmaduke7y ago

scribu7y ago· 1 in thread

Modelling uncertainty is definitely a useful tool to have, but I'm not sure why the author expects there to be a "scientific" (a.k.a. mechanistic) way of doing it.

wcrichtonOP7y ago

1 more reply

nathcd7y ago

[1] http://docs.idris-lang.org/en/latest/tutorial/interactive.ht...

[2] http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html...

[3] https://www.youtube.com/watch?v=mOtKD7ml0NU

i_am_proteus7y ago

Philosophically, one could consider model formulation to be a way the author encodes prior beliefs about the system into the model.

nartz7y ago

If you look at kernel code for many flavors of Linux you'll see annotations that hint at which branches of code are more likely.

Similarly, many JIT compilers create statistics on the fly already; for instance these are used to better predict which branches are most likely to occur and thus be prefetched.

j / k navigate · click thread line to collapse