undefined | Better HN

0 pointsusgroup2y ago0 comments

Does that imply that every possible branch is optimised separately and then convoluted thereafter?

If so, does it scale for very branchy programs?

Do you have any comparisons to a Gibbs based approach for any of the use case examples?

0 comments

2 comments · 1 top-level

frankling_2y ago· 1 in thread

The convolution is approximated via a form of sampling with additional bookkeeping at each encountered branch. How well that scales for deeply branching programs depends on the probabilities of the branches and the diversity in the output on the resulting paths, the worst case being a program where all branches are equally likely and each path generates an entirely different output (as in a hash function, for example). In practice, we've been dealing with problems involving up to tens of thousands of branches or so.

We've haven't done a direct comparison to MCMC approaches yet, but it's on the Todo list. My intuition is that MCMC will win out for problems where finding "just any" local optimum is not good enough.

usgroupOP2y ago

Thank you. Your helpfulness and balanced evaluations are refreshing.

j / k navigate · click thread line to collapse