I think the differentiable forth example in the article is interesting in the context, since it has a differentiable program with gaps, and it uses the universal approximation property of a neural network to fill them. When your code is differentiable, it's possible to embed ML models, perhaps to learn a part of an equation when you already know most of it, and which otherwise would have too large of a search space. You might even have, like other commenter said, a compiler being smart enough to rewrite the AST to reduce convergence problems (which seems to be the main problem with such models). Or you could download libraries with pretrained models/architectures in the same way you have any regular program library to embed in deeper systems.
Though I honestly can't tell if any of those are actually valid pursuits or I'm misunderstanding the possibilities.