> What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?
The correct parsing of this is: "What's the benefit? [...] Is it [the benefit] that you can backprop through this computation? Do you do so?"
There are no details about training nor the (almost-certainly necessarily novel) loss function that would be needed to handle partial / imperfect outputs here, so it is extremely hard to believe any kind of gradient-based training procedure was used to determine / set weight values here.