I've been thinking along somewhat similar lines but I've been looking at it from the perspective of introduction and elimination rules, as found in type theory. If your compiler can find a transformation that brings an intro form and an elim form together, then you can apply the computation rule to remove both from the program. My motivation is to remove allocations that aren't really essential. (I do have a functional language compiler that I work on but it doesn't implement any optimizations yet).
Another interesting idea that seems related is from a Philip Wadler talk[1] I saw which discusses how normalization can be used to achieve something like your "hypothetical inverses", which should always be eliminated from the final result.
> These transformations might only be useful in obvious optimization scenarios (cases that no programmer would intentionally create).
I would be surprised if that turned out to be the case.
[1] "Everything Old is New Again: Quoted Domain Specific Languages" https://www.youtube.com/watch?v=DlBwJ4rvz5c
Always curious to follow people doing compiler work; loop me in, yeah?
And, just watched that Wadler talk; it's good stuff. Of course, it sounds like his scenario is like: "there is a way to do the transformation, you just have to find it," and in my case, it's more like: "there might or might not be an algebra in which f acts as a homomorphism, and, even if there is, you may not have the transformations you need to determine it!"
If f is a homomorphism with respect to r, then reduce(r, map(f, l)) = f(reduce(r, l)).
The rule can be proven without reference to an inverse for f so isn't it enough to just add such a rewrite rule to your optimizer? (Now I'm not sure what the inversion trick accomplishes?).
> Always curious to follow people doing compiler work; loop me in, yeah?
norstrulde.org and twitter.com/ericbb
> "there might or might not be an algebra in which f acts as a homomorphism, and, even if there is, you may not have the transformations you need to determine it!"
To my way of interpreting it, the algebra is determined by r, which is given. You might not know whether f is a homomorphism with respect to r and it'd be tricky to figure that out by inspecting the definitions for f and r but if you have a table built in advance that characterizes functions in relation to each other (incr is a homomorphism with respect to max, etc.), then you can drive your rewrite engine using that table... So I wouldn't say that you want to use transformations to determine if f is a homomorphism but rather that you want to use the fact that f is a homomorphism to decide which transformations are applicable. Does that make sense?
Edit: I guess, more generally, you could have a relation with the form homomorphism(f:a->b, q:a->a->a, r:b->b->b) which justifies the transformation reduce(r, map(f, l)) -> f(reduce(q, l)).
If one calculates some reduction over a data structure, and this reduction is group operator (so it has inverse, like addition), and this data structure is then updated, it may be easier to recalculate the reduction by updating the result through inverse than to recalculate the whole reduction again.
For example, we are calculating sum over a list, we take element out, then it's easier to just subtract the element from the sum rather than calculating the sum again.
Anyway, all in all, I think knowledge of function inverses (and perhaps also isomorphisms between types) by compiler will become very handy in the future, and will enable huge number of optimizations.
Couchdb had an interesting approach for this that did not require inverses: they persist intermediate reduce results, so that you can re-use nodes in the tree that haven't changed. So clever. http://horicky.blogspot.com/2008/10/couchdb-implementation.h...
Let me define only "what" to compute, instead of spending all my time worrying about "how" and "when". Isn't that the point of declarative languages like SQL?
You could probably generalize that condition to situations where f can be lifted over the output of r so you don't have to require that r's output type is the same as its input.
I am curious about the idea that the inputs and outputs of r may not need to be of the same type, and want to investigate further (help is appreciated!). Since r is given to reduce(), at least one of its inputs must match its output, depending on how you define reduce(). Haskell's fold(l/r) permit the non-accumulator side to differ in type from the output, but the output must match the accumulator side. I think some similar transformation may apply to folds, but haven't thought about it enough.
h(fold(g, w, l)) = fold(f, h(w), l)
if
h(g(x, y)) = f(x, h(y))
This is exciting because (1) it suggests the equivalent transformation for sided reduces (folds), and (2) fold is a more general operator than the others. My somewhat silly use of "hypothetical inverses" effectively just creates a goalpost to (1) constructively find the the function f, given g (or vice-versa) and (2) prove the necessary condition.
Very exciting.
That examle is pretty easy; you apply a pattern match to the tree which looks for the pattern (sum (reverse ?var)) and rewrites it to (sum ?var).
In Common Lisp, we would use define-compiler-macro on the sum function which looks for the argument form being (reverse whatever), in which case the macro returns the form (sum whatever).
Not so fast, though; order of summing matters in floating-point! And over integers too, in the presence of overflows (in junk languages without bignums). That is to say, if we add together three or more integers, the arithmetic result can be in range of the type, but some orders might overflow, whereas others do not.
I don't think this is true. Modular addition is commutative and associative. As long as the behavior of overflow is modular (as it is in 2's complement), order shouldn't matter. And if the behavior of overflow is to trap, then it should trap regardless of the order you add the numbers in. I guess if you care about the stack trace you get being the same even under optimizations, then you could argue this isn't a safe optimization. And if the behavior of overflow is undefined, it's undefined whether you reverse the list or not.
Maybe you're thinking of Javascript, which not only doesn't have bignums, but uses floating point even for integers?
How come? Suppose A + B doesn't trap, and if T0 = A + B, then T0 + C doesn't trap. Why should (A + B) + C trap?
Bigger picture, I want to rewrite "sum(reverse(x))" without explicitly saying so. How do we optimize that expression using the definitions of sum and reverse? Note that in my discussion of this in my second example, I am inconsistent here: I expand the definition of reverse() but treat sum() as a primitive. That said, I am pretty sure that the example still works if you substitute the appropriate definition of sum as "reduce(add,x)"; the derivation is just longer.
You can endow functions with attributes (standard functions in the language library, and possibly user-defined ones). The reverse function can carry the attribute that it only changes the order of the elements of its argument. This is encoded as some boolean predicate: (order-identity-p x) is true if x is reverse. And if x is other things, like the identity function, the sort function and whatnot.
Then our optimzer looks for the pattern/rewrite (sum (f? arg?)) -> (sum arg?), where there is a semantic constraint: the pattern matches subject to the predicate (order-identity-p f?). This is effectively part of the match; if the predicate fails, there is no match.
Now the sum function could also be "patterned out". There could be a predicate which asserts that certain arguments of a function are sequences whose order doesn't matter to the outcome. This predicate could be list valued. If an argument has no such arguments, it returns nil/false, otherwise a list of the argument positions or whatever.
So then we're looking for the pattern (f? (g? arg?)) -> (f? arg) which is very general, and can be met in lots of ways. For instance, suppose f? and g? are the same function, and f? is idempotent. One of the rules that applies is that g? satisfies (order-identity-p g?), and (order-irrelevant-args f?) returns (1), where we note that (g? ...) is argument 1 of f?. For any argument argument expression which is an orer-identity-p function call, if that argument's position is in the list returned by order-irrelevant-args on the main constituent function, we can remove that order-altering function from the argument.
We code all that up into the tree walking engine, and then just suitably declare attributes on functions.
Anyway, I was a bit philosophically torn since the content tends toward mathematical/algorithmic and made me wonder what does it mean to play around with mathematics/algorithms?