> It doesn't solve everything, but I guess the idea is, make it work right for the majority of cases ("sensible defaults") and then offer ways to deal with harder cases ("make simple things easy, hard things possible").
My contention is that while BPipe makes simple things easy, hard things possible, Drake makes both easy and possible. I think I've made some points to that regard, and gave you examples of Drake code which is just as easy to write as the corresponding BPipe's code without compromising on functionality. But to really conclusively prove this, I'm looking forward to more BPipe examples. So far, I haven't seen anything that is simpler (or even shorter) in Bpipe.
> Not at all - if my pipeline has 15 stages then I have 15 commands to name. Those 15 stages might easily create hundreds of outputs though.
When I first read it I thought this is a great point and you're onto something. But as I thought about it more, I realized that it only seems this way.
Here's the thing: if you have 15 stages but hundreds of files, it can mean only two things:
1) The vast majority of those files are leaf files, that is - they are either inputs (with pre-determined names) or outputs, which names you don't really care about (surprisingly). Drake can generate filenames for leaf output files with ease, as they don't affect the dependency graph.
2) The vast majority of those files are not leaves, but it means that the steps either:
2a) pass to each other dozens multiple inputs and outputs, and you have to either give them identifiers (as described above, Drake can do it too) or use positions (unmanageable).
2b) even worse, have a big and complicated dependency graph with much more than 14 edges, in which case your syntax of { a + b + c } will be almost definitely inadequate to describe such a complex thing (15 vertices and several dozens edges).
So, any way you look at it, Drake can do the same thing in the same way or better. Am I missing something?
> Bpipe isn't just not trying to build a graph up front, it really doesn't think there is a graph at all! At least, not an interesting one. The "graph" is a runtime product of the pipeline's execution.
I don't understand it. I'm afraid it doesn't work this way. You can't have the graph as a runtime product of the execution (i.e. after the execution), because it cripples your ability to do partial evaluation of targets. That is, you have to have dependency graph before you can even answer the question - "is target A up-to-date?". If you need to run the workflow to arrive at a conclusion, there's no guarantee how much time it will take. I also believe it unnecessary melds the distinction between the commands and the workflow. If your code needs to care about its dependencies, it can't be used out of context. So, maybe an example?
But if all you need to do is re-run everything every time, then it means you're really doing something trivial, and it also raises the question of why we need a tool like BPipe in the first place.
> An individual pipeline stage can use if / then logic at runtime to decide whether to use a certain input or a different input and that will change the dependency graph.
I don't see how it could work this way. Could you please give me an example along with the explanation of how BPipe will handle it on the control level?
> You have to go back and ask why you care about having the graph up front in the first place, and in fact it turns out you can get nearly everything you want without it.
I'm confused, I think nothing could be further from the truth. The dependency graph specifies what steps depend on what steps. If you don't know it, you don't even know how to start evaluating the workflow, because you don't know which step to build first. I don't understand this statement at all. Could you please elaborate or give me an example?
> By not having the graph you lose some ability to do static analysis on the pipeline, but to have it you are giving up dynamic flexibility.
I need to see an example of this.
> I can't argue with that - but that's sort of the idea: simple things easy, hard things possible. Complicated cases are complicated with every tool.
I don't think having 3 inputs is a very complicated case. And neither is having any dependency graph which is not a linear step1, step2, step3. My point is as soon as you get any of those, BPipe starts to slowly evolve into Drake, with some very weird syntax and inconsistencies (like having "implicit" dependencies in steps' implementations but having to also specify some or all of the dependencies in the "run" statement).
It's possible that I'm misunderstanding BPipe. Maybe some more examples would fix this.
> I guess I'd have to disagree with this, as I really think there are some fundamental differences in approach that go well beyond syntactic sugar.
I don't really see them. And you can't just disagree, you have to provide arguments. :) I understand you can see it differently, but it seems like so far, there could be a Drake workflow for every BPipe example, which uses the same ideas and is equally easy to write (but not necessarily the reverse). This means it all comes down to syntax, no?
Again, I might be misunderstanding BPipe.
I think it's really, really hard to argue abstract concepts. I would very much appreciate some examples. It doesn't even have to be your favorite workflow. Just give me anything. Write something and ask - "how would you put it in Drake?". I think my response would make it clear whether there are syntactic or philosophical differences. We've already established that there are some things BPipe cannot do as well as Drake can. I'd like to see the reverse to be true. Because in this case we can really identify philosophical differences, but if it's the opposite - i.e. Drake can do everything BPipe can with the same ease - than it's not a question of philosophy any more but design.
I'm not trying to attack BPipe. I just want to make the best tool possible, and if we make compromises, I want to make sure they are informed. We must consciously choose some things not to be as easy or possible in Drake for some other greater good. So far, I can't identify any of those things.
Show me. :)
Artem.
P.S. You don't have to give a real world example. I think that would actually unnecessary restrain and slow you down. Just demonstrate a basic concept, a feature, name your steps A, B, C - I don't care what they do. Only if it's something extremely exotic I might ask if there's a real world use-case for this, but I think I can come up with use-cases for pretty much anything. :)
P.P.S. Please include what you do to run the workflow in your examples. I suspect I might have misconceptions about what "run" statement does and how Bpipe resolves dependencies.
P.P.S. I appreciate the dialog as well. Especially since BPipe is your 8th tool. I would like Drake to be your 9th, and better than anything you used before, including Bpipe.