I believe https://dagger.io checks all these manifesto boxes and more. At least that’s where I’m focusing my attention.
I added it to a side project just to get familiar and it added quite a few sdk files and folders to my project, and lots of decorators. It also required Docker and yadda yadda yadda.
I just could not justify using it compared to just running some regular Typescript file with Bun (or, in a different project, `go run cmd/ci/main.go`)
The problem that Dagger and similar efforts solve is for pipelines at scale, whether that's a sea of microservices maintained by an armada of teams (which never work the same) or your massive pipelines that should be decomposed into a more atomic pipeline that fed into one.
I believe the latter is a big productivity hurdle even without org-scale. My release pipeline runs for 25min with a team <5 because it's multi-staged (testing pyramid) and includes end-to-end tests. I love my pipeline because it makes me feel safe releasing my software upon success.
However, god forbid it fails with a non-obvious error of 20 minutes into exec. Lack of portability (Hi GHA vendor lock-in) and reproducibility (local-run = impossible) will make this feedback-loop hell.
Now, wiseguys might tell me that pipelines shouldn't run for multiple minutes and only unit tests blah. That's divorced from reality. This sentiment won't not solve automation problems and won't optimize for velocity. It merely throws it over the fence to somebody else. If you have "the luxury" a QE/QA/Release team which I feel bad for.
So the question to ask yourself is: how do I know I have outgrown `go run cmd/ci/main.go`?
All the tools do their own dependency tracking already (unfortunately).
As for the advantage, a makefile will definitely perform both go build and docker push, rather than just (say) docker push, an ever-present risk if you have to rely on your fingers to type these things in, or rely on your eyes to check that you recalled the right command from the history. It will also explicitly tell you the build failed rather than relying on you to do echo $? or for the tools to have some obvious output in the error case.
A shell script is also an option. Makefiles have some helpful extra features: by default, commands are echoed; by default, the build fails if any command exits with a non-0 exit code (please consult local Unix person for the details); a Makefile inherently has multiple entry points; and, a Makefile can also be easier to get working on Windows than a shell script, though if you can demand that people run it from Git Bash then maybe there's not much in this.
If you're still not convinced: that's fine! This is not a sales pitch.
(I've more recently switching to using a do-everything Python script for this stuff, which is more annoying when it comes to invoking commands but has obvious advantages in terms of how easy it is to add extra logic and checks and some UI creature comforts and so on.)
> Why have they gone out of style?
Because no modern toolchain uses make. Its syntax is so arcane that it's been replaced with various tools that are designed for the specific stack. Otherwise, more generic build systems use modern languages / markup.
I would gladly hear this argument expanded. It's really not obvious to me that that's the case.
Suppose I give you functions f and g of respective types int -> str and Nothing -> str. Can you compose them? No, and you see this immediately from the types. Types make reasoning about composability a lot easier.
Of course, it's not a panacea, and it's less helpful the more side effects a function has. Can we compose pure int->int functions? Of course! Can we compose two of them where the second expects some image to exist in some docker registry? You'll need to read the first to be able to tell.
Given the highly side effectful nature of pipelines, I'd think the applicability of types would be limited. But maybe that's just a lack of imagination on my part.
Certainly information like "this pipeline expects these variables" and "this pipeline sets these variables" are susceptible to a typed approach, and it would make things easier. By how much, I don't know.
Firstly, you want to ensure your functions are pure with respect to input. That is to say, they might reference a configuration or context object that is passed to them as an argument, but they'll never reference some global object/variable.
So then the docker image inside some docker registry? Both the image and the registry are values in the config/context argument at the least. Maybe they're their own separate arguments depending on whether you prefer a single big object argument or a bunch of smaller more primitive arguments.
So then the pure function that expects the docker image to exist in some registry is no longer
Int -> Int
It's now String -> String -> Int -> Int
because it needs a registry and an image. Maybe it's String -> String -> String -> String -> Int -> Int
because there's a username and password required to access the registry. Icky, but if we make a few types like data Registry {
user :: String,
password :: String,
url :: String
}
that becomes Registry -> String -> Int
But we could make it better by doing something like data Foo {
reg:: Registry,
image :: String
}
and now the function can be Foo -> Int -> Int
This doesn't fix the image not actually existing in the registry, but at least now we know that the two functions aren't composable, and when it fails because the image doesn't exist we can hopefully trace through to see who the caller is that's giving it incorrect data.PS: sorry if i got the haskell typing wrong. I don't know haskell so that's the result of what i could cobble together from googling about haskell type syntax
And, tellingly, it seems they still haven't provided a "why not ${other tool}" anywhere that I can readily spot
Admittedly most of my criticism is related to the choice of Go as an implementation language: more than 80% of the code volume is error handling boilerplate!
Before the lovers of Go start making the usual arguments consider that in a high-level pipeline script every step is expected to fail in novel and interesting ways! This isn’t “normal code” where fallible external I/O interactions are few and far between, so error handling overhead is amortised over many lines of logic! Instead the code becomes all error handling with logic… in there… somewhere. Good luck even spotting it.
Second, I don’t see the benefit of glu (specifically) over established IaC systems such as Pulumi — which is polyglot and allows the use of languages that aren’t mostly repetitive error handling ceremony.
This seems like an internally developed tool that suits the purposes of a single org “thrown over the fence” in the hope that the open source community will contribute to their private tool.
So are they talking about some sort of meta language compiling into multiple yaml configs for the different environments or a single separate CI tool that has plugins and integrates with GitHub/gitlab/etc?
I do agree with them about the need for a real programming language. I hate yaml in gitlabs config, it is very hard to interpret how it will be interpreted. Things were much easier when I was scripting Jenkins even though I didn't know or like groovy then with gitlab
Said kindly, no, they’re not. They’re just stating values here, imho, not impl detail
I've worked in this space for a long time and can't make head or tail of what glue is.
A motivating examplt would be help which I might have missed?
I look forward to seeing some matrix eval of impl strategies against these values
> The Fix: Use a full modern programming language, with its existing testing frameworks and tooling.
I was reading the article and thinking myself "a lot of this is fixed if the pipeline is just a Python script." And really, if I was to start building a new CI/CD tool today the "user facing" portion would be a Python library that contains helper functions for interfacing with with the larger CI/CD system. Not because I like Python (I'd rather Ruby) but because it is ubiquitous and completely sufficient for describing a CI/CD pipeline.
I'm firmly of the opinion that once we start implementing "the power of real code: loops, conditionals, runtime logic, standard libraries, and more" in YAML then YAML was the wrong choice. I absolutely despise Ansible for the same reason and wish I could still write Chef cookbooks.
It also serves as a natural sandbox for the "setup" part so we can always know that in a finite (and short) timeline, the script is interpreted and no weird stuff can ever happen.
Of course, there are ways to combine it (e.g. gitlab can generate and then trigger downstream pipelines from within the running CI, but the default is the script. It also has the side effect that pipeline setup can't ever do stuff that cannot be debugged (because it's running _before_ the pipeline) But I concede that this is not that clear-cut. Both have advantages.
My argument is that we should acknowledge that any CI/CD system intended for wide usage will eventually arrive here, and it's better that we go into that intentionally rather than accidentally.