Bazel – Correct, reproducible, fast builds for everyone (opens in new tab)

(bazel.io)

625 pointsdrivebyubnt11y ago174 comments

174 comments

113 comments · 30 top-level

habosa11y ago· 15 in thread

Working at Google, Blaze is one of the technologies that amazes me most. Any engineer can build any Google product from source on any machine just by invoking a Blaze command. I may not want to build GMail from source (could take a while) but it's awesome to know that I can.

I think this could be hugely useful to very large open source projects (like databases or operating systems) that may be intimidating for contributors to build and test.

emu11y ago

Standard caveat: I don't speak for my employer.

Using Bazel (aka. Blaze) every day is one of the things that has made me dread ever leaving Google. Fast, reproducible builds are amazing. Once you have used this tool, it is very hard to go back. Personally, I'm thrilled that it has been open sourced.

GauntletWizard11y ago

Having recently left Google, GRPC (stubby) was my biggest concern; I spent about two weeks hacking together a good code generator for GoRPC before GRPC came out and obviated the time. Now, I'm glad I haven't bothered with a build system, which was going to be next.

Nice to see a bunch of projects that've been generalizable and heavily used internally finally see the light of the outside world. Now, to start evangelizing them.

1 more reply

nevir11y ago

Well, that's also largely due to all the source (transitive dependencies) being present in one monolithic repo.

woah11y ago

Yea what's up with that? Sounds like a pretty terrible practice to me. Is that something that c++ forces you to do?

2 more replies

malkia11y ago

I've started at Google 4 months ago, and it's one of the best things to discover. Now open-sourced :)

philippnagel11y ago

Though not really open source ;)

1 more reply

OneMoreGoogler11y ago

> Any engineer can build any Google product from source on any machine

A little too optimistic :) You can't build Android, Chrome, ChromeOS, iOS apps, etc. via blaze.

ovidiup11y ago

When I worked at Google I built a Blaze extension to be able to build Android apps. It worked really well, though I'm not sure how well it was maintained after I left in 2010. Internally at Google, Blaze was extremely customizable, and I hope Bazel too, so one can easily add support for building iOS apps etc.

EDIT #1: I see support for building Objective-C apps is already present in Bazel. EDIT #2: Bazel uses Skylark, a Python-like language, which could be used to implement all sorts of extensions, including the one I was referring to.

2 more replies

rpereira11y ago

http://i.imgur.com/zQ1m75U.gif

ambrop711y ago

Look at Nix & NixOS: http://nixos.org/ It would be interesting to see a comparison of Bazel/Blaze to Nix.

solve11y ago

Wow, sounds like enterprise Gentoo, in a good way.

thrownaway242411y ago

Yes, Blaze and a hojillion computers will give you a spiffy build system. The public now has the former, but not the latter :)

skj11y ago

One piece at a time! Also, who is to say that Google's way of orchestrating those hojillion computers is best? Separating the two pieces, as has been done here, makes it possible for others to create different (and maybe better) orchestrations.

middleclick11y ago

So the builds are reproducible automatically?

hanwenn11y ago

See http://bazel.io/docs/FAQ.html, "Will Bazel make my builds reproducible automatically?

For Java and C++ binaries, yes, assuming you do not change the toolchain. If you have build steps that involve custom recipes (eg. executing binaries through a shell script inside a rule), you will need to take some extra care:

Do not use dependencies that were not declared. Sandboxed execution (–spawn_strategy=sandboxed, only on Linux) can help find undeclared dependencies.

Avoid storing timestamps in generated files. ZIP files and other archives are especially prone to this.

Avoid connecting to the network. Sandboxed execution can help here too.

Avoid processes that use random numbers, in particular, dictionary traversal is randomized in many programming languages."

2 more replies

latkin11y ago· 15 in thread

Correct, reproducible, fast builds for everyone not running Windows

jacquesm11y ago

Convince your employer to ship a half decent unix environment with its OS and it will run on windows too. It's mostly a choice by microsoft to ship a half-baked command line interface with its products, you can't blame google for that.

NeutronBoy11y ago

By 'half-baked' you mean 'not * nix compatible' command line. Powershell is amazing.

1 more reply

davidgerard11y ago

How's it go with Cygwin or mingw?

1 more reply

izacus11y ago

Not really sure why this is getting downvoted - it's kind of an important detail when choosing a build system if you have to do multiplatform deployment.

melling11y ago

I didn't downvote him but it sounds like "he's driving angry". He does work for that "evil" company in Redmond. :-). Maybe he can help make F# a great cross-platform language and increase the goodwill? Microsoft did an incredible job with F# but it really only runs well on Windows.

3 more replies

istvan__11y ago

I am pretty sure it has nothing to do with the average population being biased.

krschultz11y ago

s/not running Windows/not running Windows or refusing to install a free VM/

5 or 6 years ago I had to have Windows to run CAD software, but I found it easier to have a virtualbox install w/ Ubuntu in it for software development than trying to write code on Windows. The performance was good enough an the usability was pretty good. I imagine it has only gotten better since then.

jacquesm11y ago

One company I work with has a firewall of a certain brand. It only works with windows or a mac, you can't connect to it from a command line it needs some stupid app that you download and install just to make a VPN connection.

Not having a unix specific build system work on windows seems to be pretty much the expected behaviour. As opposed to a firewall that runs linux internally requiring windows or OS/X to talk to it...

1 more reply

CurtHagenlocher11y ago

Using a VM would only work if the individual build tools themselves run under Unix. That said, no open source project owes anybody anything.

1 more reply

sangnoir11y ago

...Yet. It is open source: you are welcome to port it to Windows and I'm sure they would be happy to accept your patch.

EricBurnett11y ago

https://news.ycombinator.com/item?id=9257187

If you've got the expertise to add it, sounds like it would be welcomed.

mellery45111y ago

a good point - which I why I will be sticking with CMake for my cross-platform build needs.

cbhl11y ago

While running bazel isn't supported on Windows, you might be able to generate Windows binaries by cross-compiling from Linux.

spankalee11y ago

Or Plan 9, Haiku, OS/2, Amiga OS...

tkubacki11y ago

With newer Windows servers you can use HyperV to run *nix apps

rquirk11y ago· 8 in thread

What is this lameness? https://github.com/google/bazel/tree/master/third_party - why not use gradle repos to download jars with known hashes? Sticking all those jars in the git repo is just... well, I expected better from Google.

jdlshore11y ago

Try not to be so rude.

The FAQ is pretty clear about their reasons. It talks about tools, not other dependencies, but I'm sure the reasoning is the same: "Your project never works in isolation... To guarantee builds are reproducible even when we upgrade our workstations, we at Google check most of these tools into version control, including the toolchains and Bazel itself."

It's a sensible policy and one I use myself. Do you have a better reason for disliking this policy than a knee-jerk "yuck?"

rquirk11y ago

Right, I'll try not to be so grouchy :-D

Some reasons are the bloat, the possibility of "accidental" forks when a non-upstream version is compiled and checked-in binary-only, crufty old versions hanging around, and security problems. It adds extra work for downstream packagers having to pick it apart for distros.

Bundling gets particularly bloaty for git repos, since the history is always included in each clone. For perforce or SVN it doesn't matter so much as you only get the latest version of everything. In git each time there's a dependency update, it will pretty much add the size of the new jar to the .git directory. Over time it's going to grow huge. If at a later date the repository owner decides on a new policy where the third party files are not bundled, then even removing the directory from the current head doesn't shrink the repo size.

There are binaries in there for Mac, Linux and Windows (.exe file at least). You either need one or the other, not all at the same time.

This sort of thing is fine for proprietary software used in a controlled environment, but for open source it looks kludgy.

An alternative could be to have a "dependencies" repository that would be shallow-cloned as needed. At least that way the source code repo only would have source in it, not jars or executables. It'd ensure separation was enforced and you could still track requirements per version or change the policy later.

sarnowski11y ago

Tbh, if your company relies on this software, I would also make sure that it cannot just vanish - and thats the most efftive solution. Artifacts can disappear from the internet and you don't know if the downloaded stuff is still the same as before. Especially, if you look outside of the maven ecosystem, but even there you have to rely on apache and their partners. An outage can mean that you cannot deploy critical bugfixes to your platform.

adambatkin11y ago

This is why you should have repository manager like JFrog Artifactory or Sonatype Nexus which can transparently proxy third-party repositories (like Maven Central).

1 more reply

pnathan11y ago

If you have a centralized version control system such as Clearcase or SVN, it's not such a grief to have binaries in VCS, whereas its kind of a problem for git & co.

Google has a legendarily awesome centralized version control system.

krupan11y ago

"legendarily awesome centralized version control system"

I thought it was just perforce.

2 more replies

MichaelGG11y ago

Well keeping dependencies in source means no third party dependency at build time, right?

rquirk11y ago

Right... but this will be an anchor on adoption. I can see why e.g. the Android build system does the same thing since it's all off in its own world anyway. I doubt you'd be popular with Linux distro packagers if you required bazel for some C library.

1 more reply

zobzu11y ago· 7 in thread

I had a bit of a read but I didn't find where it explains (code or doc) how it achieves reproducible builds.

It seems like a stricter, huge make-like harness (in fact it reminds me of the mozilla firefox python build system a bit).

It's not bad by any means, but it seems like to me it doesn't "magically" fix the "be reproducible" problem at all (which is what it seem to claim)

Am I missing something?

lberki11y ago

You are absolutely correct: Bazel by itself does not make your builds reproducible. If a tool calls rand() or bakes the current time into its output, reproducibility goes out of the window.

What Bazel does, however, is to make it possible to run build steps in a sandbox (although the current one is kinda leaky) so that your build is isolated from the environment and thus behaves in the same way on any computer. It also tracks dependencies correctly so that it knows when a specific action needs to be re-run.

This makes it possible to diagnose non-reproducible build steps easily. At Google, the hit rate of our distributed build cache usually floats around 99%, and this would be impossible without reproducible build steps.

jrockway11y ago

Does work done by Debian to make Linux packages build reproducibly help Bazel?

https://wiki.debian.org/ReproducibleBuilds

Would Bazel help with the remaining long tail of packages in Debian?

yonran11y ago

Conceptually, your build results should be a pure function of your source tree. If I understand correctly, within Google, the cross-compilers are actually checked in to the source tree, so that the distributed jobs will use the same compiler to build your code. It seems like currently bazel only uses whatever is in /usr/bin though[0]. For Java compilations, bazel additionally has its own jar builder that sorts the filenames and zeros the timestamps within the zip file[1].

[0]: https://github.com/google/bazel/tree/master/tools/cpp [1]: https://github.com/google/bazel/tree/master/src/java_tools/b...

nevir11y ago

You're right - it doesn't magically solve build reproducibility. Bazel pushes you towards a build configuration where you have to describe (in a terse way) the entire dependency graph of what is being built. It allows Bazel to be smart about where in the graph things are stale.

If you run a script that outputs intermediate files, Bazel needs to know about that scripts inputs and outputs. And it works better if it knows them ahead of time.

indygreg211y ago

I invented the Python bits of the Firefox build system (moz.build files). I learned after I implemented them that Google's internal approach with Blaze was very similar. It felt reassuring that I independently reinvented a similar solution :)

There are a handful of Blaze derivatives built by Xooglers. Pants and Buck come to mind. They also share the trait of using sandboxed Python to define a build configuration. I'll take it over make syntax any day!

skybrian11y ago

It's not magic; you have to work at it. (For example, make sure that zip doesn't put timestamps in the file.) But it's designed so that code generators should act as pure functions from input files to output files, and many generators actually are, especially the built in ones. If you do this then the build system will help you.

Writing generators to run this way is kind of a pain, actually, sort of like writing code to run in a sandbox. Also, the generators themselves must be checked in, and often built from source. But we consider the results worth it.

drothlis11y ago

I gather that it runs builds inside a chroot where the only available files are the dependencies you specified explicitly (including the compiler[1]), at least in "strict" mode[2]. Or else it must monitor what files are opened during the build step and fails the build if it saw an unexpected file being opened.

It never explains any of this explicitly, but there are hints. [1], [2], [3].

[1] "Many rules also have additional attributes for rule-specific kinds of dependency, e.g. 'compiler'" -- http://bazel.io/docs/build-ref.html#types_of_dependencies

[2] http://bazel.io/docs/build-encyclopedia.html#cc_binary.hdrs_...

[3] "The build system runs tests in an isolated directory where only files listed as 'data' are available" -- http://bazel.io/docs/build-ref.html#data

Edit: A comment below seems to suggest that this is not the case: "Within Google we use a form of sandboxing to enforce that" (emphasis mine). -- https://news.ycombinator.com/item?id=9259147

yarapavan11y ago· 6 in thread

Surprisingly, significant parts of the code is not open source. According to this page, http://bazel.io/docs/governance.html,

   Is Bazel developed fully in the open?

   Unfortunately not. We have a significant amount of code
   that is not open source; in terms of rules, only ~10% of 
   the rules are open source at this point. We did an 
   experiment where we marked all changes that crossed the
   internal and external code bases over the course of a few 
   weeks, only to discover that a lot of our changes still 
   cross both code bases.

spankalee11y ago

I don't think you're interpreting that section quite right. That section is talking about whether or not Bazel is fully _developed_ in the open, and the answer is "Unfortunately not".

What they mean is that changes to the internal source of Blaze often involve changes to both the open sourced part, which is Bazel, and the closed parts, which are additional rules that are neither open sourced, nor included in Bazel (Blaze has about 5x as many rules as Bazel).

It's best to make atomic changes, so rather than split the changes, review and submit the open source changes externally, and the closed rules changes internally (which would complicate reviews, testing, syncing and rollbacks), then pull in the external changes, they submit these cross-code-base changes internally, then dump the change into the external repo. The next paragraph on that page makes it clear that the code is open, even if not all of the development process is.

To be clear, all of Bazel is open source and the source is available here: https://github.com/google/bazel

tarblog11y ago

Can you explain or give an example of a "rule", it's unclear what this means to me.

1 more reply

lberki11y ago

Currently about 60% of our code (in terms of lines of Java code, excluding tests) is open sourced. The rest is glue logic to internal Google systems or build rules that we haven't open sourced. Some of these rules, we are planning to open source in the future, and some others are specific to Google, so they don't really make much sense in the open source tree.

georgehm11y ago

What about skyframe? http://bazel.io/docs/skyframe.html looks like an overview without any examples. Couldn't find any references to it in the bazel code at github too.

1 more reply

Symmetry11y ago

Do they mean that 10% of the original Blaze rules are now open source or that 10% of the Bazel rules they've released are open source?

DannyBee11y ago

The former.

pron11y ago· 5 in thread

How does it compare with Java 9's sjavac (http://stackoverflow.com/a/26424760/750563)?

EDIT: I fully understand that this is a build tool for multiple languages. But its raison d'etre is speed. So I'm asking what techniques does Bazel use to accelerate builds and how do they differ from those used by sjavac, which is also designed to accelerate builds of huge projects?

hanwenn11y ago

I work on Bazel.

Bazel also builds other languages, such as C++ and Objective-C.

We do invoke the Java compiler through a wrapper of our own. We think we can make that work as a daemon process to benefit from a hot JVM, but haven't gotten round to that.

moondowner11y ago

Any plans on supporting Windows? That will definitely increase the adoption of Bazel.

1 more reply

pron11y ago

Do you also use timestamps like sjavac or some other mechanism, like hashing?

1 more reply

Sphax11y ago

I don't think they're even related, Bazel is a general build tool, sjavac looks like a smarter Java compiler ?

pron11y ago

... that exploits parallelism and caching (and a hot VM) to accelerate build of huge projects, and supports build clusters.

2 more replies

setheron11y ago· 4 in thread

If i'm sticking to primarily Java; is there a benefit to using Bazel as opposed to Maven / Gradle / Sbt ?

astral30311y ago

At first impression, unless you have a single gigantic source code base, unlikely. From their FAQ:

>> "Gradle: Bazel configuration files are much more structured than Gradle's, letting Bazel understand exactly what each action does. This allows for more parallelism and better reproducibility"

The value of "more parallelism" depends on the complexity of your Java source code base. I can easily imagine why this extra structure can lead to more parallelism.

However, I am not buying "better reproducibility" without justification or explanation. I've had very reproducible Maven builds for years (and I don't see how Gradle would be different). So I would love to know which aspects are improved upon with this structure, if someone could expand or explain.

Finally, I'm very wary of "much more structure". The worst thing about Maven is its extreme insistence on structure and schema and very specific architecture of your build tasks and components. In contrast, with Gradle, you can freely shape your build scripts to reflect the "build architecture" of your source tree in a minimal, maintainable way. Furthermore, when your application's needs change, refactoring your build is far easier in Gradle, thanks to its internal-DSL style (the build script is code).

If the structure isn't "free", you pay for structure with reduced build script development speed. For Google, it's a tradeoff worth having with that massive source tree.

ulfjack11y ago

I work on Bazel.

We've put a bunch of work into making sure that we know about every file that goes into the Java compilation, and if any of them changes (and only then) do we recompile. Within Google, we use a form of sandboxing to enforce that.

You're also right that it isn't free - we have reason to believe that larger projects and larger teams will see benefits from using Bazel. Use your best judgement.

asuffield11y ago

blaze is nothing remotely like the wall of cruft that maven forces you to climb for everything you do. I would describe it as "almost entirely unlike maven".

vorg11y ago

The Bazel query language has a far nicer syntax than Maven's XML without the risk of Gradle's full procedural language Groovy.

Zariel11y ago· 4 in thread

Is this the tool that Google uses to build its Golang source? Or is that something else which is not available?

kchod11y ago

The Golang source code for the server code at google is built with this tool. The rules that accomplish this are rather complex due to their interactions with our C++ libraries, and predates the open source "Go" tool. The experience with the Google internal rules, motivated some of the choices in the "go" tool, I believe.

If you're interested, hanwen wrote a bunch rules with similar semantics as the internal rules, see https://github.com/google/bazel/tree/master/base_workspace/e... .

It would be nice to make these semantics match the external ones better, but it requires us to open up more tooling, so people won't need to write BUILD files.

runlevel111y ago

There's a typo in your link. Should be:

https://github.com/google/bazel/tree/master/examples/go

zzzhao11y ago

In what cases would using Bazel make sense to build Go projects? If they're extremely large? If they have a lot of dependencies on code in other languages? If you need sophisticated build/release tooling?

BTW, thanks for the release! Will have a fun time digging through this over the next few days. I heard some murmurs that Blaze was going to be open sourced from around the watercooler but didn't think it'd be so soon.

1 more reply

jwcrux11y ago

You can build golang from source pretty easily. If I remember right, it's just downloading the tarball and running ./all.bash or something like that.

pacala11y ago· 3 in thread

A couple of questions:

* If I have a Maven-based project with heavy reliance on pre-built jars from Maven Central, what's the recipe to port it to Bazel?

* Related, if I have multiple github repos, say a couple open source libraries and a couple private repos, what's a good recipe in conjunction to Bazel?

kchod11y ago

Check out http://bazel.io/docs/build-encyclopedia.html#maven_jar. In the root of your build, specify the jars you want from maven and then add them as dependencies in your BUILD files. The first time you run "bazel build", they'll be downloaded and cached from then on. It's somewhat limited in functionality at the moment, but should work for basic "download and depend on a jar".

For multiple Github repos, use http://bazel.io/docs/build-encyclopedia.html#http_archive or http://bazel.io/docs/build-encyclopedia.html#new_http_archiv... (depending on if it's a Bazel repository or not). Let us know if you have any questions or issues!

pacala11y ago

Thanks for the tips. I'm super-hyped that blaze was open sourced, it is one of the best systems I've ever had the pleasure to work with.

A couple more questions :)

* Any pointers for adding Scala (sbt?) support? I'd start here: http://bazel.io/docs/skylark/rules.html.

* Suppose I develop using multiple repos and http_archive. I'd like to make changes both to a library and to a project that depends on it simultaneously, without committing the library patches to master github repo just yet. Is there a way to configure the http_archive, let's say by saying "bazel --mode=local", and have it customize the remote archive http to use a different url (say, my github's fork instead of the master github) for that build?

1 more reply

needusername11y ago

Regarding Maven: - how do you resolve artifacts (eg. are you using Aether)? - are you supporting classifier and type for dependencies?

cromwellian11y ago· 3 in thread

If only our code search and code review systems were public too.

solomatov11y ago

BTW, do you have blaze build for gwt? ant seems unwieldy for me.

cromwellian11y ago

Internally to google, gwt_application, gwt_module, gwt_test is a built-in rule. GWT itself is built with blaze internally (not ant) as well.

1 more reply

bruckie11y ago

Yes.

thechao11y ago· 2 in thread

I've been burned by so many build tools over the years. I've finally settled (for C/++/asm) on the combination of Make + ccache: I build a _very_ paranoid Makefile that recompiles everything if it feels like anything changes. For instance, every rule that compiles a C/++ file is invoked if _any_ header/inc/template file changes. I let ccache do the precise timestamp/check-sum based analysis. The result is that (for large builds < 10MMLOC) I rarely wait for more than a few hundred milliseconds on incremental, _and_ I have confidence that I never miscompile.

I just wish that I had a high-performance replacement for linking that was cross-platform (deterministic mode for ar), and for non-C/++ flows. Writing a deterministic ar is about 20 lines of C-code, but then I have to bake that into the tool in awkward ways. For generalized flows, I've looked at fabricate.py as a ccache replacement, but the overhead of spinning up the Python VM always nukes performance.

beagle311y ago

> I build a _very_ paranoid Makefile that recompiles everything if it feels like anything changes.

Do you have some kind of way to verify that your makefile dependencies conform to your source dependencies? Is clang/gcc tracking sufficient for your use case? What about upgrading the compiler itself, does your makefile depend on that? If so, how?

Have you considered tup[0]? Or djb-redo[1]? Both seem infinitely better than Make if you are paranoid. tup even claims to work on Windows, although I have no idea how they do that (or what the slowdown is like). Personally, I'm in the old Unix camp of many-small-executables, non of which goes over 1M statically linked (modern "small"), so it's rarely more than 3 secs to rebuild an executable from scratch.

> (deterministic mode for ar)

Why do you care about ar determinism? Shouldn't it be ld determinism you are worried about?

[0] http://gittup.org/tup/

[1] https://github.com/apenwarr/redo

thechao11y ago

> Do you have some kind of way to verify that your makefile dependencies conform to your source dependencies?

Nope. I explicitly use a conservative approximation—this guarantees correctness, over speed. Building everything every time with a clean tree is where I begin; I start optimizing after that.

> Is clang/gcc tracking sufficient for your use case? What about upgrading the compiler itself, does your makefile depend on that? If so, how?

Self-rewriting Makefiles (to consume the .d files), combined with the cleaning necessary for them, become a large technical debt—especially given the complexity of the Makefile needed to generate them. Modern CCen just aren't capable of this. Perhap Doug Gregor's module system will land in C21/C++21, and we'll see some good, then.

> Have you considered tup[0]? Or djb-redo[1]?

Yes. They are both don't provide significantly better correctness guarantees combined with sufficiently better performance to justify the cost to porting to older Unixen. (This is a consensus opinion at my shop; I, personally, enjoy tup.)

> Why do you care about ar determinism? Shouldn't it be ld determinism you are worried about?

Determinism let's me cache *.o/a/so/dylib/exe/whatnot without getting false-positives due to time-stamp changes and owner/group permissions in the obj/ar files (see ar(1)). ld is deterministic under all the CCen I use by setting the moral-equivalent of -frandom-seed.

1 more reply

mashraf11y ago· 2 in thread

Is Google departing from just throwing white papers over the wall and let community figure out the implementation details? blaze white paper was dropped a while ago and there are already two clones in Pants and Buck at Twitter and FB. It would be interesting to see how far off clones are from original implementation.

cbgb11y ago

Do you have a link to that white paper? A quick search on their research site doesn't really yield any results.

kchod11y ago

I'm a developer on Bazel, and AFAIK there is no white paper. We definitely don't want to "throw it over the wall," we're going to try to push more and more development into the open over time.

2 more replies

w4tson11y ago· 2 in thread

It's another impressive feat from Google and reading the comments I've kind of established that

1. Binaries are checked in to source 2. It's more structured than Gradle 3. It's for very large code bases 5. It's nix only

But...

1. We've already had the "chuck it in a lib directory" approach. The distributed approach maven/ivy etc seems to be working for the millions of developers out there who just have to get through the end of the day without production going up in flames. I suppose it's like moving a portion maven central into your code base. Checked in. Feels very odd, and kinda against one of the pillars of JVM: Maven. Love it or hate it it's one of most mature build/repository types out there. npm, bower anyone?

2. Got to agree with astral303. This isn't really something to shout about. Better reproducibility? Gradle/SBT have had incremental builds for quite a while. We all know there's no silver bullet, if you don't declare your inputs and outputs to gradle/blaze tasks or seed with random values then you're only going to get unrepoduceable builds.

3. Very large, I get that.

4. Very large code bases tend to enterprise systems. Enterprise systems tend to have a plethora of platforms/OSs so it being

nix only is a drawback. However I suppose that if in charge of 10MLOC code base then I could mandate nix only builds? However in my experience they also tend to gravitate towards standards that seem to have longevity.
I'm yet to give it a go so I'll reserve final judgement. However I will say that I do wonder how far we'd be if Googles through their brightest minds at and worked with Maven/Gradle/SBT etc to scale their builds. (Yes I realise it's multi-lang - so is gradle). Perhaps the whole community would benefit from performance benefits.
Anyway hats off Google guys. It looks impressive and no doubt I'll jumping all over it in 12 months. In the mean time I'm off to go read up on Angular 2.0, or Typescript or ES6 or ES7 or whatever else I
need* to know to get me through the day.

Really I'm just jealous I don't have 10MLOC code base :D

cromwellian11y ago

I don't know about Bazel, but Blaze doesn't "check in binaries". Build artifacts are cached, but not "checked in".

The problem with maven and gradle is that their build actions/plugins can have have unobservable side effects.

This approach is more 'pure functional'. You have rules which take inputs, run actions, produce outputs and memoize them. If inputs don't change, then you use memoized outputs and don't run the action.

As long as your actions produce observable side effects in the outputs (and don't produce side effects which are not part of the outputs, but product state which depended upon in some manner), then you can do a lot of optimizations on this graph.

In my experience with maven and gradle, they are way way slower, and that's on relatively small projects

w4tson11y ago

Apologies for comment- I'd just gotten home from the pub was drunk :D

I look forward to trying it out. The ObjectiveC rules sound interesting especially given the state of XCode which is a laughable IDE.

pjjw11y ago· 2 in thread

Any reason the python support was ripped out? I've got my suspicions about not wanting/not being able to properly release the python packaging method in use internally, but I'm curious if I'd be tilting at windmills to try and get it to output pexes.

DannyBee11y ago

I suspect the reason was: "They need to start with something and go from there".

So they started with the use cases likely to be the most popular.

Additionally, there are definitely cases where the implementations of rules at Google are a morass, and rather than dump it on the open source community, it makes more sense to clean them up when they get rebuilt.

fridek11y ago

The same question about JS. Closure Compiler never made much sense for me without blaze.

jibu11y ago· 2 in thread

Maven doesn't work so well when there are loads of small self contained 'micro-libraries' (yes, sub-projects, but they are so involved to set up they almost defeat the purpose). Was considering pants -- which doesnt seem like it has great adoption? -- but this seems like its substantially more fully featured.

Presumably will also make opensourcing internal projects easier. That can't be a bad thing :)

spullara11y ago

WRT to Java support: Since it doesn't appear to generate poms or publish to maven repositories it doesn't seem very useful on the open source part of things. It seems explicitly for generating internal, proprietary software from a monolithic source tree. I would have much rather seen the incremental compiler and jar generator integrated to maven than replacing the entire build system.

alblue11y ago

Actually Maven 3.3 was released recently which has a smart builder for building separate parts in parallel, and using Takari plugins you can use the Eclipse complier which is parallelising in itself. See http://takari.io for more details.

cies11y ago· 1 in thread

What would be needed to get this to work with Haskell?

I read in the "Getting started":

> You can now create your own targets and compose them.

So does this mean it is a replacement for `make`? => Yes

Found the answer here: http://bazel.io/docs/FAQ.html

kchod11y ago

If you're interested in adding rules for a new language, check out Skylark: http://bazel.io/docs/skylark/concepts.html.

malkia11y ago· 1 in thread

Oh, but my favourite option "blaze menu" is missing :)

asuffield11y ago

Huh. I never knew that was there. I'll remember this next time I'm around Charleston.

shmerl11y ago· 1 in thread

> Why doesn't Google use …? Make, Ninja: These tools give very exact control over what commands get invoked to build files, but it's up to the user to write rules that are correct.

> Users interact with Bazel on a higher level. For example, it has built-in rules for "Java test", "C++ binary", and notions such as "target platform" and "host platform". The rules have been battle tested to be foolproof.

But does it give the optional custom level of control that for example CMake + Ninja provide? Or it's only high level rules?

blinks11y ago

http://bazel.io/docs/skylark/concepts.html

You can [at least internally] define custom rules to handle pretty much anything, in almost-but-not-quite-python.

ngd11y ago

This is an open sourcing of Google's internal build tool.

I know it as Blaze, which Bazel is an anagram of. Many files in the source have references to Blaze.

jacquesm11y ago

Getting rid of the timestamps in jar files is a huge improvement. I really hate it that when I recompile some huge java project I can't run a checksum on the jar to verify that the build is identical to a previous run (or when being dumped into some project that my current source tree is an accurate reflection of what is running in production).

mikojava11y ago

Here's the Gradle Team's perspective on Bazel

https://www.gradle.org/gradle-team-perspective-on-bazel/

frownie11y ago

From the FAQ :

Multi-language support: Bazel supports Java, Objective-C and C++ out of the box, and can be extended to support arbitrary programming languages.

c'mon, not even the Go language from Google itself ?

nchelluri11y ago

I worked at Ning for a couple of years (http://www.ning.com/) and the internal codename of our create-your-own social network was Bazel.

When I first saw the headline I thought they'd open-sourced it.

danneu11y ago

The "b"-with-leaves-sprouting-from-it logo is also used by http://beanstalkapp.com/

brooksbp11y ago

Will GYP/GN be deprecated in favor of Bazel?

What, if any, does the convergence among these projects look like longevity-wise?

forrestthewoods11y ago

Will there ever be Windows support?

hbhakhra11y ago

This seems very promising. Does anyone know if this would this work with the OSGI framework?

zerr11y ago

Fast - compared to what?

bubersson11y ago

Wohoo! This is awesome :)

toolslive11y ago

depends what you mean with reproducible: build a jar twice, and its md5sum will change because there are timestamps in the archive.

j / k navigate · click thread line to collapse

174 comments

113 comments · 30 top-level

habosa11y ago· 15 in thread

I think this could be hugely useful to very large open source projects (like databases or operating systems) that may be intimidating for contributors to build and test.

emu11y ago

Standard caveat: I don't speak for my employer.

GauntletWizard11y ago

Nice to see a bunch of projects that've been generalizable and heavily used internally finally see the light of the outside world. Now, to start evangelizing them.

1 more reply

nevir11y ago

Well, that's also largely due to all the source (transitive dependencies) being present in one monolithic repo.

woah11y ago

Yea what's up with that? Sounds like a pretty terrible practice to me. Is that something that c++ forces you to do?

2 more replies

malkia11y ago

I've started at Google 4 months ago, and it's one of the best things to discover. Now open-sourced :)

philippnagel11y ago

Though not really open source ;)

1 more reply

OneMoreGoogler11y ago

> Any engineer can build any Google product from source on any machine

A little too optimistic :) You can't build Android, Chrome, ChromeOS, iOS apps, etc. via blaze.

ovidiup11y ago

2 more replies

rpereira11y ago

http://i.imgur.com/zQ1m75U.gif

ambrop711y ago

Look at Nix & NixOS: http://nixos.org/ It would be interesting to see a comparison of Bazel/Blaze to Nix.

solve11y ago

Wow, sounds like enterprise Gentoo, in a good way.

thrownaway242411y ago

Yes, Blaze and a hojillion computers will give you a spiffy build system. The public now has the former, but not the latter :)

skj11y ago

middleclick11y ago

So the builds are reproducible automatically?

hanwenn11y ago

See http://bazel.io/docs/FAQ.html, "Will Bazel make my builds reproducible automatically?

Do not use dependencies that were not declared. Sandboxed execution (–spawn_strategy=sandboxed, only on Linux) can help find undeclared dependencies.

Avoid storing timestamps in generated files. ZIP files and other archives are especially prone to this.

Avoid connecting to the network. Sandboxed execution can help here too.

Avoid processes that use random numbers, in particular, dictionary traversal is randomized in many programming languages."

2 more replies

latkin11y ago· 15 in thread

Correct, reproducible, fast builds for everyone not running Windows

jacquesm11y ago

NeutronBoy11y ago

By 'half-baked' you mean 'not * nix compatible' command line. Powershell is amazing.

1 more reply

davidgerard11y ago

How's it go with Cygwin or mingw?

1 more reply

izacus11y ago

Not really sure why this is getting downvoted - it's kind of an important detail when choosing a build system if you have to do multiplatform deployment.

melling11y ago

3 more replies

istvan__11y ago

I am pretty sure it has nothing to do with the average population being biased.

krschultz11y ago

s/not running Windows/not running Windows or refusing to install a free VM/

jacquesm11y ago

Not having a unix specific build system work on windows seems to be pretty much the expected behaviour. As opposed to a firewall that runs linux internally requiring windows or OS/X to talk to it...

1 more reply

CurtHagenlocher11y ago

Using a VM would only work if the individual build tools themselves run under Unix. That said, no open source project owes anybody anything.

1 more reply

sangnoir11y ago

...Yet. It is open source: you are welcome to port it to Windows and I'm sure they would be happy to accept your patch.

EricBurnett11y ago

https://news.ycombinator.com/item?id=9257187

If you've got the expertise to add it, sounds like it would be welcomed.

mellery45111y ago

a good point - which I why I will be sticking with CMake for my cross-platform build needs.

cbhl11y ago

While running bazel isn't supported on Windows, you might be able to generate Windows binaries by cross-compiling from Linux.

spankalee11y ago

Or Plan 9, Haiku, OS/2, Amiga OS...

tkubacki11y ago

With newer Windows servers you can use HyperV to run *nix apps

rquirk11y ago· 8 in thread

jdlshore11y ago

Try not to be so rude.

It's a sensible policy and one I use myself. Do you have a better reason for disliking this policy than a knee-jerk "yuck?"

rquirk11y ago

Right, I'll try not to be so grouchy :-D

There are binaries in there for Mac, Linux and Windows (.exe file at least). You either need one or the other, not all at the same time.

This sort of thing is fine for proprietary software used in a controlled environment, but for open source it looks kludgy.

sarnowski11y ago

adambatkin11y ago

This is why you should have repository manager like JFrog Artifactory or Sonatype Nexus which can transparently proxy third-party repositories (like Maven Central).

1 more reply

pnathan11y ago

If you have a centralized version control system such as Clearcase or SVN, it's not such a grief to have binaries in VCS, whereas its kind of a problem for git & co.

Google has a legendarily awesome centralized version control system.

krupan11y ago

"legendarily awesome centralized version control system"

I thought it was just perforce.

2 more replies

MichaelGG11y ago

Well keeping dependencies in source means no third party dependency at build time, right?

rquirk11y ago

1 more reply

zobzu11y ago· 7 in thread

I had a bit of a read but I didn't find where it explains (code or doc) how it achieves reproducible builds.

It seems like a stricter, huge make-like harness (in fact it reminds me of the mozilla firefox python build system a bit).

It's not bad by any means, but it seems like to me it doesn't "magically" fix the "be reproducible" problem at all (which is what it seem to claim)

Am I missing something?

lberki11y ago

You are absolutely correct: Bazel by itself does not make your builds reproducible. If a tool calls rand() or bakes the current time into its output, reproducibility goes out of the window.

jrockway11y ago

Does work done by Debian to make Linux packages build reproducibly help Bazel?

https://wiki.debian.org/ReproducibleBuilds

Would Bazel help with the remaining long tail of packages in Debian?

yonran11y ago

[0]: https://github.com/google/bazel/tree/master/tools/cpp [1]: https://github.com/google/bazel/tree/master/src/java_tools/b...

nevir11y ago

If you run a script that outputs intermediate files, Bazel needs to know about that scripts inputs and outputs. And it works better if it knows them ahead of time.

indygreg211y ago

skybrian11y ago

drothlis11y ago

It never explains any of this explicitly, but there are hints. [1], [2], [3].

[1] "Many rules also have additional attributes for rule-specific kinds of dependency, e.g. 'compiler'" -- http://bazel.io/docs/build-ref.html#types_of_dependencies

[2] http://bazel.io/docs/build-encyclopedia.html#cc_binary.hdrs_...

[3] "The build system runs tests in an isolated directory where only files listed as 'data' are available" -- http://bazel.io/docs/build-ref.html#data

Edit: A comment below seems to suggest that this is not the case: "Within Google we use a form of sandboxing to enforce that" (emphasis mine). -- https://news.ycombinator.com/item?id=9259147

yarapavan11y ago· 6 in thread

Surprisingly, significant parts of the code is not open source. According to this page, http://bazel.io/docs/governance.html,

   Is Bazel developed fully in the open?

   Unfortunately not. We have a significant amount of code
   that is not open source; in terms of rules, only ~10% of 
   the rules are open source at this point. We did an 
   experiment where we marked all changes that crossed the
   internal and external code bases over the course of a few 
   weeks, only to discover that a lot of our changes still 
   cross both code bases.

spankalee11y ago

I don't think you're interpreting that section quite right. That section is talking about whether or not Bazel is fully _developed_ in the open, and the answer is "Unfortunately not".

To be clear, all of Bazel is open source and the source is available here: https://github.com/google/bazel

tarblog11y ago

Can you explain or give an example of a "rule", it's unclear what this means to me.

1 more reply

lberki11y ago

georgehm11y ago

What about skyframe? http://bazel.io/docs/skyframe.html looks like an overview without any examples. Couldn't find any references to it in the bazel code at github too.

1 more reply

Symmetry11y ago

Do they mean that 10% of the original Blaze rules are now open source or that 10% of the Bazel rules they've released are open source?

DannyBee11y ago

The former.

pron11y ago· 5 in thread

How does it compare with Java 9's sjavac (http://stackoverflow.com/a/26424760/750563)?

hanwenn11y ago

I work on Bazel.

Bazel also builds other languages, such as C++ and Objective-C.

We do invoke the Java compiler through a wrapper of our own. We think we can make that work as a daemon process to benefit from a hot JVM, but haven't gotten round to that.

moondowner11y ago

Any plans on supporting Windows? That will definitely increase the adoption of Bazel.

1 more reply

pron11y ago

Do you also use timestamps like sjavac or some other mechanism, like hashing?

1 more reply

Sphax11y ago

I don't think they're even related, Bazel is a general build tool, sjavac looks like a smarter Java compiler ?

pron11y ago

... that exploits parallelism and caching (and a hot VM) to accelerate build of huge projects, and supports build clusters.

2 more replies

setheron11y ago· 4 in thread

If i'm sticking to primarily Java; is there a benefit to using Bazel as opposed to Maven / Gradle / Sbt ?

astral30311y ago

At first impression, unless you have a single gigantic source code base, unlikely. From their FAQ:

>> "Gradle: Bazel configuration files are much more structured than Gradle's, letting Bazel understand exactly what each action does. This allows for more parallelism and better reproducibility"

The value of "more parallelism" depends on the complexity of your Java source code base. I can easily imagine why this extra structure can lead to more parallelism.

If the structure isn't "free", you pay for structure with reduced build script development speed. For Google, it's a tradeoff worth having with that massive source tree.

ulfjack11y ago

I work on Bazel.

You're also right that it isn't free - we have reason to believe that larger projects and larger teams will see benefits from using Bazel. Use your best judgement.

asuffield11y ago

blaze is nothing remotely like the wall of cruft that maven forces you to climb for everything you do. I would describe it as "almost entirely unlike maven".

vorg11y ago

The Bazel query language has a far nicer syntax than Maven's XML without the risk of Gradle's full procedural language Groovy.

Zariel11y ago· 4 in thread

Is this the tool that Google uses to build its Golang source? Or is that something else which is not available?

kchod11y ago

If you're interested, hanwen wrote a bunch rules with similar semantics as the internal rules, see https://github.com/google/bazel/tree/master/base_workspace/e... .

It would be nice to make these semantics match the external ones better, but it requires us to open up more tooling, so people won't need to write BUILD files.

runlevel111y ago

There's a typo in your link. Should be:

https://github.com/google/bazel/tree/master/examples/go

zzzhao11y ago

1 more reply

jwcrux11y ago

You can build golang from source pretty easily. If I remember right, it's just downloading the tarball and running ./all.bash or something like that.

pacala11y ago· 3 in thread

A couple of questions:

* If I have a Maven-based project with heavy reliance on pre-built jars from Maven Central, what's the recipe to port it to Bazel?

* Related, if I have multiple github repos, say a couple open source libraries and a couple private repos, what's a good recipe in conjunction to Bazel?

kchod11y ago

pacala11y ago

Thanks for the tips. I'm super-hyped that blaze was open sourced, it is one of the best systems I've ever had the pleasure to work with.

A couple more questions :)

* Any pointers for adding Scala (sbt?) support? I'd start here: http://bazel.io/docs/skylark/rules.html.

1 more reply

needusername11y ago

Regarding Maven: - how do you resolve artifacts (eg. are you using Aether)? - are you supporting classifier and type for dependencies?

cromwellian11y ago· 3 in thread

If only our code search and code review systems were public too.

solomatov11y ago

BTW, do you have blaze build for gwt? ant seems unwieldy for me.

cromwellian11y ago

Internally to google, gwt_application, gwt_module, gwt_test is a built-in rule. GWT itself is built with blaze internally (not ant) as well.

1 more reply

bruckie11y ago

Yes.

thechao11y ago· 2 in thread

beagle311y ago

> I build a _very_ paranoid Makefile that recompiles everything if it feels like anything changes.

> (deterministic mode for ar)

Why do you care about ar determinism? Shouldn't it be ld determinism you are worried about?

[0] http://gittup.org/tup/

[1] https://github.com/apenwarr/redo

thechao11y ago

> Do you have some kind of way to verify that your makefile dependencies conform to your source dependencies?

Nope. I explicitly use a conservative approximation—this guarantees correctness, over speed. Building everything every time with a clean tree is where I begin; I start optimizing after that.

> Is clang/gcc tracking sufficient for your use case? What about upgrading the compiler itself, does your makefile depend on that? If so, how?

> Have you considered tup[0]? Or djb-redo[1]?

> Why do you care about ar determinism? Shouldn't it be ld determinism you are worried about?

1 more reply

mashraf11y ago· 2 in thread

cbgb11y ago

Do you have a link to that white paper? A quick search on their research site doesn't really yield any results.

kchod11y ago

I'm a developer on Bazel, and AFAIK there is no white paper. We definitely don't want to "throw it over the wall," we're going to try to push more and more development into the open over time.

2 more replies

w4tson11y ago· 2 in thread

It's another impressive feat from Google and reading the comments I've kind of established that

1. Binaries are checked in to source 2. It's more structured than Gradle 3. It's for very large code bases 5. It's nix only

But...

3. Very large, I get that.

4. Very large code bases tend to enterprise systems. Enterprise systems tend to have a plethora of platforms/OSs so it being

Really I'm just jealous I don't have 10MLOC code base :D

cromwellian11y ago

I don't know about Bazel, but Blaze doesn't "check in binaries". Build artifacts are cached, but not "checked in".

The problem with maven and gradle is that their build actions/plugins can have have unobservable side effects.

In my experience with maven and gradle, they are way way slower, and that's on relatively small projects

w4tson11y ago

Apologies for comment- I'd just gotten home from the pub was drunk :D

I look forward to trying it out. The ObjectiveC rules sound interesting especially given the state of XCode which is a laughable IDE.

pjjw11y ago· 2 in thread

DannyBee11y ago

I suspect the reason was: "They need to start with something and go from there".

So they started with the use cases likely to be the most popular.

fridek11y ago

The same question about JS. Closure Compiler never made much sense for me without blaze.

jibu11y ago· 2 in thread

Presumably will also make opensourcing internal projects easier. That can't be a bad thing :)

spullara11y ago

alblue11y ago

cies11y ago· 1 in thread

What would be needed to get this to work with Haskell?

I read in the "Getting started":

> You can now create your own targets and compose them.

So does this mean it is a replacement for `make`? => Yes

Found the answer here: http://bazel.io/docs/FAQ.html

kchod11y ago

If you're interested in adding rules for a new language, check out Skylark: http://bazel.io/docs/skylark/concepts.html.

malkia11y ago· 1 in thread

Oh, but my favourite option "blaze menu" is missing :)

asuffield11y ago

Huh. I never knew that was there. I'll remember this next time I'm around Charleston.

shmerl11y ago· 1 in thread

> Why doesn't Google use …? Make, Ninja: These tools give very exact control over what commands get invoked to build files, but it's up to the user to write rules that are correct.

But does it give the optional custom level of control that for example CMake + Ninja provide? Or it's only high level rules?

blinks11y ago

http://bazel.io/docs/skylark/concepts.html

You can [at least internally] define custom rules to handle pretty much anything, in almost-but-not-quite-python.

ngd11y ago

This is an open sourcing of Google's internal build tool.

I know it as Blaze, which Bazel is an anagram of. Many files in the source have references to Blaze.

jacquesm11y ago

mikojava11y ago

Here's the Gradle Team's perspective on Bazel

https://www.gradle.org/gradle-team-perspective-on-bazel/

frownie11y ago

From the FAQ :

Multi-language support: Bazel supports Java, Objective-C and C++ out of the box, and can be extended to support arbitrary programming languages.

c'mon, not even the Go language from Google itself ?

nchelluri11y ago

I worked at Ning for a couple of years (http://www.ning.com/) and the internal codename of our create-your-own social network was Bazel.

When I first saw the headline I thought they'd open-sourced it.

danneu11y ago

The "b"-with-leaves-sprouting-from-it logo is also used by http://beanstalkapp.com/

brooksbp11y ago

Will GYP/GN be deprecated in favor of Bazel?

What, if any, does the convergence among these projects look like longevity-wise?

forrestthewoods11y ago

Will there ever be Windows support?

hbhakhra11y ago

This seems very promising. Does anyone know if this would this work with the OSGI framework?

zerr11y ago

Fast - compared to what?

bubersson11y ago

Wohoo! This is awesome :)

toolslive11y ago

depends what you mean with reproducible: build a jar twice, and its md5sum will change because there are timestamps in the archive.

j / k navigate · click thread line to collapse