Should be a shared Wonder all civilizations would benefit from.
Similar to every civ gets a library, but there’s only one Library of Alexandria, etc.
Distributed open source projects are obviously an example of survivorship bias: the people who thrive in them are those that work well in a distributed environment. Contributors are also usually self-motived (again, by survivorship) in a way that one wouldn't expect of rank-and-file office workers.
Also, the existence of a thing doesn't show that its genesis was an optimal path to its current state.
Additionally, large open source projects are not necessarily good at converging upon a particular goal. They do well when the goals of many individuals or small groups aggregate well. It would be very difficult to convince all of the kernel team to, e.g. optimize for mobile performance for the next two years.
Nobody contests that large things can be done by distributed teams. Usually there's contention that a particular set of employees, that work on a particular set of projects / goals can be transitioned to working that way.
Couldn't the same argument be made for those who thrive in an office environment and need to be around other people to work effectively as well as those who are not self-motivated?
I guess it really depends on what the majority prefers from a worker/contributor point of view.
Which sections of each files get more than usual commits? eg which functions?
Who wrote this particular function first? who subsequently?
How many of those include expletives? Curious minds need to know.
If I want to answer these earth shattering questions, could I just grab the entire git repo and go from there? is it that simple? is it text exportable without too much "other" scary?
12846 MAINTAINERS
4167 drivers/gpu/drm/i915/intel_display.c
3330 drivers/gpu/drm/i915/i915_drv.h
2360 drivers/gpu/drm/i915/i915_gem.c
2328 arch/arm/Kconfig
2118 arch/x86/kvm/x86.c
2079 sound/pci/hda/patch_realtek.c
2019 Makefile
2001 fs/btrfs/inode.c
1927 include/linux/sched.h
1903 net/core/dev.c
1888 drivers/gpu/drm/i915/i915_reg.h
1824 arch/arm/boot/dts/Makefile
1801 drivers/gpu/drm/i915/intel_pm.c
1781 include/linux/fs.h
1763 arch/x86/Kconfig
1737 mm/page_alloc.c
1643 kernel/sched.c
1628 fs/btrfs/extent-tree.c
It's quite interesting how present intel gpus are in the list, which probably tells something about the code on them...Million ways to interpret this.
Lots of commits could mean that it was constantly improved or there was a constant stream of bugs.
A single commit could mean that it was a perfect masterpiece on first try or it was simply forgotten.
commits | authors | first_commit | last_commit | path
---------+---------+--------------------------+------------------------+--------------------------------------
12846 | 2438 | 16 years 16 days | 00:00:00 | MAINTAINERS
4167 | 205 | 12 years 4 mons 4 days | 1 year 10 mons 13 days | drivers/gpu/drm/i915/intel_display.c
3330 | 205 | 12 years 9 mons 19 days | 2 days | drivers/gpu/drm/i915/i915_drv.h
2360 | 146 | 12 years 6 mons 16 days | 1 mon 9 days | drivers/gpu/drm/i915/i915_gem.c
2328 | 429 | 16 years 16 days | 2 days | arch/arm/Kconfig
2118 | 315 | 13 years 3 mons 3 days | 1 day | arch/x86/kvm/x86.c
2079 | 216 | 16 years 16 days | 4 days | sound/pci/hda/patch_realtek.c
2019 | 300 | 16 years 16 days | 1 day | Makefile
2001 | 186 | 13 years 10 mons 20 days | 5 days | fs/btrfs/inode.c
1927 | 367 | 16 years 16 days | 2 days | include/linux/sched.h
1903 | 388 | 16 years 16 days | 6 days | net/core/dev.c
1888 | 177 | 12 years 6 mons 16 days | 1 mon 9 days | drivers/gpu/drm/i915/i915_reg.h
1824 | 494 | 8 years 7 mons 18 days | 6 days | arch/arm/boot/dts/Makefile
1801 | 101 | 9 years 14 days | 4 days | drivers/gpu/drm/i915/intel_pm.c
1781 | 306 | 16 years 16 days | 2 days | include/linux/fs.h
1763 | 392 | 13 years 5 mons 20 days | 2 days | arch/x86/Kconfig
1737 | 391 | 16 years 16 days | 2 days | mm/page_alloc.c
1643 | 258 | 16 years 16 days | 9 years 3 mons 27 days | kernel/sched.c
1628 | 121 | 12 years 4 mons 4 days | 1 year 8 mons 12 days | drivers/gpu/drm/i915/intel_drv.h
1628 | 118 | 14 years 2 mons 4 days | 13 days | fs/btrfs/extent-tree.c
Note: Authors may and probably is incorrect since a single user could have committed with different emails.Is wanting them in the top 1000 being too cynical or just wishful thinking?
To create visuals of the gitrepo history/commits.
And then check if the lack of activity is related to the stability of the code, lack of use or its complexity.
Are there any examples of projects with 1kk+ commits that use SVN, Mercurial, Perforce, or some other SCM?
Strictly speaking, it's not actually the main project repository (which has closer to 600k commits), but the repository that contains what is effectively all of the pull requests for the past several years (more specifically, all the changes you want to test in automation).
The closed-source monorepos of Google (perforce IIRC), Facebook (Mercurial), and Microsoft (Git) are all going to be far larger than any open-source repository, of which Linux is in the largest size class but not the largest (I believe Chromium's the largest open-source repo I've found).
I think this would be one case of a “detached head”.
Microsoft is based on Git, but with a lot of engineering on top of it: https://devblogs.microsoft.com/bharry/scaling-git-and-some-b...
That's going in the ol' geek toolbox
https://en.wikipedia.org/wiki/Metric_prefix#List_of_SI_prefi...
On the other hand, if your team is used to making quick iterative commits, throwing them in a PR, never rebasing, and pulling in merge commits all over the place, uh, I can attest that you can get to a million commits pretty fast.
Evidence? One million commits seems to indicate otherwise.
What alternative do you suggest, and in what way is it better than email?
> It doesn't really scale well [...]
There are few things that scale better than email...
> [...] doesn't make contributing for newbies easier.
That's a good thing: If you haven't even mastered sending a plaintext email, I really wouldn't expect you to be able to constructively contribute to kernel development. Feel free to experiment with your local copy, though – it's open source after all.
Surprise, I've had a few of my patches accepted and this statement is just thinly veiled insult, if I ever saw one.
Have you ever heard of pull requests? Pull requests are 10x easier than adhering to lkml's email rituals.
That's entirely BS. To contribute to these projects (my experience is contributing to Git), you need to respect a dozen conventions that seem to come from another age. Just subscribing to the mailing list is not trivial for somebody in their twenties that never had to do something like that: it's the sort of things that's easy in retrospect, but the UX is hard to discover and the lack of parallel with other tools we regularly use (such mailing lists aren't a thing that most devs use) adds a huge amount of friction.
And then you need to find somewhere that explain the conventions to try and contribute and figure out how to configure your email client, how to get a patch for your commits, how to insert your patch in your email, how to write an acceptable email subject and an acceptable email body and how it relates to your commit message, who you should CC, how to handle multi-commits contributions, how to answer emails (while respecting another half-dozen conventions)...
It's not impossible, but there's a dozen things you need to figure out, half of them you don't even _know_ you need to figure out, so it's a lot of friction. This friction might be a good thing (that's another debate), but saying "you just need to be able to send a plaintext email" is completely false and dismissive.
in just plain text with no colors for code?
it must be painful as hell
Of course there will also be a lot of pull request noise, so I can't say if that tradeoff is ultimately worth it.
Edit: well I guess OSS people are extremely hostile to even the discussion of more people contributing to the Linux kernel. My bad.
Bullshit. What you really means is the second half of your comment:
> and definitely doesn't make contributing for newbies easier.
That's fair enough. I would guess that the scalability and workflow of the actual developers is more important.
I’ve been thinking about how Linux and Wikipedia use an email list that is archived to a website. The archive can be browsed like an issue tracker. Many people spend their day in their email app. I wonder if maybe most projects aren’t using email correctly…
People will be quick to point out that "the hardware keeps changing so the software has to adapt". This is true, but why not design the software in such a way that different drivers can easily be substituted (so the drivers can change but the interface doesn't)?
I did this with my open source project. I haven't made any code changes for over a year and it still works perfectly and still relevant.
I don't understand why there is such a fetish in this industry for never finishing any project. I find the whole attitude very frustrating.
To suggest that software maintenance and building maintenance are anything alike is ignoring the entire context of the two activities.
With buildings, builders have very little control over the wear and tear caused by the environment. In software, developers collectively have total control over the software and hardware environment which determines whether or not software breaks. Most of the wear and tear in software is a direct result of people compulsively changing stuff in other software (or hardware) up the stack - It's all 100% avoidable. If the software never changed materially (aside from bug fixes), it would not break. Simple as that.
And most of the software changes are simply taking us round in circles. New generations of developers undoing the work of the previous generation, then later reversing direction again, surely we've all seen it happening at most of the software companies we've worked at...
> so the drivers can change but the interface doesn't
This is already how it is. Take write(sockfd, …). Sure there are some configuration options in the parameters but nothing compared to the real complexity of networking. This is the downside to abstraction; roughly, the least complex implementation wins. Eventually, we shift and add more and more standards, but it’s never cutting edge (and for good reason).
> I haven't made any code changes for over a year
Relatively speaking, a year is nothing in the timeline of software, so this is not surprising and it’s likely that even if you hadn’t written your software in a abstracted way (which kudos to you for doing so), it would still be fine after only one year. write() has been the same interface for over 40 years.
Also, not to rain on anybody’s parade, but OSS is - generally - not held to the same performance standards that proprietary code is. This makes sense intuitively right? “If I’m paying for it, it better work.” And the vast, vast majority of code is not OSS. We just get a false impression since, by definition, we only have access to OSS. The worst that can happen for bad OSS is lack of adoption or a tsunami of incoming GitHub issues. For proprietary code, you could lose your job if a product doesn’t take.
> I don't understand why there is such a fetish in this industry for never finishing any project
This is similar to the argument that a company, once it has a good product, should just stop. Why do we need updates?, I like the features we have, Don’t change it, it’s perfect. But to survive in the market - not just on GitHub or code coverage tests - requires constant competition and innovation. If Intel launches a new multi-register write feature, Chip Company X can’t just say “Well our project is done.” It’s not anymore! And if it, then Chip Company X might be done too…
I've noticed the opposite to be true. Code from a proprietary vendor is buggy? Too bad, the corp just a faceless, nameless borg of an entity that doesn't care about your bugs. OSS has bugs? You can easily go rant at the poor coders who are working on it
The annoying thing to me is that the actual software development rule seems to be “improve it until it breaks”: Slack was pretty great for a while, but at a certain point they started adding misfeatures (the new rich text input is still really annoying in a hundred little ways) and eventually the app just became really buggy: I’m still stuck with it for Reasons, but I’m constantly reloading/force-quitting it just to read messages and bits of the UI appear and disappear seemingly at random.