CocoaPods downloads max out five GitHub server CPUs (opens in new tab)

(github.com)

826 pointsjergason10y ago308 comments

308 comments

176 comments · 43 top-level

pjc5010y ago· 18 in thread

This reply: https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm...

"Not having to develop a system that somehow syncs required data at all means we get to spend more time on the work that matters more to us, in this case. (i.e. funding of dev hours)"

In other words, using github as a free unlimited CDN lets them be as inefficient as they like. Such as having 16k entries in a directory ( https://github.com/CocoaPods/Specs/tree/master/Specs ) which every user downloads.

Package management and sync seems to suffer really badly from NIH. Dpkg is over 20 years old and yum is over a decade old. What's up with this particular wheel that people keep reinventing it seemingly without improvement?

cellularmitosis10y ago

Debian's sync may be nicer, but their client-side solution leaves a bit to be desired.

Trivial apt operations (e.g. trying to install a package which is already installed) on an NSLU2 (an ancient 266MHz ARM machine) take several minutes, whereas the same operation takes several seconds on a modern laptop.

It turns out this is due to the fact that Debian "main" (Packages.gz) has ballooned to 32MB of plain text when uncompressed, comprising more than 41,000 packages, and it has to be parsed and assembled into a dependency tree for every apt operation. This problem screams for SQLite.

A side project I've started looking into is to make a transparent apt proxy which provides a trimmed down Packages.gz (e.g., removing anything which uses X11), which would be a lot easier that rewriting apt to use a SQLite backend.

snuxoll10y ago

> This problem screams for SQLite.

This is precisely why yum/dnf has been switching from XML for repodata to SQLite. In fact, the only thing that is still XML-only is the comps file which just lists package groups, is updated rarely and "only" weighs in at half a MB.

mschuster9110y ago

> Trivial apt operations (e.g. trying to install a package which is already installed) on an NSLU2 (an ancient 266MHz ARM machine) take several minutes

Actually, I'm surprised one can actually run a modern Linux on the NSLU2 given its shameful lack of RAM and slooow USB port. But it was a nice gadget when it came out and it was fun to experiment with it.

> It turns out this is due to the fact that Debian "main" (Packages.gz) has ballooned to 32MB of plain text when uncompressed, comprising more than 41,000 packages, and it has to be parsed and assembled into a dependency tree for every apt operation. This problem screams for SQLite.

Correct me if I'm wrong, but isn't apt (and dpkg) basically composed out of a ton of different (perl/shellscript) modules? So it should be possible to create an interface-compatible sqlite data store.

1 more reply

debacle10y ago

> A side project I've started looking into...

Wouldn't cleaning up the package search interface be a similar effort with much greater payoff?

1 more reply

mitchtbaum10y ago

Arch and Debian contributors have tried a good approach for package management..

0: p2pacman - Bittorrent powered pacman wrapper

1: pacman & torrent, feasible?

2: DebTorrent

(0) https://bbs.archlinux.org/viewtopic.php?id=163362

(1) https://bbs.archlinux.org/viewtopic.php?id=115731

(2) https://wiki.debian.org/DebTorrent

I believe scaling this could happen with either: 1) lightweight filesystem\directory versioning support, like how btrfs allows you to mount snapshots. This way, peers could update whichever version of a torrent they have. Or 2) very reliable means to update to the latest torrent release (as reliable as syncing with peers), which afaict means smarter bittorrent clients that can perform DHT-based "crawling". Those recent defcon(?) "hacks" to query peers for similar torrents based on user pools and connection histories (or something like that) would make sense here.

A cool side-note: In one of my few experiences diving into `.git`, I diff'd it before and after making changes to its tracked sources, like adding files and modifying them. It looked like a torrent that included version control data would make out just fine if an updated torrent expected similar data in the same location. Again, a smarter bittorrent client would need to sort some of this out. See also 0': Updating Torrents Via Feed URL. Anyway, most users would probably leave that part out, in favor of only which parts they need.

(0') http://www.bittorrent.org/beps/bep_0039.html

Another cool side-note: This would also allow for easily adding repos from multiple sources... Look at how many ( non-automated :-( ) merge requests com.github/CocoPods/Specs's caregivers have reviewed: 13,331 as of now (0'').

(0'') https://github.com/CocoaPods/Specs/pulls?q=is%3Apr+is%3Aclos...

masklinn10y ago

> Arch and Debian contributors have tried a good approach for package management..

> 0: p2pacman - Bittorrent powered pacman wrapper

> 1: pacman & torrent, feasible?

> 2: DebTorrent

That's about distributing packages via p2p. The problematic repository doesn't store any package data, it stores package metadata (it's the cocoapods index if you will).

1 more reply

justinclift10y ago

As a data point, "dnf" is the successor to yum. Started using it recently with Fedora 23... and it's pretty decent.

(It may not have been earlier on, I really don't know. ;>)

Something nifty about the new dnf is several of the older yum commands (eg builddep, yum-downloader) are now integrated directly so don't need extra utils installed. Seems like refinement is still happening.

If only my fingers didn't keep typing "dns" instead of "dnf" all the time, it would be great. :D

hayleox10y ago

I just keep typing "yum", since it's an alias of dnf (albeit with an annoying nag message) and since I work on CentOS servers a lot and am automatically used to typing "yum".

cyphar10y ago

There's also zypper from SUSE/openSUSE.

superuser210y ago

Perhaps because Cocoapods is not an OS package manager or anything close to it. It installs libraries within the context of an XCode project, regardless of the host system or what is installed for other projects.

pjc5010y ago

It is a package manager, though? The fundamental idea of downloading a list of available options of which the user picks some, and the system pulls in dependencies, is almost exactly how dpkg and yum work. The location to which the packages are installed is a detail.

2 more replies

yegle10y ago

A package manager for a project can be the same os package manager with reduced dependency tree and default to install with a prefix (that is the project root or the vendor directory).

jahewson10y ago

Care to substantiate that last paragraph? Are you really suggesting OS X users use yum?

yegle10y ago

Why not? dpkg is already been used by jailbroken devices in Cydia.

wyldfire10y ago

Actually it seems very likely that one or more of the popular linux distro package manager ecosystems would fare well on other OSs. Arch Linux's pacman was ported to Windows, e.g..

1 more reply

umanwizard10y ago

It's absolutely possible to install yum on OS X (I know this from experience)

rodgerd10y ago

Why not? A friend of mine was employed by a Very Large Company at one point to (amongst other things) maintain their AIX port of rpm/yum.

pjc5010y ago

Not literally yum, but something with a similar design rather than abusing a git repo.

onli10y ago· 16 in thread

Note how perfect that response from mhagger is. A clear, honest sounding assurance of what Github wants to deliver. A perfectly comprehensible description of what is the problem, and where it is coming from. And then suggestion how to fix it the project actually can work on, plus mentioning changes to git itself that Github is trying to make that would help. It not only shows great work going on behind the scenes (and if that is untrue, it at least gives me that impression, which is what counts), but also explains it in a great way.

manyxcxi10y ago

I was astonished at how selfish/myopic/whatever alloy's response was.

To be blunt, you're abusing the shit out of SOMEONE ELSE'S product that you're not even paying for. Your first question shouldn't be to see what Github can do for you to make it so you don't have to make changes. You should be falling over yourself investigating all available avenues for reducing load.

It's an incredibly entitled way to think about things and I would have a real hard time employing someone who's first response was like this.

tptacek10y ago

I don't know, it sounded to me like he just didn't totally understand what Github was saying. By the end of the thread, it seemed like everyone was agreeing. I wouldn't be comfortable using words like "selfish" to describe any of what I read.

I certainly don't think the barb about your willingness to employ people who write things on Github issues threads that you disagree with is helping anyone understand any part of this situation. I understand the urge to find ways to be emphatic about how much you disagree with things, and I often find myself compelled to write lines like that, but I think they're virtually always a bad idea.

4 more replies

BogusIKnow10y ago

I had the same impression from alloy's response. I've basically read it as "hi ho we will not change anything".

And it had this passive agressive ring to it, with the hand clapping and the hurray in the beginning and the stone walling in effect.

3 more replies

knorker10y ago

Yeah. "That's your problem, and we don't want to change anything because that's a non-zero cost for us just to fix a cost on your side."

And what the hell was with the quotes around "free"? Are you paying? No? Then there's no quotes about it.

1 more reply

cmyr10y ago

I think you could assume better of people. His response was maybe a bit tone deaf, but text is a very poor way of communicating mood and circumstance. There are any number of creditable explanations from tiredness to language skills that could make sense of the response you're offended by.

fwiw: "I would have a really hard time working for someone whose first inclination was always towards criticism over accommodation or compassion." But then I also acknowledge there may be a whole bunch of other stuff going on here behind the scenes. ;)

1 more reply

acjohnson5510y ago

I'll caveat this with the fact that I'm a coworker of alloy's and orta's, so I'm admittedly biased to seeing them in a positive light.

But I think you may be reading his tone more negatively than necessary. What I see is him starting off by expressing gratitude and then switching voices to communicate very explicitly what the needs and desires of his stakeholders are. He's simply trying to reflect that as clearly as possible and discover the additional context. This was clearly effective, as you can see from the rest of the conversation that with all the information out there, everybody comes to a mutually acceptable consensus. Problem solved!

What we've ultimately got here is a free service built on a free service. CocoaPods has nothing but the time and effort of volunteers. GitHub has input of resources from the commercial side of their business. Both sides clearly want to preserve the utility of this end-to-end workflow in a more sustainable way.

"Falling over yourself" is a subjective amount of effort, but clearly CocoaPods has tried to minimize their impact on GitHub. As it turns out, the attempted optimization of shallow fetching backfired, but that's not from lack of regard for the resources they rely on. What was missing was exactly the context the Github employees provided.

Honestly, I think people are offended second-hand by a perceived lack of groveling on CocoaPods part, and to me, that's way overblown.

jwarren10y ago

I think there's definitely a tone problem in his first response, but if you continue reading the thread, you'll see that everything else is very positive. I guess it was just a bit of the classic internet text expression problem. Massive high five for the Github guys for looking past it and being extreme pros.

Bluestrike210y ago

It probably wasn't intentionally selfish. Just really short-sighted, which isn't uncommon. Sometimes it takes a while for things to penetrate for anybody. Alloy's later comments seem much more productive. Whether that's because of backlash, he reread the earlier comment and realized how it seemed, or just had some time to think about the imposition on Github, at least it looks like CocoaPods is going to reconsider how they handle their distribution.

1 more reply

braythwayt10y ago

  > To be blunt, you're abusing the shit out of SOMEONE ELSE’S
  > product that you're not even paying for.

I <3 GitHub, but for their own business/values/whatever reason they choose to host open source for free. It’s not like these people have found a loophole and are getting a paid service for free.

AFAIC, that makes every free user a customer. They may not be a paying customer, but it's GitHub’s choice to be in the free hosting business.

3 more replies

sanjeetsuhag10y ago

> you're abusing

Cocoapods uses GitHub. No abuse here.

3 more replies

dankohn110y ago

Yes, it's a hugely positive advertisement for the essential role that Github plays in the open source community. Also impressive is the followup message from a Github API developer (and Homebrew maintainer) offering access to a beta API that Homebrew is working with to reduce load. https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm...

arrakeen10y ago

taking a look at homebrew's implementation of this new API feature, i fail to see how it would dramatically reduce fetches for their (homebrew's) use case. from what i understand, it will only be called when the user manually invokes `brew update`. how often are users calling this command over and over?

that being said, i do believe it could help cocoapod's use case since the fetches are done automatically (as i understand it)

3 more replies

aikah10y ago

When a package manager monopolizes that much resources from Github at the expense of others there is no reason to "commit" all resources to this one project. Thus cocoapods getting rate limited because of the obvious bandwidth abuse going on here. mhagger answer is pretty straight forward.

EDIT: the upside is that cocoapods will have to either rethink there architecture in order to eat less resources or move to their own paid infrastructure because their package manager will soon be less than functional given the aggressive rate limiting github is performing.

toomuchtodo10y ago

> EDIT: the upside is that cocoapods will have to either rethink there architecture in order to eat less resources or move to their own paid infrastructure because their package manager will soon be less than functional given the aggressive rate limiting github is performing.

I'd like to see both happen:

* CocoaPods refactoring to be more efficient

* GitHub providing open source projects the option to buy reserved capacity if they're using excessive resources (versus just saying "No").

2 more replies

nickpsecurity10y ago

That response was sheer excellence except maybe it was too nice about how ridiculous the situation was. I'm pretty diplomatic on job but an aggressive freeloader braggjng about what tge damage saves them would try my patience.

GitHub people are truly going above and beyond in service even when barely warranted. I'll give them that.

amelius10y ago

Agreed. Perhaps better even would be an automatic message that says that rate-limiting is in effect, explaining the reasons.

Gratsby10y ago· 16 in thread

From CocoaPods.org:

> CocoaPods is a dependency manager for Swift and Objective-C Cocoa projects. It has over ten thousand libraries and can help you scale your projects elegantly.

The developer response:

> [As CocoaPods developers] Scaling and operating this repo is actually quite simple for us as CocoaPods developers whom do not want to take on the burden of having to maintain a cloud service around the clock (users in all time zones) or, frankly, at all. Trying to have a few devs do this, possibly in their spare-time, is a sure way to burn them out. And then there’s also the funding aspect to such a service.

So they want to be the go-to scaling solution, but they don't want to have to spend any time thinking about how to scale anything. It should just happen. Other people have free scalable services, they should just hand over their resources.

Thank goodness Github thought about these kinds of cases from the beginning and instituted automatic rate limiting. Having an entire end user base use git to sync up a 16K+ directory tree is not a good idea in the first place. The developers should have long since been thinking about a more efficient solution.

orclev10y ago

It seems particularly galling that their response to GitHub was to essentially throw their hands up and say "We don't want to change anything, fix it for us". I think GitHub had a near perfect response to this, they analyzed the problem, came up with a set of changes that could be made to help fix it (both short and long term), and pointed to steps they've taken to help out. CocoaPods on the other hand (or at least one of their developers) did not handle this particularly well. When presented with the evidence of why they were seeing slow responses and long queues and suggestions of how to fix it, they complained that they didn't want to fix it and didn't have the time or resources to do so.

Honestly if I was GitHub, I'd be tempted to just increase the throttling on CocoaPods and call it done, it isn't their problem if the users of that project have a bad experience. GitHub has provided solutions to the problem, it's CocoaPods that's resisting implementing those solutions.

mynameisvlad10y ago

Yeah, I'd have to agree. I was not at all impressed by the CocoaPods response here, especially since it was made clear by the GitHub staff that CocoaPods is using up a lot of CPU and terabytes of bandwidth. If you get all that for free, I'd expect you to be a little more open to changes that make it easier for your provider to continue giving you all that for free.

1 more reply

dantiberian10y ago

A later comment from @alloy was a bit more gracious about this https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm..., but I agree, it wasn't a good look.

bobwaycott10y ago

I think that's pretty unfair. It's really obvious that the initial reply didn't really understand what was going on, and what was being explained. A couple followup additional explanations later, the same dev grokked the problem, CocoaPods' responsibility for the problem, and outlined a list of how they're going to solve it. Seemed to me to be a pretty nice example of professional and helpful candor between GH and an OSS project working to figure out a long-term solution.

3 more replies

nostrademons10y ago

Well, the flip side is that the CocoaPods developers are all volunteers (right?). They aren't really deriving any benefit out of the work they do on CocoaPods, and if you ask them to take on financial or ongoing maintenance obligations for a volunteer project, they probably just won't do it. The major benefits of CocoaPods existence go to iOS developers, but there's a tragedy of the commons effect here, where no individual developer is willing to pony up money for the extra convenience that CocoaPods offers.

I think that long-term, the solution will be the Swift Package Manager, and CocoaPods will just be deprecated in favor of it. Let Apple host iOS packages; they're the ones that gain the most benefit from easy iOS development; they have the developer expertise, and the hosting costs are a drop in the bucket compared to iCloud & CloudKit. But that's not all that helpful for people who need an Objective-C package now.

dfc10y ago

> They aren't really deriving any benefit out of the work they do on CocoaPods

I don't think working on CocoaPods is an altruistic endeavor. I imagine (know) that some of the cocoapods folks are app developers and ostensibly CocoaPods makes developing applications easier.

Side Note: its not a tragedy of the commons. Github owns the infrastructure and they enforced their private property rights by rate limiting a group of users that were disproportionately using resources. It is a collective action problem for CP users.

protomyth10y ago

> They aren't really deriving any benefit out of the work they do on CocoaPods

No direct financial benefit, but they are deriving a benefit out of their work.

1 more reply

odbol_10y ago

Ugh, I'm skeptical of giving Apple ownership of any kind of developer tool. We all saw how badly they screwed up TestFlight, and now you want to give them the only OSS package manager?

2 more replies

pjc5010y ago

It's very "sharing economy": someone else has a resource you can use for free, so why not take it?

(Edit: </rhetorical> </sarcasm>)

izacus10y ago

Because abusing that resource makes sure you (or anyone else) won't be able to take it in the future. I thought that's pretty obvious?

4 more replies

mitchtbaum10y ago

I take your use of "sharing economy </r></s>" as post-uber/airbnb/etc, instead of, as I first heard it, from couchsurfing, potlucks, bittorrent, etc well before that.

Also while nit-picking, I would clear up your use of "for free </r></s>" as "as a freebie", again post-[insert: x̄ȳz, inc].

Kapura10y ago

It seems like a classic example of ignoring the negative externalities. Luckily, we live in a connected world where it is sometimes easy to trace the after-effects.

I understand the desire to personally maintain as few of one's own servers as possible, but when the result is negative effects on the service hosting the project and a worse experience for the end-user, it might be time to start looking over what google cloud offers.

rqebmm10y ago

1) I don't think they mean "scale your projects elegantly" in a "distribute your project to millions of customers" sense, but rather in a "add lots of libraries and not have it be a hassle" sense.

2) It makes perfect sense to let GitHub handle the performance hit until issues arise. Premature optimization is the devil, right? But once there start to be issues, it's definitely unfair to turn around and say "well you offer the service for free, so you should fix it"

carlosdp10y ago

Yea pretty sure they definitely mean "Add More Libraries" when they say "scaling", it wouldn't make sense otherwise given what CocoaPods does...

dinkumthinkum10y ago

It was pretty shocking to see a couple of the responses. If point is to be a package manager then they should see this as important part of the project and not just see it as Github's duty to provide this free service. Basically they are saying their service cannot exist with Github or someone will to expend the, apparently fairly significant, capital to provide the backend to their service. And this is all happening at a time when many people have figured out how to provide package management services in a reasonable way.

geofft10y ago

I think the word "scale" is used to mean different things in the two sentences you quoted. In the first case, "scale" is about package management and dependency tracking, helping an individual software project approach large scale (many third-party dependencies with possibly conflicting requirements, new developers who need to get up-to-speed quickly, etc.). In the second case, "scaling" is about distribution of the CocoaPods metadata to large numbers of users, each with their own (possibly small) software project.

Sentence 1 would still be true if CocoaPods was only used by ten companies developing the ten biggest (in terms of lines of code) Objective-C projects, but there would no longer be a need to scale in the sense of sentence 2.

mikeash10y ago· 8 in thread

The criticism against CocoaPods here seems awfully harsh.

Think about it from their perspective. GitHub advertises a free service, and encourages using it. Partly it's free because it's a loss leader for their paid offerings, and partly it's free because free usage is effectively advertising GitHub. CocoaPods builds builds their project on this free service, and everything is fine for years.

Then one day things start failing mysteriously. It looks like GitHub is down, except GitHub isn't reporting any problems, and other repositories aren't affected.

After lots of headscratching, GitHub gets in touch and says: you're using a ton of resources, we're rate limiting you, you're using git wrong, and you shouldn't even be using git.

That's going to be a bit of a shock! Everything seemed fine, then suddenly it turns out you've been a major problem for a while, but nobody bothered to tell you. And now you're in hair-on-fire mode because it's reached the point where the rate-limiting is making things fail, and nobody told you about any of these problems before they reached a crisis point.

It strikes me as extremely unreasonable to expect a group to avoid abusing a free service when nobody tells them that it's abuse, and as far as they know they're using it in a way that's accepted and encouraged. If somebody is doing something you don't like and you want them to stop, you have to tell them, or nothing will happen!

I'm not blaming GitHub here either. I'm sure they didn't make this a surprise on purpose, and they have a ton of other stuff going on. This looks like one of those things where nobody's really to blame, it's just an unfortunate thing that happened.

(And just to be clear, I don't have much of a dog in this fight on either side. My only real exposure to CocoaPods is having people occasionally bug me to tag my open source repositories to make them easier to incorporate into CocoaPods. I use GitHub for various things like I imagine most of us do, but am not particularly attached to them.)

pkaler10y ago

I think Github's response was about as good as it could be. In hindsight, they probably should have contacted CocoaPods when they pegged one CPU. And they could have given the same general solution to Homebrew and others.

With respect to CocoaPods, I would hope someone on the team had thought through performance characteristics of their architecture.

It's like they brought a shopping cart onto a city bus and were then surprised that it inconvenienced the bus driver and the other passengers.

mikeash10y ago

It's more like bringing a shopping cart onto a city bus, when the bus company said "bring all your stuff! we love it!" doing this for years with no problem, the bus driver says nothing, and then one day the bus driver says "hey, you've been causing a ton of problems with that shopping cart, you need to stop." Surprise seems entirely warranted.

1 more reply

ars10y ago

I have never seen anywhere that GitHub advertises using them as a CDN.

GitHub is for source control. That means a limited number of people pulling and submitting changes. That does not mean the general public using it as a CDN.

In fact I seem to remember seeing somewhere active discouragement of using it as a CDN.

swiley10y ago

They advertise their CDN for user/organization pages. I've always been a little bothered that they have you use got for that.

1 more reply

vacri10y ago

On the flip side, user 'alloy' gives the response that their decision to use github as a CDN was an explicit decision. In designing a product to scale, they apparently explicitly decided to outsource the 'scaling' part. While it may have been surprising to them, I don't think it should have been so surprising.

> It strikes me as extremely unreasonable to expect a group to avoid abusing a free service when nobody tells them that it's abuse

I don't think so at all. An experienced developer should expect that a free service will rate-limit their offerings at some point, and design around that. Viewing 'free' as 'an eternal resource sponge that we never have to think about' is the extremely unreasonable thing to do, in my opinion. I think that 'abuse' is probably the wrong word to use here, since that implies malice, and they don't appear to be malicious.

martinald10y ago

I entirely agree with this. GitHub gets so much advertising + community from open source projects like this.

Also, I'm amazed this is even a problem. 5 CPUs is not a lot in the scheme of things (even if they mean physical instead of cores). TBs of bandwidth are also virtually free compared to a company the size of Github.

Even better: they are getting basically real world loadtested for free and finding loads of pain points, which may hit paying customers.

Unless I'm missing something, fire more metal at the problem. Many companies would love to be able to have every single cocoapod user (which is nearly every iOS developer) have to type github.com into their terminal for the cost of a bunch of servers + some bandwidth.

Pretty strange, unless this is hitting some really bad area of their service that can't easily be scaled out of (but i would be surprised)

breischl10y ago

>>Even better: they are getting basically real world loadtested for free and finding loads of pain points, which may hit paying customers.

I think their point is that it's using the system in a way that isn't intended or desired. How does that count as "real world" load testing?

And by that logic, shouldn't anybody who gets hit with a DoS attack just say "thanks"? It's tons of free load testing on your network infrastructure, and you'll definitely find some pain points.

ars10y ago

They are not telling them to stop using GitHub, they are giving them advice on making it work better.

zymhan10y ago· 8 in thread

I've always found Github's business model interesting. What if a massive open-source organization (e.g. Fedora, Apache) decided to use it for all of their development, integrating it with continuous builds and all the associated pulls. Of course this isn't likely to happen for a number of reasons, but there are large open source projects that could put a significant load on their infrastructure if they chose to use Github as their main code versioning system.

athenot10y ago

It's the same business model that Jira used in the past, when the alternatives were Mantis, Trac and Bugzilla. They had[1] one of the better and well designed issue trackers, free for OSS projects. That turned into a great way to champion adoption within paying customers' organizations.

[1] In my opinion, they have since lost that edge on the UI.

stevoski10y ago

I remember seeing this new much improved UI for bug tracking on OSS, and thinking, "Thank goodness they're no longer using Bugzilla". And then using it a bit, and thinking, "I've got to get that bug tracker into the org I'm working for." Yep, it was Jira. It was a great strategy they used!

discreditable10y ago

> What if a massive open-source organization (e.g. Fedora, Apache) decided to use it

This one is pretty big. https://github.com/torvalds/linux

snowwrestler10y ago

Sure but it's just the kernel, and that's just a mirror. Linus does not use Github to manage kernel development. In fact he's been vitriolic in the past about how Github does pull requests.

I wonder how much traffic the Github Linux repo gets. Seems to me that people who want to use Linux, will go get a distro instead. And people who want to develop the kernel, will follow the kernel development process (which doesn't rely on Github).

1 more reply

dnlrn10y ago

This is just a mirror and not the repo the devs actually work on, so I don't think it's very taxing on the resources.

kodablah10y ago

> What if a massive open-source organization (e.g. Fedora, Apache)

https://github.com/apache - Lots of mirrors but many projects use it as their main source.

consp10y ago

I think they should do 'ok' though it won't make you wealthy: - They collect money from businesses (who pay quite well) - They collect money from private repos (like I have for lots of my config files, no, no passwords/keys in there ;) for e.g. tex files. - The large companies probably pay some form of 'support'

Anderkent10y ago

Presumably they'd get load-limited to the same degree and eventually be forced to move to something they control.

sdegutis10y ago· 6 in thread

I love how this was like the perfect storm of things that could go wrong, and how it seems like mhagger is just amazed more than anything else.

aikah10y ago

> I love how this was like the perfect storm of things that could go wrong

How so ? I bet the cocoapods team knew they were hammering Github with that gigantic repo. They just didn't care and expected Github to just give them more bandwidth, for free.

emeraldd10y ago

It's nice to see that they post a technical analysis of the problems with some very reasonable sounding potential fixes rather than just killing the repo.

revelation10y ago

Hardly. Do they hit edge cases? Yes! Is using GitHub for your CDN a dick move and going to cause problems, regardless? Yes!

The response from mhagger is unnecessarily apologetic, and I predict we'll see an official update from GitHub on this soon.

zeveb10y ago

> Is using GitHub for your CDN a dick move and going to cause problems, regardless? Yes!

I don't know about that. Both oh-my-zsh[1] and emacs prelude[2] use git repos as their code distribution mechanisms, and that works really well. I think the real issues here are exactly what is called out in the issue: poor usage of git, and poor directory layout.

[1] https://github.com/robbyrussell/oh-my-zsh [2] https://github.com/bbatsov/prelude

3 more replies

geofft10y ago

Why do you think mhagger's response is not official?

joshribakoff10y ago

Using GitHub for a CDN would be hotlinking to a .js file from your production website, not whats happening here. Other package managers do the same, for example "bower" will clone from github, it just does normal clones.

1 more reply

spoiler10y ago· 5 in thread

I find it amusing how GitHub's contact[1] form has (probably a recent addition):

> GitHub Support is unable to help with issues specific to CocoaPods/CocoaPods.

---

[1]: https://github.com/contact

jdreaver10y ago

I think that contact page remembers the last repo you visited. I went to it in incognito mode and is wasn't there.

That's a pretty neat feature!

synunlimited10y ago

Looks like this shows the last project you looked at before heading to the contact page. I refreshed the leaf repo and then refreshed the contact page and it mentioned the leaf project at the top.

jrgifford10y ago

It's pretty nifty - it seems to pull in from your history, and then show `user/repo` for the most recent one you've looked at.

paulclinger10y ago

They seem to be doing this based on the last repository visited (for some popular once). Try visiting another repository before going to the contacts page and the message will change or disappear.

sehr10y ago

I'm seeing something similar, only with Carthage instead of CocoaPods

ak21710y ago· 5 in thread

I love GitHub's response, but I would urge the project more strongly to use modern CDN solutions. CDNs are dirt cheap and incredibly powerful nowadays, for the data sizes that we're talking about here.

thecodemonkey10y ago

How would you define "dirt cheap"? One of the most popular CDN's out there (Akamai) charges $3,500/month for 10TB/month. Who's going to foot that bill? :)

ak21710y ago

That's a list price that they charge those who don't care to negotiate.

CloudFlare starts at $0 and doesn't meter/charge for bandwidth. CloudFront charges 9 cents per GB and is integrated with other AWS APIs (which can be very useful). Both those solutions could be managed with a donation pool, I would try the CloudFlare free tier first.

1 more reply

skuhn10y ago

Not surprising, Akamai is always at the top end of the pricing spectrum for CDN services.

That price point works out to $0.35/Gbyte. More typical list pricing for US/EU is in the $0.10-0.15/Gbyte ballpark. Prices decline rapidly as your utilization approaches 1PB per month.

neoncontrails10y ago

10TB Amazon AWS is around $85-100, depending on how international your users are.

1 more reply

ropiku10y ago

I believe the problem was not the bandwith usage (which would be saved by a CDN) but the way CocoaPods is using git that generates a lot of CPU expensive operations.

Things that GitHub suggested help with that: faster check for updates, breaking up big directories so diffs are computed faster.

zoul10y ago· 5 in thread

Another reason I consider https://github.com/Carthage/Carthage a better solution of the dependency management problem.

cballard10y ago

And https://github.com/apple/swift-package-manager once it's ready, which is similar to Carthage in structure (decentralized!).

kylef10y ago

Swift Package Manager will also have a centralised package index in the future, here's a quote from their [package manager proposal][1]:

> We would like to provide a package index in the future, and are investigating possible solutions.

[1]: https://github.com/apple/swift-package-manager/blob/master/D...

joeblau10y ago

I really like Carthage's architecture, but there are two things I don't like about it.

1. You can't edit the frameworks unless you open a separate project, re-edit and recompile. With Pods, you can edit the Pod in your workspace.

2. Carthage doesn't go the last mile to bundle the framework into your project.

If they addressed those two things, I feel like Cocoapods would probably start losing a lot of steam. Although, after watching some videos about what the goals of Carthage were from the creators, I doubt that those two things will be addressed so I'm waiting for Swift's Package Manager.

zoul10y ago

I consider both points a feature. I think the Xcode project of the dependency should be a black box, I just want to get the resulting framework to add to my workspace.

arr2bee210y ago

Funny how 3 of the top 4 contributors to Carthage are current or former GitHub employees.

rmoriz10y ago· 4 in thread

CocoaPods (and Homebrew) mainly exist because of a lack of tooling in the typical Apple ecosystem. So I would blame Apple for not supporting the community with money or tooling. Letting GitHub with its limited amount of funding pay the bill isn't a nice move. Apple dev relations should throw some money at GitHub so they can provide some dedicated resources or offer to pay the cost of other solutions (like a 3rd party CDN/AWS/Google Cloud/…).

Gorbzel10y ago

CocoaPods exists because developers want to learn how to "build apps" but lack the resources to intelligently include and link to 3rd party code in their projects. CocoaPods doesn't enable anything not otherwise configurable via git submodules and Xcode project hierarchies / build settings.

Therefore, it's not Apple's problem. In fact, I've talked to a non-trivial amount of engineers (both in Cupertino and long time Cocoa devs) that disapprove of the shortcuts that Cocoapods takes all over, software architecture be damned. Reasonable parties can agree to disagree, but I do include 3rd party framework inclusion without a dependency manager as an interview screen for prospective iOS hires.

Since you mention developer relations, I'll assume you're not actually arguing that this is Apple's technical responsibility, but that they should throw around some $ to grease the wheels to make dependency inclusion better. As a platform vendor, funding hosting costs for some project that you don't agree with just to "support the community" is a bad idea. Better idea is to allocate resources to setup a structure that can fix the issue in a technically agreeable way while also benefitting from the independence a FOSS project provides. In doing so, you are correct that it'd be preferable for Apple to fund/use well-known FOSS standards, such as Github.

In conclusion, Apple should setup a FOSS project to address the current inconveniences associated with third party package inclusion and should involve and pay Github somehow.

Oh wait...https://github.com/apple/swift-package-manager

mahyarm10y ago

Apple decided to block dylibs from the start of the iOS app store, and I think that friction point from how things are usually done in OSX, C & C++ land before iOS is what started the entire cocapods thing in the first place. The replacement with dynamic frameworks came about 8 years too late in ios 8.

rmoriz10y ago

well, same could be said about Homebrew vs. Macports (which is hosted and maintained by Apple). Afaik even a lot of Apple developers don't use macports anymore…

Afaik the swift package manager works only for swift code, so it's not an replacement.

Also it's a very bad habit to try to stand over the users that supply software for your most valuable product. We've seen a lot of stories lately that indie app development is dead. We also regularly see how weak Apple is in web services (cloud sync).

So either developers invest a lot of time to build something that works (and maybe even share it on GitHub) or they will stick with the holy Apple solution and provide a crappy user experience and go bankrupt. Companies like Google or Amazon (AWS) do a very good propag^developer releations job, IMHO way better than Apple ever did (in the last 10 years).

nicky010y ago

It's probably worth pointing out here that Apple is currently working on a package manager (https://github.com/apple/swift-package-manager).

superuser210y ago· 4 in thread

Just last night, all my pod installs were timing out after ~30ish minutes. That explains it.

jessaustin10y ago

Should we infer from your behavior that such a timeframe is fairly normal for cocopod? If I ran an "npm i" (or "pip install", etc.) that didn't respond for a minute, I would suspect a problem and kill it. How can any development process that takes longer than that be practical?

superuser210y ago

Indeed, I was killing and restarting the process for a while, trying to get it to be more verbose, etc. Eventually I decided to try letting it run, and that didn't work either.

I don't know what's normal; last night was one of my first iOS projects.

draw_down10y ago

Seriously. I regularly `npm i`, run tests, switch branches, blow away node_modules, then `npm i` again. It's really important for it to be quick. Then again, there is caching to help.

joeblau10y ago

I installed from scratch last night on a server and it took ~5 mins.

indygreg210y ago· 3 in thread

I help run Mozilla's version control infrastructure and the problems described by the GitHub engineer have been known to me for years. Concerns over scaling Git servers are one of the reasons I am extremely reluctant to see Mozilla support a high volume Git server to support Firefox development.

Fortunately for us, Firefox is canonically hosted in Mercurial. So, I implemented support in Mercurial for transparently cloning from server-advertised pre-generated static files. For hg.mozilla.org, we're serving >1TB/day from a CDN. Our server CPU load has fallen off a cliff, allowing us to scale hg.mozilla.org cheaply. Additionally, consumers around the globe now clone faster and more reliably since they are using a global CDN instead of hitting servers on the USA west coast!

If you have Mercurial 3.7 installed, `hg clone https://hg.mozilla.org/mozilla-central` will automatically clone from a CDN and our servers will incur maybe 5s of CPU time to service that clone. Before, they were taking minutes of CPU time to repackage server data in an optimal format for the client (very similar to the repack operation that Git servers perform).

More technical details and instructions on deploying this are documented in Mercurial itself: https://selenic.com/repo/hg/file/9974b8236cac/hgext/clonebun.... You can see a list of Mozilla's advertised bundles at https://hg.cdn.mozilla.net/ and what a manifest looks like on the server at https://hg.mozilla.org/mozilla-central?cmd=clonebundles.

A number of months ago I saw talk on the Git mailing list about implementing a similar feature (which would likely save GitHub in this scenario). But I don't believe it has manifested into patches. Hopefully GitHub (or any large Git hosting provider) realizes the benefits of this feature and implements it.

_yy10y ago

Wow, this is pretty cool. Reminds me of the performance optimizations Facebook has done with Mercurial: https://code.facebook.com/posts/218678814984400/scaling-merc...

Mercurial was designed to be easy to extend, and it shows.

rjbwork10y ago

Git was created and designed to support Linus' workflow when developing the Linux kernel.

Hg was designed to be a DVCS system.

sambe10y ago

GitHub claim in this thread that they pay about 1s for a full clone without a caching CDN, due to their bitmap indexing patches.

jdcarter10y ago· 3 in thread

Wow, really impressive response from GitHub. The right amount of technical detail coupled with balanced tone--halfway between "we support you" and "you make us crazy."

One correction to the post title: it's not maxing five nodes, but five CPUs.

justinclift10y ago

Yeah, 5 cpu's is an order of magnitude difference. ;)

joshribakoff10y ago

Wait, so the CPU isn't the big white tower sitting under my desk?!

1 more reply

dang10y ago

Ok, we replaced "nodes" with "server CPUs" in the title.

web00710y ago· 3 in thread

I keep coming back to point #4 - who ever thought that 16k objects in a single directory would be a good idea? Ever since FAT that's been a bad idea, and while modern FSes will handle it without completely melting down it's still going to cause long access operations on anything to do with it.

Even Finder or `ls` will have trouble with that, and anything with * is almost certainly going to fail. Is the use-case for this something that refers to each library directly, such that nobody ever lists or searches all 16k entries?

acdha10y ago

I do think that your last sentence is the answer: if you're using a package manager instead of working with the directory heavily, this isn't a visible problem which is going to motivate people to work on it.

The other side to consider: “one directory per package” is a very simple policy and it feels right in many ways to people (e.g. Homebrew has a similar structure because it's a natural fit for the domain). If the filesystem and basic tools like ls work just fine (which is certainly the case on OS X, where even "ls -l" or the Finder take less than a second on a directory of that size), isn't there a valid argument that the answer should be some combination of fixing tools which don't handle that well or encouraging people to learn about things like `find` instead of using wildcards which match huge numbers of files?

web00710y ago

One directory per package is completely sensible, just not all in one bunch. It's even fine if the mapping is to a flat namespace at something like the HTTP level - I can mod_rewrite /abcdefg to /a/b/c/abcdefg no problem. My only objection is to file- or directory-level structures that are this flat. I might be mentally deficient, but I can't even process anything that's structured this way.

As loathe as I am to admit anything about Perl is good, CPAN got this right. 161k packages by 12k authors, grouped by A/AU/AUTHOR/Module. That even gives you the added bonus of authorship attribution. Debian splits in a similar way as well, /pool/BRANCH/M/Module/ and even /pool/BRANCH/libM/Module/ as a special case.

Tooling can be considered part of the problem in this case. Because the tooling hides the implementation, nobody (in the project) noticed just how bad it was. I hadn't seen modern FS performance on something of this scale, apparently everything I've worked with has been either much smaller or much larger. Ext4 (and I assume HFS+) is crazy-fast for either `ls -l` or `find` on that repo.

It seems like tooling is part of the solution as well, but from the `git` side. Having "weird" behavior for a tool that's so integral to so many projects scares me a little, but it's awesome that Github has (and uses) enough resources to identify and address such weirdness.

1 more reply

Twirrim10y ago

It's one of those shouldn't-be-arcane-but-somehow-is pieces of knowledge. Almost every job I've had I've ended up speaking to developers about more efficient file storage when I find yet another "shove everything in a single directory" implementation.

wpeterson10y ago· 3 in thread

It's totally reasonable to host your code on github and to build a package manager that loads the content of a package from it's github repo.

What seems insane is to use a single github repo as the universal directory of packages and their versions driving your package manager.

There's a reason rubygems has their own servers and web services to support this use case for the central library registry, even if the source for gems are all individually projects hosted on github.

lucaspiller10y ago

I assume they modelled it after Homebrew, which has been working fine doing exactly that for the last 7 years.

That only has 3,000 packages vs 15,000 for CocoaPods or 115,000 for RubyGems.

philippnagel10y ago

In case somebody is interested in such figures (I certainly am) - NPM has 249,838 as of today [0].

[0]: https://www.npmjs.com

caf10y ago

I wonder whether you could use a DHT for the package directory.

pavlov10y ago· 3 in thread

I've never really understood CocoaPods. Dragging a framework into Xcode was never much trouble, and the amount of 3rd party libraries in a OS X / iOS project ought to be fairly small, so the gains are trivial.

The potential downsides seem much more annoying. Do you really want to have your dependencies on an overloaded central server somewhere?

Moto745110y ago

Until recently iOS only supported statically linked libraries which could lead to issues if you needed to use multiple components that had a shared dependency that you needed to upgrade for one reason or another. You couldn't touch the embedded version. Additionally there was no native package manager which made sharing libraries a clumsy affair. Cocoapods makes both cases easier.

seanalltogether10y ago

This is correct. Starting with iOS 8 they finally allowed linking against custom frameworks which is why Carthage is now becoming much more popular. CocoaPods solved a critical problem of getting dependencies linked in, while creating a new problem of having your xcodeproj file and build settings be managed by them. I'll be happy enough to drop them for new projects going forward.

1 more reply

mikeash10y ago

Dynamic libraries don't help shared dependencies, do they? In both cases, you want the shared dependency to be in its own library. There is a difference in that duplicated dependencies in shared libraries will produce a subtle runtime error rather than an obvious build-time error, but that's not really better.

1 more reply

Negative110y ago· 3 in thread

When he says approaches similar to 'other packaging systems', which ones is he referring to? I can see why this is a bad approach but am unfamiliar with what would be considered a better practice (outside of just hosting a .tar on CloudFront).

JonnieCache10y ago

Probably homebrew, which is also heavily based on github, although not to the same extent as CocoaPods apparently is. It's mentioned further downthread.

The main difference is that homebrew actually updates the git tree to provide updated versions of package specs. CocoaPods adds a new directory and some files for each package version, causing the repo to balloon.

wpeterson10y ago

RubyGems.org has their own web server and web services for publishing library versions and to allow the clients to fetch libraries and query the universal registry.

s_kilk10y ago

Same goes for almost any language-specific package manager you could name: Rust/Cargo, Clojure/Clojars, Node/npm, Elixir/hex, Python/PyPi.

No matter which way you slice it, what CocoaPods is doing is a bit daft, especially at their scale.

2 more replies

riscy10y ago· 2 in thread

> Scaling and operating this repo is actually quite simple for us as CocoaPods developers whom do not want to take on the burden of having to maintain a cloud service around the clock (users in all time zones) or, frankly, at all.

The CocoaPods developers seem to be missing the entire point of git: it's a _distributed_ revision control system.

Setup a post-recieve hook on Github to notify another server, that is setup with a basic installation of git, to pull from Github so as to mirror the master repo. Then, have your client program randomly choose one of these servers to pull from at the start of an operation. Simple load balancer to solve this problem.

justinclift10y ago

Rackspace is also known to sponsor significant resources for larger projects whom ask nicely. GlusterFS is one I used to be involved with doing this, and there are definitely others.

If CocoaPods reach out to Rackspace and/or other hosting providers, there's a decent chance they'll be able to pull together a good solution. :)

The downside though, is they'll need to figure out some way to keep it monitored/maintained. :/

voltagex_10y ago

Last I checked, Rackspace wasn't accepting any more projects.

paradite10y ago· 2 in thread

I have been seeing this trend of GitHub getting "abused" for purposes other than hosting source code.

- My school uses GitHub to host and track our software engineering project (which still can be argued as OSS).

- People using GitHub issue system as a forum.

- Friends uploading pdfs to GitHub.

- Recently people posted on HN about using GitHub to generate a status page.

I think this is a really bad trend and people should stop doing that.

WillAbides10y ago

The school example and friends uploading pdfs to GitHub are both uses that GitHub encourages.

Using GitHub Issues as a forum and a source for generating status pages are both ok from a use/abuse perspective, but you may not have the best experience since that isn't what Issues is intended for.

takluyver10y ago

The issue tracker on new repos has a 'question' tag by default, so Github are gently encouraging using issues as a forum. Though my inner cynic says that makes sense for them - issues tie a project to Github more than the git repo itself does.

It should be fine to come up with new ways to use Github, as long as it's not causing excessive load.

noahlt10y ago· 2 in thread

Go's package manager, `go get`, also downloads from GitHub. I don't know the details of how `go get` and CocoaPods work, but I would be interested in learning why one is unscalable and the other seems to work.

karaziox10y ago

Because cloning the dependency is not the issue here. CocoaPods is keeping its index in a git repo that is updated by the user as a way to get the latest index. This is the repo that incur a lot of requests from everyone.

Go get on the other hand doesn't keep any index. It just uses the url to download the dependency because of the mapping "url==project name" that exists with go projects.

miralabs10y ago

probably on separate repos which something like npm and bower does as well.

rcthompson10y ago· 2 in thread

Reading the issue, it seems that one of the problems is a single directory with lots and lots of files in it, which is something of a pathological case for Git. Now, this could be "fixed" by splitting the files in that directory into subdirectories, but the one giant directory will still exist in all the past commits. So would this actually fix anything, or just keep it from getting worse?

cyphar10y ago

Filesystems also have the same pathology, which is why git's object store is of the form prefix/object.

rcthompson10y ago

In a filesystem, you can address the problem by reorganizing the directory structure. And you can do the same for future Git commits. But unless I'm mistaken, that colossal directory is stuck in the git history unless you actually rewrite that history, which would require everyone who cloned the repo to either rebase or re-clone the repo. Maybe it's not a problem because the cost is only paid when one checks out one of those old commits, which would happen rarely?

sly01010y ago· 2 in thread

I would be interested to know what are the other top GitHub repositories. Afaik the Nix package manager uses a similar model (using a GitHub repo as a database), however they periodically release snapshots and the default configuration uses those instead of git.

vbezhenar10y ago

I'm not sure how DefinitelyTyped works, but their repository https://github.com/DefinitelyTyped/DefinitelyTyped is huge and may be they use that github repository as a single distribution point.

WorldMaker10y ago

People interact with DefinitelyTyped in a set of ways, from currently most common to least common:

1. NuGet packages, which are built from the GitHub repository but then redistributed over a non-Github CDN (NuGet's)

2. tsd management tool (https://github.com/Definitelytyped/tsd), which looks like it prefers Github's CDN raw URLs rather than full/shallow git clones

3. typings management tool (https://github.com/typings/typings) with "ambient" typings searches (`--ambient` flag), which I believe also prefers Github's CDN raw URLs, as it essentially forked from TSD

That said, it's past time to move beyond the giant huge DefinitelyTyped repo, and I for one heartily recommend people migrate to typings which has better support for NPM and other module and package management systems, as well as smaller unit/module-focused Github repositories.

debacle10y ago· 2 in thread

Why aren't the packages distributed? Composer is incredibly distributed and likely doesn't cause nearly the same headaches for GitHub.

Seems like a poor design decision on the CocoaPods side.

donarb10y ago

What package? My project may have a dependency on a specific commit that's between point releases.

The problem is not packages, it's the index, containing 16000 subdirectories,

debacle10y ago

So the packages themselves aren't part of the repository?

1 more reply

whitehat2k910y ago· 2 in thread

Only the Apple development community would think it's OK to have 16,000 subdirectories in one place and abuse GitHub as a free CDN instead of putting some actual effort in and develop their own repository infrastructure - you know, like almost every other package manager in existence.

mayli10y ago

It should be "Apple or Ruby development community", homebrew is using similar tech stack distributing the `Formula`.

whitehat2k910y ago

I guess it's no coincidence that those two hispter communities are closely related.

Const-me10y ago· 2 in thread

I don’t think GitHub acts wisely here.

Short term sure, they’re doing the right thing, implementing a nice way to manage the free rider problem without hurting them too much.

But long term it’s different.

Financially, one average programmer = $80k/year, one average cloud server = $4k/year. And, GitHub has hundreds of millions of venture capital. More than enough to provision a few more servers, even if they will be installing new servers just for those pods.

The way they act now will lead to someone will develop a decentralized git+torrent hybrid. When that happens, sure, those pods will no longer consume precious GitHub’s resources. However, for the rest of the github users, there will be no reason to stay on GitHub either.

cyphar10y ago

The way CocoaPods is using git is bad, so they'd probably run into some problems no matter how they host it. If you know anything about filesystems, they start to have pathologies when you have many entries in the same directory (which is why git has it's object store in the format prefix/object). In addition, it looks like git has it's own pathologies with such large numbers of dentries. Having 16000 entries in your "specs" directory is not a good idea. No matter how you store it.

Not to mention that "just buy another server for this one project" sounds like something CocoaPods should pay for.

Const-me10y ago

>The way CocoaPods is using git is bad

Very likely true, but I don’t see how’s that related.

>they start to have pathologies when you have many entries in the same directory

Only true for inefficient filesystems like FAT.

For NTFS, 16k entries is nothing, the performance fill start to degrade (due to directory fragmentation) at around 100k entries: http://stackoverflow.com/a/291292/126995

>"just buy another server for this one project" sounds like something CocoaPods should pay for.

I don’t think that’s how 21 century economy works in this case.

Github’s value is likely between $0.75B and $2B.

Bad PR caused by this story will exceed 10 years TCO of that extra server.

fpgaminer10y ago· 1 in thread

GitTorrent: http://blog.printf.net/articles/2015/05/29/announcing-gittor...

Imagine a world where GitTorrent is fully developed, includes support for issue tracking, and has a nice GUI client that makes the experience on-par with browsing github.com.

I mention this not as an "Everybody bail out of GitHub and run to GitTorrent!!!" sort of statement, because I believe GitHub's response here was excellent and confidence inspiring. But it's an unnatural relationship for community supported, open source projects to host themselves on commercial platforms such as GitHub. GitHub primarily hosts them to promote its business. That's not necessarily a bad thing, but it results impedance mismatches like demonstrated here.

That isn't to say that a mature GitTorrent would replace GitHub. Rather, I envision GitHub becoming a supernode in the network, an identity provider, and a paid seed offering, all alongside their existing private repo business.

Honestly, once I scrape a few projects off my plate, I'm inclined to dive into GitTorrent, see where it's at in development, and see if I can start contributing code. It just seems like such a cool and useful idea.

cyphar10y ago

My main issue with Free Software projects using GitHub is that it's proprietary, not that it's commercial. Admittedly, I think GitTorrent is a really cool idea, but I'm wondering if a distributed filesystem might be a more elegant solution than using both BitTorrent and Bitcoin.

tjdetwiler10y ago· 1 in thread

Rust's cargo does something similar, however it looks like they were much more conscious of git-scalability (ex: limiting the directories in a single level, only appending lines to files to make diffs small).

https://github.com/rust-lang/crates.io-index

wycats10y ago

For what it's worse, both of these characteristics indeed weren't an accident.

At the time, I wrote a script that hammered git commits into a repository using different strategies and looked at what the git repository would look like after 100,000 and a million commits. The "one version per file, nested in a flat structure" had serious issues.

There may still be scaling limits with the Cargo approach, but if we reach them, we have plans to create a new registry with a new initial commit and let the old registry age out, then rinse/repeat. At the moment, we haven't hit limits yet (with about 1/3 of the packages that Cocoapods has).

fokinsean10y ago· 1 in thread

I found the solution humorous. Ironically shallow clones are causing the problems, so fetch the max :)

$ git fetch --depth=2147483647

alblue10y ago

The problem is that the initial clone is depth=1, then subsequent fetches are depth=MAX. If you did clone depth=MAX in the first place it is faster to serve.

Shallow (depth=1) can be converted into a full clone with the above.

speps10y ago· 1 in thread

Why is everyone talking about CocoaPods where the title is CoacoaPods anyway? :)

rtkwe10y ago

The title was misspelled and has been fixed. It's CocoaPods right on Github.

iBotPeaches10y ago

This bug report is a great step in the direction for GitHub. As of this comment there are 3 different GitHub staff members responding and providing feedback to the CocoaPods team. From the previous "Dear GitHub" messages and responses, this seems like perfect community involvement.

jrochkind110y ago

What an unusually reasonable discussion. good on everyone.

iamleppert10y ago

Amazing to me that people create inefficient systems like this and then complain when they are rate limited.

maaku10y ago

Using Github as your CDN is a dick move. Kudos to GH for not banning the project out-right, but CocoaPods should seriously reconsider what they are doing.

xemdetia10y ago

As a current maintenance developer/systems guy I can definitely feel the tempered annoyance from mhagger here. It's definitely nice to not remind yourself that it's not only your set of recurring issues in front of you that people have to deal with.

SuperKlaus10y ago

In fact, they are maxing out five CPUs - not five nodes, big difference.

kodablah10y ago

Has any consideration been given to Bintray[1] as an alternative store for this stuff?

1 - https://bintray.com/

kmm10y ago

Funny thing is that the repo is only 7 MB gzipped (or 4 with lzma). Not that surprising, since it's just metadata of course. They say they have about 1 million fetches/clones per week, so that would make about 16 TB per month. I'm not sure how much bandwidth costs, but wouldn't some sympathetic CDN host that for free, since they're OSS?

soheil10y ago

I was up until 2am last night trying to publish my Pod [1] and Github kept timing out.

I had no idea it was just CocoaPods repo because my other repos were working fine. I accepted defeat, went to bed and everything was working great in the morning.

[1] https://github.com/soheil/SwiftCSS

joeblau10y ago

I just installed Cocoapods last night and tried to clone down the repo. It took about 5 minutes and I thought to myself "Is my 150MB/s connection slow?" This definitely clears up what was going on.

voltagex_10y ago

It's difficult to run an open source project on a budget of $0. You're always relying on the goodwill of others.

LoneWolf10y ago

While I do not have much knowledge on the subject, why not using rsync?

nimish10y ago

hopefully we can now move to using real artifact repositories.

rdancer10y ago

tl;dr: "Using GitHub as your [free-of-charge] CDN is not ideal, for anybody involved."

j / k navigate · click thread line to collapse

308 comments

176 comments · 43 top-level

pjc5010y ago· 18 in thread

This reply: https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm...

"Not having to develop a system that somehow syncs required data at all means we get to spend more time on the work that matters more to us, in this case. (i.e. funding of dev hours)"

cellularmitosis10y ago

Debian's sync may be nicer, but their client-side solution leaves a bit to be desired.

snuxoll10y ago

> This problem screams for SQLite.

mschuster9110y ago

> Trivial apt operations (e.g. trying to install a package which is already installed) on an NSLU2 (an ancient 266MHz ARM machine) take several minutes

1 more reply

debacle10y ago

> A side project I've started looking into...

Wouldn't cleaning up the package search interface be a similar effort with much greater payoff?

1 more reply

mitchtbaum10y ago

Arch and Debian contributors have tried a good approach for package management..

0: p2pacman - Bittorrent powered pacman wrapper

1: pacman & torrent, feasible?

2: DebTorrent

(0) https://bbs.archlinux.org/viewtopic.php?id=163362

(1) https://bbs.archlinux.org/viewtopic.php?id=115731

(2) https://wiki.debian.org/DebTorrent

(0') http://www.bittorrent.org/beps/bep_0039.html

(0'') https://github.com/CocoaPods/Specs/pulls?q=is%3Apr+is%3Aclos...

masklinn10y ago

> Arch and Debian contributors have tried a good approach for package management..

> 0: p2pacman - Bittorrent powered pacman wrapper

> 1: pacman & torrent, feasible?

> 2: DebTorrent

That's about distributing packages via p2p. The problematic repository doesn't store any package data, it stores package metadata (it's the cocoapods index if you will).

1 more reply

justinclift10y ago

As a data point, "dnf" is the successor to yum. Started using it recently with Fedora 23... and it's pretty decent.

(It may not have been earlier on, I really don't know. ;>)

If only my fingers didn't keep typing "dns" instead of "dnf" all the time, it would be great. :D

hayleox10y ago

I just keep typing "yum", since it's an alias of dnf (albeit with an annoying nag message) and since I work on CentOS servers a lot and am automatically used to typing "yum".

cyphar10y ago

There's also zypper from SUSE/openSUSE.

superuser210y ago

pjc5010y ago

2 more replies

yegle10y ago

A package manager for a project can be the same os package manager with reduced dependency tree and default to install with a prefix (that is the project root or the vendor directory).

jahewson10y ago

Care to substantiate that last paragraph? Are you really suggesting OS X users use yum?

yegle10y ago

Why not? dpkg is already been used by jailbroken devices in Cydia.

wyldfire10y ago

Actually it seems very likely that one or more of the popular linux distro package manager ecosystems would fare well on other OSs. Arch Linux's pacman was ported to Windows, e.g..

1 more reply

umanwizard10y ago

It's absolutely possible to install yum on OS X (I know this from experience)

rodgerd10y ago

Why not? A friend of mine was employed by a Very Large Company at one point to (amongst other things) maintain their AIX port of rpm/yum.

pjc5010y ago

Not literally yum, but something with a similar design rather than abusing a git repo.

onli10y ago· 16 in thread

manyxcxi10y ago

I was astonished at how selfish/myopic/whatever alloy's response was.

It's an incredibly entitled way to think about things and I would have a real hard time employing someone who's first response was like this.

tptacek10y ago

4 more replies

BogusIKnow10y ago

I had the same impression from alloy's response. I've basically read it as "hi ho we will not change anything".

And it had this passive agressive ring to it, with the hand clapping and the hurray in the beginning and the stone walling in effect.

3 more replies

knorker10y ago

Yeah. "That's your problem, and we don't want to change anything because that's a non-zero cost for us just to fix a cost on your side."

And what the hell was with the quotes around "free"? Are you paying? No? Then there's no quotes about it.

1 more reply

cmyr10y ago

1 more reply

acjohnson5510y ago

I'll caveat this with the fact that I'm a coworker of alloy's and orta's, so I'm admittedly biased to seeing them in a positive light.

Honestly, I think people are offended second-hand by a perceived lack of groveling on CocoaPods part, and to me, that's way overblown.

jwarren10y ago

Bluestrike210y ago

1 more reply

braythwayt10y ago

  > To be blunt, you're abusing the shit out of SOMEONE ELSE’S
  > product that you're not even paying for.

I <3 GitHub, but for their own business/values/whatever reason they choose to host open source for free. It’s not like these people have found a loophole and are getting a paid service for free.

AFAIC, that makes every free user a customer. They may not be a paying customer, but it's GitHub’s choice to be in the free hosting business.

3 more replies

sanjeetsuhag10y ago

> you're abusing

Cocoapods uses GitHub. No abuse here.

3 more replies

dankohn110y ago

arrakeen10y ago

that being said, i do believe it could help cocoapod's use case since the fetches are done automatically (as i understand it)

3 more replies

aikah10y ago

toomuchtodo10y ago

I'd like to see both happen:

* CocoaPods refactoring to be more efficient

* GitHub providing open source projects the option to buy reserved capacity if they're using excessive resources (versus just saying "No").

2 more replies

nickpsecurity10y ago

GitHub people are truly going above and beyond in service even when barely warranted. I'll give them that.

amelius10y ago

Agreed. Perhaps better even would be an automatic message that says that rate-limiting is in effect, explaining the reasons.

Gratsby10y ago· 16 in thread

From CocoaPods.org:

> CocoaPods is a dependency manager for Swift and Objective-C Cocoa projects. It has over ten thousand libraries and can help you scale your projects elegantly.

The developer response:

orclev10y ago

mynameisvlad10y ago

1 more reply

dantiberian10y ago

A later comment from @alloy was a bit more gracious about this https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm..., but I agree, it wasn't a good look.

bobwaycott10y ago

3 more replies

nostrademons10y ago

dfc10y ago

> They aren't really deriving any benefit out of the work they do on CocoaPods

I don't think working on CocoaPods is an altruistic endeavor. I imagine (know) that some of the cocoapods folks are app developers and ostensibly CocoaPods makes developing applications easier.

protomyth10y ago

> They aren't really deriving any benefit out of the work they do on CocoaPods

No direct financial benefit, but they are deriving a benefit out of their work.

1 more reply

odbol_10y ago

Ugh, I'm skeptical of giving Apple ownership of any kind of developer tool. We all saw how badly they screwed up TestFlight, and now you want to give them the only OSS package manager?

2 more replies

pjc5010y ago

It's very "sharing economy": someone else has a resource you can use for free, so why not take it?

(Edit: </rhetorical> </sarcasm>)

izacus10y ago

Because abusing that resource makes sure you (or anyone else) won't be able to take it in the future. I thought that's pretty obvious?

4 more replies

mitchtbaum10y ago

I take your use of "sharing economy </r></s>" as post-uber/airbnb/etc, instead of, as I first heard it, from couchsurfing, potlucks, bittorrent, etc well before that.

Also while nit-picking, I would clear up your use of "for free </r></s>" as "as a freebie", again post-[insert: x̄ȳz, inc].

Kapura10y ago

It seems like a classic example of ignoring the negative externalities. Luckily, we live in a connected world where it is sometimes easy to trace the after-effects.

rqebmm10y ago

1) I don't think they mean "scale your projects elegantly" in a "distribute your project to millions of customers" sense, but rather in a "add lots of libraries and not have it be a hassle" sense.

carlosdp10y ago

Yea pretty sure they definitely mean "Add More Libraries" when they say "scaling", it wouldn't make sense otherwise given what CocoaPods does...

dinkumthinkum10y ago

geofft10y ago

mikeash10y ago· 8 in thread

The criticism against CocoaPods here seems awfully harsh.

Then one day things start failing mysteriously. It looks like GitHub is down, except GitHub isn't reporting any problems, and other repositories aren't affected.

After lots of headscratching, GitHub gets in touch and says: you're using a ton of resources, we're rate limiting you, you're using git wrong, and you shouldn't even be using git.

pkaler10y ago

With respect to CocoaPods, I would hope someone on the team had thought through performance characteristics of their architecture.

It's like they brought a shopping cart onto a city bus and were then surprised that it inconvenienced the bus driver and the other passengers.

mikeash10y ago

1 more reply

ars10y ago

I have never seen anywhere that GitHub advertises using them as a CDN.

GitHub is for source control. That means a limited number of people pulling and submitting changes. That does not mean the general public using it as a CDN.

In fact I seem to remember seeing somewhere active discouragement of using it as a CDN.

swiley10y ago

They advertise their CDN for user/organization pages. I've always been a little bothered that they have you use got for that.

1 more reply

vacri10y ago

> It strikes me as extremely unreasonable to expect a group to avoid abusing a free service when nobody tells them that it's abuse

martinald10y ago

I entirely agree with this. GitHub gets so much advertising + community from open source projects like this.

Even better: they are getting basically real world loadtested for free and finding loads of pain points, which may hit paying customers.

Pretty strange, unless this is hitting some really bad area of their service that can't easily be scaled out of (but i would be surprised)

breischl10y ago

>>Even better: they are getting basically real world loadtested for free and finding loads of pain points, which may hit paying customers.

I think their point is that it's using the system in a way that isn't intended or desired. How does that count as "real world" load testing?

And by that logic, shouldn't anybody who gets hit with a DoS attack just say "thanks"? It's tons of free load testing on your network infrastructure, and you'll definitely find some pain points.

ars10y ago

They are not telling them to stop using GitHub, they are giving them advice on making it work better.

zymhan10y ago· 8 in thread

athenot10y ago

[1] In my opinion, they have since lost that edge on the UI.

stevoski10y ago

discreditable10y ago

> What if a massive open-source organization (e.g. Fedora, Apache) decided to use it

This one is pretty big. https://github.com/torvalds/linux

snowwrestler10y ago

Sure but it's just the kernel, and that's just a mirror. Linus does not use Github to manage kernel development. In fact he's been vitriolic in the past about how Github does pull requests.

1 more reply

dnlrn10y ago

This is just a mirror and not the repo the devs actually work on, so I don't think it's very taxing on the resources.

kodablah10y ago

> What if a massive open-source organization (e.g. Fedora, Apache)

https://github.com/apache - Lots of mirrors but many projects use it as their main source.

consp10y ago

Anderkent10y ago

Presumably they'd get load-limited to the same degree and eventually be forced to move to something they control.

sdegutis10y ago· 6 in thread

I love how this was like the perfect storm of things that could go wrong, and how it seems like mhagger is just amazed more than anything else.

aikah10y ago

> I love how this was like the perfect storm of things that could go wrong

How so ? I bet the cocoapods team knew they were hammering Github with that gigantic repo. They just didn't care and expected Github to just give them more bandwidth, for free.

emeraldd10y ago

It's nice to see that they post a technical analysis of the problems with some very reasonable sounding potential fixes rather than just killing the repo.

revelation10y ago

Hardly. Do they hit edge cases? Yes! Is using GitHub for your CDN a dick move and going to cause problems, regardless? Yes!

The response from mhagger is unnecessarily apologetic, and I predict we'll see an official update from GitHub on this soon.

zeveb10y ago

> Is using GitHub for your CDN a dick move and going to cause problems, regardless? Yes!

[1] https://github.com/robbyrussell/oh-my-zsh [2] https://github.com/bbatsov/prelude

3 more replies

geofft10y ago

Why do you think mhagger's response is not official?

joshribakoff10y ago

1 more reply

spoiler10y ago· 5 in thread

I find it amusing how GitHub's contact[1] form has (probably a recent addition):

> GitHub Support is unable to help with issues specific to CocoaPods/CocoaPods.

---

[1]: https://github.com/contact

jdreaver10y ago

I think that contact page remembers the last repo you visited. I went to it in incognito mode and is wasn't there.

That's a pretty neat feature!

synunlimited10y ago

Looks like this shows the last project you looked at before heading to the contact page. I refreshed the leaf repo and then refreshed the contact page and it mentioned the leaf project at the top.

jrgifford10y ago

It's pretty nifty - it seems to pull in from your history, and then show `user/repo` for the most recent one you've looked at.

paulclinger10y ago

They seem to be doing this based on the last repository visited (for some popular once). Try visiting another repository before going to the contacts page and the message will change or disappear.

sehr10y ago

I'm seeing something similar, only with Carthage instead of CocoaPods

ak21710y ago· 5 in thread

thecodemonkey10y ago

How would you define "dirt cheap"? One of the most popular CDN's out there (Akamai) charges $3,500/month for 10TB/month. Who's going to foot that bill? :)

ak21710y ago

That's a list price that they charge those who don't care to negotiate.

1 more reply

skuhn10y ago

Not surprising, Akamai is always at the top end of the pricing spectrum for CDN services.

That price point works out to $0.35/Gbyte. More typical list pricing for US/EU is in the $0.10-0.15/Gbyte ballpark. Prices decline rapidly as your utilization approaches 1PB per month.

neoncontrails10y ago

10TB Amazon AWS is around $85-100, depending on how international your users are.

1 more reply

ropiku10y ago

I believe the problem was not the bandwith usage (which would be saved by a CDN) but the way CocoaPods is using git that generates a lot of CPU expensive operations.

Things that GitHub suggested help with that: faster check for updates, breaking up big directories so diffs are computed faster.

zoul10y ago· 5 in thread

Another reason I consider https://github.com/Carthage/Carthage a better solution of the dependency management problem.

cballard10y ago

And https://github.com/apple/swift-package-manager once it's ready, which is similar to Carthage in structure (decentralized!).

kylef10y ago

Swift Package Manager will also have a centralised package index in the future, here's a quote from their [package manager proposal][1]:

> We would like to provide a package index in the future, and are investigating possible solutions.

[1]: https://github.com/apple/swift-package-manager/blob/master/D...

joeblau10y ago

I really like Carthage's architecture, but there are two things I don't like about it.

1. You can't edit the frameworks unless you open a separate project, re-edit and recompile. With Pods, you can edit the Pod in your workspace.

2. Carthage doesn't go the last mile to bundle the framework into your project.

zoul10y ago

I consider both points a feature. I think the Xcode project of the dependency should be a black box, I just want to get the resulting framework to add to my workspace.

arr2bee210y ago

Funny how 3 of the top 4 contributors to Carthage are current or former GitHub employees.

rmoriz10y ago· 4 in thread

Gorbzel10y ago

In conclusion, Apple should setup a FOSS project to address the current inconveniences associated with third party package inclusion and should involve and pay Github somehow.

Oh wait...https://github.com/apple/swift-package-manager

mahyarm10y ago

rmoriz10y ago

well, same could be said about Homebrew vs. Macports (which is hosted and maintained by Apple). Afaik even a lot of Apple developers don't use macports anymore…

Afaik the swift package manager works only for swift code, so it's not an replacement.

nicky010y ago

It's probably worth pointing out here that Apple is currently working on a package manager (https://github.com/apple/swift-package-manager).

superuser210y ago· 4 in thread

Just last night, all my pod installs were timing out after ~30ish minutes. That explains it.

jessaustin10y ago

superuser210y ago

Indeed, I was killing and restarting the process for a while, trying to get it to be more verbose, etc. Eventually I decided to try letting it run, and that didn't work either.

I don't know what's normal; last night was one of my first iOS projects.

draw_down10y ago

Seriously. I regularly `npm i`, run tests, switch branches, blow away node_modules, then `npm i` again. It's really important for it to be quick. Then again, there is caching to help.

joeblau10y ago

I installed from scratch last night on a server and it took ~5 mins.

indygreg210y ago· 3 in thread

_yy10y ago

Wow, this is pretty cool. Reminds me of the performance optimizations Facebook has done with Mercurial: https://code.facebook.com/posts/218678814984400/scaling-merc...

Mercurial was designed to be easy to extend, and it shows.

rjbwork10y ago

Git was created and designed to support Linus' workflow when developing the Linux kernel.

Hg was designed to be a DVCS system.

sambe10y ago

GitHub claim in this thread that they pay about 1s for a full clone without a caching CDN, due to their bitmap indexing patches.

jdcarter10y ago· 3 in thread

Wow, really impressive response from GitHub. The right amount of technical detail coupled with balanced tone--halfway between "we support you" and "you make us crazy."

One correction to the post title: it's not maxing five nodes, but five CPUs.

justinclift10y ago

Yeah, 5 cpu's is an order of magnitude difference. ;)

joshribakoff10y ago

Wait, so the CPU isn't the big white tower sitting under my desk?!

1 more reply

dang10y ago

Ok, we replaced "nodes" with "server CPUs" in the title.

web00710y ago· 3 in thread

acdha10y ago

web00710y ago

1 more reply

Twirrim10y ago

wpeterson10y ago· 3 in thread

It's totally reasonable to host your code on github and to build a package manager that loads the content of a package from it's github repo.

What seems insane is to use a single github repo as the universal directory of packages and their versions driving your package manager.

There's a reason rubygems has their own servers and web services to support this use case for the central library registry, even if the source for gems are all individually projects hosted on github.

lucaspiller10y ago

I assume they modelled it after Homebrew, which has been working fine doing exactly that for the last 7 years.

That only has 3,000 packages vs 15,000 for CocoaPods or 115,000 for RubyGems.

philippnagel10y ago

In case somebody is interested in such figures (I certainly am) - NPM has 249,838 as of today [0].

[0]: https://www.npmjs.com

caf10y ago

I wonder whether you could use a DHT for the package directory.

pavlov10y ago· 3 in thread

The potential downsides seem much more annoying. Do you really want to have your dependencies on an overloaded central server somewhere?

Moto745110y ago

seanalltogether10y ago

1 more reply

mikeash10y ago

1 more reply

Negative110y ago· 3 in thread

JonnieCache10y ago

Probably homebrew, which is also heavily based on github, although not to the same extent as CocoaPods apparently is. It's mentioned further downthread.

wpeterson10y ago

RubyGems.org has their own web server and web services for publishing library versions and to allow the clients to fetch libraries and query the universal registry.

s_kilk10y ago

Same goes for almost any language-specific package manager you could name: Rust/Cargo, Clojure/Clojars, Node/npm, Elixir/hex, Python/PyPi.

No matter which way you slice it, what CocoaPods is doing is a bit daft, especially at their scale.

2 more replies

riscy10y ago· 2 in thread

The CocoaPods developers seem to be missing the entire point of git: it's a _distributed_ revision control system.

justinclift10y ago

Rackspace is also known to sponsor significant resources for larger projects whom ask nicely. GlusterFS is one I used to be involved with doing this, and there are definitely others.

If CocoaPods reach out to Rackspace and/or other hosting providers, there's a decent chance they'll be able to pull together a good solution. :)

The downside though, is they'll need to figure out some way to keep it monitored/maintained. :/

voltagex_10y ago

Last I checked, Rackspace wasn't accepting any more projects.

paradite10y ago· 2 in thread

I have been seeing this trend of GitHub getting "abused" for purposes other than hosting source code.

- My school uses GitHub to host and track our software engineering project (which still can be argued as OSS).

- People using GitHub issue system as a forum.

- Friends uploading pdfs to GitHub.

- Recently people posted on HN about using GitHub to generate a status page.

I think this is a really bad trend and people should stop doing that.

WillAbides10y ago

The school example and friends uploading pdfs to GitHub are both uses that GitHub encourages.

takluyver10y ago

It should be fine to come up with new ways to use Github, as long as it's not causing excessive load.

noahlt10y ago· 2 in thread

karaziox10y ago

Go get on the other hand doesn't keep any index. It just uses the url to download the dependency because of the mapping "url==project name" that exists with go projects.

miralabs10y ago

probably on separate repos which something like npm and bower does as well.

rcthompson10y ago· 2 in thread

cyphar10y ago

Filesystems also have the same pathology, which is why git's object store is of the form prefix/object.

rcthompson10y ago

sly01010y ago· 2 in thread

vbezhenar10y ago

I'm not sure how DefinitelyTyped works, but their repository https://github.com/DefinitelyTyped/DefinitelyTyped is huge and may be they use that github repository as a single distribution point.

WorldMaker10y ago

People interact with DefinitelyTyped in a set of ways, from currently most common to least common:

1. NuGet packages, which are built from the GitHub repository but then redistributed over a non-Github CDN (NuGet's)

2. tsd management tool (https://github.com/Definitelytyped/tsd), which looks like it prefers Github's CDN raw URLs rather than full/shallow git clones

debacle10y ago· 2 in thread

Why aren't the packages distributed? Composer is incredibly distributed and likely doesn't cause nearly the same headaches for GitHub.

Seems like a poor design decision on the CocoaPods side.

donarb10y ago

What package? My project may have a dependency on a specific commit that's between point releases.

The problem is not packages, it's the index, containing 16000 subdirectories,

debacle10y ago

So the packages themselves aren't part of the repository?

1 more reply

whitehat2k910y ago· 2 in thread

mayli10y ago

It should be "Apple or Ruby development community", homebrew is using similar tech stack distributing the `Formula`.

whitehat2k910y ago

I guess it's no coincidence that those two hispter communities are closely related.

Const-me10y ago· 2 in thread

I don’t think GitHub acts wisely here.

Short term sure, they’re doing the right thing, implementing a nice way to manage the free rider problem without hurting them too much.

But long term it’s different.

cyphar10y ago

Not to mention that "just buy another server for this one project" sounds like something CocoaPods should pay for.

Const-me10y ago

>The way CocoaPods is using git is bad

Very likely true, but I don’t see how’s that related.

>they start to have pathologies when you have many entries in the same directory

Only true for inefficient filesystems like FAT.

For NTFS, 16k entries is nothing, the performance fill start to degrade (due to directory fragmentation) at around 100k entries: http://stackoverflow.com/a/291292/126995

>"just buy another server for this one project" sounds like something CocoaPods should pay for.

I don’t think that’s how 21 century economy works in this case.

Github’s value is likely between $0.75B and $2B.

Bad PR caused by this story will exceed 10 years TCO of that extra server.

fpgaminer10y ago· 1 in thread

GitTorrent: http://blog.printf.net/articles/2015/05/29/announcing-gittor...

Imagine a world where GitTorrent is fully developed, includes support for issue tracking, and has a nice GUI client that makes the experience on-par with browsing github.com.

cyphar10y ago

tjdetwiler10y ago· 1 in thread

https://github.com/rust-lang/crates.io-index

wycats10y ago

For what it's worse, both of these characteristics indeed weren't an accident.

fokinsean10y ago· 1 in thread

I found the solution humorous. Ironically shallow clones are causing the problems, so fetch the max :)

$ git fetch --depth=2147483647

alblue10y ago

The problem is that the initial clone is depth=1, then subsequent fetches are depth=MAX. If you did clone depth=MAX in the first place it is faster to serve.

Shallow (depth=1) can be converted into a full clone with the above.

speps10y ago· 1 in thread

Why is everyone talking about CocoaPods where the title is CoacoaPods anyway? :)

rtkwe10y ago

The title was misspelled and has been fixed. It's CocoaPods right on Github.

iBotPeaches10y ago

jrochkind110y ago

What an unusually reasonable discussion. good on everyone.

iamleppert10y ago

Amazing to me that people create inefficient systems like this and then complain when they are rate limited.

maaku10y ago

Using Github as your CDN is a dick move. Kudos to GH for not banning the project out-right, but CocoaPods should seriously reconsider what they are doing.

xemdetia10y ago

SuperKlaus10y ago

In fact, they are maxing out five CPUs - not five nodes, big difference.

kodablah10y ago

Has any consideration been given to Bintray[1] as an alternative store for this stuff?

1 - https://bintray.com/

kmm10y ago

soheil10y ago

I was up until 2am last night trying to publish my Pod [1] and Github kept timing out.

I had no idea it was just CocoaPods repo because my other repos were working fine. I accepted defeat, went to bed and everything was working great in the morning.

[1] https://github.com/soheil/SwiftCSS

joeblau10y ago

I just installed Cocoapods last night and tried to clone down the repo. It took about 5 minutes and I thought to myself "Is my 150MB/s connection slow?" This definitely clears up what was going on.

voltagex_10y ago

It's difficult to run an open source project on a budget of $0. You're always relying on the goodwill of others.

LoneWolf10y ago

While I do not have much knowledge on the subject, why not using rsync?

nimish10y ago

hopefully we can now move to using real artifact repositories.

rdancer10y ago

tl;dr: "Using GitHub as your [free-of-charge] CDN is not ideal, for anybody involved."

j / k navigate · click thread line to collapse