NIH - Not invented here and redoing an opensource project.
- Github said they used HAProxy before, i think the use case of github could very well be unique. So they created something that works best for them. They don't have to re-engineer an entire code base. When you work on small projects, you can send a merge request to do changes. I think this is something bigger then just a small bugfix ;). Totally understand them there for creating something new
- They used opensource based on number of open source projects including, haproxy, iptables, FoU and pf_ring. That is what opensource is, use opensource to create what suits you best. Every company has some edge cases. I have no doubt that Github has a lot of them ;)
Now,
Thanks GitHub for sharing, i'll follow up on your posts and hope to learn a couple of new things ;)
But they also talk about DNS queries, which are still mainly UDP53, so I'm hoping GLB will have UDP load-balancing capability as gravy on top. I excluded zone transfers, DNSSEC traffic or (growing) IPv6 DNS requests on TCP53 because, at least in carrier networks, we're still seeing a tonne of DNS traffic that still fits within plain old 512-byte UDP packets.
Looking forward to seeing how this develops.
EDIT: Terrible wording on my part to imply that GLB is based off of HAProxy code. I meant to convey that GLB seems to have been designed with deep experience working with HAProxy as evidenced by the quote: "Traditionally we scaled this vertically, running a small set of very large machines running haproxy [...]".
"Now that you have a taste of the system that processed and routed the request to this blog post we hope you stay tuned for future posts describing our director design in depth, improving haproxy hot configuration reloads and how we managed to migrate to the new system without anyone noticing."
That leads me to believe it involves HAProxy.
It is then justified that their creation is needed because "no one else has these kinds of problems" but then they open source them as if lots of other people could benefit from it. Why open source something if it has an expected user base of 1?
Again, I am not surprised by this. They whole push of Github is not to create a community which works together on a single project in a collaborative, consensus based method, but rather lots of people doing their own thing and only occasionally sharing code. It is no wonder that they follow this meme internally.
And since we're talking about github, haven't they already launched a highly successful pair of projects in atom/electron, in areas where both had competition? why start with negativity before we see what they come out with?
I'm skeptical that Github's load balancing requirements are that different from any of the other large file hosts and SaaS companies. But it's possible and I'm not in a position to tell. That being said, the "NIH syndrome" is a largely overlooked problem in our industry and I think it's reasonable to raise concerns over new projects that may be reinventing the wheel.
It isn't as simple as here's my massive rewrite, click the accept button and everything works out for the open source community.
Let me be the first to say that the level of politics, circle jerking and knowing people is ridiculous.
I have >3 month old pull requests to add tests which have never been looked at - whereas someone who knows the project maintainer will get a PR looked and merged the next day.
True open source fashion is also the freedom to work on whatever you want.
"Just think about it, if they would have contributed to foo instead of working on bar, then foo would be twice as good!" keeps being thrown around every time someone announces something new around here.
Open Source is about just that - allowing anyone and everyone to have access to the source and (if free) making their own version.
There never needs to be justification to write something from scratch, even if it's been done a million times before.
You can appreciate open-source, you can wish that proprietary code was open source, but you never have the right to tell people what they should do, nor are you ever likely to be correct as to what they should do - you are not them.
I'm bothered by the increasing prevalence of "never invent here".
FOSS is great, but if it's not meeting your needs then writing your own is perfectly valid.
Two reasons.
1) Recruiting. Check out our awesome code! Don't you want to work on this too?
2) Our unique problem today will be the problems of everyone in three years.
This has borne out with the Netflix opes source. At the time it was a problem unique to Netflix -- now a bunch of people are using that software or derivatives.
Joining a pre-existing project is more reasonable when you just can't replicate the basis of the project without learning a lot of new things.
That's the reason why there's a plethora of compile to JS languages, but only a few actual javascript virtual machines.
Facebook: https://www.usenix.org/conference/srecon15europe/program/pre...
Google: http://static.googleusercontent.com/media/research.google.co...
I think very few HN readers are really in a position to have an informed opinion regarding Github's decision to build new piece of software rather than using an existing system.
Personally I find this area quite interesting to read about because it is very difficult to build highly available, scalable, and resilient network service endpoints. Plain old TCP/IP isn't really up to the job. Dealing with this without any cooperation from the client side of the connection adds to the difficulty.
I look forward to hearing more about GLB.
> Over the last year we’ve developed our new load balancer, called GLB (GitHub Load Balancer). Today, and over the next few weeks, we will be sharing the design and releasing its components as open source software.
Is it common practice to do this? Most recent software/framework/service announcements I've read were just a single, longer post with all the details and (where applicable) source code. The only exception I can think of is the Windows Subsystem for Linux (WSL) which was discussed over multiple posts.
Take, for example, this post from CoreOS back in March 2016 that suggested that they might know a way to improve systemd-journald performance: https://coreos.com/blog/eliminating-journald-delays-part-1.h...
It smelled suspicious, but its release generated a bunch of noise on HN anyway. And they never followed up with subsequent parts, which suggests to me that they never found a solution in the first place.
I'm not suggesting that GitHub is blowing smoke -- if you truly have a solution, that's great! But there's no harm in gathering documentation and source code and cleaning it up and waiting until it's good and ready to go. Otherwise, I frankly mistrust the motives and abilities of those involved. Call me cynical if you must.
To paraphrase from another industry, "sell no wine before its time." There's a lot of wisdom there that is equally applicable to products in our industry too.
I think people choose this pattern when:
* engineers implement something cool
* management want engineering content to promote the company/drive recruitment
* the engineers are pressed for time/not professional technical writers
Some bigger companies stagger the context when it's really hefty and makes sense to chunk it up, plus it drives repeat visitors. For smaller companies, the schedule usually slips, and the first post is usually "look how hard this problem is! wow it's really really hard! isn't is amazing that we even tried to fix it? OK see you next time!".
Yes, I think GitHub and eBay are small companies.
[0]: http://www.technology-ebay.de/the-teams/mobile-de/blog/tamin... [1]: http://www.technology-ebay.de/the-teams/mobile-de/blog/tamin... [2]: http://www.technology-ebay.de/the-teams/mobile-de/blog/tamin...
See the product unveiling from Apple, GoPro and probably others that I haven't been following.
Hype and slow release, it's the new clickbait.
- in a traditional L4/L7 load balancing setup (typically what is described in my very old white paper "making applications scalable with load balancing"), the first layer (L3-4 only, stateless or stateful) is often called the "director".
- the second level (L7) necessarily is based on a proxy.
For the director part, LVS used to be used a lot over the last decade, but over the last 3-4 years we're seeing ECMP implemented almost in every router and L3 switch, offering approximately the same benefits without adding machines.
ECMP has some drawbacks (breaks all connections during maintenance due to stateless hashing).
LVS has other drawbacks (requires synchronization, cannot learn previous sessions upon restart, sensitivity to SYN floods).
Basically what they did is something between the two for the director, involving consistent hashing to avoid having to deal with connection synchronization without breaking connections during maintenance periods.
This way they can hack on their L7 layer (HAProxy) without anyone ever noticing because the L4 layer redistributes the traffic targeting stopped nodes, and only these ones.
Thus the new setups is now user->GLB->HAProxy->servers.
And I'm very glad to see that people finally attacked the limitations everyone has been suffering from at the director layer, so good job guys!
When I think of "bare metal" I think of a single image with disk management, network stack, and what few services they want all running in supervisory mode. Basically the architecture of an embedded system.
I was quite excited when I read it, and felt quite let down when I followed up.
That being said, why introduce a new piece of technology without actually releasing it if you're planning to release it, without giving a firm deadline? This isn't a press release, this is a blog post describing the technical details of the load balancer that is apparently already in production and working, so why not release the source when the technology is introduced?
pfsync, lvs and etc uses multicast to share connection state which we also wanted to avoid.
We're struggling with our load balancers right now. We're using Azure load balancers and then HAproxy. But the Azure ones sometimes don't work. Luckily the new network type on Azure supports floating IPs so we can set something up ourselves https://gitlab.com/gitlab-com/infrastructure/issues/466
Years ago I worked at Demon Internet and we tried to give every dial up user a piece of webspace - just a disk always connected. Almost no one ever used them. But it is what the web is for. Storing your Facebook posts and your git pushes and everything else.
No load balancing needed because almost no one reads each repo.
The problem is it is easier to drain each of my different things into globally centralised locations, easier for me to just load it up on GitHub than keep my own repo on my cloud server. Easier to post on Facebook than publish myself.
But it is beginning to creak. GitHub faces scaling challenges, I am frustrated that some people are on whatsapp and some slack and some telegram, and I cannot track who is talking to me.
The web is not meant to be used like this. And it is beginning to show.
I did. That was one of the best features of Demon at the time, when 10MB of Web space could cost you £100+ per year :-) Thanks for your part in making it work!
> But it is beginning to creak. GitHub faces scaling challenges,
I don't agree that Github facing scaling issues means the web is creaking. More like old wooden boats are being replaced by big, sturdy battleships. I think the web is getting stronger thanks to engineers facing the challenges coming their way.
> I am frustrated that some people are on whatsapp and some slack and some telegram, and I cannot track who is talking to me.
If you're annoyed by people messaging on you through multiple platforms, it seems the solution would be to only have one provider. But you earlier call that "what is wrong with the web today," and that we should have distributed systems.
Well, yes. That's the point. It was designed as an entirely distributed setup. It's crazy that in order to post a message to my neighbours I have to send data to Facebook in SV and just as crazy that two devs on the same team need to write their code commits in a load balanced mega server in ... Err ... Washington? Wherever.
And I don't mind having lots of clients but I object to no open standards, incompatible and frequently unavailable APIs and lack of control over my messages and how they are dealt with. I want procmail for messaging platforms ! And I want a pony !
Yes, multiple platforms is overhead. That's why I mostly use mail, basically never IM, even for the most short-lived or informal conversations.
It's no problem at all for me having email conversations with people whose mailboxes are hosted at a diverse set of providers.
A fully distributed "web" of personally managed data is where we'll get to one day. It might need a few cycles of centralisation and distribution, though.
It's also one of the reasons why we must not let the "privacy is dead" and "back door encryption" folk have their way.
To be fair, though, I don't think GitHub is a big part of that problem.
I was at PyCon UK, and watched 60 kids connecting up microbits to RPis and inventing ways to send signals over BTLE without a stack. I would like to solve the whole worlds problems, but if the past twenty years have taught us anything, it's that a few kids can invent the fire and we will all follow. So inthinknwe are going to be in good hands.
My understanding is that the likes of, for example, Cloudflare or EC2 have a pretty solid system in place for issuing geoDNS responses (historical latency/bandwidth, ASN or geolocation based DNS responses) to direct random internet clients to a nearby POP. Building such a system is not that difficult, I am fairly confident many of us could do so given some time and hardware funding.
Observation #1: No geoDNS strategy.
Observation #2: Limited global POPs.
Given that the inherently distributed nature of git probably makes providing a multi-pop experience easier than for other companies, I wonder why Github's architecture does not appear to have this licked. Is this a case of missing the forest for the trees?
Back in the day we did this with Netscalers doing L7 load balancing in clusters, and then Cisco Distributed Directors doing DNS load balancing across those clusters.
It can take days/weeks to bleed off connections from a VIP that is in the DNS load balancing, but since you've got an H/A pair of load balancers on every VIP you can fail over and fail back across each pair to do routine maintenance.
That worked acceptably for a company with a $10B stock valuation at the time.
There are some brainiacs pushing these magic solutions on us and one of the promises is load balancing is not an issue, even better, it's not even being talked about.
Please, please, tell me there's something I'm missing.
What Anycast means is just that multiple hosts share the same IP address - as opposed to unicast. When all the nodes sharing the same IP are on the same subnet "nearest" is kind of irrelevant. So the implication is different.