One very good example is Amazon redis. Amazon figured out that redis asynchronous replication didn't work at scale so instead of fixing issues upstream they chose to develop Amazon redis in house and monetized it.
https://aws.amazon.com/memorydb/
Enhanced version means patched made by AWS. https://aws.amazon.com/elasticache/redis-details/
You're entitled to your opinion but your line of reasoning for how MemoryDB for Redis came to exist or the reasoning about why it isn't in upstream Redis is not factual. MemoryDB's architecture uses Amazon's home grown log replication services as pointed out by Werner Vogel in his blog post about MemoryDB[0]. This architecture is fundamentally incompatible with upstream Redis. The real reason why MemoryDB for Redis exists is far less juicy: MemoryDB came about by meeting customers where they were. Customer's love Redis, especially for its non-caching features, but replication is a headache with all the existing solutions today.
Also as far as I know one of the lead committers to Redis is from AWS.
0 - https://www.allthingsdistributed.com/2021/11/amazon-memorydb...
I’m a total AWS fanboy, but even I cringe when I read that superficial, customer-centric sound bite. You know who meets people where they are? FOSS maintainers.
“Also as far as I know one of the lead committers to Redis is from AWS”
Conveniently cryptic, what does from AWS mean? Do they still work there? Did they specialize in log replication?
Madelyn Olson, at present an AWS employee, is one of the members of the Redis core team [1]. The invitation was extended because she had "been actively involved in Redis development for several years, contributing numerous changes throughout Redis, including bug fixes and features."
For whatever reason they chose BSD. And now Amazon made some improvements and is not contributing back.
Not sure why anyone is surprised.
But yes, this is the reason I won't work for free on my own or others' BSD or MIT licensed projects.
ianal, but imo, Apache License v2, Mozilla Public License v2, and xGPLs v3 are better at protecting the rights of the consumers (including contributors).
Oracle didn't build their company in the style of AWS Redis, that cloning maneuver. Oracle's database was a pioneer. Oracle didn't get where they are by cloning open source and claiming it as their own. Despite the numerous bad things that can be said about Oracle's culture, that's not one of the key negatives about Oracle.
''' Amazon OpenSearch Service (successor to Amazon Elasticsearch Service)
Run and Scale OpenSearch and Elasticsearch Clusters (successor to Amazon Elasticsea... '''
This seems like a petty, small win from the Elasticsearch people. I understand AWS has a history of gobbling up OSS and productizing it, and that that's detrimental, but it's hard to see Elastic, Inc as anything but sore that they got their lunch eaten here. Maybe that's justified. But it comes off as incredibly petty.
(disclaimer: i used to work at aws, but not anywhere near the referenced offerings).
IMO, they chased the benefits of being open source and then changed course when the costs to them exceeded the benefits (which is fine for code going forward), but trademark concerns aside, I can’t see AWS as the bad actor here.
With Microsoft + GitHub intensifying their investments in F/OSS, AWS had to play ball. It is smart, not petty on anyone's part.
Judging from the tone of the article, I am glad Elastic is content in their current business relationship with AWS. Hopefully, the companies also find an agreement to have AWS' OpenSearch fork merged back in, as well.
disclaimer: ex-AWS, but zero insider information.
Yes, it is. Why the scare quotes?
Lots of things are legally allowed that would be very detrimental to me. Or you.
> Those companies can’t simultaneously claim to be competent and to have made that choice without knowing what they were doing.
I'm glad you can tell the future.
> IMO, they chased the benefits of being open source and then changed course when the costs to them exceeded the benefits (which is fine for code going forward), but trademark concerns aside, I can’t see AWS as the bad actor here.
You can at least see how the system has big problems, even if you don't want to blame Amazon for them, right?
I am guessing you didn't like how Amazon was profiting from ElasticSearch's work.. so would you retrospectively change ElasticSearch's license to SSPL? Or maybe help all companies: prohibit all permissive licenses (Apache, MIT, BSD etc..) entirely and force people to either AGPL, SSPL or fully commercial? Or make a law that no free software license may apply to companies with >$1 billion in revenue?
While we are at it, how about prohibiting commercial flat-rate unlimited site licenses? The huge companies can take advantage of those too.
The thing is, I bet Elastic's original decision to use Apache was to get more users, at the expense of also getting more competition. Now they have their users but they don't like the price they paid for it. This may have been a bad business decision on their part.. but I don't see "big problems" there, companies make bad business decisions all the time.
The reason for everyone not selling the software in question to prefer F/OSS is exactly that it requires surrendering the kind of exclusive control that is also what allows you to build a company around the software as a product.
Yes, this means that F/OSS as such is unlikely to be the key product of a successful company. That's always the way it has been with F/OSS.
“But if I can’t center my business on F/OSS-as-product and I can't center my business on commercial software for the same use because F/OSS products, even if somewhat inferior themselves, develop more robust ecosystems, then how am I supposed to compete with free?” one might ask. And the answer is this: “Maybe you’re not: you are entitled to a business model.”
IANAL, and certainly not an expert on trademark law. Hopefully someone with more legal knowledge can provide some resources.
Even if it is just petty, it's petty against amazon, and I for one don't really feel the need to have sympathy for them.
Source: I have a best-selling book author friend who has been defending her trademark on a somewhat popular pop-culture term this way for a few decades.
Of course, Amazon has every legal and ethical right to continue providing their fork under a new name as they’re now doing. That’s the whole point of open source.
I’ve been tossing up moving our workloads to Elastic Cloud anyway, because AWS ES Service is a source of constant headaches for us. Feels like at least once a week a server ends up in a state where we can’t fix it, and AWS engineers have to manually fix their internal state.
Their standard response is “add more nodes”; well, we did that, and it is costing us an arm and a leg, and it didn’t fix the problems. (Plus, now we have new problems where networking blips appear to be causing quorum problems and sending the cluster into a death spiral.)
Purely off the customer experience, it feels like Elastic Cloud has to be better; the whole licensing debacle has definitely turned me off Elastic though.
We saved a bunch of money and gained performance by using our own cluster. That cluster hasn't gone down since... years later.
It's very difficult to debug these problems when you don't have direct access to elastic search's configuration... what would normally take minutes to verify can take hours to isolate.
Effectively a newer config file was lacking a jvm.options line that changed the behavior with an older machine setup. I would not be surprised if AWS deployed a new config to an old environment.
Unfortunately, not having direct machine access, I could not confirm whether this was the case in the failing cluster.
Sounds like it might be a "grass is always greener" scenario. ...maybe I'll spend more time looking into self-hosting...
Their move away from open-source has been unfortunate. For that and some other reasons we've ended up more impressed with Logz.io and Splunk SaaS.
We’ve rewritten most of the interactions with the older version of ES into a new service and use a self-managed OpenSearch cluster on Graviton instances and it’s the most stable Elasticsearch/Solr solution I’ve ever interacted with
If you just need "Lucene but clustered" I highly suggest looking at Solr instead, it's design is much more straightforward and has pretty much all the most important knobs for actual indexing that ES has.
If however you are tightly coupled to ES API or use it with 3rd party systems you are sort of up shit creek without a paddle...
Ran ES at large scale for many years, eventually gave up and only use Solr or custom built search engines these days.
This. I have had similar experiences and adding more nodes had just amplified the issues; nodes were all also barely loaded.
When OpenSearch was announced, I shared some insights into how both Elasticsearch and OpenSearch were evolving, and I'll share some more up to date insights here.
Looking at recent pull request activity, OpenSearch had 52 contributors
https://oss.gitsense.com/insights/github?q=pull-age%3A%3C%3D...
while Elasticsearch had 181
https://oss.gitsense.com/insights/github?q=pull-age%3A%3C%3D...
The metric that I'm most interested in, is knowing how many people committed within the last 14 days compared to those that committed more than 14 days ago. For Elasticsearch, they had 87 contributors which accounts for 68% of all contributors. OpenSearch had 20, which accounts for 67%. With these numbers, I can ball park how many people are working on Elasticsearch and OpenSearch full time and I would say Elasticsearch at the present moment has probably 5 times more people working on it fulltime vs OpenSearch.
An important thing to note is, Amazon has other projects that are related to OpenSearch so these numbers don't necessary give the full picture, but it is pretty obvious that Elasticsearch is evolving at a much faster pace and time will tell if they (OpenSearch) can keep up.
https://github.com/elastic/elasticsearch/graphs/contributors
https://github.com/opensearch-project/OpenSearch/graphs/cont...
Over an order of magnitude more development on Elasticsearch than OpenSearch since that fork
repo | commits | authors
------------+---------+---------
elastic | 2719 | 179
opensearch | 265 | 54
Here's a breakdown based on commits with at least 15 lines of code churn (lines added, changed or deleted). repo | commits | authors
------------+---------+---------
elastic | 1681 | 122
opensearch | 153 | 41
What is interesting about these numbers, is they clearly show a lot of people (57 authors) took the time to create a small pull request, which goes to show how popular the project is.But I still think having a bunch of folks contribute isn't trivial and worth highlighting (57 is a ton for many projects) and anytime I see a truly OSS project at this scale I think it's a good thing.
In practical terms, I now have to worry about supporting both Elastic and Opensearch as a consultant (always looking for new customers to help, if you need some advice) and I can't really recommend people stick with Elastic with a straight face either because Opensearch really is a decent alternative and increasingly the default for new users; as I have noticed with my own customers.
I maintain a Kotlin client for Elasticsearch and I have started work on making that work for both Elasticsearch 7, 8 and Opensearch; something that is complicated by the fact that Elastic intentionally made sure their Java client (which I depend on) no longer talks to opensearch. They also deprecated it and introduced a completely new one recently. And of course the recently released Elasticsearch 8 complicates things with a few new features, compatibility breaking changes, etc.
On top of that, Opensearch is starting to get its own features or alternatives to Elastic-only features. I agree that the momentum is still with Elastic but it is not the case that Opensearch is a dead fork. I know of several people that quit their jobs at Elastic because of the license change and that are now working on Opensearch or on Opensearch plugins. It will be interesting to see how the two forks evolve.
IMHO, the license change is long term a mistake for Elastic. They've cut themselves off from open source contributors who can still contribute to making Opensearch, Lucene, or Solr better. So, that's long term not going to be helpful. Elastic has always leaned on opensource heavily for contributions and hiring. A lot of their early hires came from the Lucene community and the many researchers contributing to that and later from people contributing to Elasticsearch. Additionally, a lot of value gets added to the ecosystem by OSS plugins. The authors of those now have to choose between Opensearch and Elastic. That's long term not good for Elastic as a lot of cutting edge stuff usually starts in the plugin community.
I don't necessarily like what Amazon did but I do think they did it the right way from the point of view of creating a credible alternative. And I think that the way Elastic responded to that is throwing out the baby out with the bathwater.
Perhaps the most striking statistic is that the total number of PRs for both combined declined: everybody loses. IMHO, they should just roll back the license change and grow their community rather than fragmenting and shrinking it. Amazon is going to get some of their customers either way. I actually think the fork is working out better for Amazon than it is for Elastic currently. It was a mistake. Amazon is not the most likely steward of open source. But stranger things have happened. Take MS in recent years for example. It's a sound business strategy to not piss off OSS developers. Elastic would do well to take note of that.
repo | month | commits | authors
------------+---------+---------+---------
elastic | 2022-02 | 195 | 66
elastic | 2022-01 | 269 | 68
elastic | 2021-12 | 194 | 64
elastic | 2021-11 | 233 | 59
elastic | 2021-10 | 435 | 74
elastic | 2021-09 | 318 | 64
elastic | 2021-08 | 49 | 30
opensearch | 2022-02 | 18 | 11
opensearch | 2022-01 | 34 | 17
opensearch | 2021-12 | 32 | 10
opensearch | 2021-11 | 19 | 13
opensearch | 2021-10 | 28 | 12
opensearch | 2021-09 | 20 | 10
opensearch | 2021-08 | 3 | 3
It seems like the contribution for Elasticsearch hasn't drastically changed that much (if any really), but what is interesting is contributions for OpenSearch has been increasing.It's written in asynchronous Rust with native application speed albeit using much lower memory usage than ElasticSearch, and comes with even more features than ES, so it's feature-rich, blazing fast and can still benefit on multithreaded CPUs. Downside is that it does not have distributed indexing mode yet, but it is scheduled on this year (presumably Q4 2022 I guess)
The low end of the market is well covered with solutions that don't really offer a lot of features, are a bit challenged on the scalability front, lack usability, etc. Some of those have more merit than others of course than others. Things are not automatically better when they are half implemented in Rust. Lucene is an amazing piece of technology that over the years has resisted multiple attempts for people to do better in other languages. Contrary to the popular belief, it's actually pretty good with memory. Most of it is off heap memory or operating system file caches these days: it relies on memory mapped files for a lot of things. It's also pretty good with doing things concurrently. E.g. updating index files with 32 CPU cores while also serving queries is not an easy problem to solve. I've indexed documents at a rate of 500M/second on a 30 node cluster once. That's pretty amazing to see happening. I was basically saturating IO and CPUs. Indexed over a billion documents in about 1 hour.
Lucene has many issues; scaling isn't one of them.
I recently used Elastic Cloud, and as much as I hate the company, their product is actually really good. I’d always recommend Elastic.
Yes, Elastic’s press release is carefully crafted to give that impression, but, AFAICT, all AWS is doing is using the name of the open fork (“OpenSearch” [0]) for their service, which is now labelled “Amazon OpenSearch Service” and subheaded “(successor to Amazon Elasticsearch Service)” [1], and not using the ElasticSearch name (except as a historical reference.)
It's pretty disingenuous to say they don't open source.
Based off experience with AWS and GCP, AWS has significantly more widely available. On the other hand, Google the company has contributed a lot to OSS (like cgroups in the Linux kernel)
Edit: I see it’s called OpenSearch now.
It’s such a shame that this happened