Afterwards I never really saw the point of any of the search systems other than elastic search because the streaming capabilities that it gives you.
At work we just materialize the data from PG into ES and take advantage of the powerful ES queries and redundancy. Scaling up by just adding nodes is easier.
Use HN to poll for opinions and experiences from others, not for things that take 30 minutes to resolve.
That said, this question probably works better somewhere like Reddit programming or some IIRC where Elasticsearch hangs out. After going that route, it’s probably fine to poll here based on that research.
Sometimes posts are shit, and it's OK to call it out to hopefully improve this site collectively.
Solr is good. I've been wanting to try Lunr [1] for small sites.
We wrote our own search engine at that point. You are right that there are a lot of little “devil in the details” issues. But overall it was a fun experience.
This was needed to support some specific machine learning workflows in the search ranking process — which could not be used if we paid the high latency cost to first get preliminary results in Solr.
So we took a “create your own index data structures” approach with index data (both the normalized bag of words vectors and companion data like boolean filters), which allowed us to highly optimize the initial broad ranking query. Latency was low enough that it allowed the time cost of calling follow-on machine learning services.
This was for a fairly high-traffic product search engine at an online retailer. It ended up working very well and over a span of about two years we eventually rolled all search traffic onto the in-house platform, even the parts not needing the machine learning services, and our query latencies went down across all our traffic, and we retired the original Solr implementation.
Wouldn’t be the right choice for everyone, but it informs my opinion a lot about the worthwhileness of creating an in-house search engine to specifically replace Solr. I’d suspect a lot of medium-sized or large companies running Solr should seriously consider it.
For some value of "highly performant". I remember its search (exact substring match) being significantly slower than simply running grep on the same data (JSON documents produced from syslog logs) stored in flat files.
It did have several advantages over grep in that scenario (e.g. having a structured language and being accessible for other programs through network), but performance was not one of them.
> JAXenter: You started Compass, your first Lucene-based technology, in 2004. Do you remember how and why you became interested in Lucene in the first place?
> Shay Banon: Reminiscing on Compass birth always puts a smile on my face. Compass, and my involvement with Lucene, started by chance. At the time, I was a newlywed that just moved to London to support my wife with her dream of becoming a chef. I was unemployed, and desperately in need of a job, so I decided to play around with “new age” technologies in order to get my skills more uptodate. Playing around with new technologies only works when you are actually trying to build something, so I decided to build an app that my wife could use to capture all the cooking knowledge she was gathering during her chef lessons.
> I picked many different technologies for this cooking app, but at the core of it, in my mind, was a single search box where the cooking knowledge experience would start a single box where typing a concept, a thought, or an ingredient would start the path towards exploring what was possible.
> This quickly led me to Lucene, which was the defacto search library available for Java at the time. I got immersed in it, and Compass was born out of the effort of trying to simplify using Lucene in your typical Java applications (conceptually, it simply started as a “Hibernate” (Java ORM library) for Lucene).
> I got completely hooked with the project, and was working on it more than the cooking app itself, up to a point where it was taking most of my time. I decided to open source it a few months afterwards, and it immediately took off. Compass basically allowed users to easily map their domain model (the code that maps app/business concepts in a typical program) to Lucene, easily index them, and then easily search them.
> That freedom caused people to start to use Compass, and Lucene, in situations that were wonderfully unexpected. Imagine already having the model of a Trade in your financial app, one could easily index that Trade using Compass into Lucene, and then search for it. The freedom of searching across any aspect of a Trade allowed users to convey this freedom to their users, which proved to be an extremely powerful concept.
> Effectively, this allowed me to be in the front seat of talking and working with actual users that were discovering, as was I, the amazing power that search can have when it comes to delivering business value to their users. Oh, and btw, my wife is still waiting for that cooking app. Now, 10 years later, it is the basis of Elasticsearch.
https://jaxenter.com/elasticsearch-founder-interview-112677....