Would love to hear feedback and how useful this is relative to the existing search.
Compared to Algolia.hn, this gives 0 filter controls (time window, stories vs. comments, `author:metadat', sort order, and so on), and no ability to search for exact matches. It failed to turn up anything interesting or even relevant for the 4 or 5 queries I ran.
You've still made it further than I in the HN search engine adventures, which is commendable.
It would be remarkable and interesting to have a super deep search capability that indexes all first-order links on this site.
Is that a valid link? I get an error when opening it.
Algolia has already done the search thing, can the Vectara search be 10x better?
What I do find missing from HN is the ability for me to see things that may be of interest to me, but that I may have missed. I like how I get everything in the main feed which is pure popularity, but I don't have the time to go through all posts, and definitely likely miss things I would probably have been interested in.
Though this can be done with collaborative filtering, or other non-AI methods, might this be a decent use case for your AI?
I posted an RSS reader that can do this recently [2] and I'm actively hacking on another [3]. But there's many RSS tools that can do this.
[1] my hunch is that some human expert curation is involved.
Human curation also exists, but I think that is aimed at removing spam and uplifting YC company posts.
(I've been thinking about this not just in terms of HN, but treating all my RSS feeds as one undifferentiated stream and just having a chatbot sort incoming items into whatever bucket it deems most appropriate).
What's stopping me is that it might work, and I doubt making the internet even stickier is good for me long term.
Want more posts about Lisp, Smalltalk and reverse engineering, for example, rather than the usual front page drivel? Search for them.
On one hand I wish Algolia didn't give very old posts a lot of weight (it often prefers to show posts > 8+ years ago), on the other hand old content tends to be before the Eternal September of tech-adjacent people coming to this forum to discuss tech-adjacent light content, so it's actually a feature. The real value of HN is its archives IMO.
javascript:(function() {function randomDate(start, end) {var date = new Date(+start + Math.random() \* (end - start));var day = ("0" + date.getDate()).slice(-2);var month = ("0" + (date.getMonth() + 1)).slice(-2);var year = date.getFullYear();return year + '-' + month + '-' + day;}var startDate = new Date(2007, 9, 9);var endDate = new Date();var randomDateStr = randomDate(startDate, endDate);var newUrl = 'https://news.ycombinator.com/front?day=' + randomDateStr;window.location.href = newUrl;})();I found a bug. Under the "When will GPT-5 be released?" search results, there are double duplicate results. On one of the duplicates, the "username (date)" says "undefined (undefined)"
> Arm says it wants all Snapdragon X Elite laptops destroyed
Not so useful.
So it’s not like it’s irrelevant, even though it is certainly not actually the most relevant one either.
It seems to give better results if you are more specific. For example, try the following search:
how to use iptables effectively
And have a look at the first five or so results.
Also, note that OP said it’s searching about six months worth of data. So if anything specific about iptables that you were looking for is older than that then their search tool doesn’t know about it.
One of the most frequent searches I do is to look for a specific comment that I know a user made recently. For example, I might want to look for my own comment here: https://news.ycombinator.com/item?id=40801389 (sorry, this is a slightly political one but I just picked it randomly for test purposes).
Searching Vectara for "n4r9 NHS" produces no results: https://hackernews.demo.vectara.com/?query=n4r9+NHS&filter=
HN's own search however produces the goods in the top result: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
[ EDIT except for this very post :p ]
Maybe 6 days ago is outside the dataset that this is based on?
Some other thoughts/suggestions:
- Ability to click through to the comment itself? At the moment it looks like the link goes just to the main comments page and then I have to find the relevant comment on the page.
- Filter comments vs posts?
- Order by datetime?
- Filter within a date range?
My personal opinion is that I'll keep using the HN search for the foreseeable time.
It doesn't seem like it has any filtering or sorting like the Algolia one has, like comments/stories by a specific user, during certain dates, sorting by upvotes/recency, searching by just title/content/comments.
Say I wanted to search for comments by the OP, ofermend, it doesn't seem like I can...
Entering just their name returns results that aren't made by them nor mention their username, I tried other queries too without any luck.
Although, something I value a lot from algolia is the very fast live search as you type[0].
Vectara seems to be smarter, but much slower.
My needs are satisfied with algolia 99% of the time as a technical user.
[0]: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
https://huggingface.co/spaces/vectara/hacker-news-chat
Feel free to ask it some things and let me know how it works.
PS: no, lootitooti is not my project. I decided to finally watch Game of Thrones with my wife and I remembered that site when I was watching the opening. I remembered seeing it here on HN, searched and found it.
I am currently playing with the Algolia hackernews search API myself and experimenting with spaCy Named Entity Recognition and llama3 to come up with some interesting data.
Work in progress version here: https://news.facts.dev/topic