Other ones worth checking out include:
- https://search.marginalia.nu/ (A non-commercial search engine)
- https://wiby.me/ (Tends to have those really weird and cool indie sites)
- https://searchmysite.net/ (An index of personal websites)
- https://indieweb-search.jamesg.blog/ (Search IndieWeb websites)
- https://millionshort.com/ (Ignore the first million results from Google)
https://seirdy.one/2021/03/10/search-engines-with-own-indexe...
IMO, Kagi and Brave search are the two best alternative general search engines right now.
Runnaroo was pretty good as well ;-)
It shouldn't be that hard to find the bad network if you're systematically investigating all the time. Google has people testing search results often.
The problem is that this is fairly expensive. But quite possibly not the largest cost a search engine would have.
As a few of you noticed, narrow searches do not work very well because this is not a general web search engine and has a tiny index compared to Google. Use Teclis to discover more about a broader topic you are interested in and to discover writing from 'clean' websites on the web.
Looking forward to feedback to improve!
Are you getting better results with vector search?
I've been looking at this problem with my search engine as well. I've recently side-loaded all of stackoverflow and stackexchange, and searching in that part of the index is still not great at finding narrow results like you can on bigger search engines, when that reasonably speaking should be possible.
I think, beyond the fact that my index is DIY and fairly crude, algorithms like BM25 are designed to identify topical keywords, and they do that rather well, but narrow searches go far beyond merely the topic and often involve words that aren't important to the document but are important to some particular context within it.
I may have some ideas to get around this, but they're fairly half baked. Experiments are needed.
Hybrid approaches that use vector search for broad matches and rerank using BM25 could be what you’re looking for. See https://blog.vespa.ai/efficient-open-domain-question-answeri...
Also, Marginalia Search link on front page is broken.
Matches well with our thesis we wrote about here: https://re-search.xyz/writing/mapping-the-new-world-towards-...
Disclaimer: We’re a research group that is also working on a new kind of search engine. Our approach is a little different though. We think that information is now scattered across different semi-open silos, so the future of search will not look like a search bar and ten blue links to web pages.
https://search.marginalia.nu/explore/news.ycombinator.com
If you click 'similar' under any site, you get a list of its neighbors.
I think it would be neat to extend the metaphor not to just websites, but ... I dunno, something more general, links, topics, what have you. Like a browsable web of connected things. Maybe like with a bookmarking or annotation system. I think it could be super neat. Still a bit of a hand-wavy idea, but I want to build it, or someone else to build it.
I do think the search box is a bit limiting.
Another graph that is useful is the graph of people -> topic clusters. See https://twitterverse.net/ . Such a graph can help rank content from people deeply invested in a particular topic, and its hard to fake because they would presumably have to trick all their peers about their expertise
We'll have our own Show HN soon but it's great to see similar ideas bouncing around. Would love to connect over email to learn more about your thoughts.
I'm genuinely surprised there were any pages left to crawl.
Unfortunately this also kicks out genuinely useful blogs and other pages that are otherwise helpful but happen to be using a platform or framework that makes a few block-worthy requests.
I can't figure out if all of Wikipedia is in the removed set or just ranked too low to show up in results. On the browser, the site seems clean.
Turns out it was filtered out by mistake, back now!
Funnily enough I somehow ranked #1 for "ADHD" but I don't know what's particularly special about my landing page. Does your crawler look prioritize/crawl HN by any chance?
This seems like a really good way to do research as well: people offering information without the expectation of getting paid for it.
My queries didn't get (obviously) mangled behind the scenes! Thank you for treating me with respect. Having said that, Teclis doesn't seem to treat alternate spellings of '-ise' words (e.g. normalise/normalize) as equivalent -- this is one case of auto-correction that I do appreciate in other search engines.
I just noticed the semantic search mode tip. I haven't tried it yet, but I like that it's not the default way to interpret my query.
I found it easy to find "technical" results and even (relevant) websites that I've never seen or heard of within the first ~10 hits. I wonder about the link between "non-commercial" as Teclis defines it and authentic, non-abusive, or otherwise desirable search results.
Also good:
- I didn't need to turn on javascript.
- clear info on the front page (the info itself and the fact that it's right there)
- results are actual normal links
- result snippet is normal selectable text (not a giant link)
Plus I'm impressed that kagi.com teclis.com and the Orion browser is all the same guy ^^
EDIT: And "Kagi was created in 2018 and is running on tight budget, bootstrapped by the founder's funds from the previous exit. "
Hopefully they can keep iterating and improving this; a new entrant to Search is always welcome!
Because we desperately need something better and more useful than Goggle. It'll take a paradigm shift, for sure.
Projects on GitHub (if it found anything, it was shitty, unmaintained forks)
Current events like the war in ukraine
Wikipedia articles
Terms found on websites I host or frequent which do not serve any form of advertisement (not indexed apparently, the hits were completely irrelevant with zero matching terms)
I wonder if it would make sense to have cross platform plug-ins, so that all of these interesting nascent search engine efforts could automatically benefit from new plug-ins and an ecosystem could start to develop.
It’s great to have an alternative but obviously it’s such a huge effort the efficiency of development will be important.
Examples:
http://teclis.com/search?q=angular+
https://teclis.com works just fine, site was submitted using its http link (maybe mods can change it).
The fact is "best laptop" is what's called a commerical intent query where people are looking to make a purchase. They want recent results and recent products, not informational articles
> The fact is "best laptop" is what's called a commerical intent query
However, there's a place for a search engine that doesn't see it as one - there've been quite a few times when I was trying to research a topic, when the search engine assumed it was a commercial intent query and made it almost impossible to get the historical view I needed from the search.
"cool shirts for summer" and then search places like Reddit, fashion forums, etc. basically all areas where UGC is relatively authentic. And then toggle it for "paid for blogs", like strategist, wirecutter, rtings, etc.
"A query would help :-)"
One thing I noticed playing with Teclis is that it gives useful results for 'A vs B' queries. I don't know a single other search engine that still delivers remotely useful results for this type of query.
I think found one but failed to read that (which?) page.
The index is 5bn+ pages and entirely independent as opposed to the Bing/Google offshoots.
With Google's algo tending to favour "mobile optimised" sites, I suspect a lot of older sites get buried.