This has been a personal pet project of mine and I spent considerable time getting my hands dirty with the code, as the team was busy with other initiatives. When I said the "feed broke" for the launch I meant I broke it. Software is messy especially for an old school dev. I learned in the process I am not a very good coder anymore (if I ever was one?), constantly going back and fixing stuff I previously thought was solid. Check it out in the linked repo [1].
Most importantly - I found the site replace the need for discovery for me, and getting to know various different humans and their writing felt good! A lot of unexpected stuff surfaced and the web felt close again. I think there is a glimpse of hope in the concept and I hope you see it too. And the improvements to search quality and diversity this brings are real.
You can check the list of included websites here [2]. And all the recent posts already surface in Kagi results (for relevant queries).
[1] https://github.com/kagisearch/smallweb
[2] https://github.com/kagisearch/smallweb/blob/main/smallyt.txt
It would be nice to also be able to just search within the small web, maybe using a lens in the future?
Regarding the topic of self promotion, I would disagree with the current rules and I would ask you to allow people to self promote. As long they have an old enough blog, maybe even cut that down to a year, would be helpful. Most users on the small/indie web lack visibility and this would do them service. My blog is already within the index because I think it might have been picked up during the "HN share your blog" post that happened a while ago, but others might not have been that lucky.
For me you represent an incredible accomplishment: the first search engine that gives better results than Google, respects privacy, offers customization and so much more.
Thank you.
I notice google/bing and quickly kick in my muscle memory built up over the years to type duckduckgo.com and do another search.
After this I still am confused why the results are so crap. Then I notice the duck laughing at me and think... shouldnt this be a dog?
And then I start feeling bad for the original user of the device and tell them to do the search themselves and let me know when they get a result
*https://search.marginalia.nu, commonly mentioned on HN https://hn.algolia.com/?query=marginalia
And to be fair, I was simultaneously inspired by the blog thread to build a curated blog filter for my search engine, which led to a series of changes that overall tends to promote more of this types of results in general.
The fact that we're several to have similar ideas sort of validates the ideas I think.
I'm a Kagi customer, and a very happy one. The search engine is amazingly good. This only makes me happier with my search engine choice.
I understand the spirit of it and don't have any counter examples but seems like a bummer if someone has a nice indie blog but can't be added because they have a few ads or a sponsored post.
That is to say that very little other effort was made to curate the YT channels and we expect the user community will contribute to edit the list.
Also I was under impression that all YouTube channels had ads, so that is why this was not considered as a seperate criterion.
why not Matrix? I think that's a more realistic alternative. :)
So happy so see the small, independent and more humane web being highlighted.
I’m trying to do my part[0] but I have no doubt that a search engine—even if still a niche one— can have a much bigger impact.
Really well done Vlad!
It's expensive enough that I can't imagine anyone repackaging it profitably (2x Bing search prices for me) but having to email someone adds just enough friction to discourage a lot of tinkerers from even trying it.
Edit: just saw this:
> Do not submit your own website.
I see. I'm okay with that. Maybe it will show up there one day.
I noticed the example result for useyourloaf wasn't included if I switched it to "Sweden" and not sure if this is just an oddity or if the entire feature is nerfed because I just leave my locale on all the time.
https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
Kagi could just admit they don't want to moderate notes or store them permanently. No need to push down the small web, because a lot of small sites preserve their content.
I get that Kagi probably has data indicating the reality of how often sites down, but it seems from my experience that content in big platforms disappears often as well, even in the cases where the creator hadn't forgotten about it. The "Small web" websites made by a creator that cares have the room to be much more permanent.
It would be nice if Kagi Small Web would have an ActivityPub interface so that the most appreciated sites of a day could be added to a timeline on mastodon or lemmy.
I hadn't considered this aspect of Kagi yet. I'm not a subscriber at this point, but I am strongly considering it. But I use search instead of typing domain names directly to avoid the typo phishing style attacks. I wonder how much "artificial" search that would generate based on my typical usage.
On the $5 tier it's 1.7 cents per search for the first 300, then 1.5 cents after that. As expected I blew past the 300 before the month was out and am currently sitting at a total of almost $9 for this month and there are still 3 more days to go, and this includes being away for labor day weekend and only using my phone (without kagi) for a couple days, and this is after populating and using a bunch of bookmarks specifically to cater to the fact that I now pay for searches. And this is only my laptop. I have not used kagi on my phone or anything else yet.
I haven't used bookmarks in 20 years and don't particularly want to. I normally don't even have the toolbar visible but now I had to un-hide it and add that clutter back to my browser.
So I'm both paying money and contorting my usage pattern.
I guess now that the first month is about done, I can say it looks like I should go up to the $10 plan, where the searches are only 1 cent, but only for the first 1000, and only if I actually use all 1000! If I pay $10 and only do 250 searches that month, then they weren't 1 cent were they?
As much as I like it, I don't know if I'm going to keep it.
I will not pay $10 or $25 just to have it sitting there available "in case", and I apparently will at least some times (who knows how often? every month? 3 out of 12?) will blow past the $5 an end up paying $10 anyway.
If I complicate my usage to cater to kagi so that my default is ddg and just invoke kagi sometimes when I feel like it would help, then I'll probably forget it exists most of the time and do about 10 per month and pay 50 cents each. Probably only one or two months of that and I'll just decide it's not worth $5 for a handful of searches and just cancel it.
The only way it will be useful for me is if it can just be the default search that I don't have to worry about.
They should just figure out whatever the fair per search price is and bill that. The stupid tiers are probably going to drive me off.
I'm now halfway through my billing period (22 aug - 21 sep) and I'm on 488 searches. Would be tight on the pro plan (1000 searches), but I'm still on the early adopter pro plan (1500 searches).
The search results are consistently better than anyone else's, including DuckDuckGo, so I am, and will remain, a happy paying customer.
I tried Kagi for about a week, and it felt more or less identical in quality to DDG, just infinitely more expensive as DDG costs nothing.
It feels a bit like how it felt discovering Google back when AltaVista was still a thing.
I had some memory of AltaVista being full of SEO (as I recall it was mostly a keyword search engine) spam.
And that was probably true, but I also came across this Quora page https://www.quora.com/Why-did-Altavista-search-engine-lose-g... and it reminded me about two other things stand out to me as true
1. AltaVista was slow and you needed to know your syntax to find things 2. AltaVista was cluttered. Google had very clean results.
I think on 1 google is still fast, but 2. not so much. Google's results are far from "clean", instead it feels like the main goal is showing you as much ads as possible and preferably getting you to accidentally click on one.
This in and the fact that Kagi results are as good or slightly better (and much more customisable) than Google is probably the main reason I'm so happy with Kagi.
After all, Kagi sources results from Google so it's no surprise the quality is similar https://help.kagi.com/kagi/search-details/search-sources.htm...
Plus, Searx supports many more search engines, and I can customize it exactly to my liking.
I wish them well, as they clearly have good intentions and a good product, but I prefer using an equivalent OSS and self-hosted solution over a proprietary SaaS I have to create an account for, even if it's not as polished or featureful.
EDIT: Actually, I'm wrong. Kagi apparently also has their own crawlers and indexes[1]. Still, I'm not finding Searx results to be deficient, so I'm not missing out on much.
One thing that does concern me with Searx, and partially with Kagi, is that those 3rd parties could decide to block these API requests at any point, leaving Searx unusable, and Kagi's results less relevant. I'm not sure this is a sustainable way to build a search engine, but I do appreciate both Kagi's and Searx's stance on ads. Using any mainstream search engine via their own frontends is a frustrating experience at best.
[1]: https://help.kagi.com/kagi/search-details/search-sources.htm...
https://kagi.com/search?q=%s&r=fr
Where `fr` is the region codeI really appreciate Kagi's development matching what i feel like i'm buying. Thanks Kagi Team <3
Spotify: Really hostile, manipulative, and terrible to artists. On my end I hate how commercial the homescreen is, and the CarPlay interface is just a user-hostile to a degree that is frankly unsafe disaster.
TIDAL: Pretty good in a lot of ways. They pay their artists well. The recommendations are decent. The apps have some really stupid bugs that have persisted for years though, the most annoying of which is that if you shuffle a playlist it only shuffles the dozen or so tracks that the interface had pre-cached from the top. So if you try to shuffle your full library you wind up just hearing the same dozen songs only, over and over again.
Deezer: Wanted to like it, but the apps aren't great, and I ran into more missing tracks than I'd like.
So I finally settled on Apple Music. I've got an iPhone, so it's a natural fit there, and the CarPlay interface is great. They also pay artists almost as well as TIDAL. The recommendations are super good enough, and I don't really feel like the home page is constantly trying to push me to whatever the huge labels are paying them to promote (damn you Spotify). The Windows apps are terrible (like flat out embarrassing), and the Linux apps are non-existent, but luckily there's a pretty great open source app called Cider that solves that.
This is what blocked me from trying Apple Music as well. Maybe i'll give Cider a look, thanks!
<math display="block">
<msup><mi>e</mi>
<mrow><mi>k</mi><mi>t</mi></mrow></msup>
</math>
Which shows up in your RSS feed [3] as: e
kt
[EDIT: filed an issue: https://github.com/kagisearch/smallweb/issues/10][1] https://www.jefftk.com/news.rss
[2] https://www.jefftk.com/p/weekly-incidence-vs-cumulative-infe...
#1 What's the rationale behind favoring recent blog updates? In my experience, recent updates make the weakest search results, more prone to updates and link breakage, and overall tend to be of lower quality. I also wonder if promoting recent content might incentivize pumping out low-quality entries to increase the odds of being listed.
#2 In dabbling in the domain I've always ended up with an almost absurd skew toward technical programmer:y blogs. While there is a strong overlap between the cohort with a blog, and the cohort with programmer interests, I feel it would be more inviting to other groups if other interests were better represented. Is this something you've thought about, and if so, what do you think might be done?
#1 There are a couple of factors.
- More recent content tends to be more relevant given the same search query, or at least its freshnes will contribute to it not being completely irrelevant (which is the worse thing you want in a search engine). The quality of it is already guaranteed to some extent by this being a curated list to begin with.
- It was relatively easy to assemble and maintain the list because it relies on RSS feed tech, and there were a lot of sources to seed it.
- Focus on recent writing can encourage some people to write (more) as we had an example highlighted in the blog post. In general, the web needs more high quality, non-commercial content, and this is ultimately what we want to contribute towards with this. By providing a platform (even if very small) to encourage this behavior we get a step closer to the web we like.
#2 In general I agree. Although I should say we did spend effort to create a diverse pool of websites for this initiative (for example I am seeing a lot of economy or photography). Again, we can only encourage the creation of more content in various areas through platforms like Kagi (and Marginalia) and hope that it will work out at the end.
I wish kagi has something similar, one place where I can see all the links to the personal websites collected via all it's sources
They do. It's in the open repo https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
I really like what you're doing with Kagi Small Web -- love that you've taken the initiative to start surfacing all this excellent content. Keep up the good work. I think I'll try out Kagi search...
I think the only condition is that the article is less than one week old.
And of course it showed me Questionable Content, which I first got to via stumbleupon.
In the meantime you can toggle the 'non-commercial web' lens which will include KSW (and some other things).
https://kagi.com/smallweb displays the homepage for the Kagi Blog at the moment, though.
not that it matters since the old web wasn't good. it was as terrible as now. the UI of absolutely every website ever made has been terrible quirky garbage compared to something like windows 98. Even back then there was a massive difference going from windows 98 (sane GUI) to web (garbage hackjob GUI + ads (YES REMEMBER 40 POPUPS? ADS? TOOLBARS? THE OLD WEB WAS NOT GOOD IT WAS A HELL JUST LIKE NOW)).
the content was never good either. every topic discussed on the web is little cliques who believe some easy to digest nonsense and then if you go skim some books on the subject the meta is completely different. except programming since that just centers around the web [1]. think of anything else like cooking or engineering
the web is a terrible protocol that should have died 20 years ago and been replaced with something that was modern at the time like freenet (and they should have made an alternative to html etc).
1. and this is ironic too since programming is the one field that is steered by the web's body of pseudoknowledge and as a result you have people who think C, PHP, and OOP are legitimate programming practices.
what kind of utilities?
I am happy to pay $2 (or more) for tools I use if it is a one time payment or a payment for tokens (e.g. $n for $m ocr scans, generated images or whatever).
I loathe apps that demand monthly payments unless it is really understandable why they have to have it that way (service that require permanent storage comes to mind, although I think I only use iCloud for storage now).
On a side note, there is now a - to my best knowledge - completely unrelated product "ExpertGPT" from some totally different company. I am not talking about that one.
Second link I found: https://ambience.sk/quotes-from-books-the-universe-maker/
For example Kagi uses Google and even hosts on GCP - I think Google's technology and people are great, it is just the business model that is rotten and contributes to the deterioration of the web.
And interestingly enough, at least Discord (and to some extent Twitter) are trying to have a business model that does not put more ads down your throat (although admittedly I did cancel my Twitter subscription as unexplainably they still showed ads even when subscribing - you can't sit on two chairs).
Twitter isn’t really trying to do that. It’s just something the owner likes to claim, but what he’s actually done is the opposite:
- Hired a CEO from the ad world
- Increased personal data collection (with updated Terms of Service to reflect that)
- Limited tweet access without login (those views would be harder to target with ads)
As a Tesla customer I’m fed up with this person’s endless lies. Seems like you got a taste of it with your Twitter subscription too.
Yeah I see similarities to marginalia, but it’s great to have multiple services for the small web.
I need to get my website on the lists asap!
Now more than ever do we need a user friendly search engine.
Well this is disappointing. It's no harder to curate other languages. You're just say you don't care.