Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.
The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.
Yes, Ephemeral queries must not retain any data, but there is also other rules, for instance it is forbidden for commercial services (where Ollama have a pricing model ?).
It's OK to pirate a massive amount of books if you're not reading or sharing, but rather just training an AI.
Caching is a problem with many geocoding APIs (which I happen to be familiar with) and a good reason to prefer e.g. Opencage over the Google or Here geocoders because unlike most geocoder terms and conditions, Opencage actually encourages you to cache and store things; because it's all open data. The Here geocoder requires you to tell them how much data you store and will try to charge you extra for the privilege of storing and keeping data around. Because it's their data and the conditions under which they license it to you are limiting what you can and cannot do. Search APIs are very similar. Technically geocoding is a form of search (given a query, return a list of stuff).
It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.
Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.
Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA. (You don’t need revenue to be subject to this, just being a data controller for 50,000 Californian residents is enough.)
https://oag.ca.gov/privacy/ccpa
I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.
I wonder how they plan to monetize their users. Doesn't sound promising.
Why would I use those models on your cloud instead of using Google's or Anthropic's models? I'm glad there are open models available and that they get better and better, but if I'm paying money to use a cloud API I might as well use the best commercial models, I think they will remain much better than the open alternatives for quite some time.
Ollama is beloved by people who know how to write 5 lines of python and bash to do API calls, but can't possibly improve the actual app.
Qwen3 235b
Deepseek 3.1 671b (thinking and non thinking)
Llama 3.1 405b
GPT OSS 120b
Those are hardly "small inferior models".
What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.
I was thinking about trying ChatGPT Pro, but I seem to have completely missed that they bumped the price from $100 to $200. It was $100 just a while ago, right? Before GPT-5, I assume.
At some level it's also more of a principle that I could run something locally that matters rather than actually doing it. I don't want to become dependent on technology that someone could take away from me.
What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?
200 weekly users :)
I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.
My rough idea for architecture:
- Crawler: A lightweight scraper that visits each site periodically.
- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.
- Storage: Store raw HTML and text locally, maybe compress older snapshots.
- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.
I would do periodic updates and build a small web UI to browse.
Anyone tried it or are there similar projects?
Which was very encouraging to me, because it implies that indexing the Actually Important Web Pages might even be possible for a single person on their laptop.
Wikipedia, for comparison, is only ~20GB compressed. (And even most of that is not relevant to my interests, e.g. the Wikipedia articles related to stuff I'd ever ask about are probably ~200MB tops.)
[1]: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
Crawling was tricky. Something like stackoverflow will stop returning pages when it detects that you're crawling, much sooner than you'd expect.
For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.
I like using ollama locally and I also index and query locally.
I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.
https://github.com/mjochum64/mcp-solr-search
A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.
Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?
However I found that Google gives better results, so I switched to that. (I forget exactly but I had to set up something in a Google dev console for that.)
I think the DDG one is unofficial, and the Google one has limits (so it probably wouldn't work well for deep research type stuff).
I mostly just pipe it into LLM apis. I found that "shove the first few Google results into GPT, followed by my question" gave me very good results most of the time.
It of course also works with Ollama, but I don't have a very good GPU, so it gets really slow for me on long contexts.
openAI, xAI, gemini all suffer from not being allowed on respective competitor sites.
this searched works for me with some quick tests well on YT videos, which OpenAI web search can't access. It kind of failed on X but sometimes returned ok relevant results. Definitely hit and miss but on average good
Many sites have hidden sitemaps that cannot be found unless submitted to google directly. (Not even listed in robots txt most of the time). There is no way a local LLM can keep up with up to date internet.
It takes lots of servers to build a search engine index, and there’s nothing to indicate that this will change in the near future.
During the preview period we want to start offering a $20 / month plan tailored for individuals - and we are monitoring the usage and making changes as people hit rate limits so we can satisfy most use cases, and be generous.
Like a full search engine that can visit pages on your behalf. Is anyone building this?
Looking forward to try it with a few shell scripts (via the llm-ollama extension for the amazing Python ‘llm’) or Raycast (the lack of web search support for Ollama has been one of my biggest reasons for preferring cloud-hosted models).
Is https://ollama.com/blog/tool-support not it?
For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.
For larger models, it can start functioning as deep research.
Or is this just someone trying to monetize Meta open source models?
Even with heavy ai usage I'm only at like 400/1000 for the month
pip install transformers
transformers chat Qwen/Qwen2.5-0.5B-Instruct
Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.
From where I'm standing, there's not enough money in B2C GPU hosting to make this sort of thing worthwhile. Features like paid search APIs this really hammer home how difficult it is to provide value around that proposition.