Ollama Web Search (opens in new tab)

(ollama.com)

348 pointsjmorgan9mo ago176 comments

176 comments

115 comments · 30 top-level

sorenjan9mo ago· 13 in thread

I had no idea they had their own cloud offering, I thought the whole point of Ollama was local models? Why would I pay $20/month to use small inferior models instead of using one of the usual AI companies like OpenAI or even Mistral? I'm not going to make an account to use models on my own computer.

mchiang9mo ago

Fair question. Some of the supported models are large and wouldn't fit on most local devices. This is just the beginning, and Ollama does not need to exclude cloud hosted frontier models either with the relationship we've built with the model providers. We just have to be mindful and understand that Ollama stands with developers, and solve the needs.

https://ollama.com/cloud

sorenjan9mo ago

> Some of the supported models are large and wouldn't fit on most local devices.

Why would I use those models on your cloud instead of using Google's or Anthropic's models? I'm glad there are open models available and that they get better and better, but if I'm paying money to use a cloud API I might as well use the best commercial models, I think they will remain much better than the open alternatives for quite some time.

2 more replies

disiplus9mo ago

hi, to me this sounds like you are going into the direction of openrouter.

dcreater9mo ago

Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.

kergonath9mo ago

As long as the software that runs locally gets maintained (and ideally improved, though if it is not I’ll simply move to something else), I find it difficult to be angry. I am more annoyed by software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.

1 more reply

all29mo ago

What sort of monetization model would you like to see? What model would you deem acceptable?

3 more replies

Cheer21719mo ago

What reputation? People who actually know how to develop software or work with LLMs know ollama is a child's tricycle and to run the hell away from what is just a buggy shell around other people's inference engines.

Ollama is beloved by people who know how to write 5 lines of python and bash to do API calls, but can't possibly improve the actual app.

1 more reply

kordlessagain9mo ago

You make an account to use their hosted models AND to have them available via the Ollama API LOCALLY. I'm spending $100 on Claude and $200 on GPT5, so $20 bucks is NOTHING and totally worth having access to:

Qwen3 235b

Deepseek 3.1 671b (thinking and non thinking)

Llama 3.1 405b

GPT OSS 120b

Those are hardly "small inferior models".

What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.

n4bz0r9mo ago

Has anyone tried the hosted models? How do they compare to GPT-5?

I was thinking about trying ChatGPT Pro, but I seem to have completely missed that they bumped the price from $100 to $200. It was $100 just a while ago, right? Before GPT-5, I assume.

1 more reply

brabel9mo ago

How does it compare to AzureAI which has all the best models and you don’t need to sign up with anyone other than Azure itself?

mrheosuper9mo ago

If you are on $100 tier Claude, what makes you think the $20 Tier Ollama is enough for you ?

2 more replies

ricardobeat9mo ago

For models you can't run locally like gpt-oss-120b, deepseek or qwen3-coder 480b. And a way for them to monetize the success of Ollama.

zmmmmm9mo ago

a lot of "local" models are still very large to download and slow to run on regular hardware. I think it's great to have a way to evaluate them cheaply in the cloud before deciding to pull down the model to run locally.

At some level it's also more of a principle that I could run something locally that matters rather than actually doing it. I don't want to become dependent on technology that someone could take away from me.

coffeecoders9mo ago· 13 in thread

On a slightly related note-

I've been thinking about building a home-local "mini-Google" that indexes maybe 1,000 websites. In practice, I rarely need more than a handful of sites for my searches, so it seems like overkill to rely on full-scale search engines for my use case.

My rough idea for architecture:

- Crawler: A lightweight scraper that visits each site periodically.

- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.

- Storage: Store raw HTML and text locally, maybe compress older snapshots.

- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.

I would do periodic updates and build a small web UI to browse.

Anyone tried it or are there similar projects?

andai9mo ago

Have you ever looked at Common Crawl dumps? I did a bit of data mining and holy cow is 99.99% of the web crap. Spam, porn, ads, flame wars, random blogs by angsty teens... I understand it has historical and cultural value — and maybe literary value, in a Douglas Coupland kind of way — but for my purposes, there was very little here that I considered of interest.

Which was very encouraging to me, because it implies that indexing the Actually Important Web Pages might even be possible for a single person on their laptop.

Wikipedia, for comparison, is only ~20GB compressed. (And even most of that is not relevant to my interests, e.g. the Wikipedia articles related to stuff I'd ever ask about are probably ~200MB tops.)

harias9mo ago

YaCy (https://yacy.net) can do all this I think. Cloudflare might block you IP pretty soon though if you try to crawl.

fabiensanglard9mo ago

Have you ever tried https://marginalia-search.com ? I love it.

UltimateEdge9mo ago

Drew DeVault tried building something similar to this under the name SearchHut, but the project was abandoned [1]. I tried hacking on it a while ago (since it's built on Postgres and a bit of Go), but I ran out of steam trying to understand the Postgres RUM extension.

[1]: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

msephton9mo ago

Perhaps not quite solving your problem, but I have a handful of domain-specific Google CSE (Custom Search Engine) that limit the results to predefined websites. I summon them from Alfred with short keywords when I'm doing interest-specific searches. https://blog.gingerbeardman.com/2021/04/20/interest-specific...

mrkeen9mo ago

Yep. Built a crawler, an indexer/queryprocessor, and an engine responsible for merging/compacting indexes.

Crawling was tricky. Something like stackoverflow will stop returning pages when it detects that you're crawling, much sooner than you'd expect.

_flux9mo ago

I think a lot of time an exhaustive searchable index just of what I've browsed would be useful, though I suppose refresh feature would be useful.

matsz9mo ago

You could take a look at the leaked Yandex source code from a few years ago. I'd believe their architecture should be decent enough.

efilife9mo ago

Where?

1 more reply

bryanhogan9mo ago

Reminds me of building a Obsidian vault with all the content in markdown form. There's also plugins to show vault results when doing a Google search, making notes within your vault show up before external websites.

computerex9mo ago

Kind of. I made ainews247.org that crawls certain sites and filters content so it's AI specific and valuable. I think it's a really good idea.

toephu29mo ago

With LLMs why do you even need a mini-Google?

andai9mo ago

For my LLM to use! I want sources, excerpts, cross-referencing...

riskable9mo ago· 9 in thread

WTF is going to happen to Google's ad revenue if every PC has an AI that can perform searches on the user's behalf?

onesociety20229mo ago

How is that any different than someone installing an ad blocker in their browser? Arguably ad blocker is much simpler technology than running a local LLM and has been available for years now. And yet Google’s ad revenue seems to have remained unaffected.

hadlock9mo ago

It's been demonstrated that as ChatGPT usage goes up, traffic to sites dependent on SEO search ranking has gone down, roughly proportionally, every month over the last ~18 months. ChatGPT is free and fast and requires no technical know-how. Installing an ad blocker requires knowing what one is, and the time and energy to install a browser plugin. Pretty much everyone I know thinks free online ChatGPT type products is an absolute existential thread to Google's ad dominance. Even mediocre LLMs provide a vastly better experience than ad choked pages linking to ad choked SEO optimized websites serving (largely) google's own ads.

tartoran9mo ago

They'll have to squeeze it all from Youtube!

system29mo ago

There are millions of websites, and a local LLM cannot scrape all of them to make sense of them. Think about it. OpenAI can do it because they spend millions to train its systems.

Many sites have hidden sitemaps that cannot be found unless submitted to google directly. (Not even listed in robots txt most of the time). There is no way a local LLM can keep up with up to date internet.

riskable9mo ago

No, the AI will just use Google, DDG, Bing, etc on behalf of the user (behind the scenes). The ads will be shown to the AI which will ignore them.

cantor_S_drug9mo ago

I think because Google knows traditional search is gonna die, they will be aggressively pushing ads on traditional search to extract as much money as possible till they figure out newer ways of making money.

thimabi9mo ago

They can always pivot to their Search-via-API business :)

It takes lots of servers to build a search engine index, and there’s nothing to indicate that this will change in the near future.

Havoc9mo ago

That’s easy - they’re just going to ram the ads down your throat inline via Gemini

andrewmcwatters9mo ago

google.com/sorry

simonw9mo ago· 8 in thread

I'd love to know what search engine provider they're using under the hood for this. I asked them on Twitter and didn't get a reply (yet) https://twitter.com/simonw/status/1971210260015919488

Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.

mchiang9mo ago

We work with search providers and ensure that we have zero data retention policies in place.

The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.

simonw9mo ago

OK, so it looks like you aren't willing to share which providers you are working with. Can you share the rationale for not sharing that information instead?

1 more reply

jpencausse8mo ago

Would be curious about legal statement with EU AI Act that kills Bing API (Microsoft switch to Grounding Bing that rephrase the content)

Yes, Ephemeral queries must not retain any data, but there is also other rules, for instance it is forbidden for commercial services (where Ollama have a pricing model ?).

kingnothing9mo ago

You can say you're training an AI model and do whatever you want with it.

theshrike799mo ago

The "Zuckerberg defence".

It's OK to pirate a massive amount of books if you're not reading or sharing, but rather just training an AI.

1 more reply

userbinator9mo ago

You should ask if search results are even copyrightable, if they are just a list of links.

jillesvangurp9mo ago

Instead of turning this into an academic debate about copyright, a more practical thing to do is to examine the terms and conditions of whatever API you are using. Because if you are going to end up in a conflict with a search API provider, those probably spell out pretty clearly what the provider wants to allow or not and what you are agreeing to by using their API.

Caching is a problem with many geocoding APIs (which I happen to be familiar with) and a good reason to prefer e.g. Opencage over the Google or Here geocoders because unlike most geocoder terms and conditions, Opencage actually encourages you to cache and store things; because it's all open data. The Here geocoder requires you to tell them how much data you store and will try to charge you extra for the privilege of storing and keeping data around. Because it's their data and the conditions under which they license it to you are limiting what you can and cannot do. Search APIs are very similar. Technically geocoding is a form of search (given a query, return a list of stuff).

apimade9mo ago

It is strange to launch this type of functionality with not even a privacy policy in place.

It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.

Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.

Get on that privacy notice soon, Ollama. You’re HQ’d in CA, you’re definitely subject to CCPA. (You don’t need revenue to be subject to this, just being a data controller for 50,000 Californian residents is enough.)

https://oag.ca.gov/privacy/ccpa

I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.

andrewmutz9mo ago· 6 in thread

Ollama is a business? They raised money? I thought it was just a useful open source product.

I wonder how they plan to monetize their users. Doesn't sound promising.

blihp9mo ago

There are very few recently launched pure open source projects these days (most are at least running donation-ware models or funded by corporate backers), none in the AI space that I'm aware of.

brabel9mo ago

Well the real open source project is llama.cpp which Ollama basically wrapped and made a nice interface on top of. Now they do more things as they want to be a real business, but llama.cpp is now doing most things people wanted from something like ollama, like serving a REST API compatible with OpenAPI, downloading and managing local LLMs… while remaining an actual open source project without VC money as far as I know.

1 more reply

coolspot9mo ago

They are former Docker employees running Docker playbook.

Cheer21719mo ago

[flagged]

1 more reply

Havoc9mo ago

The launched a hosted platform a while back

lynnharry9mo ago

Until I saw your reply I had thought this post is about OpenAI lol.

lxgr9mo ago· 4 in thread

Does this work with (tool use capable) models hosted locally?

parthsareen9mo ago

Hi - author of the post. Yes it does! The "build a search agent" example can be used with a local model. I'd recommend trying qwen3 or gpt-oss

lxgr9mo ago

Very cool, thank you!

Looking forward to try it with a few shell scripts (via the llm-ollama extension for the amazing Python ‘llm’) or Raycast (the lack of web search support for Ollama has been one of my biggest reasons for preferring cloud-hosted models).

1 more reply

yggdrasil_ai9mo ago

I don't think ollama officially supports any proper tool use via api.

lxgr9mo ago

Huh, I was pretty sure I used it before, but maybe I’m confusing it with some other python-llm backend.

Is https://ollama.com/blog/tool-support not it?

1 more reply

tripplyons9mo ago· 4 in thread

Just set up SearXNG locally if you want a free/local web search MCP: https://gist.github.com/tripplyons/a2f9d8bd553802f9296a7ec3b...

disiplus9mo ago

That's what i have together with open webui and gpt-oss-120b. it works reasonably well. But sometimes the searches are slow.

tripplyons9mo ago

You can try removing search engines that fail or reducing their timeout setting to something faster than the default of a few seconds.

1 more reply

mchiang9mo ago

I haven't tried SearXNG personally. How does it compare to Ollama's web search in terms of the search content returned?

tripplyons9mo ago

I have no idea how well Ollama's works, but I haven't ran into any issues with SearXNG. The alternatives aren't worth paying for in any use case I've encountered.

MisterBiggs9mo ago· 3 in thread

I was hoping for more details about their implementation, I saw ollama as the open source // platform agnostic tool but I worry their recent posturing is going against that

jmorganOP9mo ago

We did consider building functionality into Ollama that would go fetch search results and website contents using a headless browser or similar. However we had a lot of worries about result quality and also IP blocking from Ollama creating crawler-like behavior. Having a hosted API felt like a fast path to get results into users' context window, but we are still exploring the local option. Ideally you'd be able to stay fully local if you want to (even when using capabilities like search)

wirybeige9mo ago

Their GUI is closed-source. If someone wants an easy to use & easy to setup app, may as well use LMStudio, which doesn't try to pretend to be OSS. Or use ramalama which is basically just containerizing LLMs and the relevant bits, pretty damn similar to ollama. Or just go back to "basics" and use llama.cpp or vllm.

dcreater9mo ago

Their posture has continually been getting worse and worse. It's deceptive and I've expunged it from all my systems

Tepix9mo ago· 3 in thread

Looks like Ollama is focusing more and more on non-local offerings. Also their performance is worse than say vLLM.

What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?

kgeist8mo ago

At work I've set up LibreChat + LlamaSwap + llama.cpp

200 weekly users :)

Tepix8mo ago

How do you deal with different users wanting to use different LLMs at the same time?

Ey7NFZ3P0nzAe9mo ago

i heard about Llamaswap and vllm

yggdrasil_ai9mo ago· 3 in thread

I wish they would instead focus on local tool use. I could just use my own web search via brave api.

parthsareen9mo ago

Hey! Author of the blogpost and I also work on Ollama's tool calling. There has been a big push on tool calling over the last year to improve the parsing. What's the issues you're running into with local tool use? What models are you using?

vrzucchini9mo ago

Hey, unrelated to the question you're answering but where do I see the rate limits for free and paid tiers?

yggdrasil_ai9mo ago

I went back and had another look at my implementation, and got it to work. Sorry I was mistaken!

bigyabai9mo ago· 3 in thread

> Create an API key from your Ollama account.

Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.

disiplus9mo ago

they had at some point start earning money.

bigyabai9mo ago

At some point you have to earn user trust. If Ollama won't be the Open Source Ollama API provider, there are several endpoint-compatible alternatives happy to replace them.

From where I'm standing, there's not enough money in B2C GPU hosting to make this sort of thing worthwhile. Features like paid search APIs this really hammer home how difficult it is to provide value around that proposition.

timothymwiti9mo ago

Does anyone know if the python and JavaScript examples on the blog work without an Ollama Account?

mrkeen9mo ago· 2 in thread

Any tips on local/enterprise search?

I like using ollama locally and I also index and query locally.

I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.

ineedasername9mo ago

You can use solr, very good full text search and it has an mcp integration. That’s sufficient on its own and straightforward to setup:

https://github.com/mjochum64/mcp-solr-search

A slightly heavier lift, but only slightly, would be to also use solr to also store a vectorized version of your docs and simultaneously do vector similarity search, solr has built in knn support fort it. Pretty good combo to get good quality with both semantic and full-text search.

Though I’m not sure if it would be relatively similar work to do solr w/ chromadb, for the vector portion, and marry the result stewards via llm pixie dust (“you are the helpful officiator of a semantic full-text matrimonial ceremony” etc). Also not sure the relative strengths of chromadb vs solr on that- maybe scales better for larger vector stores?

all29mo ago

docling might be a good way to go here. Or consider one of the existing full text search engines like Typesense.

andai9mo ago· 2 in thread

I added search to my LLMs years ago with the python DuckDuckGo package.

However I found that Google gives better results, so I switched to that. (I forget exactly but I had to set up something in a Google dev console for that.)

I think the DDG one is unofficial, and the Google one has limits (so it probably wouldn't work well for deep research type stuff).

I mostly just pipe it into LLM apis. I found that "shove the first few Google results into GPT, followed by my question" gave me very good results most of the time.

It of course also works with Ollama, but I don't have a very good GPU, so it gets really slow for me on long contexts.

ivape9mo ago

How do you meaningfully use it without using scraping APIs? Aren't the official apis severely limited?

selcuka9mo ago

Google Programmable Search Engine [1] is pretty good if your needs are within their usage limits.

[1] https://programmablesearchengine.google.com/about/

1 more reply

throwaway12345t9mo ago· 2 in thread

Do they pull their own index like brave or are they using Bing/Google in the background?

tripplyons9mo ago

Based on the fact that there are very few up-to-date English-language search indexes (Google, Bing, and Brave if you count it), it must be incredibly costly. I doubt they are maintaining their own.

throwaway12345t9mo ago

We need more indexes

4 more replies

chrisshroba9mo ago· 2 in thread

Are the rate limits documented somewhere?

Havoc9mo ago

Was looking to and could see them

lgats9mo ago

it seems not, not even for the pro plan. just 'generous'

jerrygoyal9mo ago· 2 in thread

I'm looking to use web search in production, but they haven't mentioned the price. Only thing that's mentioned is $20/month, but how much quota does it include?

mchiang9mo ago

Sorry about this. We are working really hard on providing a usage based pricing.

During the preview period we want to start offering a $20 / month plan tailored for individuals - and we are monitoring the usage and making changes as people hit rate limits so we can satisfy most use cases, and be generous.

enoch20909mo ago

That's the essence of these services, they never explicitly mention the quota, or secretly lowers it at some point.

anonyonoor9mo ago· 2 in thread

I know it might be a security nightmare, but I still want to see an implementation of client-side web search.

Like a full search engine that can visit pages on your behalf. Is anyone building this?

apimade9mo ago

AgenticSeek, or you can get pretty far with local qwen and Playwright-Stealth or SeleniumBase integrated directly into your Chrome (running with Chrome DevTools Protocol enabled).

not_really9mo ago

sounds like a good way to get your IP flagged by cloudflare

drnick19mo ago· 1 in thread

What "Ollama account?" I am confused, I thought the point of Ollama was to self-host models.

mchiang9mo ago

To provide additional features or using Ollama's cloud hosted models, you can signup for an Ollama account.

For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.

chungus429mo ago· 1 in thread

My biggest gripe with small models has been the inability to keep it informed with new data. Seems like this at least eases the process.

mchiang9mo ago

I was pleasantly surprised on the model improvements when testing this feature.

For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.

For larger models, it can start functioning as deep research.

alberth9mo ago· 1 in thread

Dumb question: is this affiliated with Meta?

Or is this just someone trying to monetize Meta open source models?

mchiang9mo ago

No, Ollama is it's own project and separate. You can check it out via GitHub

https://github.com/ollama/ollama

nextworddev9mo ago· 1 in thread

Can someone tell me how much this costs and how this compares to Tavily etc

typpilol9mo ago

Taviy gives you 1k free requests a month.

Even with heavy ai usage I'm only at like 400/1000 for the month

thomastraum9mo ago

I am just working on a tool using websearch and iterating over different providers.

openAI, xAI, gemini all suffer from not being allowed on respective competitor sites.

this searched works for me with some quick tests well on YT videos, which OpenAI web search can't access. It kind of failed on X but sometimes returned ok relevant results. Definitely hit and miss but on average good

frabonacci9mo ago

This is a nice first step - web search makes sense, and it’s easy to imagine other tools being added next: filesystem, browser, maybe even full desktop control. Could turn Ollama into more than just a model runner. Curious if they’ll open up a broader tool API for third-party stuff too

dumbmrblah9mo ago

What is the data retention policy for the free account versus the cloud account?

kordlessagain9mo ago

I have a MCP tool that uses SERP API and it works quite well.

kgeist9mo ago

I use Llama.cpp with Tavily search (they give free credits each month). LibreChat has built-in support for it. No Ollama needed.

tempodox9mo ago

Is the web search also integrated into the locally running native ollama binaries, and if so, how can I use it?

orliesaurus9mo ago

Exa, Tavily or Firecrawl. Which one is it?

Cheer21719mo ago

Your regular reminder that you don't need ollama to get a quick chat engine on the command line, you can just do this with pretty much any major model on huggingface:

pip install transformers

transformers chat Qwen/Qwen2.5-0.5B-Instruct

mmaunder9mo ago

So, use ollama to avoid cloud models and services, but ollama sells cloud models and services. The dissonance makes my teeth hurt.

j / k navigate · click thread line to collapse

176 comments

115 comments · 30 top-level

sorenjan9mo ago· 13 in thread

mchiang9mo ago

https://ollama.com/cloud

sorenjan9mo ago

> Some of the supported models are large and wouldn't fit on most local devices.

2 more replies

disiplus9mo ago

hi, to me this sounds like you are going into the direction of openrouter.

dcreater9mo ago

Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.

kergonath9mo ago

1 more reply

all29mo ago

What sort of monetization model would you like to see? What model would you deem acceptable?

3 more replies

Cheer21719mo ago

Ollama is beloved by people who know how to write 5 lines of python and bash to do API calls, but can't possibly improve the actual app.

1 more reply

kordlessagain9mo ago

Qwen3 235b

Deepseek 3.1 671b (thinking and non thinking)

Llama 3.1 405b

GPT OSS 120b

Those are hardly "small inferior models".

What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.

n4bz0r9mo ago

Has anyone tried the hosted models? How do they compare to GPT-5?

I was thinking about trying ChatGPT Pro, but I seem to have completely missed that they bumped the price from $100 to $200. It was $100 just a while ago, right? Before GPT-5, I assume.

1 more reply

brabel9mo ago

How does it compare to AzureAI which has all the best models and you don’t need to sign up with anyone other than Azure itself?

mrheosuper9mo ago

If you are on $100 tier Claude, what makes you think the $20 Tier Ollama is enough for you ?

2 more replies

ricardobeat9mo ago

For models you can't run locally like gpt-oss-120b, deepseek or qwen3-coder 480b. And a way for them to monetize the success of Ollama.

zmmmmm9mo ago

coffeecoders9mo ago· 13 in thread

On a slightly related note-

My rough idea for architecture:

- Crawler: A lightweight scraper that visits each site periodically.

- Indexer: Convert pages into text and create an inverted index for fast keyword search. Could use something like Whoosh.

- Storage: Store raw HTML and text locally, maybe compress older snapshots.

- Search Layer: Simple query parser to score results by relevance, maybe using TF-IDF or embeddings.

I would do periodic updates and build a small web UI to browse.

Anyone tried it or are there similar projects?

andai9mo ago

Which was very encouraging to me, because it implies that indexing the Actually Important Web Pages might even be possible for a single person on their laptop.

Wikipedia, for comparison, is only ~20GB compressed. (And even most of that is not relevant to my interests, e.g. the Wikipedia articles related to stuff I'd ever ask about are probably ~200MB tops.)

harias9mo ago

YaCy (https://yacy.net) can do all this I think. Cloudflare might block you IP pretty soon though if you try to crawl.

fabiensanglard9mo ago

Have you ever tried https://marginalia-search.com ? I love it.

UltimateEdge9mo ago

[1]: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

msephton9mo ago

mrkeen9mo ago

Yep. Built a crawler, an indexer/queryprocessor, and an engine responsible for merging/compacting indexes.

Crawling was tricky. Something like stackoverflow will stop returning pages when it detects that you're crawling, much sooner than you'd expect.

_flux9mo ago

I think a lot of time an exhaustive searchable index just of what I've browsed would be useful, though I suppose refresh feature would be useful.

matsz9mo ago

You could take a look at the leaked Yandex source code from a few years ago. I'd believe their architecture should be decent enough.

efilife9mo ago

Where?

1 more reply

bryanhogan9mo ago

computerex9mo ago

Kind of. I made ainews247.org that crawls certain sites and filters content so it's AI specific and valuable. I think it's a really good idea.

toephu29mo ago

With LLMs why do you even need a mini-Google?

andai9mo ago

For my LLM to use! I want sources, excerpts, cross-referencing...

riskable9mo ago· 9 in thread

WTF is going to happen to Google's ad revenue if every PC has an AI that can perform searches on the user's behalf?

onesociety20229mo ago

hadlock9mo ago

tartoran9mo ago

They'll have to squeeze it all from Youtube!

system29mo ago

There are millions of websites, and a local LLM cannot scrape all of them to make sense of them. Think about it. OpenAI can do it because they spend millions to train its systems.

riskable9mo ago

No, the AI will just use Google, DDG, Bing, etc on behalf of the user (behind the scenes). The ads will be shown to the AI which will ignore them.

cantor_S_drug9mo ago

thimabi9mo ago

They can always pivot to their Search-via-API business :)

It takes lots of servers to build a search engine index, and there’s nothing to indicate that this will change in the near future.

Havoc9mo ago

That’s easy - they’re just going to ram the ads down your throat inline via Gemini

andrewmcwatters9mo ago

google.com/sorry

simonw9mo ago· 8 in thread

I'd love to know what search engine provider they're using under the hood for this. I asked them on Twitter and didn't get a reply (yet) https://twitter.com/simonw/status/1971210260015919488

Crucially, I want to understand the license that applies to the search results. Can I store them, can I re-publish them? Different providers have different rules about this.

mchiang9mo ago

We work with search providers and ensure that we have zero data retention policies in place.

The search results are yours to own and use. You are free to do what you want with it. Of course you are bound by local laws of the legal jurisdiction you are in.

simonw9mo ago

OK, so it looks like you aren't willing to share which providers you are working with. Can you share the rationale for not sharing that information instead?

1 more reply

jpencausse8mo ago

Would be curious about legal statement with EU AI Act that kills Bing API (Microsoft switch to Grounding Bing that rephrase the content)

Yes, Ephemeral queries must not retain any data, but there is also other rules, for instance it is forbidden for commercial services (where Ollama have a pricing model ?).

kingnothing9mo ago

You can say you're training an AI model and do whatever you want with it.

theshrike799mo ago

The "Zuckerberg defence".

It's OK to pirate a massive amount of books if you're not reading or sharing, but rather just training an AI.

1 more reply

userbinator9mo ago

You should ask if search results are even copyrightable, if they are just a list of links.

jillesvangurp9mo ago

apimade9mo ago

It is strange to launch this type of functionality with not even a privacy policy in place.

It makes me wonder if they’ve partnered with another of their VC’s peers who’s recently had a cash injection, and they’re being used as a design partner/customer story.

Exa would be my bet. YC backed them early, and they’ve also just closed a $85M Series B. Bing would be too expensive to run freely without Microsoft partnership.

https://oag.ca.gov/privacy/ccpa

I can imagine the reaction if it turns out the zero-retention provider backing them ended up being Alibaba.

andrewmutz9mo ago· 6 in thread

Ollama is a business? They raised money? I thought it was just a useful open source product.

I wonder how they plan to monetize their users. Doesn't sound promising.

blihp9mo ago

There are very few recently launched pure open source projects these days (most are at least running donation-ware models or funded by corporate backers), none in the AI space that I'm aware of.

brabel9mo ago

1 more reply

coolspot9mo ago

They are former Docker employees running Docker playbook.

Cheer21719mo ago

[flagged]

1 more reply

Havoc9mo ago

The launched a hosted platform a while back

lynnharry9mo ago

Until I saw your reply I had thought this post is about OpenAI lol.

lxgr9mo ago· 4 in thread

Does this work with (tool use capable) models hosted locally?

parthsareen9mo ago

Hi - author of the post. Yes it does! The "build a search agent" example can be used with a local model. I'd recommend trying qwen3 or gpt-oss

lxgr9mo ago

Very cool, thank you!

1 more reply

yggdrasil_ai9mo ago

I don't think ollama officially supports any proper tool use via api.

lxgr9mo ago

Huh, I was pretty sure I used it before, but maybe I’m confusing it with some other python-llm backend.

Is https://ollama.com/blog/tool-support not it?

1 more reply

tripplyons9mo ago· 4 in thread

Just set up SearXNG locally if you want a free/local web search MCP: https://gist.github.com/tripplyons/a2f9d8bd553802f9296a7ec3b...

disiplus9mo ago

That's what i have together with open webui and gpt-oss-120b. it works reasonably well. But sometimes the searches are slow.

tripplyons9mo ago

You can try removing search engines that fail or reducing their timeout setting to something faster than the default of a few seconds.

1 more reply

mchiang9mo ago

I haven't tried SearXNG personally. How does it compare to Ollama's web search in terms of the search content returned?

tripplyons9mo ago

I have no idea how well Ollama's works, but I haven't ran into any issues with SearXNG. The alternatives aren't worth paying for in any use case I've encountered.

MisterBiggs9mo ago· 3 in thread

I was hoping for more details about their implementation, I saw ollama as the open source // platform agnostic tool but I worry their recent posturing is going against that

jmorganOP9mo ago

wirybeige9mo ago

dcreater9mo ago

Their posture has continually been getting worse and worse. It's deceptive and I've expunged it from all my systems

Tepix9mo ago· 3 in thread

Looks like Ollama is focusing more and more on non-local offerings. Also their performance is worse than say vLLM.

What's a good Ollama alternative (for keeping 1-5x RTX 3090 busy) if you want to run things like open-webui (via an OpenAI compatible API) where your users can choose between a few LLMs?

kgeist8mo ago

At work I've set up LibreChat + LlamaSwap + llama.cpp

200 weekly users :)

Tepix8mo ago

How do you deal with different users wanting to use different LLMs at the same time?

Ey7NFZ3P0nzAe9mo ago

i heard about Llamaswap and vllm

yggdrasil_ai9mo ago· 3 in thread

I wish they would instead focus on local tool use. I could just use my own web search via brave api.

parthsareen9mo ago

vrzucchini9mo ago

Hey, unrelated to the question you're answering but where do I see the rate limits for free and paid tiers?

yggdrasil_ai9mo ago

I went back and had another look at my implementation, and got it to work. Sorry I was mistaken!

bigyabai9mo ago· 3 in thread

> Create an API key from your Ollama account.

Dead on arrival. Thanks for playing, Ollama, but you've already done the leg work in obsoleting yourself.

disiplus9mo ago

they had at some point start earning money.

bigyabai9mo ago

At some point you have to earn user trust. If Ollama won't be the Open Source Ollama API provider, there are several endpoint-compatible alternatives happy to replace them.

timothymwiti9mo ago

Does anyone know if the python and JavaScript examples on the blog work without an Ollama Account?

mrkeen9mo ago· 2 in thread

Any tips on local/enterprise search?

I like using ollama locally and I also index and query locally.

I would love to know how to hook ollama up to a traditional full-text-search system rather than learning how to 'fine tune' or convert my documents into embeddings or whatnot.

ineedasername9mo ago

You can use solr, very good full text search and it has an mcp integration. That’s sufficient on its own and straightforward to setup:

https://github.com/mjochum64/mcp-solr-search

all29mo ago

docling might be a good way to go here. Or consider one of the existing full text search engines like Typesense.

andai9mo ago· 2 in thread

I added search to my LLMs years ago with the python DuckDuckGo package.

However I found that Google gives better results, so I switched to that. (I forget exactly but I had to set up something in a Google dev console for that.)

I think the DDG one is unofficial, and the Google one has limits (so it probably wouldn't work well for deep research type stuff).

I mostly just pipe it into LLM apis. I found that "shove the first few Google results into GPT, followed by my question" gave me very good results most of the time.

It of course also works with Ollama, but I don't have a very good GPU, so it gets really slow for me on long contexts.

ivape9mo ago

How do you meaningfully use it without using scraping APIs? Aren't the official apis severely limited?

selcuka9mo ago

Google Programmable Search Engine [1] is pretty good if your needs are within their usage limits.

[1] https://programmablesearchengine.google.com/about/

1 more reply

throwaway12345t9mo ago· 2 in thread

Do they pull their own index like brave or are they using Bing/Google in the background?

tripplyons9mo ago

Based on the fact that there are very few up-to-date English-language search indexes (Google, Bing, and Brave if you count it), it must be incredibly costly. I doubt they are maintaining their own.

throwaway12345t9mo ago

We need more indexes

4 more replies

chrisshroba9mo ago· 2 in thread

Are the rate limits documented somewhere?

Havoc9mo ago

Was looking to and could see them

lgats9mo ago

it seems not, not even for the pro plan. just 'generous'

jerrygoyal9mo ago· 2 in thread

I'm looking to use web search in production, but they haven't mentioned the price. Only thing that's mentioned is $20/month, but how much quota does it include?

mchiang9mo ago

Sorry about this. We are working really hard on providing a usage based pricing.

enoch20909mo ago

That's the essence of these services, they never explicitly mention the quota, or secretly lowers it at some point.

anonyonoor9mo ago· 2 in thread

I know it might be a security nightmare, but I still want to see an implementation of client-side web search.

Like a full search engine that can visit pages on your behalf. Is anyone building this?

apimade9mo ago

AgenticSeek, or you can get pretty far with local qwen and Playwright-Stealth or SeleniumBase integrated directly into your Chrome (running with Chrome DevTools Protocol enabled).

not_really9mo ago

sounds like a good way to get your IP flagged by cloudflare

drnick19mo ago· 1 in thread

What "Ollama account?" I am confused, I thought the point of Ollama was to self-host models.

mchiang9mo ago

To provide additional features or using Ollama's cloud hosted models, you can signup for an Ollama account.

For starter, this is completely optional. It can be completely local too for you to publish your own models to ollama.com that you can share with others.

chungus429mo ago· 1 in thread

My biggest gripe with small models has been the inability to keep it informed with new data. Seems like this at least eases the process.

mchiang9mo ago

I was pleasantly surprised on the model improvements when testing this feature.

For smaller models, it can augment it with the latest data by fetching it from the web, solving the problem of smaller models lacking specific knowledge.

For larger models, it can start functioning as deep research.

alberth9mo ago· 1 in thread

Dumb question: is this affiliated with Meta?

Or is this just someone trying to monetize Meta open source models?

mchiang9mo ago

No, Ollama is it's own project and separate. You can check it out via GitHub

https://github.com/ollama/ollama

nextworddev9mo ago· 1 in thread

Can someone tell me how much this costs and how this compares to Tavily etc

typpilol9mo ago

Taviy gives you 1k free requests a month.

Even with heavy ai usage I'm only at like 400/1000 for the month

thomastraum9mo ago

I am just working on a tool using websearch and iterating over different providers.

openAI, xAI, gemini all suffer from not being allowed on respective competitor sites.

frabonacci9mo ago

dumbmrblah9mo ago

What is the data retention policy for the free account versus the cloud account?

kordlessagain9mo ago

I have a MCP tool that uses SERP API and it works quite well.

kgeist9mo ago

I use Llama.cpp with Tavily search (they give free credits each month). LibreChat has built-in support for it. No Ollama needed.

tempodox9mo ago

Is the web search also integrated into the locally running native ollama binaries, and if so, how can I use it?

orliesaurus9mo ago

Exa, Tavily or Firecrawl. Which one is it?

Cheer21719mo ago

Your regular reminder that you don't need ollama to get a quick chat engine on the command line, you can just do this with pretty much any major model on huggingface:

pip install transformers

transformers chat Qwen/Qwen2.5-0.5B-Instruct

mmaunder9mo ago

So, use ollama to avoid cloud models and services, but ollama sells cloud models and services. The dissonance makes my teeth hurt.

j / k navigate · click thread line to collapse