I would not be surprised if the folks at Google Search have forgotten that pne of the first tenants of organizing information is not to omit it.
Either that, or they implemented their delist functionality as "force relevance zero, and truncate results before you get there".
Back when Gerard Salton was writing the first papers on IR he had a set of 60 or so documents and kept his index on a deck of punched cards.
With a small set of documents the main problem you run into is that some of the relevant documents don't use the exact words in the query so you might miss most of the relevant results.
With Google on the other hand you could have millions or billions of relevant document and the challenge is to do so well for the first few results that odds are good (say 70% -- this is limited by the ambiguity of the query) that the first result is a "direct hit".
If you are answering questions like that in a huge distributed system the query process probably looks like a set of funnels that feed answers into funnels that feed answers into funnels. If you want to answer questions quickly at low cost the best you can do is kill low relevance documents as quickly as possible.
I worked on a search engine for patents that had multiple nodes and could get slightly different answers from different nodes because each node had a neural net for semantic indexing of its contents. You might have the system report that there were 15,091 relevant results one time, another time it might be 15,094. Management thought that our customers would lose confidence in our search because of this so we implemented something that made the selection of nodes used for a particular query deterministic, which hurt the scalability of our search.
Given that neural nets add in levels of "Oh, what the hell?" That explains a lot actually about the uselessness of Google results without getting really creative.
I still grade Google at failing to produce a semantically valid index, that checks off the usability criteria of an actual index.
If I have a corpus of data, I want to be able to examine the structure, even if only through a window. Way back before the through neural nets at things (if that is what they are doing now), you could actually get a sense for it.
I used to love trawling Google results into the hundreds of pages, because what you'd get real feedback w.r.t the effects of additional predicates on your query. It was more of a data processing and query refinent exercise than "throw it at their ML models and hope they decide to be useful today."
Organizing information isn't just about vomiting results... It's about imposing enough structure that people can help themselves it's like Architecture. A poorly planned building, or an excellently planned building optimized for discomfort is a hell on Earth.
One that actually reflects and accommodates the natural flow and needs of the occupants/users is a joy for all to behold.
Google had that. Now it doesn't, and it's increasingly difficult to get the darn thing to stop playing Bayesian/gradient descent/backprop buggers and just show me stuff that matches what I asked for in the Boolean sense, and don't you dare tell me there are only 13 bloody results.
There is search, then there is Search. I prefer the latter.
Nothing is being "deleted" especially not actively. This is also why you shouldn't use "number of results" as data in research, because it is meaningless.
This video is worse than misinformation and clickbait.
My bet is that you can compare the original "estimated" results number to the actual number google gives you at page tenish for thousands of search terms and queries and find no relation between the two.
If you are still concerned about things google is ACTUALLY fucking with when it comes to search results, check out the Mozilla organization's research into the matter.
If it says 6.8 billion but only had 448 total…
I’ve had an issue in google where a study I found in 2012 is no longer available in 2022 even in their search