That's a really poor excuse. If that "awareness" is missing then the code is structured wrong, probably because somebody cargo-culted a design pattern without ever once thinking of the user.
1. there's a potentially-huge result set;
2. almost nobody ever wants to see the whole result set, but rather almost always just wants to see the first N chunks, and then drops off;
3. the results are sourced from a data lake, or from eventually-consistent geographic shards, or any other process where you need to actively gather results together with a map-reduce.
When these three factors apply, it becomes very expensive to know exactly how many results you will have, because that changes your partial streaming map-reduce workload into a complete map-reduce workload, over potentially billions of records, just to validate their inclusion and then count them.
Think "Twitter timeline." If every client displaying a Twitter timeline needed to know in advance how many tweets they could ever see, total, if they "scrolled all the way back" — then Twitter's servers would fall over from the load of calculating that number.
Another example is Google Search, which, while not "infinite scroll" on the client, is still a "map-reduced stream" on the backend. Google has put some extra effort into heuristic scheduling logic for its Search map-reduce: it grabs an initial 20-page-or-so chunk of the stream and caches it on a sticky-session node for you. This means that, if your search-result set is less than 20 pages, you get to know the actual result-set size. If it's more than 20 pages, though, Google Search reverts to exactly the same "you'll only know when you're at the end when you get there" semantics of SQL cursors. You see the first 20 pages, with a "next" arrow to go beyond them; and then it just starts counting up, and up, and up...
I'm glad I work in infra, not apps. Infra developers certainly make their own share of dumb decisions, but they're not that lazy and sloppy and user-blaming.
If you know you have one more result, that is necessarily because your data pipeline actually has that result available for it to count; i.e. the result record/tuple has been loaded into the database’s memory, and the database has determined that that record/tuple is valid and fresh. (And at that point, rather than counting, the DB may as well send you that record itself. Just counting it has already required almost all of the same work!)
Remember that MVCC exists. You can’t know how much of something you have as of a given instant without doing version deduplication/application of tombstone records. This is the reason that COUNT() in Postgres takes minutes/hours on large (>1bn records) partitioned tables: you have to actually visit records, to see whether they’re still part of the current MVCC transaction-version, and therefore whether they should be contributors to the current count.
That applies whether or not you’re “counting” or actually streaming results. Given the architecture of both traditional data-warehouses — and of the map-reduce systems like Hadoop that are used to do reporting on data-lake data — you can’t know whether anything that’s in the rest of the data set is going to actually exist when you get to it. Your data warehouse might have a 100GB heap of data in a table, but everything after the first 1GB of it is dead tuples, such that after you’ve streamed the first 1GB of results, the rest of the streaming consists of the data warehouse sitting there silently for a minute or two (as it checks the liveness of those tuples) before saying “okay, nothing more, we’re done.”
And because of this, it’s not about precision. You can’t even guess. You can’t know whether you have one more result, or a billion more. Until you actually check them.
Yes, OLAP systems are different. OLAP systems operate in terms of infrequent batch inserts, giving the system time to build indices, generate counts, etc. in-between, that will all stay valid up until the time of the next batch insert. An index is, in a sense, a pre-baked answer of the “set of live tuple-versions” that a data-warehouse is holding. Count the table? Just return the size of the index. If you’ve only ever built OLAP-oriented systems, maybe it feels like these are the “simple, obvious” solutions to this problem.
But none of the things we’re talking about — global-web search engines, social-network timelines, marketplace listings — are OLAP systems. They’re OLTP. They constantly get new results in, and people expect to be able to see freshly-inserted data in the results as soon as they insert it. Data comes in at too high a rate to generate “dataset snapshots” ala ElasticSearch. The pipeline has to deal with data as it comes, doing as little to it as possible so that it can ingest it all at the ridiculous rates required, by pushing all the work of validating tuple liveness/freshness off to query-time.
And given that, OLAP properties don’t attain in such systems. It’s basically the CAP theorem at work: Consistency (and cross-shard index-building) both require time for the system to investigate itself; and you can’t get Availability (and/or cross-shard freshness) unless you run the system too fast to allow for that time.