To add to both of your arguments, "JS-routers" like SvelteKit/NuxtJS/NextJS are literally reinventing server side rendering for the client to then call the actual server to get data ... to render HTML.
HTMX, LiveView, Livewire, Hotwire etc are escape hatches back to sanity.
It's all done in one call - at least in Svelte. You can even render all this into a fully static site.
Meanwhile htmx and the like is the same idea that was popular 15-ish years ago, which died for good reasons.
this was due to the fact that HTML stopped advancing as a hypermedia, as I say in my original comment
htmx and other libraries are attempting to address that by pushing HTML forward as a hypermedia, which allows you to implement more sophisticated user interfaces purely in hypermedia terms:
the hypermedia idea itself is extremely innovative and interesting and, at some level, today's javascript applications are new iterations of an even older idea: client-server based RPC applications, as we built back in the 1980s
But I neither want nor need much interactivity in Web sites. Interactivity is for applications and applications are much better on desktop or mobile than on the Web.
Nitpick: "media" is plural. The singular is "medium".
While there's no shortage of frameworks on the frontend, it's all still JS/TS, so everyone is using the same idioms.
Other problems:
1. 1:1 mapping of endpoints to presentation. Common cases where this blows up:
A. A list of items which looks different depending on the context or has different styles of presentation switchable via buttons.
B. Two or more different data sources, combined in different ways.
Now you need endpoints for each context(A) or combination(B), which makes a mess in your cache. You also send way more data this way - especially if your users tend to fidget(A).
2. Inherently slow. You either replace whole parts of the DOM - which triggers a massive reflow or you make smaller changes by comparing new and old HTML in which case you need to parse both versions and deduce what changed and how.
Unfortunately since this is all just HTML you don't have object references to the data that was used, so you can't employ some of the neat performance tricks modern frameworks use, like detecting a row swap.
3. HTML elements with state, e.g. a canvas, video/audio, file upload input or even a textarea with a selection. You have to store this state somewhere, but sometimes (like with a "tainted" canvas) you can't access it at all due to security reasons, so now you have to cut around it when you're updating.
Selection is especially a piece of work, because browser implementations differ significantly to this day.