> You can't have it both ways.
Your argument goes against empirical evidence in this instance. You can have it "both ways" when client-side feature detection is the slower choice on high bandwidth connections and you want to consistently render the UI within 200ms.
Performance goes beyond raw bandwidth, and as with all things engineering, involves tradeoffs: client-side feature detection has higher latency (server-client-server round trip and network connection overheads) and is therefore unsuitable for logic that executes before the first render of above-the-fold content. All of this is pragmatic, well-known and not controversial among people who work on optimizing FE performance. Your no-serverside-detection absolutism is disproved by the many instances of UA-string parsing in our present reality.