TLDR: because it is done over a slow protocol (I²C) and because of artificial delays in code to account for slower devices
The negotiation is done when the host initiates it, usually when the cable is connected to a port or when coming back from deeper sleep/standby.
If the host remains active after the negotiation, the monitor doesn't go through the process again, I explained how that worked for me here: https://news.ycombinator.com/item?id=34051428
The problem that OP has is when the host being switched to is in standby. In that mode, most devices deactivate the video ports completely, disconnecting the monitors and removing them from the hardware device tree.
Switching to the inactive input without waking up the host device will usually show a "No signal" message on the monitor. Waking the device before switching to the input could improve the waiting time, but that is an implementation detail depending on both the monitor and the OS and can't be guaranteed.