By modern hardware I assume you do not mean x86? Self-modifying code (without flushing the I-cache) is still allowed even on the latest processors. Modern OSes indeed do not mark all pages read/execute by default, but that takes a 1-line mprotect() or VirtualProtect() to change. Also, on RISC architectures, D-caches are generally not cleared as part of that process. After all, you are writing instructions, not data.
To the store instruction everything is data, even if it happens to be instructions. Stores initially go the L1 D-cache, and unless the I and D caches are coherent, explicit cleaning (D) and invalidating (I) is required. Maybe they are coherent on x86, but I know with certainty that they are not on for example ARM.