"can't be bothered to learn how a profiler works"
To be fair, profiling is way more difficult than it was in the days of single-core local applications. A single-threaded single-machine application means you can get a very clear and simple tree-chart of where your program's time is spent, and the places to optimize are dead obvious.
Even if you're using async/await but are basically mostly releasing the thread and awaiting the response, the end-user experience of that time is the same - they don't give a crap that you're being thoughtful to the processor if it's still 0.5s of file IO before they can do anything, but now the profiler is lying to you and saying "nope, the processor isn't spending any time in that wait, your program is fast!".