But I can't imagine a lexer would ever be the performance bottleneck in a compiler.
I am not sure if this is true any more. It probably depends on the language.
Even as CPUs grow faster, code bases grow bigger, so the number-of-bytes argument is still important. On the other hand, heavy optimisations and difficult languages like C++ will shift the bottlenecks to later stages.
OTOH that probably also depends on your use case, JS for example needs to get parsed at every page load. Some JS-VMs only parse the source code lazily, at first the parser only detects function boundaries. Only if this function is really executed, the function gets parsed completely.
You're both right. The problem of C++ is how '#include' works, that it just includes the content of all files and therefore there's more overhead on the lexing side.
Rust's "equivalent" of '#include' is 'use', which doesn't have this problem, because it doesn't concatenates files.
Possible with rust that moves a bunch of the grant work into the code generator. AKA where as in C/C++ by the time you're generating code all your type information is set in stone, possible in rust a bunch of stuff isn't resolved yet.
It might be that for non-optimised output, the parser becomes the bottleneck again. Or it might be that LLVM just imposes overhead even for the unoptimised case, that pays for itself in the optimised case.
My experience is that parsers for source languages can reach into the 1-10mb/s range, and depending on how complex the IRs and transformations are after that, code generation is usually around 0.5mb-5mb/s. The stuff in the middle (dealing with IRs) is harder to measure in terms of bytes.
one rather trivial way to observe the effects of avoiding building a source file is to use ccache. ccache avoids the recompilation (even if you do a make clean, or some such), and it is not uncommon to observe speed ups of factor of 5 or so.
however, once you have crossed that barrier, you hit the linking wall. which is where you would end up spending a large portion of time. gold (https://en.wikipedia.org/wiki/Gold_(linker)) optimizes that i.e. supports incremental linking, but unfortunately, i haven't had any experience in large code-bases where it is being used.