The documentation also says that /cgthreads is the number of threads
used by the code generation and optimization passes
of the compiler, not by the whole compiler.
A compiler needs to do a lot of stuff beyond code generation and optimizations (e.g. in debug builds optimizations are even often disabled). For C++ you have overload resolution, template argument deduction, template instantiation, constexpr evaluation... and well parsing, tokenization, type checking, static analysis (e.g. for warnings), etc.
Parallel optimizations and codegen is trivial when compared with an end-to-end multi-threaded compiler. All LLVM frontends for all programming languages do parallel optimizations and parallel codegen, it's only an LLVM option away. You enable it, and it's done.
> The /MP (Build with Multiple Processes)
This is one process per translation unit, each single translation unit is then compiled with a single thread until optimizations and codegen.
Often you need to finish compiling/linking some translation units before continuing.
The Rust compiler has pipelined compilation: compilation of a translation unit starts as soon as what this requires of its dependencies is already available, before its dependencies have finished compiling. It also compiles each translation unit itself using multiple threads end-to-end, so that if one of your translation units is bigger than the others, you can speed the compilation of that one by throwing more threads at it.