Games.
Big/specialiased games to be precise, as for smaller projects managed language offer good enough performance.
High end games programming sometimes knocks the love of malloc()/free() (and naive RAII) out of them.
Note also that with (tracing) GC, you not only pay for the cost of allocation, but also pay for the time your allocated objects live.
A more nuanced take is that I can, in practice, map considerable chunk of developers I met and their understanding of memory as:
"it's magic" > "manual memory management rulz! my C/C++/Rust will be always better than GC" >>> people who actually studied automatic memory management a bit >> people who adjust memory management techniques to specific use cases
The fall off is pretty much exponential in amount of people, which I will admit does make some people defensive thanks to second group being both numerically big and undying because programmers suck at educating new generations. Same can be seen in various other areas, not just GC (though GC links with general trope of "opinions formed from people repeating things learnt from xeroxed half dog eaten copies of notes somewhat relevant on 256kB IBM 5150 PC and Turbo Pascal")
A lot of the time I do encounter people who essentially parrot claims about GC without any understanding of what's happening underneath, and indeed sometimes going about "crazy awesome non-GC solution!" that turns out to call out to whatever malloc()/free() happens to live in libc when I pull the cover and with just as much unpredictability as you can get.
As for paying - allocations in GC systems tend to be blindingly fast, but yes, collection does involve other costs.
A funny story related to all that from one of my previous jobs includes a high performance C++ code base that used forking to utilize preloaded code and data and to easily clean up after itself.
Turned out that naive-ish (even with jemalloc dropped in) memory management in C++ code resulted in latency spikes not because of allocation/freeing, but because they put no control over where things got allocated which resulted in huge TLB and pagetable thrashing as copy-on-write got engaged after forking.
To the point that using something like Ravenbrook's Memory Pool System (a GC) with threads quite possible would let them hit performance targets better.