Citation needed. I don't know of any GCed language that heap-allocates local variables in this way (well, maybe SML/NJ does, but I doubt it). Certainly not Java or any Common Lisp I've ever used.
But I agree with your larger point: heap allocation is never quite as fast as stack allocation, once you factor in the additional GC load. I don't actually know how close it gets with modern collectors; would love to see some numbers.