Everything is a trade-off. Immutable objects enables easier safety when having multiple threads do work concurrently. Shared mutable state is still very difficult to do correctly, and at the point where you're introducing locks then you've crippled performance.
We have so many cores now that it tends to be a positive trade-off to have many threads doing some wasteful work (copies, extra GC pressure, potentially multiple threads duplicating the same work) than trying to have a perfectly optimized single thread.