Depending on your definition of "way" this is true. A lot of persistent vector-like structures can have quite good performance, but if your runtime is dominated by iteration and random updating, you will still get a speed up by using raw arrays, i.e. guaranteed contiguous blocks of memory.
> This is especially bad if you want the data structure to be shared across multiple threads. Just iterating over a tree now requires taking and releasing oodles of locks.
This is not. The whole point of immutable data structures and concurrency is that iterating over them does not require any locks because the structure won't change! So concurrent reads require absolutely no new code and certainly no locks. For concurrent writes, the usual strategy is a single CAS-like operation at the top-level to swap out the old immutable structure for a new one (e.g. Clojure atoms, Java AtomicReferences, Haskell IORefs, etc.). But that's a single thing for the whole structure, not every level.
In general concurrent update in the functional programming world is handled by CAS not locks.