https://en.wikipedia.org/wiki/Cave_automatic_virtual_environ...
http://www.allosphere.ucsb.edu/
In VR headset terminology it has full field of view, zero latency head tracking, very high resolution, group experience, no fatigue, etc.
Obviously, everyone can't have one in their house but it's a great experience.
If it's like the ones have been or worked on, using 3D glasses (active or passive) with IR reflectors for tracking by multiple cameras, then it neither has full field of view nor zero latency and I can't see how it would have group rendering.
The field of view is limited by the glasses, if your eye goes outside the glasses you see both eyes together on the images projected on the screens and it looks fuzzy. Obviously you can turn your head 360 but that's the same in a headset.
In terms of head movement, in a VR headset you have full spherical coverage, not just 360 horizontally, which seems to be the case here?
For tracking if they use Vicon cameras or similar system, you also have the usual motion to photon latency: video camera latency + video camera frame interval + CV algorithms + rendering + sending to screens.
Actually none of the Cave systems I've seen use predictive tracking (like VR headsets do), where the head pose send to the renderer is not the sampled pose by the tracker but the predicted pose that your head will have by the time the image is displayed in the headset.
Group experience? How do they manage that, I'm curious. For accurate multi-person VR you need a different render for each person based on his specific head position and orientation. In a Cave that would mean rendering 4 images per each frame just to get it working for 2 people. So 360fps refresh rate screens for a 90fps experience for example, (which is what the Rift and Vive have).