Although I love the coinage "2 demential space", I think you mean "comparing one-dimensional space [audio] to three-dimensional space [video]". A two-dimensional signal might be a still image or a temporal sequence of samples from a one-dimensional array of sensors, such as those in a single slice of a CT machine or a linear MIMO antenna array. A video signal is three-dimensional, not two-dimensional, and probably not "2 demential" either.