Four dimensions to specify a ray: two dimensions for direction, and two dimensions for where it passes through the surface (whether the surface is the display, or the viewer's eye). You could specify a ray with five dimensions (three spatial, two angular), but that's actually overdetermined; any ray with the same direction along the same line is identical, so it's a four-dimensional quantity. (Source: I do this for a living. But google 'plenoptic function' for more technical explanation.)
Those four dimensions are projected into two on the retina, but the exact projection is a function of where the person is focusing. (That's how we understand focus; by changing the lens parameters, we can focus at different depths in the scene.)