...it's complicated. Very complicated. However complicated you think it is, it's more complicated than that. Please note that I'm not an expert in human eyeball physiology, I'm just a computer programmer who's tried pretty hard to come to a better understanding of how to make computer vision better. (I've failed, fyi. Caveat emptor.)
The human eye has four basic cell types, rod cells and cone cells, and there are three subtypes of cones, short, medium, and long. The three subtypes of cone cells sense blue, green, and red light more or less directly. Medium and long cone cells, which directly detect green and red light, almost entirely overlap. [0] It is more accurate to say that long cone cells detect yellow light than it is to say it detects red light. There is a brain system which measures the difference in response between the long (red) and medium (green) cells and uses the difference to say "aha! this must be red!"
The ratio of short (blue) medium (green) and long (red (yellow)) cone cells are roughly 2%, 2/3, and 1/3. The cells in your eye which detect blue light are more or less a rounding error. The cells which detect green light are roughly twice as numerous as the cells which detect red (well, yellow) light. If you see a thing and think, "man, that's awfully blue," it's not because your eyes are telling you "hey, this thing is awfully blue". The "blue" signal is barely noticeable in the overall signal; but your brain jacks up its responsiveness to the minuscule blue signal.
One of the side effects of the completely fucked ratios between the three types of cones is that your perception of the overall brightness of a thing is mostly down to how green it is. This shows up in lots of standards; NTSC, JPEG, the whole nine yards. If you've ever implemented a conversion between RGB and any luminosity-chroma colorspace (YUV, YCbCr, YIQ, NTSC, any of them) there's a moment where you'll go "wait a minute this doesn't make any fucking sense". You look at the numbers and the luminosity channel is just... green, and you know that the other two chroma channels are quartered in resolution. And you'll think that makes no sense. But that's how it works.
Then you'll remember that color sensors have their pixels arranged in groups of four, with two green, one red, and one blue channel. There must be some green conspiracy.
And there is. It's your brain. It's your eyeballs with 2/3 of its cone cells being green sensitive ones.
Those are your cone cells. Rod cells are entirely different. It's trivial to say well, cone cells see color, rod cells see black and white, but it's more complicated than that. Rod cells are excellent in low light conditions, cone cells not so much. Cone cells see motion very well, rod cells not so much. Cone cells can discern fine detail, rod cells do not. Rods and cones are not evenly distributed across the retina either; cone cells are densely packed in the center, rod cells are more common in peripheral vision.
Look at a colorful thing directly; take a note of how colorful it is. Now look away from it, so it's only in your peripheral vision; take a note of how colorful it is. Does it seem just as colorful? It isn't. That's your brain fucking with you. Your brain knows it's in your peripheral vision and all the colors are muted out there, so your brain exaggerates the colorfulness. Cone cells are 30 times as dense in the center of your vision as they are just outside the center of your vision. [1] That's why you can read a word directly where you're looking but it's very difficult to read elsewhere.
The reality is that your retinas give a fucking mess of bullshit to your brain, and the brain is the most incredible image processing system conceivable. It takes bullshit that makes no damn sense and -- holy shit I forgot to talk about blind spots.
Ok, so your rods and cones have a light sensitive thing, with a wire in the back, and all the wires get bundled up in the optic nerve that goes to the brain. Here's the thing: they're fucking plugged in backwards. The wires go forward, and are bundled up between your retinas and the stuff you're looking at. The big fat optic nerve therefore constitutes a large chunk of your vision where you can't see anything. Your brain just.. invents stuff where the optic nerve burrows through your retina.
Other weird stuff. If it's bright, the rods and cones send no signal, if it's dark, they send a strong signal. It's inverted. There's apparently a very good reason for this but I don't remember what it is. Also, the rods continuously produce a light sensitive substance that amplifies the light sensitivity but is destroyed in the process. It takes a long time to build up a reserve. This is why it takes time to "build up" your dark vision, and why it's so easily destroyed by lighting a cigarette. The physiology of "ow it's bright" as opposed to "it's bright" isn't just on your retinas, it's also on your eyelids and your iris, but more importantly, it's shared between your two eyes. This is why closing one eye makes it less painful when you go from a dark place to a bright place.
The point is, the study of human vision is not the study of the human eye. The study of human vision is the study of the human brain.
Much of what we do with color spaces and image compression is dictated by our stupid smart eyeballs and our stupid smart brains. Video codecs compress with 4:2:0 chroma subsampling because the brain's gonna decompress that shit better than a computer can anyway. Cameras have twice as many green sensitive pixels as blur or red pixels because the eye resolution is much sharper in green than other colors. More advanced image and video compression schemes will try harder to account for human eye-brain physiology.
[0] https://upload.wikimedia.org/wikipedia/commons/0/04/Cone-fun...
[1] https://upload.wikimedia.org/wikipedia/commons/3/3c/Human_ph...