I/O is done piecewise, a line at a time. The file is never "loaded up". Again you're applying an intuition about how parsers are presented to college students (suck it all into RAM and build a big in-memory representation of the syntax tree) that doesn't match the way actual config file parsers work (read a line and interpret it, then read another).
In said systems, RAM was such an expensive resource that we had to save individual bits wherever we could. Such as only storing the last two digits of the year (aka the millennium bug).
The computational cost of infrequently rescanning the config files then freeing the memory afterwards was much cheaper than the cost of storing those config files into RAM. And I say “infrequently rescanning” because you weren’t talking about people logging in and out of TSSs at rapid intervals.
That all said, sshd was first written in the 90s so I find it hard to believe RAM considerations was the reason for the “first match” design of sshd’s config. More likely, it inherited that design choice from rsh or some other 1970s predecessor.