While I don't know RethinkDB is structured internally, I don't see any technical reason why a non-mapreduce group-by needs to load the entire table into memory instead of streaming it, or why a mapreduce group-by needs to be slow. M/R only becomes a slow algorithm once you involve shards and network traffic; any classical relational aggregation plan uses a kind of M/R anyway.
Postgres has a schema, of course, but it still needs to look up the column map (the ItemIdData) in each page as it scans it, the main difference being that this map is of fixed length, whereas in a schemaless page it would be variable-length.
Anyway, I'm hoping RethinkDB will get better at this. I sure like a lot about it.