Counting the DRAM row buffer as a separate cache layer is interesting because it has meaningful performance impact, but mainly because existence of such an thing is good counter argument to people who think that DRAM chips have RAS/CAS multiplexed address pins only to save package pins.
I wouldn't really count the fill buffer between L1 and L2 as a separate caching level, or at least if you wanted to do that you should do something similar for L2 <-> L3, L3 <-> memory controller, etc: since such buffers or queues will exist at all those places.