I think his point is that it's easy to ignore serious usability problems when you're using better hardware than your customers. For example, developers with big screens are more prone to create GUIs that do not work well on small screens. Similarly, some Xbox games are unplayable in 480i or 480p mode because nobody ever bothered to check whether the interface elements were readable at low resolutions.
Testing is not a boolean. You won't notice if some item on a checklist took a couple of seconds during your daily device test, but a user will notice when that action is all he's doing, over and over.
You say "testing is not a boolean", then imply that testing "over and over" is? Developing on the emulator doesn't stop you from doing nice, lengthy real usage testing on the devices once you've got everything in decent shape.