This is usually my kind of instinct too, which is why I was attributing the other viewpoint to someone else. But it was interesting to understand where someone like that is coming from. You can spend a lot of time adding the right swaps etc. but in the end if you don't own hardware that works that way, aren't testing it regularly, from a certain perspective you may be wasting your time.
OTOH I recall that insightful article from Rob Pike about how the "right" way to do it in a testable fashion is to not think in terms of swapping at all, and just do shifts that are portable regardless of architecture. http://commandcenter.blogspot.com/2012/04/byte-order-fallacy...
(By the way, in your 68k -> PPC -> Intel -> ARM example, the endianness only changes once. This was actually part of my friend's original argument, that endianness changes are even more costly than the instruction set, and platform vendors would be unwise to change it. "Modern" CPUs support both endianness types, so those shipping ARM platforms are in effect consciously deciding to be the same as Intel.)