May I ask how many logical shards do you have per physical shard/machine? And what is the average size of a logical shard on disk?
You wrote "the data is sharded by customer and then sub-sharded by end user within the customer", but malisper wrote above that "clustering by time winds up being a much bigger win". Isn't it contradictory?