On StackOverflow I saw someone say that they got hash collisions in MD5 (128 bit) after hashing around 20k files.
When I tried making something similar I figured if I added the size of the file in bytes to the hash that would decrease the number of hash collisions since you would need a permutation of bytes in a set of bytes of same size to generate the same MD5 hash to get a collision. Still feels random and unavoidable in the greater scheme of things, though.