Seems like only comparing a checksum is a recipe for disaster, especially when hosting an enormous amount of files. Comparing checksum (or multiple checksums) and file size would drastically reduce the amount of false-positives.
Googling DS_Store it contains the folder customization metadata which backs up my first instinct. These are almost certainly identical byte for byte recreations of DS_Store with probably default settings.