Did the author actually verify this with strace (or the mac/windows equivalent)?
It sounds like he guessed this based on I/O activity of the process. It could be enough to hash the beginning of the files, and compare the rest if a match is found in the database.