This seems pretty streamlined and well-documented, although I wonder if they could have delayed the Facebook login. Convincing users to do your data cleaning for you is quite the trick, like the Smithsonian
.I wonder, what more underhanded versions of data collection are possible? Maybe providing the opportunity to clean inline as users search the archive?
https://transcription.si.edu/phyllis-diller-cards