See https://papertohtml.org/about:
> What data do we keep? We cache a copy of the extracted content as well as the extracted images. This allows us to serve the results more quickly when a user uploads the same file again. We do not retain the uploaded files themselves. Cached content is never served to a user who has not provided the exact same document.
Also, we can delete the extracted data on request. Just send a note to accessibility@semanticscholar.org.
Sorry for the confusion!