1. No, it's impossible. Actually, the theorems in this paper do not claim to reach zero loss either, as they're all inequalities on the size of the loss. The paper you cite refers to converging to zero loss, as do you in point 2. Perhaps you're referring to error, which is not the loss that is directly optimized.
2. This paper certainly isn't talking about generalization. It doesn't appear to be mentioned once. Your other paper is talking about generalization. The parent asked if this paper is super important. I gave a reason why it isn't super important for most people.
3. Massively overfitting is antithetical to generalizing. Overfitting means fitting to the extent that you're generalizing less well.