Some languages (what's being discussed here) give you arbitrary precision integers, like Lean. So the proof in the blog applies to Lean and any issues with things like -INT_MIN not existing isn't a factor in the proof. That's what's being discussed. The proof for C or Ada or something else with fixed-width integers will be different (and the algorithm, likely) to account for that difference and any other relevant differences.
Usually `system` in this context refers to the machine you run the program on, not the language itself, or at least that is how I interpreted it. I guess there used to be systems with BCD math supported by hardware, and I guess those could have variable precision. I'm not sure if it was arbitrary (ie it could still overflow), but one could presumably detect overflow and increase the memory allocation. But once you have a software system doing your math rather than a machine instruction it seems like you need to prove the correctness of that implementation not just the general concept.
Given Lean's raison d'etre it wouldn't surprise me if the relevant library code was verified. If using another language such as Python I think it would be fairly reasonable to take the underlying implementation on faith. If we're going to start questioning whether a core part of the runtime that implements basic arithmetic operations has bugs then why are we not also suspecting the operations provided by the CPU and other hardware?
We probably should suspect those things, but as a matter of practicality we're going to need a boundary at present.