1: they aren't. They don't need to be, and they cannot be.
Simulate the first step of the first machine, then the first step of the second machine, then the second step of the first machine, then the first step of the third machine, then the second of the second, then the third of the first, then the first of the fourth, and so on. Every step of every program/machine is eventually reached.
For the same reason that the pairs of integers can be put into bijection with the integers.
2) that is not a problem. The first step of that is not to run a step of a program it simulates. The "simulating a single step of a given machine" is not a single step. It is many steps. As such, there is no such recursive problem like you describe. Simulating a single step of a machine always finishes, regardless of what that step "represents". Even if the step being simulated is part of some simulator machine, simulating it is still done in the same way, and doesn't take longer.
3) Any programs for any computer we have can be simulated by a Turing machine. They are quite general.
You might claim that some physical process is not computable, but that has not been demonstrated for any physical process (well, except for maybe consciousness, but that is contentious), and most ideas of how physics works are computable, so there would be a significant burden of proof.
So, it seems like any program we can run can be simulated by a Turing machine.