I am speaking from experience, I have written a lot of renderers.
> GPU resources in particular need to wait on fences and therefore can't have a neatly defined lifetime like what you're proposing unless you block your whole program waiting for these fences, while other GPU resources are (like I said) immutable and need to live for multiple frames.
I don't quite understand why fences are a problem here. As long as you know what frame a resource corresponds to, you just to have to guarantee that the entire frame is finished before reusing. You don't need to know what's happening at a granular detail within the frame.
As far as static resources, that's easy: another arena for static resources which grows as needed and is shared between frames. Most of that data is living on the GPU anyway, so you just need to keep a list of the resource handles somewhere, and you're going to want a budget for it anyway, so there's no reason not to make it of finite size.
> More generally, how do you track resources other than the framebuffer, which are usually managed very dynamically and often appear and disappear between frames?
See the whole issue is trying to solve the issue "more generally". In many cases, you can know exactly the resources you will need (or at least the upper limit) when you start rendering a given scene. If you need to do any memory management at all, you can do it at neatly defined boundaries when you load/unload a scene.
The only time when you need some kind of super dynamic renderer is when you are talking about an open world or something where you are streaming assets.
Most of the serious renderers are written in C++ using custom allocators, which are just some kind of arena allocator under the hood anyway.