Regardless of what they say, they CAN compete in training and inference, there is literally no alternative to W7900 at the moment. That's 4080 performance with 48Gb VRAM for half of what similar CUDA devices would costs.
FP16+ doesn't really matter for local LLM inference, no one can run reasonably big models at FP16.
Usually the models are quantized to 8/4 bits, where the 5090 again demolishes the w7900 by having a multiple of max TOPS.