undefined | Better HN

0 pointspshirshov6mo ago0 comments

Regardless of what they say, they CAN compete in training and inference, there is literally no alternative to W7900 at the moment. That's 4080 performance with 48Gb VRAM for half of what similar CUDA devices would costs.

0 comments

grim_io6mo ago

How good is it though compared to 5090 with 32GB? 5090 has double the memory bandwidth, which is very important for inference.

In many cases where 32GB won't be enough, 48 wouldn't be enough either.

Oh and the 5090 is cheaper.

adgjlsfhk16mo ago

AMD has more FP16 and FP64 flops (but ~1/2 the FP32 flops). Also the AMD is at half the TDP (300 vs 600 W)

grim_io6mo ago

FP16+ doesn't really matter for local LLM inference, no one can run reasonably big models at FP16. Usually the models are quantized to 8/4 bits, where the 5090 again demolishes the w7900 by having a multiple of max TOPS.

1 more reply

lostmsu6mo ago

The FP16 bit is very wrong re: LLMs. 5090 has 3.5x FP16 for LLMs. 400+ vs ~120 Tops.

j / k navigate · click thread line to collapse

0 pointspshirshov6mo ago0 comments

0 comments

grim_io6mo ago

How good is it though compared to 5090 with 32GB? 5090 has double the memory bandwidth, which is very important for inference.

In many cases where 32GB won't be enough, 48 wouldn't be enough either.

Oh and the 5090 is cheaper.

adgjlsfhk16mo ago

AMD has more FP16 and FP64 flops (but ~1/2 the FP32 flops). Also the AMD is at half the TDP (300 vs 600 W)

grim_io6mo ago

1 more reply

lostmsu6mo ago

The FP16 bit is very wrong re: LLMs. 5090 has 3.5x FP16 for LLMs. 400+ vs ~120 Tops.

j / k navigate · click thread line to collapse