undefined | Better HN

0 pointssadpasture3y ago0 comments

I think it has to do with text being much more precise. Your stably diffused cartoon avatar having 6 finger is not nearly as noticeable as a language model's chat mispelling every second word. So you need less resources to get to a human acceptable result

0 comments

1 comments · 1 top-level

andbberger3y ago

no, diffusion models are just more efficient

j / k navigate · click thread line to collapse