There is a 1-1 correspondence between data compression and generative models. GPT-2 is a highly effective loseless data compression tool: https://bellard.org/textsynth/sms.html
Always wondered why this insight is not taught as much, especially in the context of things like dimensionality reduction...