The attacker doesn't have to compromise the backend to achieve XSS.
Suppose your website displays user-generated content (like HN posts). If the attacker finds a way to bypass encoding and instead injects JS, then without CSP, the attacker gets XSS at that point. With CSP, even if the attacker can get user-generated content to render as JS, the browser will refuse to execute it.
My understanding of htmx is that the browser would still refuse to execute standard JS, but the attacker can achieve XSS by injecting htmx attributes that are effectively arbitrary JS.