That would be an interesting problem, how do you algorithmically filter out SEO and focus on useful information? A signal vs. noise problem, but on human text, with the added challenge that the noise is trying to outsmart you. Maybe an adversarial ML network with an SEO-generating bot working against you?