Modern web pages are cluttered with tracking scripts, analytics, styling, ads, and interactive elements that waste tokens and dilute semantic meaning when processing content for AI systems. This library strips away the noise to give you clean, meaningful HTML that:
- Reduces token count by 60-90% (fewer API costs)
- Improves embedding quality (less noise = better semantic search)
- Speeds up processing (smaller payloads = faster inference)
- Preserves structure (headings, paragraphs, links stay intact)
- Zero dependencies (pure JavaScript, no bloat)