Just looking at that clause makes me think perhaps the Web Audio API should have been called something else.
Can you imagine writing "fetching and displaying various image formats is a bit outside the purview of HTML"?
(I realize that's a bit apples 'n oranges.)