You cannot easily control the volume until the video starts playing; not even when the video is still loading.
Luckily on newer version of Android there is always an option to quickly access the media volume after you press the volume button once, but it's still unintuitive.
There are still two classes of problems unaffected by this change.
1) As is popular with gifs, there could very well be multiple of these autoplaying videos on the same page, probably even multiple fitting in the same viewport area. If they had sound, these would interfere with eachother.
2) My understanding is that many listen to music from some source unrelated to the webpage containing these autoplaying videos. So even if there's only one video, its autoplaying sound could easily mix with my tunes.
Doing autoplaying sound is hard, but with gaze tracking and knowledge of system sound usage status, I think a pretty good solution is possible.