Explained: Freshly recorded MPEG (and almost all other container types) typically saves the header at the end of the file. This is the logical place to store header information after recording as you just append to the end of the file you've just written. It's the default.
Unfortunately having the header at the end of the file is terrible for web playback. A user must download the entire video before playback starts. You absolutely need to re-encode the video with FAST-Start set.
The header location is the number one mistake that i've seen a lot of website and developers make. If you find your website videos have a spinner that's seconds long before the video playback starts check the encoding. Specifically check that you've set fast start.
I've seen companies who have a perfectly reasonable static site behind a CDN spend a fortune hosting their videos with a third party to fix the latency issues they were seeing. The expensive third party was ultimately fixing the issue because they re-encoded the videos with fast start set. The reality is their existing solution backed by a CDN would also have worked if they encoded the videos correctly.
Writing metadata at the end, instead of periodically interleaving it inside the file, is not only useless for web playback, but also for any creation process (i.e., live recording) which intends to call itself robust. Imagine recording a 1-hour long video, only to have some unrelated issue abruptly stopping it all (e.g. the application crashes, or the video camera suddenly loses power), thus rendering an unplayable file because the damned metadata didn't get to be written at the end of the file...
(luckily there are post-process methods that in some cases are able to restore such broken files, but still, a priori the file will be unplayable)
One of the few surviving legacies of Apple technology created during the Jobs interregnum.
Like everything else not from NeXT, the modern Apple has killed QuickTime, but they can’t eliminate its concepts and wacky constants from the MPEG-4 standard.
For folks who might not realize you can fix this post-encode, you can do this:
ffmpeg -i in.mp4 -c copy -map 0 -movflags +faststart out.mp4That was not my experience. If the web server announces the Content-Length of the video file, most browsers will make a range request for the end of the file and then go on from there.
Latency is still a bit higher this way, but no way near downloading the entire video file.
If anyone's interested, this is called the moov atom, and when you use "web playback" presets in tools like Handbrake, that's typically what it does is move that header to the beginning of the file. The progressive playback issues for files not optimized has improved somewhat when the file's host supports range requests, since browsers have gotten smart enough to try and check the end of the file for that header via range requests.
If you're interested in more along the codec/container line, one of my colleagues gave a talk on the internals of MP4 containers at Demuxed a few years ago[0] and another gave a Streaming Media keynote on the history of codecs and containers, particularly as they relate to online video[1].
[0] https://www.youtube.com/watch?v=iJAPTY3B7yE [1] https://www.youtube.com/watch?v=9Qo3WfsK4vc
It is a shame that Chrome only allows it to contain vp8/opus. If it allowed all the codecs in mp4 I think it would see much wider adoption.
So a more concise answer, the codecs used for video by YouTube have the optimization you're thinking about built in. It would send a "pixel didn't change from last frame" and not need to send all the color information for all the pixels.
YouTube does not really do variable frame rate, and it's messy for editing, but it's another optimization that is possible and could be useful for the type of video you're describing.
The article describes DASH which you would need to send the full frame initially every time segment but per segment the previously described concept still applies. I don't believe YouTube uses DASH for anything outside live streams.
This is a pretty good high-level video I came across - https://www.youtube.com/watch?v=r6Rp-uo6HmI
H.264/AAC4:
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High@L3.2
Format settings : CABAC / 2 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 2 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 3 min 29 s
Bit rate : 375 kb/s
Width : 1 080 pixels
Height : 1 080 pixels
Display aspect ratio : 1.000
Frame rate mode : Constant
Frame rate : 25.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.013
Stream size : 9.35 MiB (74%)
Writing library : x264 core 155 r2901 7d0ff22
Codec configuration box : avcC
Audio
ID : 2
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 3 min 29 s
Bit rate mode : Constant
Bit rate : 128 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 3.19 MiB (25%)
Default : Yes
Alternate group : 1
VP9/Opus: Video
ID : 1
Format : VP9
Codec ID : V_VP9
Duration : 3 min 29 s
Width : 1 080 pixels
Height : 1 080 pixels
Display aspect ratio : 1.000
Frame rate mode : Constant
Frame rate : 25.000 FPS
Language : English
Default : Yes
Forced : No
Color range : Limited
Audio
ID : 2
Format : Opus
Codec ID : A_OPUS
Duration : 3 min 29 s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 48.0 kHz
Bit depth : 32 bits
Compression mode : Lossy
Language : English
Default : Yes
Forced : NoAlso check out the video-dev Slack[0] and demuxed. Pion WebRTC and WebRTC for the Curious was motivated by conversations I had with other developers in their Slack.
Also, WebRTC for the Curious was on HN a while back, but for those that didn't see it the first time around: https://webrtcforthecurious.com/
This site is an incredible, concise, and comprehensive resource for people trying to better understand how video works.
I attended the demuxed conference this year, and was exposed for the first time into the nitty gritty of how video works behind the scenes.
Huge props for the content not being a sales pitch, and truly being educational and informative.
I've already tinkered around with ffmpeg including throwing around any movflags I could find and it will never be able to stream it properly in chunks.
My current working solution is using MP4Box with this command
MP4Box -dash 1000 -rap -frag-rap test.mp4Great explaination tho.