Also, it's not uncommon to get odd frame rates in the containers. Even on things as "innocent" as listing the frame rate as 29.97 vs 30000/1001 will affect timing (depending on usage). The variations on 23.976 is fun too: 24000/10001. 2997/125.
The muxer is an important step. When using software decoders, things can be a lot more flexible. Back when shiny round discs were popular, there were verifiers that ensured your muxed data was correct. When your decoders are in hardware, there is a very strict set of parameters the input is expected. Any deviation means the hardware cannot play the video. Early days of "cheaper" DVD software had issues with the muxing.
Niche knowledge really can creep up on you over the years as you gradually encounter problems and work to solve them a few hours at a time.
Nowadays it's mostly moot since MP3 is obsolete.
What should we be using instead for lossy audio?
I would guess YouTube will do some sort of fix or sanity check.
For example, I wrote an iTunes-in-the-browser web app; I needed to know durations of songs to display them. MP3 doesn't include these in metadata IIRC, so I needed to pre-process them with ffmpeg just to have duration data. I wasn't doing anything with that other than displaying it. But it would have been nice to just have that info in the metadata.
This jogged my memory from (part of) the first thing I ever built in a general purpose programming language, all of probably 20 years ago! I was doing exactly this: using ffmpeg to get duration metadata from MP3s.
My memory was fuzzy so I looked it up, which (surprisingly!) confirmed what I remembered. MP3s may include metadata (ID3) which may include duration (or start/end times).
I knew my input source (it was me, my music, my MP3 conversions), so I was able to rely on the metadata directly. IIRC I even processed it on demand in my first naive version, which was “slow” but not nearly as slow as stuff I’d complain about today.
I can set up a broken service, that outputs a gajillion lines of same errors to syslog, creating terrabytes of logs, zip all that into few megabytes, and that'd be a valid zip, that'd fill up most modern laptops and servers.
A surveillance camera video, with a very high frame rate when motion is detected and a very low frame rate when not (high framerate -> timelapse), can be a perfectly valid video, taking a few gigabytes in this format, and a few terrabytes when converted to fixed 60fps.
https://www.youtube.com/watch?v=5Grsvyt5xps
The video is 22 minutes but it's reported at nearly 3 hours in length.
Just become its file size is small, does not mean it can't be 15 hours long (one of the author is takeaway is "[t]he size of a video file is not an proper indicator for how long it is": but even without this hack, you can't do that either, since video can have whatever bitrates.)
> Herbie Hancock on Miles: Don't play the butter notes!
i'm interested in learning more about the mp4 format. where can I read more? is there a canonical read that everyone but me knows about?
OP seems like he has some kind of file explorer UI for it - also interested in that
The MP4 format is fundamentally pretty easy, at least the box structure. But there have been so many standards that overlap that the MP4 format is also really messy. Aaand you need to pay to get access to the specs of the format..
> To the best of my knowledge, the impact was rather low because their transcoders are setten up in such a way that they will eventually give up on file if it takes too many resources.
In the end I just encoded them the 'correct' way, but it was eye opening to the wildness going on in video files. I just assumed I would be able to set a duration, a frame rate, and things would "work".
ffmpeg -i INPUT -map 0:v -map 0:a -enc_time_base -1 -c copy -f null -
Note the time= value at the end of the process.So rather than loading the bogus videos in a sandboxed Chromium instance, you want to load them in an unsandboxed VLC instance? I smell eventual RCE.
They were hacking around with MP4 muxers and YouTube. This is definitely the hacker spirit. The word doesn't need to be re-appropriated by Hollywood caricatures.
For your average person a hacker is a person in Guy Fawks mask with a black hoodie that steals your facebook password.
For people in the industry a "hack" is a code that works but might be a placeholder/potentially dangerous code. The author would want to write a better version of it but perhaps is not able to due to time or design constraints
I don't think anyone other than HN using "hacker" in the way it mean to be, perhaps it is time to catch up with the time