It’s sort-of optional for most playback stacks because they leave frame reordering to individual decoders as a codec-specific implementation detail, but Apple’s stack actually cares about frame-accurate random access so it relies on the codec-independent container timestamps.
The inaccurate seeking you get without container pts is okay for playback but it falls apart with editing or stuff like av1an.