Video is hard. It's a lot of data that needs to be read, processed, and displayed under tight time constraints and needs to be synchronized with associated audio playback.
This is all made more challenging if you're trying to do it on storage, bandwidth, memory, and processing constrained consumer computing hardware. Compression somewhat solved the storage and bandwidth problems but not processing. Better fidelity codecs needed a lot more processing power to decode.
In the early 90s you had MPEG-1 at the high end of the quality scale but was so processor intensive you needed decoder ASICs in consumer hardware just to play it back. Then you had codecs like Cinepak that were far less processor intensive but middling quality. Then you had much lower fidelity like Microsoft's Video1, Apple Video, and even Smacker which had very low decoding requirements but didn't look great.
Network delivery of any of those in consumer hardware was a pipe dream when 14.4K modems were still rare and 9.6k were common. The h.261 codec, which MPEG-1 was based on, had a minimum bitrate of 64kbps which was out of reach of pretty much everyone.
Besides the hardware decoding requirements of MPEG-1 it was entirely unsuited for editing. Both QuickTime and Video for Windows were meant for editing on consumer desktop machines. The codecs they supported were meant for editing and then delivery (on CD-ROM typically).
In the mid to late 90s processing power had advanced such that MPEG-1 and h.261/3 could be decoded real-time in software. RealVideo and Sorensen Video 1 were both based on drafts of the h.263 spec which included video conferencing over POTS connections in its design criteria.
Again I'm not seeing dark times for digital video. There were lots of codecs because they had different uses, limitations, and strengths. The h.26x codecs were designed for video conferencing and it was Real and to a lesser extent Apple that realized they were also useful for streaming over the internet. Both MPEG-1/2 were unsuited for streaming as they didn't support variable frame rates and handled low dial-up bitrates poorly at best. It wasn't until the MPEG-4 überspec that internet streaming, video conferencing, and disc-based delivery settled under a single specification.
While ffmpeg is an amazing project and widely used, it didn't really do anything to settle the proliferation of codecs and containers. It really was MPEG-4 that allowed for that to happen to the extent it's happened.