I wasn't suggesting anything about your siblings, but you, who are a developer. I was just talking about the actual download step, not what you did after that. (Obviously you were going to host them somewhere else in some other form. Probably not DVDs but a little quickie website or maybe just a Flash drive with a HTML file index, say, I don't know, lots of options here to make it user-friendly for your siblings on Christmas Day. The hard drive or Flash drive idea has the benefit of LOCKSS, especially if you use up the spare space providing PAR2 FEC.)
> I doubt any model could effectively label locations and people over 20 years of video.
Actually, Gemini is highly promptable with a large context window and a single still image only takes up ~300 tokens IIRC, so I think that you could probably do so! Just include, say, 3 photos of each person over time with a natural language description, and 1 photo of each location, and that might be enough to get back useful labels. Gemini can even do bounding boxes. (Google is quite proud of its vision and video analysis capabilities.) And you can run multiple passes or split up videos etc.