7000 pictures at 5 seconds per picture is "only" 10 hours of work. Possibly per-picture time can be lower than that too. Seems quite doable over 2-4 afternoons.
Props for doing the project end2end, including the non-trivial (and typically skipped) part of collecting training data.