2. Create a backend that queries each camera for their current image.
3. Run the images through some object recognition neural network (fastai has some great object recognition tutorials).
4. Make your backend stitch together the images from each camera to make a video of what was being recognized when objects of interest were detected (each camera will create its own video)
5. Create a nice little UI/App for everyone to view the movies with some filtering in place (time of day, object(s) of interest).
I want to create something like this as well. All of the pieces seem to be pretty straightforward to me except for figuring out power and network connectivity. I guess each neighbor would have to configure the device to use their SSID... and hope someone doesn't steal the raspberry pi and get the SSID credentials from memory... Maybe just make the neighbors use a separate network?