All the major cloud providers offer some form of face detection and numberplate reading, with many supporting object detection (ie package, vehicle, person) out of the camera itself.
It's definitely creeping into things, though most of the features I've seen are fairly simplistic compared to what would be possible if the video was being reviewed + indexed by current SoTA multimodal LLMs.