* "Temporal SVC", in which the frame dependencies are structured so you can discard down to 1/2 or 1/4th of the nominal frame rate and still decode the remainder.
* Three output streams, which you could configure for say forensics (high-bandwidth/high-resolution/high-fps), inference (mid-bandwidth/mid-resolution/low-fps), and viewing multiple streams / over mobile networks (low-bandwidth/low-resolution/mid-fps).
* On-camera ML tasks too. (Although I haven't seen one that lets you upload your own model.)
But other cameras are less good. E.g. some Reolinks [2] only support two streams, and the "sub" stream is fixed at 640x352, which is uncomfortably low. Your inference network may not take more resolution than that, but even if not, you might want to crop down to the area of interest (where there's motion and/or where the user has configured an alert) to improve quality. (You probably wouldn't pair that cheap Reolink camera with this expensive inference card, but the point stands in general.)
Even the "better" cameras' timestamp handling is awful, so it's hard to reliably match up the main stream, sub stream, analytics output, and wall clock time. Given that limitation it'd be desirable to just use the main stream for everything but the on-NVR transcoding's likely unaffordable.
[1] https://github.com/scottlamb/moonfire-nvr
[2] https://github.com/scottlamb/moonfire-nvr/wiki/Cameras:-Reol...
I also have a sneaking suspicion using lower channel counts let you raise the FPS but the max of 96 channels is the hard limit, tuned to allow up to use cases like recognition from unprocessed feeds but the documentation access seems to be a manual approval process so I can't verify for sure.
Good point. At that scale, the price might make sense. (I'd still hesitate to buy this card, though. Based on experience with Amazon VT1 instances, I don't have any faith in Xilinx's software quality.)
There are much lower-cost solutions if you don't need that many cameras, e.g.:
* The Coral TPU is nice and cheap. I keep hoping to see a new version and/or someone making M.2/PCIe cards with several of these chips on it. It doesn't do the video decoding, though, so you need other hardware for that.
* There was an Axelera card just announced. [1] I'm curious to read the reviews when it actually ships to folks.
* The newer Rockchip SoCs advertise decent video decoding and some ML acceleration. I have one and will be trying it out sooner or later.
> The onboard camera processing is usually about justifying a cloud pitch ("we use data to send video when something interesting happens" or "we send only the best picture of the face in HD to save bandwidth but still be able to ID them later") not so much letting you go in and solve your own problem.
My software's more aimed at the home/hobbyist side of things. There some folks go with the canned/cloud stuff (Ring/Nest/whatever) similar to what you're saying. Some do everything at home with e.g. BlueIris and use the on-camera ML stuff as it is. The lack of flexibility (mostly due to closed-source, low-quality software IMHO) is a real problem though. Some folks use something like Frigate that does on-NVR analytics, and I'll eventually add that feature to my own software.
> I also have a sneaking suspicion using lower channel counts let you raise the FPS but the max of 96 channels is the hard limit, tuned to allow up to use cases like recognition from unprocessed feeds but the documentation access seems to be a manual approval process so I can't verify for sure.
I bet you're right.
[1] https://www.cnx-software.com/2023/01/02/150-axelera-m2-ai-ac...