It would be nice to be able to decode smaller frame dimensions with faster decoding time. That would be useful for viewing 4K material on computers which can't decode the full resolution.
The same for 10- and 12-bit videos - it would be nice to be able to decode a 8-bit version for 8-bit displays with faster decoding time.
Update: a more thorough look at the code quickly disillusioned me to this idea. Same as libaom...
Feel free to come on IRC, happy to help you dive into this, it's not very difficult.
Here's a comment that gives a clue - https://news.ycombinator.com/item?id=17539791
Great news though!
No access to those machines, so I cannot guess...
a) can be targeted by C, but not by Rust
b) provide enough performance to make porting a next-gen video decoder a worthwhile exercise?
https://code.videolan.org/videolan/dav1d/graphs/master/chart...
A few have begun to embrace LLVM, and so don't care about the front-end language -- they still only say they support C, but turn out to not notice if you feed in IR from something else. Then it becomes a question of how badly your code needs the language runtime support code, or how good you are at porting it, because they will not pick up maintaining any of that under any circumstance. GC? Ha.
I think the lack of mention of GPUs in the post means the answer will be "no", but is this an area where open-source folks could realistically someday lean on the GPU for any help with decoding at all?
I see mentions of CPU/GPU "hybrid decoding" from GPU vendors, but can imagine that might only be something realistically possible with the lower-level access to the GPU the vendor's own driver team has, not via the documented shader languages and APIs.
Very very hard to do, with standard GPU APIs. You need GPU assembly to do great stuff, and this is rarely available or cross-GPUs.
Also, the issue is that, after SIMD, the run time of the things that are easy to parallelize (therefore GPU-izable) is around 25% or 30%. Which could offer some improvements, but not a x2 improvement.
Also, CPU <-> GPU memory transfer need to be avoided, on desktop, or mobiles where the memory access is not uniform, because this adds a lot of I/O latency.
So, some things are doable, but a full "GPGPU decoder" is unlikely...
Reasons a GPU vendor might be better able to do this sort of thing than an outsider who can sling OpenGL include: 1) some hybrid decoders are described as leaning partly on special-purpose video decoding hardware, which tends to be a black box to us, and 2) more-detailed understanding of and access to the details of the hardware might let you efficiently express something that's inefficient or awkward in just GLSL--in other words, same kind of reason people care about Metal/Vulkan vs. OpenGL or asm vs. C.
(The further down in the weeds I get the less sure I am of precise technical correctness, but a couple concrete things that seem to make shaderizing decoding tricky are: 1) AV1 has a ton of control-flow-y elements--blocks can be split many different ways and be different sizes, and there are lots of prediction modes--and branchy code can be bad for shader efficiency, and 2) some things seem to block parallelism, e.g. for intra prediction you need the blocks you're predicting from before you can do predictions for the next block. And given the CPU-GPU transfer latency you can't ping-pong back and forth at will; you need large chunks that run well strictly on the GPU. Could be that pieces like the transforms and post-filtering that can be cleanly separated into GPU steps, though.)
An efficient open-source AV1 decoder based just on OpenGL/GLSL would be great! But since it wasn't mentioned as an ambition in the post, community-written hybrid decoders seem rare, and we had an expert about AV1 decoders in the thread, it did not seem unreasonable to me to ask how realistic it was.
Though if you manage to write an open-source OpenGL-accelerated AV1 decoder, that would definitely answer my question and leave everyone happy. :)
Is there a need to seperate VideoLAN and VLC?
Anyway nice progress, didn't expect such good results so soon. My main question right now is what the slowest system is on which AV1 is still playable. I know that older CPU and ARM optimizations are on the horizon (On the other platforms, SSE and ARM assembly will follow very quickly, and we're already as fast on ARMv8.), but I'm curious if my raspberry pi/odroid will ever be able to play 1080p AV1 Videos.
Yes, the community are not joint. VideoLAN has numerous people not working on VLC.
> Raspberry pi/odroid will ever be able to play 1080p AV1 Videos.
rPi? no. Recent o-Droid, yes.
Whoa, what? What else is going on? Oh, x264 & x265, I bet.
No actual measure, just feeling from what we've seen.