You've seen QR codes, maybe Microsoft HCCB [0], maybe jabcode [1] – well, here's a prototype of something new for the pile. :)
I saw txqr [2] a while back and was impressed, but also curious about how much throughput was possible with animated bar codes. I may have gotten carried away in my research into the question.
cimbar is a single and multi-frame color barcode format using reed solomon (for now) and wirehair [3] for error correction. Files are encoded into an animated series of bar codes drawn to the display. Files are decoded by an Android app [4] with its camera pointed at the cimbar code. This works with all antennas off, e.g. in airplane mode, because it's only using the visual data channel.
I ported the encoder to wasm, because I could: https://cimbar.org
Sustained transfer speed is currently on the order of 800 kilobits/second. So it's not very practical for files larger than a couple of MB, unless you have a lot of free time. :)
[0] https://en.wikipedia.org/wiki/High_Capacity_Color_Barcode
[1] https://github.com/jabcode/jabcode
[2] https://github.com/divan/txqr
(air-gapped computers be damned)
That said... given the somewhat remarkable way fountain codes work, there's nothing stopping us from having a protocol that uses the audio and the video channels simultaneously for better throughput...
Having said that, I am actually working on a small library for data-over-sound which can be used for small data chunk transmissions across the room [0].
I'm not 100% sure I understand your example. Is it RDP server (remote) -> RDP client (local) -> cell phone?
A barcode size parameter may be useful (even though it reduces throughput) so that it is easier to scan on all devices.
Transfers getting stuck like that usually is an indication that the decoder can't reliably find the 3+1 corner pattern. I might add a toggle to the decoder app that trades speed for more reliability.
That said, there are some failure modes that are harder to fix: I had a mysteriously slow transfer that was driving me nuts, until I noticed that my mouse cursor was on top of one of the corners.
Curious though, can't you make a web app to record the video, too?
There are a number of loose ends that I'd like to look into down the road, and more flexibility for different use cases is on the list.
edit: I should mention: the main reason the decoder is a native Android app and not a WebAssembly app is that decoding performance is a throughput bottleneck. I wasn't too eager to pay the wasm tax, when I wasn't sure if even native performance would be good enough. As it happens, I now think decode performance is going to be ok -- but it's still a bit of a weak spot in the scheme.
Wonder if someone can decode the .mp4 file from the video. It's not rickroll, I promise ;-)
Though I think 720p is towards the low end of resolutions this'll work on -- the cimbar code itself is 1024x1024, and while there's some wiggle room (hamming distance between the symbols), in my experience it gets pretty iffy below 900x900.
Also... I should probably update the overlay to auto-hide when a file is selected -- it helps a lot to have the full color range.
The decoder has a lot of image processing work to do (intriguingly well-suited for the GPU), and also lots of popcnts. I've optimized it a fair bit, but there's probably some tricks I still need to learn. It turns out that mobile processors don't like heat very much, and blow out their cache much quicker than you'd hope. :)
In the long run, I think your intuition is correct. The hard physics of the camera constraints (exposure time, etc) will put a hard upper bound on FPS+fidelity, and thus bandwidth.
How much data can this one hold?
4-color (standard): 7500 bytes
8-color: 8750 bytes
monochrome: 5000 bytes
Those numbers are with the standard, fairly high ECC setting (~20% of the image) that I settled on for video-based transfer. If we want to be more aggressive and use half the error correction:
4-color (standard): 8400 bytes
8-color: 9800 bytes
monochrome: 5600 bytes