What's cool about the era in which we live is if you look at high-performance graphics for games or simulations, for instance, it may in fact be faster to a the model to "enhance" a low-resolution frame rather than trying to render it fully on the machine.
ex. AMD's FSR vs NVIDIA DLSS
- AMD FSR (Fidelity FX Super Resolution): https://www.amd.com/en/technologies/fidelityfx-super-resolut...
- NVIDIA DLSS (Deep Learning Super Sampling): jhttps://www.nvidia.com/en-us/geforce/technologies/dlss/
AMD's approach renders the game at a crummy, low-detail resolution then each frame uses "upscales"
Both FSR and DLSS aim to improve frames-per-second in games by rendering them below your monitor’s native resolution, then upscaling them to make up the difference in sharpness. Currently, FSR uses spatial upscaling, meaning it only applies its upscaling algorithm to one frame at a time. Temporal upscalers, like DLSS, can compare multiple frames at once, to reconstruct a more finely-detailed image that both more closely resembles native res and can better handle motion. DLSS specifically uses the machine learning capabilities of GeForce RTX graphics cards to process all that data in (more or less) real time.
Video is really a series of frames, the framerate for film/human could get away with 24 frames/second-- ~40ms/image for real-time.
What's cool about the era in which we live is if you look at high-performance graphics for games or simulations, it may in fact be faster to run the model on each frame to "enhance" a low-resolution frame rather than trying to render it fully on the machine.
ex. AMD's FSR vs NVIDIA DLSS
- AMD FSR (Fidelity FX Super Resolution): https://www.amd.com/en/technologies/fidelityfx-super-resolut...
- NVIDIA DLSS (Deep Learning Super Sampling): https://www.nvidia.com/en-us/geforce/technologies/dlss/
AMD's approach renders the game at a crummy, low-detail resolution then use "spatial upscaling" to enhance the images one frame at a time.
NVIDIA DLSS uses "temporal upscaling" to pass over multiple frames and uses other capabilities exclusive to Nvidia's cards to stitch together the frames.
This is a different challenge than generating the content from scratch
I don't think this is possible in real-time yet, but someone put a filter trained on the German country side to produce photorealistic Grand Theft Auto driving gameplay:
https://www.youtube.com/watch?v=P1IcaBn3ej0
Notice the mountains in the background go from Southern California brown to lush green
https://www.rockpapershotgun.com/amd-fsr-20-is-a-more-demand....