Just another entry on the "things that are supposed to be impossible that convolutional nets can do now."
but man, it can make your internet pics look smooth! :) thanks for the comment!
(Created by the super talented duncanrobson)
A good content unaware upscaling would be nice (one of the default photoshop algos)
I also wonder what they used for the downscaling. I see 4x4 pixel blocks, but also some with 3px or 7px lengths.
This looks pixely and is supposed to be a source file?: https://raw.githubusercontent.com/Tetrachrome/subpixel/d2e28...
https://arxiv.org/abs/1609.04802
The pic with the boat on page 13 is interesting. In the SRGAN version I would take the shore for some sort of cliff, while the original shows separated boulders.
I'm not familiar enough with the field to understand how the "neutral net" part feeds in, other than to do parallel computation on the x-pos, y-pos, (RGB) color-type-intensity tensor interpolated/weighted into a larger/finer tensor.
(linear algebra speak for upscaling my old DVD to HD, that sort of thing)
At the risk of exposing my ignorance, this has nothing to do with "AI", right? It's "just" parallel computation?
this may make you feel disappointed now, but in the write up we are also pitching this same module to be used in generative networks and other models that do build an understanding of the scene. Lets see what the community (and ourselves) can do next...
Approaches in the past used heuristics (like finding edges and upsampling them, etc). Those were fragile systems. In this approach, the system learns what's appropriate on its own.
I missed the part, though, where there was some "learning"/"adjusted predication" in the interpolation function(s), rather than just a fixed calculation such as a literal linear interpolation.
I was happy just to be able to tease apart the big equation before the the python code sample, but was too lazy to drill down into what the "delta-x"/"delta-y" factor-functions were.
Still, this was a good presentation: somebody with little to no knowledge of the field, but some math, could get the gist of it. Kudos to the author.
Unless we are telling it to "be intelligent" whatever that means.
How were the input images prepared?
edit:
example: https://youtu.be/yZyIYUEfT3U?t=71
Also, masks themselves can be motion blurred, and if motion blur approximation is close enough to the footage, then it's good https://www.youtube.com/watch?v=biginQL6NIo
And, what it looks like pulling a matte with state-of-the-art tools https://www.youtube.com/watch?v=8oQqr6Lfmag Still a pain.
It's still useful though, browsers, for instance, could use it for displaying downscaled images.
You're right though, and that's why chroma hinting for subpixel AA has fallen out of favor. It also doesn't work on mobile where the screen can be rotated from RGB-horz to RGB-vert at a moment's notice. This was changed for ClearType in Windows 8 (DirectWrite never did chroma hinting).