Here’s an example [1] of how bokeh really looks, using a high quality fast lens on a decent camera. We’ve got a long way to go.
[1] https://upload.wikimedia.org/wikipedia/commons/8/8a/Josefina...
Edit: I really have no idea what I’m talking about. I’m just guessing it can definitely be improved upon. I probably should’ve stopped at it’s a nice proof of concept (like most free content) :)
Rather than using a hardware solution, this is a good software solution that can be accomplished on more devices.
The effect being that blurred colors mix wrongly, as you can see where the red from the flower mixes with the green. The transition looks wrong, and not how a lens would render it.
- a background scene is usually static. if a pixel matches its historical average (i e. not changing) then it's probably background.
- if a pixel color matches its neighbors, it's probably the same as them. i noticed my black shirt tripped it up on occasion, but was all-or-nothing.
- people are blobs, not diffuse, so try to segment large regions.
--
By the way, Google does some absolutely nuts stuff with this on the Pixel 3 and 4 - they actually calculate a stereo depth map using the two autofocus sites in individual pixels. Essentially some modern CMOS sensors use a technology called dual pixel autofocus (DPAF), by measuring the response from two photodiodes in the same pixel, you adjust focus until each pixel has the same intensity (more or less). If the camera is out of focus, the two photodiodes will have different intensities.
However what this gives you is two separate images with an extremely small (but detectable) parallax which can be used to give coarse 3D reconstruction, and you can segment foreground and background. It's nice because you get a strong physical prior, rather than having to worry about using a convnet to identify fore/background regions. (They of course apply a convnet anyway to refine the result).
https://ai.googleblog.com/2018/11/learning-to-predict-depth-...
https://ai.googleblog.com/2019/12/improvements-to-portrait-m...
You can do better by playing with intensities of pixel values as suggested in the article I linked.
mask = masks[0][0]
Presumably 0 is the class ID? For someone new to ML or object detection, it might not be obvious why you take the first channel here.Also recent related reading: https://bartwronski.com/2020/03/15/using-jax-numpy-and-optim...
HN Discussion: https://news.ycombinator.com/item?id=22590360&ref=hvper.com&...
https://ai.facebook.com/blog/-powered-by-ai-turning-any-2d-p...
Anyway, the entire technique is also very easy to do manually with Photoshop.
If not, this approach seems to have merit
This is bokeh: https://cdn.mos.cms.futurecdn.net/JgrDuxQPCAvgd5VzqiKN5a-650...
And bokeh is 3 dimensional and surrounds the focus area, increasing with distance: https://www.flickr.com/photos/scottcartwrightphotography/145...
I suggest to brush up a bit on photo skills, although I agree that the instragram and iPhone culture can shift your perception of "bokehlicious" quite a bit.
I don't want to make you feel bad, so I also tell you that Apple's "bokeh" in portrait mode looks disgusting, doesn't make any sense and breaks the image in a lot of trivial cases... and they spent a lot more money & effort on it than you :).
I just wanted to see how far I could take this idea.
I was able to scroll to the bottom footer, but I can’t scroll back to the top, so there’s literally nothing but a blank page now except for the footer.