We've seen some really interesting stuff on HN about tracking thru crowds, reconstructing images from fragments etc. If these folks can do anything like that, they aren't showing it.
Regarding (2), if I understand the paper properly, this should allow for a massive increase in lens quality while also allowing for lenses to be much smaller. Both are worth tons of money... together, it's massive. As a guy who carries around a $1900 lens that weighs 2.5 lbs, because of the creative options it gives me that no other lens can, this appeals to me greatly!
Granted their demo isn't impressive, but they're underutilizing their technology, and honestly I can't think of a better demo either, but don't be misled. This light field camera is capturing far more information. Meaningful information. I wonder if it's possible to like, create 3D models of objects in these images? That would probably be more "computational camerawork". What's impressive that that could be done after the fact.
https://secure.wikimedia.org/wikipedia/en/wiki/Focus_stackin...
One can put a diffractive optical element infront of the sensor and obtain 25 instantaneous images, each at a different depth.
http://waf.eps.hw.ac.uk/research/live_cell_imaging.html
Couple this with high resolution techniques and your at the current research front and could possibly solve the question:
What happens in a synapse?
It is theoretically possible to image at 40nm resolution with 100fps and therefore see the transport of vesicles.
One can expect important discoveries on how the brain works using these techniques.
custom binary format?
And I suppose this is the key
So here is the parser script. The example.html is very messy sorry for that. https://gist.github.com/997861
(note this downloads 12 Mb of data)
You can look at the source: http://lightfield.stanford.edu/aperture.html
EDIT: And a few job listings [2] [3].
[1]: https://twitter.com/#!/ManuKumar/portfolio/members
[2]: http://www.indeed.com/q-Lytro-l-Mountain-View,-CA-jobs.html
The company seems to be doing well, and recently changed names to Lytro in order to not be pigeonholed into refocusing applications only.
McDonald's bought Chipotle but didn't rename it "McDonald's", neither did they rename McDonald's Chiptole. McDonald's is for hamburgers and Chipotle is for burritos even if the money ends up in the hands of the same people.
It is almost never beneficial to merge existing brands (unless one of the brands has a horrible reputation), so why should someone generalize away from a successful niche application and lose the associated branding instead of just starting a new brand for the new field?
Kinect's image capture is very low resolution (even by today's cellphone camera standards), it doesn't give you depth information at a per-pixel resolution and even ignoring those issues in addition to depth information you also need a source image which has critically sharp focus across the entire viewing range (you can't selectively focus in software that which was captured out of focus with a standard digital sensor), which means using a very small aperture (large F-stop value). So it'll be very difficult to capture anything but still-life images because the small aperture means a long exposure time, and thus motion blur if anything moves. Granted this is already less of an issue with Kinect because the sensor in it is so tiny that getting out-of-focus areas is not that much of a concern, but the cost of that is that the image resolution is also atrocious.
Once you get up to usable sensor resolutions, if you're already limited to taking long exposures of still-life images on a tripod, you might as well skip the IR depth perception and just take a series of wider aperture pictures at different focus plane levels, focus stack the results and preprocess the image series for blur levels to work out the relative depths of the in-focus bits of each source image. At least doing it that way you can use a DSLR to get quality photos.
Neither of these is a true replacement for what they are doing here, though.
However one captures a lot more information with the light field camera. For example transparent things like smoke, fog, glass and things with weird optical properties like polished steel or the Tiger eye mineral with Chatoyance will be captured by this camera.
This gives the photographer tremendously more artistic space. Just imagine photographing a close up of an eye with the Kinect technology.
One can argue that the light field camera will maintain a quality in the image one could never achieve with Kinect based systems (without a lot of photoshopping).
I guess Ren Ng is behind the company as he was first author: http://graphics.stanford.edu/~renng/
This method works nice for photography where all the dimensions involved are much bigger than wavelength. I work on something like that in fluorescence microscopes. I can tell you, it is much harder when you have to consider wave optics.
Here is a related talk: http://www.youtube.com/watch?v=THzykL_BLLI
BTW, your comment illustrates that much of "computational photography" is just re-application of tricky imaging tech from other fields. Nothing wrong with that, but something to keep in mind.
[1]http://www.tgeorgiev.net/Lippmann/index.html [2]http://www.futurepicture.org/?p=34
It captures a video wihle you move the Iphone camera and combines it into an image that looks like it was captured with a big aperture.
I don't have an Iphone and have never seen this app in real live, though.
It also had a feature that automatically saved the grams of a video that had a face wit a smile.
Why is Sports Photography Hard? (and what we can do about it)
http://research.microsoft.com/apps/video/dl.aspx?id=146906
So for sports photography it seems very useful. Replacing the green screen doesn't seem to make sense.
For macroscopic objects depth reconstruction with two cameras like in the Kinect seems the better alternative.
Object excemption seems to be a nice idea. I can't remember having read anything about this. In principle it should be possible to recover an unsectioned stack of images.
One can then use an iterative algorithm to subtract in-focus information from on slice of the stack from each of the other slices and end up with a deconvolved image of sectioned images.
Then one could delete one object in the stack and recalculate a superposition of blurred sectioned images to recover a reconstruction representing the object without the image.
This is quite complicated. Just imagine to remove a wine glass from a scene. One needs to delete all the rays that went through the wine glass and bend them such as though the wine glass wasn't there.
One can argue that polarization and absorbtion effects will be very hard or even impossible to handle correctly.
Certainly light fields contain A LOT of potential.
http://visualnary.com/2008/04/13/lens-that-takes-multiple-pi... http://visualnary.com/2008/04/13/nab-predictions.html
This is the capture part of the capture and display of true 3D images.
What do I mean by 'true'? Imagine a screen that works like a window.
If you think about a window or a mirror as a display screen, you can imagine that every point on the screen is a tiny hemispherical lens, light exits the screen in all directions due to these lenses. By producing light in every direction (as opposed to just perpendicular to the screen + diffusion) you could let your eye decide on what to focus. Additionally such a system would be view-angle agnostic, so you could look from the side and see a wider 'view' into the scene (again noting this works for n viewers).
Such a display would be complex to implement, but even if you had one you'd need image capture such as Lytro is providing to make it work.
Exciting times!