Splash of Color: Instance Segmentation with Mask R-CNN and TensorFlow (opens in new tab)

(engineering.matterport.com)

47 pointswaleedka8y ago5 comments

5 comments

5 comments · 2 top-level

Magic results but R-CNN (and other similar architectures with large number of forward passes per frame) is pretty darn slow, which severely limits its applicability.

Q6T46nT668w6i3m8y ago

It really depends on the application. If you’re comparing to classical instance segmentation methods like a distance-based watershed, Mask RCNN (e.g. using the Keras-RCNN package) has nearly identical performance. However, if you’re looking for real-time performance, you’ll need to look elsewhere (and likely reframe your problem as an object detection rather than instance segmentation problem).

mliker8y ago

R-CNN is slow but Fast R-CNN and Faster R-CNN simply use one forward pass per image, making it feasible for use in real time

metaobject8y ago· 1 in thread

For people familiar with Mask R-CNN, how might this model be used to detect clouds (in the sky) in all-sky imager data? An all-sky imager is basically a camera with a fisheye lens, mounted on the ground, and facing the sky. Assume training data is available on a pixel-by-pixel basis and is classified as either: (sky, cloud)

Since clouds are amorphous, it seems there would be problems trying to feed training data to the model. Could one simply use the entire training image by specifying the bounding box to the cloud to be the entire image bounds?

I'm exploring new models, having already tried Fully Convolutional DenseNet with semi-satisfactory results (but with very large GPU memory footprint).

waleedkaOP8y ago

I’m very curious about the use of such cloud finder. But, to answer your question, the Mask RCNN model would be useful if you want to identify individual clouds. As in, find clouds surrounded by empty sky. Or, for example, you want to find clouds that look like a puppy or or a sheep. On the other hand, if you don’t care for identifying individual clouds and only want to find the pixels that belong to the ‘cloud’ class, then a semantic segmentation model would likely do better.

Also, while I don’t know much about your use case, using DenseNet seems like it might be an overkill since you only have two classes, cloud and sky. A lighter network might give you better results, especially if you don’t have a lot of training data.

j / k navigate · click thread line to collapse