However, testing in waves is very important, and this article shows none of that (barely mentions it, in fact). But at least the big boats have the advantage of being able to put a sensor 50 meters up in the air, if that helps.
[1] http://www.smh.com.au/sport/sailing/sydney-to-hobart-comanch...
Another modern shipwreck: http://www.newyorker.com/news/sporting-scene/twenty-first-ce...
Detecting the horizon seems trivial enough (hough transform? or a robust edge detector). If you know where that line is then you can take series of gradients normal to it and form an average temperature model as a function of distance from the horizon. If you take enough of these gradients across the image, you should be able to get a robust average.
You then use this model to predict the temperature at each pixel and take a difference with the observed data. What should happen is you get a ton of ~zeros, a bit of noise due to waves and any non-sea object will stick out like a sore thumb.
The rule here is that absolute measurements will kill you when it comes to dynamic range. You're much better off taking relative measurements.