https://www.reddit.com/r/apple/comments/9fkb3t/im_an_emergen...
"Medical grade" is hard to define, but you might actually be very surprised by how accurate this simple sensor could be in an era of deep learning AI.
Earlier this year, there was a paper published at AAAI (https://arxiv.org/pdf/1802.02511.pdf) that found that just using the sparse, noisy data from the AW sensors (non-continuous, noisy heart rate measurements and a handful of HRV estimates every day), they could diagnose diabetes, high cholesterol, high blood pressure, and sleep apnea with relatively high accuracy.
In fact, the diabetes diagnoses were comparable in accuracy to cheap lab tests specifically for diabetes. And even more surprisingly, the sleep apnea diagnoses could be made even if one doesn't wear the watch during sleep.
There are other recent papers showing extremely good accuracy in detecting rhythm abnormalities.
Deep learning can often magnify the power of cheap, simple sensors in ways that can in many cases seem unimaginable. Partly because of the power of multi-dimensional inference, and partly because the volume of data you get from wearing a device 24/7 helps to compensate for all the noise and sparsity.
And that was with the current gen AW sensors, which now are a generation behind some other consumer devices -- I'm sure the AW4 catches up (and can likely gather data like continuous HRV). Add in another data point like ECG, even if it's the simplest possible form of ECG, and I wouldn't be surprised if the diagnostic accuracy for many conditions is higher than some lab tests. Especially for transient rhythm abnormalities like transient afib, which might show up late at night after drinking, but not occur during a lab test, which probably makes detecting transient afib hard in a lab until a lot of cardiac changes have occurred.
Commenter /y/AlanYx continues in grandchild:
Yup. Humans also have difficulty thinking and reasoning multi-dimensionally, especially with conditional probabilities across those dimensions. So while the watch probably can't apply the "simplistic fill-in-the-blank style" of reading EKGs taught in the Dubin book that someone recommended above (because the sensors are just too simplistic and the data collected too limited), deep learning can see through the data in ways people often can't.
There's a good example of this in the DeepHeart paper I linked to -- the authors mention that no diagnostically predictive relationship between heart rate patterns during waking hours and sleep apnea was known prior to the work, likely because it is too complex to spot by just looking at heart rate graphs in the way humans do. (And in fact because a convolutional neural network is used, it's not easy right now to tease out in an explainable way to humans exactly what the computer is "seeing", although its specificity and sensitivity characteristics are known.)